Novel domains of the prokaryotic two-component signal transduction systems

Novel domains of the prokaryotic two-component signal transduction systems

FEMS Microbiology Letters 203 (2001) 11^21 www.fems-microbiology.org MiniReview Novel domains of the prokaryotic two-component signal transduction ...

742KB Sizes 2 Downloads 54 Views

FEMS Microbiology Letters 203 (2001) 11^21

www.fems-microbiology.org

MiniReview

Novel domains of the prokaryotic two-component signal transduction systems Michael Y. Galperin *, Anastasia N. Nikolskaya, Eugene V. Koonin National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA Received 25 April 2001; received in revised form 28 June 2001; accepted 28 June 2001 First published online 4 August 2001

Abstract The archetypal two-component signal transduction systems include a sensor histidine kinase and a response regulator, which consists of a receiver CheY-like domain and a DNA-binding domain. Sequence analysis of the sensor kinases and response regulators encoded in complete bacterial and archaeal genomes revealed complex domain architectures for many of them and allowed the identification of several novel conserved domains, such as PAS, GAF, HAMP, GGDEF, EAL, and HD-GYP. All of these domains are widely represented in bacteria, including 19 copies of the GGDEF domain and 17 copies of the EAL domain encoded in the Escherichia coli genome. In contrast, these novel signaling domains are much less abundant in bacterial parasites and in archaea, with none at all found in some archaeal species. This skewed phyletic distribution suggests that the newly discovered complexity of signal transduction systems emerged early in the evolution of bacteria, with subsequent massive loss in parasites and some horizontal dissemination among archaea. Only a few proteins containing these domains have been studied experimentally, and their exact biochemical functions remain obscure; they may include transformations of novel signal molecules, such as the recently identified cyclic diguanylate. Recent experimental data provide the first direct evidence of the participation of these domains in signal transduction pathways, including regulation of virulence genes and extracellular enzyme production in the human pathogens Bordetella pertussis and Borrelia burgdorferi and the plant pathogen Xanthomonas campestris. Gene-neighborhood analysis of these new domains suggests their participation in a variety of processes, from mercury and phage resistance to maintenance of virulence plasmids. It appears that the real picture of the complexity of phosphorelay signal transduction in prokaryotes is only beginning to unfold. ß 2001 Federation of European Microbiological Societies. Published by Elsevier Science B.V. All rights reserved. Keywords : Signal transduction; Genome sequencing; Conserved motif; Domain organization; Histidine kinase; Phosphorylation ; Phosphodiesterase ; Cyclic nucleotide

1. Introduction The availability of the complete sequences of more than 40 microbial genomes representing eight of the 10 main bacterial phyla and both major branches of archaea (http://www.ncbi.nlm.nih.gov/PMGifs/Genomes/new_micr. html) increasingly impacts our understanding of microbiology, providing ample data for the analysis of the metabolism, cell organization, and evolution of prokaryotes [1,2]. Cross-genome comparisons improve our understanding of each particular genome, allowing prediction of

* Corresponding author. Tel. : +1 (301) 435 5910; Fax: +1 (301) 435 7794. E-mail address : [email protected] (M.Y. Galperin).

general (biochemical) functions of uncharacterized genes based on their conserved domain organization, gene neighborhood (operon organization), and phylogenetic patterns (presence in some species but not others) [3^5]. Domain architecture has proven particularly informative for analyzing multi-domain proteins involved in signal transduction. In sensor histidine kinases, several new domains have been described, such as the phosphotransfer Hpt domain [6], the heme- and £avin-binding PAS domain [7,8], the extracellular ligand-binding Cache domain [9], the cGMP-binding GAF domain [10,11], and the HAMP linker domain [12] (see [13^17] for recent reviews). Analysis of the downstream signal transduction module revealed an even greater diversity. In addition to well-known response regulators, which consist of a CheY-like phosphoacceptor domain and a helix^turn^helix (HTH) DNA-binding domain, bacterial genomes were found to

0378-1097 / 01 / $20.00 ß 2001 Federation of European Microbiological Societies. Published by Elsevier Science B.V. All rights reserved. PII: S 0 3 7 8 - 1 0 9 7 ( 0 1 ) 0 0 3 2 6 - 3

FEMSLE 10076 31-8-01

Cyaan Magenta Geel Zwart

12

M.Y. Galperin et al. / FEMS Microbiology Letters 203 (2001) 11^21

encode a variety of response regulators with unusual domain organization, featuring still poorly characterized domains, such as GGDEF [18^20], EAL [19,21], and HD-GYP [22,23]. Because of their complex domain organization, signaling proteins are often poorly annotated in sequence databases such as GenBank, most often just as `sensor protein' or `response regulator'. However, detailed sequence and structure analyses of these novel domains have been performed and sequence alignments are currently available in several protein domain databases, including SMART [24], COGs [4], and Pfam [25] (Table 1). Here, we brie£y review the diversity of newly discovered prokaryotic signaling domains and discuss the emerging complex picture of prokaryotic signal transduction. 2. Recently discovered prokaryotic signaling domains The archetypal two-component signal transduction systems include a sensor module, which consists of an extracytoplasmic or membrane-associated sensor input domain and a cytoplasmic histidine kinase domain with an ATPase and phosphoacceptor subdomains (Table 1), and a response regulator, which consists of a receiver CheYlike domain and a DNA-binding domain. Functions of these domains and of the PAS domain, commonly found in the sensor module, have been reviewed recently [13^17] and will not be considered here in detail.

2.1. Extracytoplasmic ligand-binding sensor domains Periplasmic (in Gram-positive bacteria ^ extracytoplasmic) ligand-binding sensor domains are extremely diverse. The most common type of such domains (FliY-type, Table 1) is homologous to the periplasmic solute-binding protein components of the ATP-dependent transport systems [26]. Several sensor kinases, for example Escherichia coli EvgS, contain duplicated FliY-type domains followed by a transmembrane segment that anchors them to the membrane. Another periplasmic ligand-binding domain, Cache, is found in sensor kinases and in the extracytoplasmic parts of methyl-accepting chemotaxis proteins, such as Bacillus subtilis McpA and McpB [9]. There are many other types of ligand-binding sensor domains that are apparently speci¢c for the recognition of narrow groups of substrates, such as metals, citrate, nitrate, etc. Despite their diversity, many sensor modules have the same domain architecture with an N-terminal transmembrane segment (likely uncleavable signal peptide), a relatively large (100^300 aa) periplasmic domain, and a second transmembrane segment, followed by a HAMP domain and a cytoplasmic signal-transducing domain. In addition to extracytoplasmic ligand-binding domains, membrane-bound signaling domains also exist, as exempli¢ed by the recently identi¢ed MHYT, a predicted metalbinding, redox-sensing domain (MYG, T.A. Gaidenko, A.Y. Mulkidjanian, and C.W. Price, submitted for publication). The diversity of sensor domains probably re£ects

Table 1 Conserved domains of the bacterial signal transduction systems Domain name Sensor moduleb FliY Cache MHYT PAS GAF HAMP His kinase 1 His kinase 2 Hpt Response moduleb CheY HTH AAA GGDEF EAL HD-GYP

Length (aa)

Function

Structure

V220 V120 V200 V100

amino acid binding small ligand binding metal binding ? FAD, heme, and cinnamic acid binding

2lao ^ transmembrane 2phy

V150 V50 V80 V120 V100

cGMP binding, photopigment binding dimerization? phosphoacceptor, dimerization Phosphorylation of His kinase 1 domain phosphoacceptor

1f5m 1joy 1joy 1bxd 2a0b

V120 V120 V300 V170 V250 V170

phosphoacceptor DNA binding c54 -binding ATPase c-diGMP formation? c-diGMP hydrolysis ? phosphodiesterase ?

2che 1d2n similar to 1cjv? ^ ^

Domain database entrya

Reference

SMART

COG

Pfam

PBPb ^ ^ PAS PAC GAF HAMP HisKA HATPase HPT

3438 3290d 3300 2202 2203 2770 0642d 0642d 2198

PF00497 PF02743 ^ PF00989 PF00785 PF01590 PF00672 PF00512 PF02518 PF01627

[10,11] [12] [14,15,17] [14,15,17] [6,14]

REC ^c AAA DUF1 DUF2 ^

0784 ^c 2204 2199 2200 2206

PF00072 ^c PF00158 PF01590 PF00990 ^

[15,51] [15,17] [52] [19,33] [19] [22]

[26] [9] ^ [13]

a Protein domain databases that contain sequence alignments of these signaling domains include SMART (http://smart.embl-heidelberg.de [24]), COGs (http://www.ncbi.nlm.nih.go/cog [4]), and Pfam (http://www.sanger.ac.uk/Software/Pfam [25]). b GGDEF, EAL and HD-GYP domains occur in fusions with sensor module domains as well as response module domains (CheY). Therefore, their classi¢cation as parts of response modules is based on their predicted functions and requires experimental veri¢cation. c HTH domains of response regulators are not listed as separate domains in SMART, COGs, or Pfam. d Members of this COG contain more than one domain.

FEMSLE 10076 31-8-01

Cyaan Magenta Geel Zwart

M.Y. Galperin et al. / FEMS Microbiology Letters 203 (2001) 11^21

the wide range of environmental stimuli that elicit regulatory responses in bacterial cells. 2.2. GAF domain The GAF domain was originally described as a noncatalytic cGMP-binding domain conserved in cyclic nucleotide phosphodiesterases [27]. Subsequently, this domain was recognized in cyanobacterial adenylate cyclases and, ¢nally, in histidine kinases and certain other proteins [10]. In spite of limited sequence similarity, the structure of the GAF domain turned out to be very similar to that of the PAS domain [11], indicating their common ancestry. In bacterial and plant phytochromes, the GAF domain contains a small insertion with a conserved Cys residue that serves for covalent attachment of photopigments [28,29]. GAF domains have been also found in association with a variety of other protein domains, such as PEP-dependent phosphotransferase (PTS Enzyme I [30]), PP2C-type protein phosphatase, NtrC-type ATPase, GGDEF, and EAL [10]. 2.3. GGDEF domain The GGDEF domain (Fig. 1A) was ¢rst discovered in the response regulator PleD that controls cell di¡erentiation in the swarmer-to-stalked cell transition in Caulobacter crescentus [18]. PleD and its cognate histidine kinase PleC were ¢rst described as members of a typical two-component signal transduction system. However, instead of a typical CheY-HTH domain organization, PleD was found to consist of a CheY domain and a previously uncharacterized domain that was dubbed GGDEF based on its conserved sequence motif (Fig. 1A) [18]. This observation attracted little attention until it turned out that GGDEF is encoded in many bacterial genomes (Table 2), including 19 copies in E. coli and four copies in B. subtilis [22]. Recently, it was shown that PleD mutants with an intact CheY domain but lacking the GGDEF domain are defective in £agellar degradation and stalk formation during cell di¡erentiation in C. crescentus [20]. These data directly demonstrate the involvement of the GGDEF domain in signal transduction, a role that was previously proposed on the basis of the association of this domain with CheY and PAS domains [18] in multi-domain proteins. Although the functions of most of the GGDEF domaincontaining proteins remain uncharacterized, some clues have emerged from a study of the regulation of the biosynthesis of extracellular cellulose in Acetobacter xylinum (recently renamed Glucoacetobacter xylinum) [19]. An extensive study of this process by Benziman and colleagues showed that it is regulated by cyclic diguanylate (c-diGMP, bis(3P,5P)-cyclic diguanylic acid), a novel e¡ector molecule that consists of two cGMP moieties bound head-to-tail [31]. The exact mechanism remains unclear,

FEMSLE 10076 31-8-01

13

but it apparently involves c-diGMP binding to some membrane proteins that activate expression and/or secretion of cellulose synthetase [32]. The search for the enzymes that synthesize and hydrolyze c-diGMP resulted in the identi¢cation of six open reading frames (ORFs) with an almost identical domain composition [19]. All of these proteins contain N-terminal PAS domains, followed by the GGDEF domain and another uncharacterized domain, which was dubbed EAL based on its conserved sequence motif (see below). Based on the properties of mutants in which these ORFs were inactivated by insertions, Tal et al. concluded that the GGDEF domain of each of these proteins was responsible for its diguanylate cyclase activity [19]. This conclusion has not been directly veri¢ed by demonstrating the enzymatic activity of the recombinant protein, so there remains a (remote) possibility that the ORFs described by Tal et al. [19] just regulate expression of diguanylate cyclases and phosphodiesterases. However, the notion that the GGDEF domain is a diguanylate cyclase has recently received support from a detailed analysis of its sequence. Using iterative PSI-BLAST searches and threading, Pei and Grishin aligned the GGDEF domain with eukaryotic adenylate cyclases [33]. Although the level of sequence similarity between the two domains was low, conservation of the proposed nucleotide-binding loop, which corresponds to the GGDEF motif, was compatible with the cyclase activity of the GGDEF domain [33]. 2.4. EAL domain The EAL domain (Fig. 1B) was originally described in a study of BvgR protein in Bordetella pertussis [21]. Under the conditions in which the genes encoding the major virulence factors are activated, virulence-repressed (vrg) genes are turned o¡ in a BvgR-dependent fashion. Both these processes are under the control of a two-component regulatory system, BvgAS [21]. These observations established BvgR as a component of the signal transduction system in B. pertussis. A direct interaction of BvgR with DNA was suggested based on the similar expression patterns and molecular masses of BvgR and a putative transcriptional regulator, previously demonstrated to bind to a regulatory sequence within the coding region of vrg genes [34]. However, no direct experimental evidence for such an interaction has been provided. Sequence comparisons indicated the presence of a BvgR-like domain in a number of other poorly characterized proteins from diverse bacteria [21]. The same BvgR-like domain, dubbed EAL after its conserved residues, was independently discovered in tandem with the GGDEF domain in putative diguanylate cyclases and phosphodiesterases that regulate cellulose synthesis in A. xylinum [19]. Since diguanylate cyclase activity has been assigned to the GGDEF domain (see above), the EAL domain emerged as a good candidate for the role of a

Cyaan Magenta Geel Zwart

14

M.Y. Galperin et al. / FEMS Microbiology Letters 203 (2001) 11^21

diguanylate phosphodiesterase. Indeed, the sequence of this domain contains several conserved acidic residues that could participate in metal binding and potentially might form a phosphodiesterase active site (Fig. 1B). Other experimentally characterized proteins containing the EAL domain are listed in Table 3. It should be noted that YuxH (ComB) protein from B. subtilis, which was originally described as a transcriptional regulator of late competence genes, was later shown not to be required for this regulation [35]. The rtn gene of Proteus vulgaris, originally identi¢ed through its e¡ect on the infection by phages V and N4 [36], was later found in E. coli. The e¡ect of rtn mutation was suppressed in cells grown on maltose, indicating that it might a¡ect expression or membrane localization of LamB, which participates both in maltose transport and attachment of V and N4. 2.5. HD and HD-GYP domains Although the predicted phosphodiesterase activity of the

EAL domain has not yet been demonstrated, some (predicted) signal transduction proteins do contain bona ¢de phosphodiesterase domains similar to the ones found in eukaryotic cyclic-nucleotide phosphodiesterases [37]. These domains belong to the recently identi¢ed superfamily of metal-dependent phosphohydrolases, designated the HD superfamily after the principal conserved residues implicated in metal binding and catalysis. This superfamily also includes such enzymes as bacterial dGTP triphosphohydrolase and the ppGpp(p) hydrolase SpoT [37]. The version of the HD-type domain that is fused with a CheY domain in response regulator-like proteins from several organisms (Table 4) has many additional highly conserved residues, including a conserved GYP motif (Fig. 1C); this domain was therefore dubbed HD-GYP [22]. Like the GGDEF and EAL domains, the HD-GYP domain was originally implicated in signal transduction on the basis of its association with CheY-like and other signaling domains [22]. Recently, its role in signaling has been demonstrated experimentally. In the plant pathogen

Fig. 1. Consensus sequences of the recently discovered signaling domains. A: GGDEF domain. B: EAL domain. C: HD-GYP domain. The residue numbering is from T. maritima protein TM0107 (A), E. coli YdiV (B), and A. aeolicus aq_2027 (C). The GGDEF motif comprises residues 114^118 in A, the EAL motif comprises residues 29^31 in B, the HD-GYP motif corresponds to the residues 54^55 and 115^117 in C. Consensus sequences of the 218 GGDEF domains, 128 EAL domains, and 36 HD-GYP domains encoded in complete microbial genomes (Table 3) were drawn using the SeqLogo program [60] in the WWW-based implementation by Steven Brenner (http ://www.bio.cam.ac.uk/seqlogo). The letters in each position represent amino acid residues found in that position; the height of each letter re£ects the fraction of sequences with the corresponding amino acid residue in that position (the degree of conservation). The total height of each column indicates statistical importance of the given position. The residues are colored as follows: N, Q ^ green; K, R, H ^ blue; D, E ^ red; F, L, I, M, V ^ yellow ; the rest ^ purple.

FEMSLE 10076 31-8-01

Cyaan Magenta Geel Zwart

M.Y. Galperin et al. / FEMS Microbiology Letters 203 (2001) 11^21

Fig. 1 (continued).

FEMSLE 10076 31-8-01

Cyaan Magenta Geel Zwart

15

16

M.Y. Galperin et al. / FEMS Microbiology Letters 203 (2001) 11^21

Table 2 Inventory of signaling domains in complete prokaryotic genomes Speciesa

Bacteria Mesorhizobium loti Pseudomonas aeruginosa Escherichia coli Bacillus subtilis Bacillus halodurans Mycobacterium tuberculosis Vibrio cholerae Caulobacter crescentus Synechocystis sp. Mycobacterium leprae Xylella fastidiosa Deinococcus radiodurans Lactococcus lactis Neisseria meningitidis Thermotoga maritima Haemophilus in£uenzae Campylobacter jejuni Helicobacter pylori Aquifex aeolicus Chlamydia pneumoniae Treponema pallidum Chlamydia trachomatis Borrelia burgdorferi Rickettsia prowazekii Mycoplasma pneumoniae Ureaplasma urealyticum Buchnera sp. APS Mycoplasma genitalium Archaea Archaeoglobus fulgidus Halobacterium sp. NRC-1 M. thermoautotrophicum Pyrococcus abyssi Pyrococcus horikoshii Aeropyrum pernix Methanococcus jannaschii Thermoplasma acidophilum

Genome size (kb)

Total number of proteins

Sensor moduleb

Response moduleb

Cache

PAS

GAF

HisKin

CheYc Hpt

CheYd

GGDEF

EAL

HD-GYP

32 33 19 4 4 1 41 11 23 3 3 16 0 0 9+2f 0 1+1 0 11 0 1+1 0 1 1 0 0 0 0

18 21 17 3 2 2 22 10 13 2 3 5 0 0 0 0 0 0 8 0 0 0 1 1 0 0 0 0

1 3 0 0 2 0 9 0 2 0 1 4 0 0 9 0 0 0 1 0 3 0 1 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

7036 6264 4639 4215 4202 4412 4033 4017 3573 3268 2679 2649 2365 2184 1861 1830 1641 1668 1551 1230 1138 1042 911 1111 816 752 641 580

6752 5565 4289 4100 4066 3918 3827 3737 3169 2720 2766 2580 2266 2121 1846 1709 1654 1566 1522 1052 1031 894 850 834 677 611 564 467

2 6 3 10 8 0 20 3 2 0 0 0 0 0 6 0 5 1 1 0 1 0 0 0 0 0 0 0

36 42 14 14 15 2 30 26 26 3 4 7 2 1 4 1 4 0 7 1 0 2 1 0 0 0 0 0

10 9 9 3 5 3 5 6 28 1 1 3 1 1 2 0 0 0 4 0 2 0 0 0 0 0 0 0

62 63e 28e 33e 36e 15 41e 62 42e 5 14 21 7 5 9 4 7 4 4 1 1 1 4 4 0 0 0 0

15 19 5 0 2 0 11 28 17 0 5 5 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0

2 11 6 1 1 0 10 2 7 0 2 0 0 0 1 2 1 1 0 0 1 0 2 0 0 0 0 0

57 75 32 36 48 13 49 45 41e 5 20 25 7 5 12 6 11 9 5 2 4 2 6 5 0 0 0 0

2178 2014 1751 1765 1739 1670 1665 1565

2420 2058 1869 1765 V1750 V1720 1715 1478

0 0 0 1 1 0 0 0

25 15 15 1 0 0 0 0

5 7 4 0 0 0 0 1

14 14 16 1 1 0 0 0

0 3 3 0 0 0 0 0

1 1 0 0 1 0 0 0

11 6 8 1 1 0 0 0

0 0 0 0 0 0 0 0

a

The names and data for non-obligate parasites are in bold. Complete genome sequences and corresponding references are available in the NCBI Entrez Genome division at http://www.ncbi.nlm.nih.gov/PMGifs/Genomes/micr.html. b Each number represents the number of proteins in a given genome that contain the corresponding domain; multiple occurrences of the same domain (e.g., PAS, CheY) on a single polypeptide chain are counted as one. The numbers are from the COG database (http://www.ncbi.nlm.nih.gov/COG, [4]) and/or from the results of iterative PSI-BLAST searches of domain-speci¢c pro¢les against a database of proteins, encoded in completely sequenced microbial genomes [38,53,54]. These numbers were also compared against those in SMART [24]) and those reported in [41]. Complete lists of proteins that contain each particular domain are available at ftp://ncbi.nlm.nih.gov/pub/galperin/TwoCompCensus.html. c CheY-like domains of the `hybrid' sensor kinases (found on the same polypeptide chain with the His kinase domain, see [39]). d CheY-like domains found in the response regulators (associated with HTH, ATPase, CheB, GGDEF, or HD-GYP domains), as well as stand-alone CheY-like domains. e These numbers do not include signaling proteins of the LytS family (COG3275, COG2972), predicted to be divergent His kinases [40]. The discrepancies with the numbers reported in [39] are due to the addition of Synechocystis sp. His kinase slr1212 and response regulators slr0687, sll1544, sll1879, and slr2041 and the exclusion of stand-alone Hpt domain slr0073 from the list of histidine kinases. f These domains are likely to be inactivated.

Xanthomonas campestris, response regulator RpfG, which contains a CheY-like and an HD-GYP domain, has been shown to activate the synthesis of extracellular enzymes and the extracellular polysaccharide [23]. In-frame deletion of the rpfG gene abolished production of extracellular en-

FEMSLE 10076 31-8-01

doglucanase and signi¢cantly decreased the levels of polygalacturonate lyase and extracellular polysaccharide [23]. Increased expression of the cognate histidine kinase RpfC stimulated the production of these extracellular enzymes and even overcame the e¡ect of the rpfG mutation. This

Cyaan Magenta Geel Zwart

M.Y. Galperin et al. / FEMS Microbiology Letters 203 (2001) 11^21

latter result suggests that response regulators of the CheYHD-GYP class, like RpfG, represent important, but not the only, output modules for their corresponding sensor kinases. If the HD-GYP domain is indeed a phosphatase or a phosphodiesterase, its highly conserved sequence suggests high substrate speci¢city. Notably, at least two proteins, Aquifex aeolicus aq_2027 and Deinococcus radiodurans DRA0342, contain a HD-GYP-GGDEF domain combination [22]. Thus, the HD-GYP domain may be involved in the metabolism of cyclic diguanylate or in dephosphorylation of a phosphotransfer domain. A modi¢ed version of the HD-GYP domain is fused to the C-terminus of the EAL domain in the ComB (YuxH) protein from B. subtilis, two A. aeolicus proteins, and three Vibrio cholerae proteins. This version lacks the conserved distal portion of the HD-GYP domain (Fig. 1C) and has certain substitutions in the characteristic metal-binding residues of the HD superfamily phosphohydrolases [37], which likely render the domain catalytically inactive.

17

3. Census of signaling domains in completely sequenced prokaryotic genomes With the complete sequences of over 30 bacterial and archaeal genomes currently available, we were interested in obtaining accurate counts of the number of signaling domains in each of them. Sequence pro¢les were constructed for each of these domains (see Fig. 1) and compared using iterative BLAST searches [38] against a database of protein sequences encoded in each of the completely sequenced genomes. The results obtained (Table 2) are very close to those reported earlier for E. coli, Synechocystis sp., and C. crescentus [39^42], and reveal several interesting trends. First, some variations notwithstanding, these domains are abundant in the genomes of all free-living bacteria but are much less common in obligate parasites. This di¡erence is particularly striking in the case of the GGDEF, EAL, and HD-GYP domains (compare the data in Table 2 for A. aeolicus and Helicobacter pylori, two bacteria with nearly the same number of genes,

Table 3 Partly characterized proteins containing GGDEF, EAL, and HD-GYP domains Organism, protein name Cell di¡erentiation C. crescentus PleD

GenBank accession number

Domain organizationa

Function, operon structure

Reference

L42554

CheY-xCheY-GGDEF

required for swarmer-to-stalked cell transition

[18,20]

PAS-GGDEF-EAL CheY-GGDEF CheY-HD-GYP

required for cellulose biosynthesis required for cellulose biosynthesis required for biosynthesis of extracellular polysaccharide

[19] [55] [23]

regulates transcription of vrgs follows the operon, encoding type 1 (mannose-sensitive) ¢mbriae, is required for their adhesiveness, but not for their formation precedes the operon encoding type IV (mannose-sensitive) ¢mbriae, but is not required for its expression forms an operon with a gene speci¢cally induced in infection required for intercellular adhesion (bio¢lm formation), biosynthesis of extracellular polysaccharide

[21] S. Clegg

overexpression confers resistance to phages V and N4 encoded on mercury resistance transposons, but not required for resistance; appears to a¡ect transposition rate permits overexpression in E. coli of A. jandaei L-lactamase AsbB1, but not of L-lactamases AsbA1 or AsbM1

[36] [45,46]

Biosynthesis of extracellular polysaccharides A. xylinum DGC1 AF052517 Rhizobium leguminosarum CelR2 AF121341 X. campestris RpfG AJ251547

Virulence, biosynthesis of extracellular proteins, adhesion B. pertussis BvgR AF071567 EAL Klebsiella pneumoniae FimK AAA25064 HTH-EAL

V. cholerae MshH (VC0398 or YhdA) V. cholerae VieA

AF079406

PER-GGDEF-EAL

AAC38449

CheY-EAL-xCheY-HTH

Salmonella typhimurium AdrA

AJ271071

TM-GGDEF

Resistance to phages, toxic metals E. coli Rtn U83404 Pseudomonas stutzeri Urf2 AAC38223 (TnpM)

PER-EAL EAL

Aeromonas jandaei RrpX

U67070

xCheY-CheY-GGDEF

Response to oxygen and light E. coli Dos Synechocystis sp. Cph2

BAA15160 BAA10536

PAS-GGDEF-EAL GAF-GAF-GGDEF-EALGAF-GGDEF

reversibly binds O2 , CO, and NO with high a¤nity phytochrome, dark-induced and repressed by light

a

[56] [57] [47,48]

[58]

[59] [29]

Only readily discernible domains are listed. TM, multiple (four or more) transmembrane segments; xCheY, a CheY domain that is inactivated by mutations; PER, a periplasmic sensor module of typical topology followed by a single transmembrane segment.

FEMSLE 10076 31-8-01

Cyaan Magenta Geel Zwart

18

M.Y. Galperin et al. / FEMS Microbiology Letters 203 (2001) 11^21

but with a free-living and parasitic life style, respectively). It appears that these newly discovered domains might be particularly important for sensing the more diverse environmental stimuli encountered by free-living or non-obligatory parasitic bacteria. The minimal genomes of mycoplasmas and Buchnera do not encode any signaling proteins at all (Table 2). Second, the signaling domains are generally less abundant and less evenly distributed in archaea than they are in bacteria. Although certain archaeal species, such as Archaeoglobus fulgidus and Methanobacterium thermoautotrophicum, encode a signi¢cant number of histidine kinases, CheY-like response domains, and PAS domains, the GGDEF, EAL, and HD-GYP domains have not been detected in any of the archaeal genomes sequenced thus far (Table 2). This result is particularly surprising because all of these domains are abundant in the hyperthermophilic bacteria A. aeolicus and Thermotoga maritima, which appear to have undergone horizontal gene exchange with the archaea on a massive scale [43,44]. Furthermore, two of the sequenced archaeal genomes, those of Aeropyrum pernix (the only sequenced representative of the Crenarchaeota, one of the two major archaeal branches) and Methanococcus jannaschii, do not appear to encode any of the currently recognized signaling domains (Table 2). This uneven phyletic distribution lends credence to a scenario whereby the twocomponent signaling system and the potential c-diGMP signaling system emerged early during bacterial evolution, with some of the components subsequently acquired by certain archaeal lineages via horizontal gene transfer. 4. Genomic context of the new signaling domains The abundance of the uncharacterized signaling domains (Table 2) was one of the most unexpected features of bacteria revealed by genome sequencing. Indeed, none of the 19 copies of the GGDEF domain and 17 copies of the EAL domain encoded in the E. coli genome belongs to an experimentally characterized protein (see, e.g., COG2199 and COG2200 in the COG database, http:// www.ncbi.nlm.nih.gov/COG [4]). The number of these domains in the recently sequenced genomes of P. aeruginosa and V. cholerae is even higher, and again the functions of the encoded proteins are as obscure as they probably are diverse (Table 3). Gram-positive bacteria appear to encode fewer of these signaling domains; there are only four proteins with the GGDEF domain and three proteins with the EAL domain in B. subtilis (Table 2), none of them has been characterized, either. Therefore, it is becoming increasingly clear that we have been missing major aspects of the regulatory circuits present in bacterial cells. In the absence of direct experimental data, some clues to the range of functions of these newly described domains could be revealed by their genomic context, including operon structures and conserved domain fusions [5]. Un-

FEMSLE 10076 31-8-01

fortunately, the GGDEF-, EAL-, and HD-GYP-encoding genes are seldom found in operons, let alone conserved ones. Indeed, of the 29 E. coli genes that encode a GGDEF domain, an EAL domain, or both, only one, y¢N, forms a potential operon with another uncharacterized gene. Six more are paired into potential operons yddVU, yeaIJ, and yliEF. Thus, the majority of the genes coding for these domains in E. coli (and in most other bacteria) are not predicted to be in operons. Furthermore, in many cases the orientation of these genes is opposite to that of their nearest neighbors. An interesting feature of many GGDEF and EAL domain-encoding genes is their presence in various transposons. For example, an EAL-encoding Urf2 has been found at the end of the mer operons in transposons Tn21 and Tn501, although it is not required for mercury resistance and is deleted in Tn5053 [45]. A part of this gene, named tnpM for transposition modulator, has been shown to enhance Tn21 transposition by activating transposase expression and decreasing resolvase expression [46]. Whether the full-length Urf2 has the same activity is unknown. Genes for stand-alone EAL and GGDEF domains have also been found in the Lactococcus lactis transposon Tn5481. 4.1. Conserved fusions of novel signaling domains Two-domain fusion proteins consisting of a phosphoacceptor CheY-like domain and either a GGDEF or an EAL domain were classi¢ed as response regulators even before the abundance of such fusions has become apparent [18,39,40]. A systematic analysis of domain organization of signaling proteins in completely sequenced bacterial genomes shows numerous domain fusions that pair novel output domains (GGDEF, EAL, and HD-GYP) not just with CheY-like domains but also with extracytoplasmic ligand-binding sensor domains or with cytoplasmic PAS and GAF sensor domains (Table 4). The variety of these multi-domain proteins seems to mirror that of sensor kinases. This circumstance apparently re£ects an underlying uniformity of the mechanisms of signal transduction in the cell, from an N-terminal sensor domain to a transmitter domain to a C-terminal response output domain, and suggests that the novel domains comprise a distinct signaling (cyclic diguanylate-based?) system that complements the classical two-component system. Indeed, in several independent cases [18,20,21,23] it has been shown that predicted response regulators containing these novel domains are regulated by, and act in parallel with, the `standard' (CheY-HTH) response regulators. They seem to provide an additional output module and, potentially, a means of feedback control. The systems that are regulated by these novel response regulators include those responsible for the interaction of the bacteria with the environment (¢mbriae, extracellular proteins, virulence) and with each other (bio¢lm formation [47,48],

Cyaan Magenta Geel Zwart

M.Y. Galperin et al. / FEMS Microbiology Letters 203 (2001) 11^21

19

Table 4 Diversity of domain fusions in bacterial response regulators Domain organization Simple response regulators CheY-HTH

Experimentally studied examples

Examples identi¢ed solely from genomic sequences

E. coli ArcA, CitB, FimZ, UvrY, KdpE, NarL, NarP, OmpR, PhoB B. subtilis CitT, ComA, GerE, PhoP, ResD

E. coli YgiX, YedH, YlcA

CheY-AAA-HTH

E. coli AtoC, GlnG, HydG

CheY-GGDEF CheY-EAL CheY-HD-GYP

C. crescentus PleD, R. leguminosarum CelR2 ^ X. campestris RpfG

CheY-GGDEF-EAL HD-GYP-GGDEF

^ ^

Response regulators with extracytoplasmic sensor domains FliY-GGDEF ^ FliY-HD-GYP ^ PER-GGDEF ^ PER-EAL ^ PER-GGDEF-EAL ^ PER-HD-GYP ^

VC1067, VCA0557 TM1170 VC2285, VC2454, VCA1082, PA0847 YhjK, PA2072, PA1433, slr2077 TM1682, VCA0895

Response regulators with cytoplasmic sensor domains PAS-GGDEF-EAL A. xylinum DGC1, PHE CheY-PAS-GGDEF-EAL ^ GAF-GGDEF ^ CheY-GAF-GGDEF ^ GAF-GGDEF-EAL ^ GAF-PAS-GGDEF-EAL

B. subtilis YdbG, YdfI, Y¢K, YhcZ, YocG, YufM, YvfU, YvqC, YxjL E. coli YfhA, Cj1024c, HP0703, RP562, CT468, CPn0586, TP0519, BB0763 BB0419, VC1086, RP237, Cj0643 slr1588, PA3947, VC1652 PA2572, PA4781, slr2100, sll1624, TM0186, TM1147, VC1087, VC1348, VCA0210, XF1113 Thiocystis violacea ORF5 (S54369)a , XF0401 aq_2027, DRA0342

aq_1442 slr1305, PA4959, XF2624 E. coli YeaP, DRB0044, slr1143, sll0048, PA2771 slr0687 Rhizobium etli ORF1 (AF034831), Rv1354c, VCA0080, VCA0785, PA2567, ML1750 Azorhizobium caulinodans YntC (X63841), PA5017

^

Fusions of sensory transduction domains from various signaling systems GAF-PtsI E. coli PtsP (AAB40476), Azotobacter vinelandii PtsP (Y14681) CheY-RsbU ^ PAS-RsbU B. subtilis RsbP (YvfP)

PA0337, VC0672, mll3436 PA3346, VCA1086, slr1983 Rv1364c

a

For experimentally studied proteins, not listed in Table 2, GenBank accession numbers are given in parentheses. The names of proteins encoded in completely sequenced genomes are listed exactly as in genome annotations ; the corresponding sequences can be retrieved from the NCBI WWW site at http://www.ncbi.nlm.nih.gov/Entrez/ or http://www.ncbi.nlm.nih.gov/COG/. Protein names shown in italics indicate presence of other domains in addition to those listed in the ¢rst column.

Table 3). Although such systems are extremely important in vivo, they may act in response to stimuli that have not yet been replicated in vitro. 4.2. Cross-talk between di¡erent signaling systems The variety of fusions between signaling domains discussed above extends to fusions of these domains with components of other signaling pathways, creating a complex network of regulatory interactions. Perhaps the most interesting case is a fusion of a GAF domain to the Enzyme I of the PEP-dependent sugar : phosphotransferase system, ¢rst described for the E. coli PtsP protein [30]. PtsP and similar proteins encoded in the genomes of P. aeruginosa, V. cholerae, and Mesorhizobium loti might modulate the activity of the phosphotransferase system in response to the levels of cGMP or some other ligand that interacts with its N-terminal GAF domain.

FEMSLE 10076 31-8-01

Another notable example of cross-talk between di¡erent regulatory systems (Table 4) is the fusion of CheY-like and GAF domains with phosphatase domains of the PP2C type, found in RsbU-like regulators of cB subunit, which participate in the stress response in B. subtilis and many other bacteria [49,50]. Such fusion proteins should be able to couple stress responses directly to perturbations in oxygen and/or cGMP levels. 5. Conclusions and perspectives Comparative analysis of complete microbial genomes reveals a network of regulatory interactions that is much more complex than was assumed previously. This complexity is mostly limited to free-living bacteria, whereas parasites with degraded genomes have few, if any, sensory transduction systems. Functions of some of the signaling

Cyaan Magenta Geel Zwart

20

M.Y. Galperin et al. / FEMS Microbiology Letters 203 (2001) 11^21

domains are already known, whereas the functions of others remain to be discovered. If the GGDEF and EAL domains indeed function as a diguanylate cyclase and a c-diGMP phosphodiesterase, respectively [19], c-diGMP could emerge as a major cell regulator in bacteria but, remarkably, not in archaea. Experimental characterization of the functions of these domains will signi¢cantly advance our understanding of the principles and mechanisms governing the prokaryotic regulatory machinery. Note added in proof While this paper was under review, the c-diGMP phosphodiesterase activity of the Acetobacter xylinum DGC1like protein (see Table 3) was shown to be regulated by oxygen [61]. Also, we became aware of the data implicating GGDEF-containing proteins in hemin storage in Yersinia pestis [62] and in £agellar function in E. coli [63]. References [1] Koonin, E.V. and Galperin, M.Y. (1997) Prokaryotic genomes: the emerging paradigm of genome-based microbiology. Curr. Opin. Genet. Dev. 7, 757^763. [2] Nelson, K.E., Paulsen, I.T., Heidelberg, J.F. and Fraser, C.M. (2000) Status of genome projects for nonpathogenic bacteria and archaea. Nat. Biotechnol. 18, 1049^1054. [3] Tatusov, R.L., Koonin, E.V. and Lipman, D.J. (1997) A genomic perspective on protein families. Science 278, 631^637. [4] Tatusov, R.L., Galperin, M.Y., Natale, D.A. and Koonin, E.V. (2000) The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res. 28, 33^36. [5] Galperin, M.Y. and Koonin, E.V. (2000) Who's your neighbor? New computational approaches for functional genomics. Nat. Biotechnol. 18, 609^613. [6] Matsushika, A. and Mizuno, T. (1998) The structure and function of the histidine-containing phosphotransfer (HPt) signaling domain of the Escherichia coli ArcB sensor. J. Biochem. Tokyo 124, 440^445. [7] Ponting, C.P. and Aravind, L. (1997) PAS: a multifunctional domain family comes to light. Curr. Biol. 7, R674^R677. [8] Zhulin, I.B., Taylor, B.L. and Dixon, R. (1997) PAS domain S-boxes in Archaea, Bacteria and sensors for oxygen and redox. Trends Biochem. Sci. 22, 331^333. [9] Anantharaman, V. and Aravind, L. (2000) Cache ^ a signaling domain common to animal Ca2‡ -channel subunits and a class of prokaryotic chemotaxis receptors. Trends Biochem. Sci. 25, 535^537. [10] Aravind, L. and Ponting, C.P. (1997) The GAF domain : an evolutionary link between diverse phototransducing proteins. Trends Biochem. Sci. 22, 458^459. [11] Ho, Y.S., Burden, L.M. and Hurley, J.H. (2000) Structure of the GAF domain, a ubiquitous signaling motif and a new class of cyclic GMP receptor. EMBO J. 19, 5288^5299. [12] Aravind, L. and Ponting, C.P. (1999) The cytoplasmic helical linker domain of receptor histidine kinase and methyl-accepting proteins is common to many prokaryotic signalling proteins. FEMS Microbiol. Lett. 176, 111^116. [13] Taylor, B.L. and Zhulin, I.B. (1999) PAS domains: internal sensors of oxygen, redox potential, and light. Microbiol. Mol. Biol. Rev. 63, 479^506.

FEMSLE 10076 31-8-01

[14] Dutta, R., Qin, L. and Inouye, M. (1999) Histidine kinases: diversity of domain organization. Mol. Microbiol. 34, 633^640. [15] Grebe, T.W. and Stock, J.B. (1999) The histidine protein kinase superfamily. Adv. Microb. Physiol. 41, 139^227. [16] Hoch, J.A. (2000) Two-component and phosphorelay signal transduction. Curr. Opin. Microbiol. 3, 165^170. [17] Stock, A.M., Robinson, V.L. and Goudreau, P.N. (2000) Two-component signal transduction. Annu. Rev. Biochem. 69, 183^215. [18] Hecht, G.B. and Newton, A. (1995) Identi¢cation of a novel response regulator required for the swarmer-to-stalked-cell transition in Caulobacter crescentus. J. Bacteriol. 177, 6223^6229. [19] Tal, R., Wong, H.C., Calhoon, R., Gelfand, D., Fear, A.L., Volman, G., Mayer, R., Ross, P., Amikam, D., Weinhouse, H., Cohen, A., Sapir, S., Ohana, P. and Benziman, M. (1998) Three cdg operons control cellular turnover of cyclic di-GMP in Acetobacter xylinum: genetic organization and occurrence of conserved domains in isoenzymes. J. Bacteriol. 180, 4416^4425. [20] Aldridge, P. and Jenal, U. (1999) Cell cycle-dependent degradation of a £agellar motor component requires a novel-type response regulator. Mol. Microbiol. 32, 379^391. [21] Merkel, T.J., Barros, C. and Stibitz, S. (1998) Characterization of the bvgR locus of Bordetella pertussis. J. Bacteriol. 180, 1682^1690. [22] Galperin, M.Y., Natale, D.A., Aravind, L. and Koonin, E.V. (1999) A specialized version of the HD hydrolase domain implicated in signal transduction. J. Mol. Microbiol. Biotechnol. 1, 303^305. [23] Slater, H., Alvarez-Morales, A., Barber, C.E., Daniels, M.J. and Dow, J.M. (2000) A two-component system involving an HD-GYP domain protein links cell-cell signalling to pathogenicity gene expression in Xanthomonas campestris. Mol. Microbiol. 38, 986^1003. [24] Schultz, J., Copley, R.R., Doerks, T., Ponting, C.P. and Bork, P. (2000) SMART : a web-based tool for the study of genetically mobile domains. Nucleic Acids Res. 28, 231^234. [25] Bateman, A., Birney, E., Durbin, R., Eddy, S.R., Howe, K.L. and Sonnhammer, E.L. (2000) The Pfam protein families database. Nucleic Acids Res. 28, 263^266. [26] Tam, R. and Saier Jr., M.H. (1993) Structural, functional, and evolutionary relationships among extracellular solute-binding receptors of bacteria. Microbiol. Rev. 57, 320^346. [27] Charbonneau, H., Prusti, R.K., LeTrong, H., Sonnenburg, W.K., Mullaney, P.J., Walsh, K.A. and Beavo, J.A. (1990) Identi¢cation of a noncatalytic cGMP-binding domain conserved in both the cGMP-stimulated and photoreceptor cyclic nucleotide phosphodiesterases. Proc. Natl. Acad. Sci. USA 87, 288^292. [28] Davis, S.J., Vener, A.V. and Vierstra, R.D. (1999) Bacteriophytochromes: phytochrome-like photoreceptors from nonphotosynthetic eubacteria. Science 286, 2517^2520. [29] Park, C.M., Kim, J.I., Yang, S.S., Kang, J.G., Kang, J.H., Shim, J.Y., Chung, Y.H., Park, Y.M. and Song, P.S. (2000) A second photochromic bacteriophytochrome from Synechocystis sp. PCC 6803 spectral analysis and down-regulation by light. Biochemistry 39, 10840^10847. [30] Reizer, J., Reizer, A., Merrick, M.J., Plunkett, G., Rose, D.J. and Saier, M.H. (1996) Novel phosphotransferase-encoding genes revealed by analysis of the Escherichia coli genome: a chimeric gene encoding an Enzyme I homologue that possesses a putative sensory transduction domain. Gene 181, 103^108. [31] Ross, P., Mayer, R. and Benziman, M. (1991) Cellulose biosynthesis and function in bacteria. Microbiol. Rev. 55, 35^58. [32] Weinhouse, H., Sapir, S., Amikam, D., Shilo, Y., Volman, G., Ohana, P. and Benziman, M. (1997) c-di-GMP-binding protein, a new factor regulating cellulose synthesis in Acetobacter xylinum. FEBS Lett. 416, 207^211. [33] Pei, J. and Grishin, N.V. (2001) GGDEF domain is homologous to adenylyl cyclase. Proteins 42, 210^216. [34] Beattie, D.T., Mahan, M.J. and Mekalanos, J.J. (1993) Repressor binding to a regulatory site in the DNA coding sequence is su¤cient

Cyaan Magenta Geel Zwart

M.Y. Galperin et al. / FEMS Microbiology Letters 203 (2001) 11^21

[35] [36]

[37]

[38]

[39]

[40]

[41]

[42]

[43]

[44]

[45]

[46]

[47]

to confer transcriptional regulation of the vir-repressed genes (vrg genes) in Bordetella pertussis. J. Bacteriol. 175, 519^527. Dubnau, D. (1991) Genetic competence in Bacillus subtilis. Microbiol. Rev. 55, 395^424. Chae, K.S. and Yoo, O.J. (1986) Cloning of the lambda resistant genes from Brevibacterium albidum and Proteus vulgaris into Escherichia coli. Biochem. Biophys. Res. Commun. 140, 1101^1105. Aravind, L. and Koonin, E.V. (1998) The HD domain de¢nes a new superfamily of metal-dependent phosphohydrolases. Trends Biochem. Sci. 23, 469^472. Chervitz, S.A., Aravind, L., Sherlock, G., Ball, C.A., Koonin, E.V., Dwight, S.S., Harris, M.A., Dolinski, K., Mohr, S., Smith, T., Weng, S., Cherry, J.M. and Botstein, D. (1998) Comparison of the complete protein sets of worm and yeast: orthology and divergence. Science 282, 2022^2028. Mizuno, T., Kaneko, T. and Tabata, S. (1996) Compilation of all genes encoding bacterial two-component signal transducers in the genome of the cyanobacterium, Synechocystis sp. strain PCC 6803. DNA Res. 3, 407^414. Mizuno, T. (1997) Compilation of all genes encoding two-component phosphotransfer signal transducers in the genome of Escherichia coli. DNA Res. 4, 161^168. Koretke, K.K., Lupas, A.N., Warren, P.V., Rosenberg, M. and Brown, J.R. (2000) Evolution of two-component signal transduction. Mol. Biol. Evol. 17, 1956^1970. Nierman, W.C., Feldblyum, T.V., Laub, M.T., Paulsen, I.T., Nelson, K.E., Eisen, J., Heidelberg, J.F., Alley, M.R., Ohta, N., Maddock, J.R., Potocka, I., Nelson, W.C., Newton, A., Stephens, C., Phadke, N.D., Ely, B., DeBoy, R.T., Dodson, R.J., Durkin, A.S., Gwinn, M.L., Haft, D.H., Kolonay, J.F., Smit, J., Craven, M.B., Khouri, H., Shetty, J., Berry, K., Utterback, T., Tran, K., Wolf, A., Vamathevan, J., Ermolaeva, M., White, O., Salzberg, S.L., Venter, J.C., Shapiro, L. and Fraser, C.M. (2001) Complete genome sequence of Caulobacter crescentus. Proc. Natl. Acad. Sci. USA 98, 4136^4141. Aravind, L., Tatusov, R.L., Wolf, Y.I., Walker, D.R. and Koonin, E.V. (1998) Evidence for massive gene exchange between archaeal and bacterial hyperthermophiles. Trends Genet. 14, 442^444. Nelson, K.E., Clayton, R.A., Gill, S.R., Gwinn, M.L., Dodson, R.J., Haft, D.H., Hickey, E.K., Peterson, J.D., Nelson, W.C., Ketchum, K.A., McDonald, L., Utterback, T.R., Malek, J.A., Linher, K.D., Garrett, M.M., Stewart, A.M., Cotton, M.D., Pratt, M.S., Phillips, C.A., Richardson, D., Heidelberg, J., Sutton, G.G., Fleischmann, R.D., Eisen, J.A. and Fraser, C.M. (1999) Evidence for lateral gene transfer between Archaea and bacteria from genome sequence of Thermotoga maritima. Nature 399, 323^329. Brown, N.L., Misra, T.K., Winnie, J.N., Schmidt, A., Sei¡, M. and Silver, S. (1986) The nucleotide sequence of the mercuric resistance operons of plasmid R100 and transposon Tn501 : further evidence for mer genes which enhance the activity of the mercuric ion detoxi¢cation system. Mol. Gen. Genet. 202, 143^151. Hyde, D.R. and Tu, C.P. (1985) tnpM: a novel regulatory gene that enhances Tn21 transposition and suppresses cointegrate resolution. Cell 42, 629^638. Romling, U., Rohde, M., Olsen, A., Normark, S. and Reinkoster, J. (2000) AgfD, the checkpoint of multicellular and aggregative behaviour in Salmonella typhimurium regulates at least two independent pathways. Mol. Microbiol. 36, 10^23.

FEMSLE 10076 31-8-01

21

[48] Zogaj, X., Nimtz, M., Rohde, M., Bokranz, W. and Romling, U. (2001) The multicellular morphotypes of Salmonella typhimurium and Escherichia coli produce cellulose as the second component of the extracellular matrix. Mol. Microbiol. 39, 1452^1463. [49] Vijay, K., Brody, M.S., Fredlund, E. and Price, C.W. (2000) A PP2C phosphatase containing a PAS domain is required to convey signals of energy stress to the sigmaB transcription factor of Bacillus subtilis. Mol. Microbiol. 35, 180^188. [50] Koonin, E.V., Aravind, L. and Galperin, M.Y. (2000) A comparative-genomic view of the microbial stress response. In: Bacterial Stress Responses (Storz, G. and Hengge-Aronis, R., Eds.), pp. 417^ 444. ASM Press, Washington, DC. [51] Volz, K. (1993) Structural conservation in the CheY superfamily. Biochemistry 32, 11741^11753. [52] Neuwald, A.F., Aravind, L., Spouge, J.L. and Koonin, E.V. (1999) AAA+: A class of chaperone-like ATPases associated with the assembly, operation, and disassembly of protein complexes. Genome Res. 9, 27^43. [53] Natale, D.A., Galperin, M.Y., Tatusov, R.L. and Koonin, E.V. (2000) Using the COG database to improve gene recognition in complete genomes. Genetica 108, 9^17. [54] Galperin, M.Y. (2001) Conserved `hypothetical' proteins: new hints and new puzzles. Comp. Funct. Genomics 2, 14^18. [55] Ausmees, N., Jonsson, H., Hoglund, S., Ljunggren, H. and Lindberg, M. (1999) Structural and putative regulatory genes involved in cellulose synthesis in Rhizobium leguminosarum bv. trifolii. Microbiology 145, 1253^1262. [56] Marsh, J.W. and Taylor, R.K. (1999) Genetic and transcriptional analyses of the Vibrio cholerae mannose-sensitive hemagglutinin type 4 pilus gene locus. J. Bacteriol. 181, 1110^1117. [57] Lee, S.H., Angelichio, M.J., Mekalanos, J.J. and Camilli, A. (1998) Nucleotide sequence and spatiotemporal expression of the Vibrio cholerae vieSAB genes during infection. J. Bacteriol. 180, 2298^ 2305. [58] Alksne, L.E. and Rasmussen, B.A. (1997) Expression of the AsbA1, OXA-12, and AsbM1 beta-lactamases in Aeromonas jandaei AER 14 is coordinated by a two-component regulon. J. Bacteriol. 179, 2006^ 2013. [59] Delgado-Nixon, V.M., Gonzalez, G. and Gilles-Gonzalez, M.A. (2000) Dos, a heme-binding PAS protein from Escherichia coli, is a direct oxygen sensor. Biochemistry 39, 2685^2691. [60] Schneider, T.D. and Stephens, R.M. (1990) Sequence logos: a new way to display consensus sequences. Nucleic Acids Res. 18, 6097^ 6100. [61] Chang, A.L., Tuckerman, J.R., Gonzalez, G., Mayer, R., Weinhouse, H., Volman, G, Amikam, D., Benziman, M. and Gilles-Gonzalez, M.A. (2001) Phosphodiesterase A1, a regulator of cellulose synthesis in Acetobacter xylinum, is a heme-based sensor. Biochemistry 40, 3420^3426. [62] Jones, H.A., Lillard Jr., J.W. and Perry, R.D. (1999) HmsT, a protein essential for expression of the haemin storage (Hms+) phenotype of Yersinia pestis. Microbiology 145, 2117^2128. [63] Ko, M. and Park, C. (2000) Two novel £agellar components and H-NS are involved in the motor function of Escherichia coli. J. Mol. Biol. 303, 371^382.

Cyaan Magenta Geel Zwart