Bacterial FHA domains: neglected players in the phospho-threonine signalling game?

Bacterial FHA domains: neglected players in the phospho-threonine signalling game?

556 Opinion TRENDS in Microbiology Vol.10 No.12 December 2002 Bacterial FHA domains: neglected players in the phospho-threonine signalling game? Ma...

80KB Sizes 0 Downloads 96 Views

556

Opinion

TRENDS in Microbiology Vol.10 No.12 December 2002

Bacterial FHA domains: neglected players in the phospho-threonine signalling game? Mark Pallen, Roy Chaudhuri and Arshad Khan Forkhead-associated (FHA) domains bind phospho-threonine peptides and are known to mediate phosphorylation-dependent protein–protein interactions in a variety of eukaryotic settings. However, their role in bacterial physiology and signalling has been largely neglected. We have surveyed bacterial FHA domains and discovered that they are implicated in many bacterial processes, including regulation of cell shape, type III secretion, sporulation, pathogenic and symbiotic host–bacterium interactions, carbohydrate storage and transport, signal transduction and ethambutol resistance. The way is now open to identify the targets of each FHA domain, and their roles in cellular physiology, and perhaps even to develop novel FHA-blocking antibacterial agents. Published online: 31 October 2002

Mark Pallen* Roy Chaudhuri Arshad Khan Division of Immunity & Infection, Birmingham University Medical School, Birmingham, UK B15 2TT. *e-mail: [email protected]

The forkhead-associated (FHA) domain was first described in 1995 as a conserved sequence of ~75 amino acids found in 20, mainly eukaryotic, proteins including several forkhead-type transcription factors [1]. Subsequent studies established that the FHA domain is a phosphoprotein-recognition unit, with a clear preference for phospho-threonine (pT) peptides (see recent reviews [2–5]). The FHA domain has now been found in >200 proteins (PFAM entry PF00498) involved in diverse processes in eukaryotes, including signal transduction, response to DNA damage, vesicular transport, chromosome segregation, protein degradation and cell cycle regulation. The structures of three FHA domains with associated phosphopeptides have been reported [3,6,7]. From these, it is clear that the structural domain encompasses a larger region than is obvious from sequence homology and is folded into an 11-stranded β-sandwich. The most conserved residues are involved in recognition of the phosphopeptide backbone or pT residue; the specificity-determining residues (e.g. contacting the pT+3 residue) are less well conserved [3,6]. As Durocher and Jackson recently pointed out [3], the discovery of an FHA domain in a protein has predictive significance: it strongly suggests that the FHA-domain-containing protein interacts with a protein partner in a process regulated by reversible protein phosphorylation. Regulatory phosphorylation http://tim.trends.com

or dephosphorylation of serine or threonine residues by serine/threonine protein kinases (STPKs) and phosphatases (STPPs) is an emerging theme in prokaryotic signalling [8–10], particularly given the discovery of many STPKs and STPPs through genome sequencing and sequence surveys [11–14]. However, despite it being clear from the outset that FHA domains occur in bacterial proteins – there are now 64 bacterial FHA proteins listed in the SMART database (http://smart.ox.ac.uk) and Leonard et al. have provided estimates of the number of FHA domains in STPK-containing genomes [15] – there have been no reports of experimental characterization of the physiological role or identification of the binding partner(s) of any bacterial FHA domain. (One marginal exception demonstrated the threonine-phosphorylationdependent peptide-binding specificity, but not actual binding partners, of FHA domains from two mycobacterial proteins, Rv1827 and Rv0020c [6].) Nor has there been any systematic attempt to survey the occurrence of FHA domains in bacteria with a view to making predictions as to their roles or binding partners that could be explored in the laboratory. We have therefore undertaken a search for FHA domains in bacterial proteins, drawing particularly on proteins predicted from completed genome sequences. Methods

An iterative PSI-BLAST search with an FHA domain of known structure (Fha1 domain of yeast Rad53; PDB entry 1G3GA) was performed on our ViruloGenome web site (http://www.vge.ac.uk) against a combined database of the NCBI’s NR database and products predicted by GLIMMER (http://www.tigr.org/softlab/glimmer/glimmer.html) from unfinished genomes. The default settings were used (expect value for inclusion in iteration 1: 0.002; BLOSUM 65 matrix; composition-based statistics; no filter) and the search terminated itself after eight iterations. Although protein sequences from unfinished genomes were recovered in our searches and might have played a role in bridging gaps in sequence space [16], for economy of space further analyses and descriptions were largely restricted to proteins from published completed genome sequences. The genomic contexts of genes encoding FHA domains were scrutinized for STPKs and STPPs and for other clues as to function. The domain structures of protein sequences derived from adjacent genes were analyzed using PFAM (http://www.sanger.ac.uk/Pfam/) and SMART (http://smart.ox.ac.uk) [17]. Additional PSI-BLAST searches on novel FHA domains were performed using default parameters as required. Results FHA domains are found in a wide range of bacteria

Although FHA domains were found in many bacterial genomes, they are not a universal feature of bacterial

0966-842X/02/$ – see front matter © 2002 Elsevier Science Ltd. All rights reserved. PII: S0966-842X(02)02476-9

Opinion

TRENDS in Microbiology Vol.10 No.12 December 2002

557

a

Table 1. FHA domains in Gram-positive bacteria Organism

Protein

GI

Length FHA domains Associated STPK/STPPs

Comments

Bacillus halodurans Deinococcus radiodurans Clostridium acetobutylicum

BH1777

15614340

230

18–102

BH2504?

Contains carboxy-terminal DNA-binding domain

DRA0333

15807993

314

167–311

STPK: DRA0332

Contains Ala/Pro-rich low-complexity sequence

CAC0036

15893334

468

379–468

STPP: CAC0035

CAC0039

15893337

1544

111–201

STPP: CAC0035

CAC0406

15893697

516

420–516

CAC0408

15893699

1524

82–190

STPP: CAC0407, STPK: CAC0404 STPP: CAC0407, STPK: CAC0404

CAC0504

15893795

159

84–158

Rv0019c

15607161

155

57–149

Rv0020c

15607162

527

389–525

Rv1267c

15608407

388

257–380

Rv1747 Rv1827 Rv3360 SCH69.13

15608885 15608964 15610496 7480073

865 162 122 290

177–299 35–156 4–92 159–286

SCH69.14

7480074

172

46–168

SC1A8A.04c SCE50.03

7649595 7546665

267 1345

141–260 94–201

STPK: pknE/pknF ? ? STPK: SCH69.18, STPP: SCH69.15 STPK: SCH69.18, STPP: SCH69.15 ? STPK: pkaA/pkaB

Contains TM domain; clusters with WXG100 and tetratricopeptide repeat genes Contains TM domain and three SpoIIIE/FtsK domains; clusters with WXG100 and tetratricopeptide repeat genes Contains TM domain; clusters with WXG100 and tetratricopeptide repeat genes Contains TM domain and three SpoIIIE/FtsK domains; clusters with WXG100 and tetratricopeptide repeat genes In rodA/pbp gene cluster; involved in regulating cell shape? In rodA/pbp gene cluster; involved in regulating cell shape? In rodA/pbp gene cluster; involved in regulating cell shape? EmbR; contains an amino-terminal BAD domain; regulates cell wall arabinosyltransferases; involved in ethambutol resistance; adjacent cluster of ABC transporter genes Contains ABC transporter domain GarA; implicated in regulation of glycogen storage No functional assignment Orthologue of Rv0020c

SCBAC1A6.03

13276806

183

102–179

?

SC6D10.12

6855393

604

452–594

Two STPKs

SCI33.05c

12718426

876

1–89, 201–307

STPP

Mycobacterum tuberculosis

Streptomyces coelicolor

STPK: pknA/pknB, STPP: ppp STPK: pknA/pknB, STPP: ppp STPK: pknH

Orthologue of Rv0019c Orthologue of GarA Contains proline-rich low-complexity sequence; contains FtsK/SpoIIIE domain; adjacent cluster of ABC transporter genes Clusters with amylase genes; involved in glycogen metabolism? Contains low-complexity proline-rich sequence; in sugar transport system operon; regulates uptake of unidentified carbohydrate? Contains two amino-terminal FHA domains flanking proline-rich low-complexity sequence, followed by an ABC-transporter domain and TM domains

a

Abbreviations: ABC, ATP-binding cassette; FHA, forkhead-associated; STPK, Ser/Thr protein kinase; STPP, Ser/Thr protein phosphatase; TM, transmembrane

signalling, being restricted to three main groups of bacteria: the mycobacteria and their relatives; the cyanobacteria; and selected Gram-negative proteobacteria together with the chlamydias (Tables 1,2). As can be seen from the alignment (Fig. 1), the conserved motifs identified in eukaryotic FHA domains [3] (GR at the end of strand three, SXXH just before strand five and NG just before strand seven) are common in the bacterial domains. Given the conservation of the pT-contact residues (equivalent to Arg70, Ser85 and Asn107 in RAD53FHA1) [3] and what is known about eukaryotic FHA domains, together with the limited experimental evidence on mycobacterial FHA binding specificities [6], it is reasonable to assume that most of these http://tim.trends.com

FHA domains bind pT-peptides, although interactions with other phospho-residues (e.g. phospho-aspartate, phospho-histidine, phospho-serine or phospho-tyrosine) or other functions cannot be ruled out for the most variant forms. FHA domains in the Bacillus/Clostridium group

Bacillus halodurans possesses one obvious FHA-domain-containing protein, BH1777, where the FHA domain sits at the amino terminus; the carboxyl terminus contains a DNA-binding domain of the sort (termed trans_reg_C in PFAM) usually found in association with two-component response regulators but occasionally found in other transcriptional regulators (e.g. HilA from Salmonella;

558

Opinion

TRENDS in Microbiology Vol.10 No.12 December 2002

Table 2. FHA domains in Gram-negative bacteria and chlamydias

a

Organism

Protein

GI

Length

FHA domains

Associated STPK/STPPs

Comments

Pseudomonas aeruginosa PAO1 P. aeruginosa PAO1

PA1665

11349270

397

25–107

In impA-N-like gene cluster

PA0081

11348758

497

23–121

Rhizobium leguminosarum ImpI bv. trifolii Agrobacterium tumefaciens AGR_L_ 1057 Mesorhizobium loti mlr2345

16326459

399

15–113

15890648

399

28–113

13472144

486

13–106

Escherichia coli O157 Vibrio cholerae Myxococcus xanthus

Ecs0229 vca0112 ‘orf1’

15829483 15600883 2736192

616 495 833

18–120 8–108

STPP: stp1, STPK: stk1 STPP: PA0075, STPK: ppkA STPP: impM, STPK: impN STPP: AGR_L_1064, STPK: AGR_L_1065 STPP: mlr2361, STPK: mlr2363 No STPK in genome No STPK in genome STPK: pkn3

EspA

5713126

388

136–279, 271–385

?

CT664 YscD

15605397 267565

768 829

32–148 355–476

STPK: CT673 ?

Chlamydia trachomatis Yersinia enterocolitica

In impA-N-like gene cluster In impA-N-like gene cluster In impA-N-like gene cluster In impA-N-like gene cluster In impA-N-like gene cluster; in O-island In impA-N-like gene cluster Contains a proline-rich amino-terminal domain followed by two FHA domains Histidine-kinase; a complex domain architecture; regulates the timing of sporulation In type III secretion gene cluster, see below Amino-terminal putative FHA domain also evident in some other SctD proteins, e.g. PscD from Pseudomonas aeruginosa, HrpQ from Erwinia amylovora (Fig. 1)

a

Abbreviations: STPK, Ser/Thr protein kinase; STPP, Ser/Thr protein phosphatase.

http://www.sanger.ac.uk/cgi-bin/Pfam/getacc?PF00486). The FHA domain is presumably involved in transcriptional regulation, yet scrutiny of the surrounding genes gives few clues as to its target genes. Four of the five FHA-domain-containing proteins from Clostridium acetobutylicum are encoded in two similar clusters of WXG100 proteins (Table 1). These proteins are short (~100 residues) proteins with a central conserved WXG motif that, in the archetypal ESAT-6 protein from Mycobacterium tuberculosis at least, are secreted without a signal peptide and probably rely on SpoIIIE/FtsK-domain-bearing membrane-associated ATPases for their export [18]. One of the C. acetobutylicum FHA/WXG100 clusters also encodes an STPK. It is likely that the FHA domains mediate signalling interactions between proteins encoded in these clusters. Additional interactions among these proteins are probably also mediated by tetratricopeptide repeats encoded in some of the clustering genes (tetratricopeptide repeats represent a protein–protein interaction module found in multiple copies in many functionally different proteins and are thought to facilitate specific interactions with partner proteins [19]). Nölling et al. have speculated that proteins encoded by these gene clusters are involved in Ser/Thr-phosphorylationregulated cell division in C. acetobutylicum. An alternative or additional hypothesis, based on the homologies to the mycobacterial ESAT-6 gene cluster and the presence of SpoIIIE/FtsK domains [18], is that they encode secretion systems involved in the export of some as-yet-unknown macromolecular substrate. The presence of FHA domains in the C. acetobutylicum WXG100 clusters but not in the http://tim.trends.com

mycobacterial ESAT-6 gene clusters suggests clear differences in the regulation of the function of the proteins encoded in these clusters. Interestingly, an additional VGE-PSI-BLAST search with the FHA domain from C. acetobutylicum CAC0039 retrieves with significance (E = 0.001) an amino-terminal region from the Bacillus halodurans WXG100-linked, membrane-associated FtsK/SpoIIE-like ATPase BH0975, hinting that this bacterium might also regulate a WXG100 secretion system by Ser/Thr phosphorylation (note that some proteins associated with macromolecular transport in eukaryotes, the kinesins, also possess FHA domains [3]). The remaining FHA-domain-containing protein from C. acetobutylicum, CAC0504, is encoded by a gene just upstream of genes implicated by homology in determining cell shape (encoding homologues of the shape-determining RodA and Pbp2 from Escherichia coli [20]). FHA domains in the mycobacteria and their relatives

Four of the six FHA-domain-containing proteins in Mycobacterium tuberculosis are encoded within gene clusters containing STPK- or STPP-encoding genes (Table 1). Rv0019c and Rv0020c are encoded within the pknA/pknB/ppp gene cluster [8]. Curiously, as with CAC0504 from C. acetobutylicum, there is clustering of these mycobacterial FHA-encoding genes with those involved in determining cell shape, suggesting a role for Ser/Thr phosphorylation in this process in both organisms; indeed, surprisingly, PknA can influence cell shape even when expressed in E. coli [21]. Obvious potential binding partners for Rv0019c and Rv0020c include the two kinases

Opinion

TRENDS in Microbiology Vol.10 No.12 December 2002

(both known to engage in autophosphorylation [21,22]) and their other, as-yet-unidentified substrates. Rv1267 or EmbR is encoded by a gene adjacent to pknH and sits near a cluster of ATP-binding cassette (ABC) transporter genes. Rv1267 contains an aminoterminal bacterial transcriptional-activator domain (BAD; PFAM entry PF03704) and a carboxy-terminal FHA domain and is thought to be a transcriptional regulator of the embA and embB genes, which encode cell wall arabinosyltransferases involved in ethambutol resistance [23]. This FHA domain probably mediates protein–phosphoprotein interactions that regulate arabinan synthesis. One possibility is that PknH phosphorylates EmbR, which is then activated by dimerization through an FHA-mediated interaction. Interestingly, a mutation associated with ethambutol resistance affects a residue at the carboxyl terminus of the EmbR FHA domain [24], suggesting that alteration in FHA-mediated interactions can result in an ethambutol-resistance phenotype. Rv1747 is encoded by a gene in the pknE/pknF cluster and contains an ABC transporter domain, suggesting that Rv1747 binds the target(s) of PknE and/or PknF (which has been shown to phosphorylate myelin basic protein on Ser and Thr residues [25]) and regulates transport of an unknown substrate. Rv1827 or GarA (glycogen-accumulation regulator A) appears to be taken up entirely by the FHA domain and has been implicated in the regulation of glycogen storage in Mycobacterium smegmatis, although the phosphorylated proteins to which it binds remain unknown, despite it being clear from peptide-binding studies that it does indeed bind pT phosphopeptides [6,26]. There are no clues as to the role of the remaining mycobacterial FHA-domain-containing protein (Rv3360). Of the seven FHA-domain-containing proteins found in Streptomyces coelicolor, three were orthologues of M. tuberculosis proteins and four were specific to S. coelicolor (Table 1). SCE50.03 is encoded by a gene adjacent to the genes encoding the previously characterized PkaA/PkaB kinases [27]. Both SCE50.03 and PkaB contain large regions of low-complexity proline-rich sequence. In addition, SCE50.03 contains a carboxy-terminal FtsK/SpoIIIE domain, suggesting that it is involved in the movement of DNA and/or proteins. As PkaA and PkaB can both autophosphorylate [27], the FHA domain in SCE50.03 might bind to one or other or both; alternatively, it might recruit other phosphorylated proteins into a multi-protein complex, perhaps involving the ABC transporter system encoded by an adjacent but divergent gene cluster (SCE99.01c-03c). SCBAC1A6.03 is taken up entirely by an FHA domain and this gene clusters with those encoding amylases, suggesting that it might play a similar role to the mycobacterial GarA in regulating glycogen metabolism. The SC6D10.12 gene sits at the end of an operon including two STPKs and components of a sugar transport system. As with SCE50.03, most of the http://tim.trends.com

559

SC6D10.12 sequence is taken up with low-complexity proline-rich sequence, with the FHA domain at the carboxyl terminus. This FHA domain is probably involved in regulating the uptake of an unidentified carbohydrate. SCI33.05c contains two amino-terminal FHA domains flanking low-complexity proline-rich sequence, followed by a central ABC transporter domain, concluded by several transmembrane helices at the carboxyl terminus. This suggests a role in regulating the export of an unidentified substrate. The gene for SCI33.05c is located near an STPP gene, with both gene products presumably affecting the same phosphoprotein target. FHA domains in the proteobacteria and chlamydias

Several pathogenic or symbiotic proteobacteria possess clusters of homologous genes similar to the impA–N cluster from Rhizobium leguminosarum bv. trifolii (GenBank entry AF361470) and centred on a gene encoding an FHA domain (Fig. 2; Table 2). Some of these clusters also contain an STPP/STPK gene pair (experimentally characterized for the two clusters in Pseudomonas aeruginosa [28–30]). However, in other cases (e.g. E. coli O157 and Vibrio cholerae), STPP/STPK genes are absent – not just from the cluster but from the entire genome – and gene clusters that contain homologues of some of the impA–N genes exist in some organisms that lack both the FHA-domain-encoding gene and the STPP/STPK gene pair (e.g. in several Yersinia pestis clusters and the centisome-7 Salmonella pathogenicity island [31]; M. Pallen et al., unpublished). The little that is known of the function of these impA–N-like gene clusters suggests a role in bacterial–host interactions; in several cases, genes from the clusters have been shown to be expressed in vivo, the PpkA kinase from P. aeruginosa is required for virulence in an animal model and the imp gene cluster from R. leguminosarum bv. trifolii is involved in temperature-dependent protein secretion and has been linked to nodulation [30,32,33]. The clusters also contain homologues of the gene icmF, which is required for macrophage killing by Legionella pneumophila [34,35]. It is reasonable to assume that the ImpI-like FHA-domain-containing proteins bind to the targets of the associated STPKs in P. aeruginosa, R. leguminosarum bv. trifolii, Agrobacterium tumefaciens and Mesorhizobium loti (including the autophosphorylating STPKs themselves [28,29]) and regulate some aspect of the bacteria–host interaction. However, the function of FHA domains in E. coli O157 and V. cholerae remains a mystery, with the absence of STPKs in these species. One explanation might be that they are non-functional remnants of a previous Ser/Thr protein-phosphorylation system lost from a gene cluster; alternatively, they might bind proteins phosphorylated on other residues. A single FHA-domain-containing protein (CT664 from Chlamydia trachomatis and its orthologues) is found in each of the three chlamydial species for which

560

Opinion

Rad53/FHA1 Rad53/FHA2 HrpQ/Panag HrpQ/Erwam SCH69.13 Rv3360 CAC0504 CAC0408 EmbR SCI33.05c/FHA2 Rv1747 BH1777 all4083 all0156 all3776 all1175 SCI33.05c/FHA1 SCBAC1A6.03 alr1728 Rv1827 SC1A8A.04c alr4954 Rv0020c alr3269 EspA/myxxa alr0548 Rv0019c SCH69.14 SCE50.03 SC6D10.12 cyaD/Nostoc DRA0333 ImpI/Rhile AGR_L_1057p Pa1665 mlr2345 VCA0112 Pa0081 all1730 CAC0406 CAC0036 alr1603 CT664 alr4579 CAC0039 all4084 orf/myxxa/FHA2 yscD_yeren orf/myxxa/FHA1

TRENDS in Microbiology Vol.10 No.12 December 2002

66 601 23 23 216 26 87 115 308 218 230 31 48 3 42 40 25 110 35 77 190 27 455 90 62 26 83 99 130 523 24 231 27 26 28 26 34 28 435 440 392 205 406 64 130 31 315 24 198

3 4 5 6 ---> --> -----> ----> WTFGRNP---ACDYHLGN---ISRLSNKHFQILLG---------EDGNLLLND------ISTNG FFIGRSE---DCNCKIED----NRLSRVHCFIFKKRHAVGKSMYESPAQGLDD----IWYCHTG WWIGAAE---DADLALFD----PGIKDRHCQVIKT----------PQSWMVKA--------LEG WWIGAAQ---DADLALFD----PGIKDRHCRLSKT----------DLGWEVTA--------LEG LVMGRST---EADVRIDD----PGVSRRHCEIRTG----------TPSTIQ-D-----LGSTNG VVVGSDL---RADMRVAH----PLIARAHLLLRFD----------RGNWIAID-----NDSQSG ITIGRKD---DNSIMLNE----GYVSGHHARVYLR----------NNQYILED-----LNSTNG FTIGRGK---FNDIVFDD----IKVSEKHAEIFED----------NGKYVLVD-----LNSTNK TRIGRLH---DNDIVLDS----ANVSRHHAVIVDT---------GTN-YVIND-----LRSSNG MRIGRAL---ENDLVVSD----LQVSRNHAEFHST---------PDGRMEIRD-----LGSHNG VRIGRAN---DNDIVIPE----VLASRHHATLVPT---------PGG-TEIRD-----NRSING GRLGKSW---KPDIAFDN----VFISRKHALLYVE----------EGQVFVKD-----LDSKHG AELDGKR---VSRMLLNS----DQVSRYHALIVWE----------NNQLVVID-----QDSVNG SENNGQR---VSRITIED----DLIADYHALIDWQ----------NQDLIIID-----QNTDNG YSIGRDK---ESNIRLVS----QFVSRRHATLVRLP------KNNSYYYRIVDGDGKGRASANG YSIGRHT---SNAIVLHS----RSVSRQHAILLRVTLP----ETDQCSFRIIDGNFKGQGSTNG YALGRDP---QGELVFDD----ARVSWRHATIS----------FNGRGWVVED-----HGSTNG LRIGRDP---ASGLRLS----HETVSRVHAELSRQGG----------MWVLRD-----LGSTNG VVIGRDP---SCQVVLDAMM-YRMVSRRHAVVRPVASS----VDSKFSWVLCD-----LNSANG TSAGRHP---DSDIFLDD----VTVSRRHAEFRLE----------NNEFNVVD-----VGSLNG TTAGRHP---QSDIFLDD----VTVSRRHVEFRRSP---------DGSFTVAD-----VGSLNG CIMGRSP---EANIQLPDDAEHKTISRYHCLLDIM----------PPNIRIRD-----FGSKNG NIIGRGQ---DAQFRLPD----TGVSRRHLEIRW----------DGQVALLAD-----LNSTNG WTIGRDR---HNGICTYD----KLLSRHHAAIKYV---------ENQGFLLID-----FQSTNG HIIGRGS---DVTVRIDD----HGVSRKHARV--V-------RAGDGACHVTD-----LDSTNG IRIGRAA---DNHVILSD----NLVSRHHLEIRQV-------SSGGGGSWQVV-----SKGTNG VLIGRAD---DSTLVLTD----DYASTRHARLS----------MRGSEWYVED-----LGSTNG ITLGRAH---DSTIVLDD----DYASSRHARIYP---------DQNGQWIVED-----LGSTNG IRLGRSA---DADVALDD----PDVSRMHCAVTV---------GPDARVSVAD-----LGSTNG HSTGDTP---DIDLAVPPE--DPGVSHQHAVLVQQP---------DGSWAVVD-----QNSTNG FTIGRLP---ECNLYLP----FAGVSRKHAQLVKKA---------DGKWIIED-----LGSKNG FDASSGPV--DIDLSSLPG--AEHISRHHAELYRE----------GSQWFVRD-----LGSTNG RTLGRAP---DCDWRLPED--RRSVSKLHCIIERDRE----------GFLLRD------QSANG RAIGRSR---DCDWQIDDN--ERRVSKLHCTLSRDGE----------GFIILD------QSANG GLIGRGG---ECDWAIPDR--KRHLSKQHARVSYRNG----------AFYLTD------TSSNG LVIGRSA---DAGWQIDDP--DMFVSRAHCKIRGDRD----------GYFVTD------TSSSG GVIGSSP---NAQWRLVDA--QGSVKPMHCEVMMVDG----------AYCLKD------SCGST LTIGRGP---DNDWVLPDP--ERLVSSRHCTILNRD----------GVYYLTD------TSTNG TRIGRTK---DNDIVIP----ELSVSKRHAEILCRNNFT---GNQARTYYLQD------FSTYG FKIGRLTG--SVDYVSDN----RAIGKMHAEIRKI----------NSEYYLMD-----LDSKNG FKIGRISG—-QADYISDN----KAVGKLHAEIRKQ----------NEKYYLID-----LTSRNG VHIGKPNDRIPPDVDVSGFANSEIVSRVHADIRLE----------GDAHYIED-----VGSSNG YIVGSDPQ--VADIVLSD----MSISRQHAKIIIG---------NDNSVLIED-----LGSKNG YVLGRSSK--SSDIVIRN----PVVSQIHLSLSRDSS------QRTPVFIIKD-----ENSTNG LSIGRDE---DNNISIRD----DLIDRKHCEIKCD---------SNNKFYVTD-----LKSKYG VRIGRDP--LRCDIVLTN----PTVSGLHVEIFFHS--------QQQNFYIRN-----LRSQNP LTIGLAH----CDLSFPGD---EGLAGRHCELSPT----------PTGALLRD-----LSGGLG CVFGSDP--LQSDIVLSD----SEIAPVHLVLMVDEEGIR--LTDSAEPLLQEG----LPVPLG CVVGRQR----GAILFADD---AFVSPLHATFLVK----------DGALYVRD-----ESSASG TRENDS in Microbiology

there are complete genome sequences. Two FHA domains flank a proline-rich sequence towards the amino terminus of the protein and are followed by a transmembrane domain and a presumably extra-cytoplasmic carboxy-terminal domain. The gene for this protein lies within a type III secretion system cluster that also encodes an STPK (CT673 in C. trachomatis). This suggests a role for CT664 in mediating phosphorylation-dependent protein–protein interactions in the chlamydial type III secretion system. As VGE-PSI-BLAST searches with several bacterial FHA domains revealed significant hits to the amino-terminal regions of some HrpQ proteins (of unknown function, but associated with type III http://tim.trends.com

secretion systems in phytopathogens), a new search was initiated with the amino-terminal 120 residues of HrpQ from Erwinia amylovora. Within six iterations this revealed significant homology between this region, other FHA domains and the amino-terminal domains from several other members of the SctD family of type III secretion system proteins (a group of membrane-associated proteins of unknown function, but required for type III secretion, which includes the archetypal YscD from Yersinia [36,37]). In an attempt to confirm this unexpected finding, VGE-PSI-BLAST and SMART searches were carried out on the amino-terminal domain (1–120 residues) of YscD from Y. enterocolitica. Under default conditions the

Opinion

TRENDS in Microbiology Vol.10 No.12 December 2002

Fig. 1. Multiple alignment of conserved regions from bacterial forkhead-associated (FHA) domains. Sequences obtained from the PSI-BLAST search were retrieved from the NCBI NR database. Sequences were aligned using ClustalW/Jalview. Alignments were then shaded using the BOXSHADE server (red, >50% sequences with identical residue; blue, >50% sequences with similar residue). Shown above the alignment are arrows corresponding to β-sheets three to six in Rad53 FHA1. The numbers after each sequence name indicate residues at which the aligned segment begins. To save space, sequences from Synechocystis were omitted. Sequence designations and NCBI GI numbers as follows: RAD53/FHA1 and RAD53/FHA2: FHA domains from RAD53, GI 134835; HrpQ/Panag, HrpQ from Pantoea agglomerans, GI 14588838; HrpQ/Erwam, HrpQ from Erwinia amylovora, GI 1181168; SCH69.13 from S. coelicolor, GI 7480073; Rv3360 from M. tuberculosis, GI 15610496, CAC0504 from C. acetobutylicum, GI 15893795; CAC0408 from C. acetobutylicum, GI 15893699, EmbR from M. tuberculosis, GI 15608407, SCI33.05c from S. coelicolor, GI 12718426; Rv1747 from M. tuberculosis, GI 15608885, BH1777 from B. halodurans, GI 15614340; all4083 from Nostoc sp. PCC 7120, GI 17231575; all0156 from Nostoc sp. PCC 7120, GI 17227652; all3776 from Nostoc sp. PCC 7120, GI 17231268; all1175 from Nostoc sp. PCC 7120, GI 17228670; SCBAC1A6.03 from S. coelicolor, GI 13276806; alr1728 from Nostoc sp. PCC 7120, GI 17130818; RV1827 from M. tuberculosis, GI 15608964; SC1A8A.04c from S. coelicolor, GI 7649595; alr4954 from Nostoc sp. PCC 7120, GI 17232446; Rv0020c from M. tuberculosis, GI 15607162; alr3269 from Nostoc sp. PCC 7120, GI 17230761; EspA/myxxa, Espa from M. xanthus, GI 5713126; alr0548 from Nostoc sp. PCC 7120, GI 17228044; Rv 0019c from M. tuberculosis, GI 15607161; SCH69.14 from S. coelicolor, GI 7480074; SCE50.03 from S. coelicolor, GI 7546665; SC6D10.12 from S. coelicolor, GI 6855393; cyaD/Nostoc from Nostoc sp. PCC 7120, GI 17228238; DRA0333 from D. radiodurans, GI 15807993; ImpI/Rhile, ImpI from Rhizobium leguminosarum bv. Trifolii, GI 16326459; AGR_L_1057p from Agrobacterium tumefaciens, GI 15890648; PA1665 from P. aeruginosa, GI 11349270; mlr2345 from M. loti, GI 13472144; VCA0112 from V. cholerae, GI 15600883; PA0081 from P.aeruginosa, GI 11348758, all1730 from Nostoc sp. PCC 7120, GI 17229222; CAC0406 from C. acetobutylicum, GI 15893697; CAC0036 from C. acetobutylicum, GI 15893334; alr1603 from Nostoc sp. PCC 7120, GI 17229095; CT664 from C. trachomatis, GI 15605397; alr4579 from Nostoc sp. PCC 7120, GI 17232071; CAC00039 from C. acetobutylicum GI 15893337; all4084 from Nostoc sp. PCC 7120, GI 17231576; orf/myxxa from M. xanthus, GI 2736192; yscD_yeren, YscD from Y. enterocolitica, GI 267565.

Rhizobium leguminosarum bv. trifolii impA

impI

impM impN

Vibrio cholerae vca0107

vca0112

vca0116

Agrobacterium tumefaciens str. C58 (Cereon) AGR_L_1042

Mesorhizobium loti mlr2336

AGR_L_ 1057

AGR_L_ AGR_L_ 1064 1065

mlr2345

mlr mlr 2361 2363

Pseudomonas aeruginosa (strain PAO1) PA1657

PA1665

stp1 stk1

Pseudomonas aeruginosa (strain PAO1) PA0089

PA0081

FHA-domain gene IcmF-like gene

ppkA

Ser/Thr protein phosphatase

Escherichia coli O157 RIMD 0509952 Ser/Thr protein kinase ECs0229

TRENDS in Microbiology

Fig. 2. Gene clusters from the proteobacteria centred on a forkhead-associated (FHA) domain gene. Arrowed boxes show the orientation and size of genes in clusters. The left-most and right-most genes are identified by gene names that will allow the entire clusters to be identified in GenBank. Homology searches with members of the Rhizobium leguminosarum bv. Trifolii impA–N gene cluster were used to identify homologs in each of the other clusters and assign colors or shades to homologous genes. impI-like genes encoding FHA-domain proteins are shown in red; icmF-like genes in pink; and genes encoding Ser/Thr protein phosphatases and kinases in dark green and light green, respectively. For reasons of space, genes absent from the imp cluster but sometimes associated with similar clusters (e.g. clpB-like genes) or similar clusters without a impI-like gene (e.g. the sci gene cluster in the centisome 7 Salmonella pathogenicity island, GenBank accession number AJ320483 [31]) are not shown.

http://tim.trends.com

561

PSI-BLAST search terminated after a single iteration. However, if a more generous than usual threshold in the expect value needed for inclusion in a subsequent iteration was adopted (0.05 versus the usual 0.002), then the search quickly linked the amino terminus of YscD with the other FHA domains. A search of the SMART database with YscD also reported a hit to the FHA domain but with an unimpressive e value (8.76). Given the link in Chlamydia between type III secretion and an FHA-domain-containing protein, it seems plausible, but not conclusive, that the SctD proteins all contain a variant FHA domain. Whether this domain mediates phosphorylation-dependent protein–protein interactions and/or signalling remains a tantalizing hypothesis, particularly as possession of an STPK would appear to be a common, but not universal, feature of bacteria engaged in type III secretion (see SMART entries STYKc and S_TKc). Myxococcus xanthus is a δ-proteobacterium with a complex developmental life cycle and is known to possess an abundance of STPKs [9]. Our search revealed only two proteins containing FHA domains in this organism, however, this number is likely to increase dramatically once the genome sequence is complete. One of these proteins contains a prolinerich amino-terminal domain followed by two FHA domains and is encoded by a gene adjacent to the STPK gene pkn3. It is therefore likely to interact with the as-yet-unknown targets of Pkn3. The other myxococcal FHA-domain-containing protein is EspA, a histidine kinase with a complex domain architecture that regulates the timing of sporulation [38]. The presence of an FHA domain in a histidine kinase suggests that there is cross-talk between His-Asp and Thr phosphorylation systems and perhaps hints that some FHA domains might recognize the His or Asp residues phosphorylated by two-component sensor kinases. FHA domains in the cyanobacteria

Nostoc sp. PCC 7120 and Synechocystis each contain >12 proteins with FHA domains, many with genes that cluster with those for STPK/STPPs (M. Pallen et al., unpublished; see SMART). From their co-occurrence in proteins with other signalling domains, for example, a PAS domain, a GAF domain, and a DUF1 and a DUF2 domain in the protein all1175 (Entrez UID 17228670), it is likely that most cyanobacterial FHA domains are components of signalling pathways. In some cases, for example the protein alr0548 (Entrez UID 17228044), the FHA domain is actually embedded in an STPK. Intriguingly, the gene for the Nostoc FHA-domaincontaining protein all0156 (Entrez UID 17227652) sits in a cluster with genes encoding PbpA- and RodA-like proteins; alr4579 (Entrez UID 17232071) is a penicillin-binding protein with an FHA domain, suggesting that cyanobacteria, like mycobacteria and clostridia, probably regulate cell shape via an FHA-mediated phosphoprotein–protein interaction.

562

Opinion

TRENDS in Microbiology Vol.10 No.12 December 2002

Discussion

Acknowledgements We thank the BBSRC for funding ViruloGenome and Alex Lam and Nick Loman for help in establishing and maintaining it. M. Pallen gratefully acknowledges staff at the Wellcome Trust Sanger Institute, the John Innes Centre, The Institute for Genomic Research, the University of Oklahoma and Genome Therapeutics for making incomplete genome sequence data publicly available.

This analysis provides persuasive prima facie evidence that FHA domains and, by implication, protein phosphorylation and de-phosphorylation play previously unsuspected roles in a wide range of processes in bacteria, including regulation of cell shape, protein secretion, sporulation, pathogenic and symbiotic host–bacterium interactions, carbohydrate storage and transport, signal transduction and even antibiotic resistance. Studies targeting FHA-mediated protein–protein interactions are likely to shed light on all these phenotypes. The challenge now is to do the experiments: (1) to determine the binding partner(s) and specificities of each FHA domain (do they all bind phospho-tyrosine peptides or do some bind phospho-aspartate, phospho-histidine, phospho-serine or phospho-tyrosine peptides?), using, for example, two-hybrid or peptide-scanning screens, pull-downs, site-directed or truncation mutagenesis and affinity blotting [39]; (2) to determine how each FHA-mediated interaction fits within its physiological context (e.g. why the recurrent links to FtsK/SpoIIIE domains and proline-rich domains?) by, for example, studying the phenotypes of over-expression or knock-out mutants, including proteomics and microarray studies; and (3) to uncover the evolutionary processes involved in recruitment or loss of FHA domains and phosphoprotein signalling in various bacterial processes by more detailed molecular phylogenetic studies and by looking for evidence of horizontal gene transfer through, for example,

References 1 Hofmann, K. and Bucher, P. (1995) The FHA domain: a putative nuclear signalling domain found in protein kinases and transcription factors. Trends Biochem. Sci. 20, 347–349 2 Durocher, D. et al. (1999) The FHA domain is a modular phosphopeptide recognition motif. Mol. Cell 4, 387–394 3 Durocher, D. and Jackson, S.P. (2002) The FHA domain. FEBS Lett. 513, 58–66 4 Yaffe, M.B. and Elia, A.E. (2001) Phosphoserine/threonine-binding domains. Curr. Opin. Cell Biol. 13, 131–138 5 Yaffe, M.B. and Smerdon, S.J. (2001) PhosphoSerine/threonine binding domains: you can’t pSERious? Structure 9, R33–38 6 Durocher, D. et al. (2000) The molecular basis of FHA domain:phosphopeptide binding specificity and implications for phospho-dependent signaling mechanisms. Mol. Cell 6, 1169–1182 7 Liao, H. et al. (1999) Structure and function of a new phosphopeptide-binding domain containing the FHA2 of Rad53. J. Mol. Biol. 294, 1041–1049 8 Av-Gay, Y. and Everett, M. (2000) The eukaryotic-like Ser/Thr protein kinases of Mycobacterium tuberculosis. Trends Microbiol. 8, 238–244 9 Inouye, S. et al. (2000) A large family of eukaryotic-like protein Ser/Thr kinases of Myxococcus xanthus, a developmental bacterium. Microb. Comp. Genomics 5, 103–120 10 Bakal, C.J. and Davies, J.E. (2000) No longer an exclusive club: eukaryotic signalling domains in bacteria. Trends Cell Biol. 10, 32–38

http://tim.trends.com

GC bias (why the link to cell shape in three distinct groups, and why the differences in the complement of FHA domains between close relatives such as M. tuberculosis and S. coelicolor?).

‘... FHA domains and, by implication, protein phosphorylation and de-phosphorylation play previously unsuspected roles in a wide range of processes in bacteria...’ These challenges will multiply as novel FHA domains are found encoded in future completed genome sequences, as will the opportunities to make comparisons between key FHA-mediated interactions in bacterial and eukaryotic cells. Furthermore, given recent progress in developing therapeutic inhibitors of other phosphopeptide-binding domains such as SH2 domains [40] and hints that STPK inhibitors can inhibit bacterial growth and development [41,42], blocking bacterial FHA–phosphopeptide interactions might prove a fruitful source of novel antibacterial agents. Supplementary information online

The results of the PSI-BLAST searches can be found online at http://archive.bmn.com/supp/tim/FHA/ FHAsupplement.html

11 Ponting, C.P. et al. (1999) Eukaryotic signalling domain homologues in archaea and bacteria. Ancient ancestry and horizontal gene transfer. J. Mol. Biol. 289, 729–745 12 Takami, H. et al. (2000) Complete genome sequence of the alkaliphilic bacterium Bacillus halodurans and genomic sequence comparison with Bacillus subtilis. Nucleic Acids Res. 28, 4317–4331 13 Parkhill, J. et al. (2001) Complete genome sequence of a multiple drug resistant Salmonella enterica serovar Typhi CT18. Nature 413, 848–852 14 Kaneko, T. et al. (2001) Complete genomic sequence of the filamentous nitrogen-fixing cyanobacterium Anabaena sp. strain PCC 7120. DNA Res. 8, 205–213 15 Leonard, C.J. et al. (1998) Novel families of putative protein kinases in bacteria and archaea: evolution of the ‘eukaryotic’ protein kinase superfamily. Genome Res. 8, 1038–1047 16 Park, J. et al. (1997) Intermediate sequences increase the detection of homology between sequences. J. Mol. Biol. 273, 349–354 17 Schultz, J. et al. (2000) SMART: a web-based tool for the study of genetically mobile domains. Nucleic Acids Res. 28, 231–234 18 Pallen, M.J. (2002) The ESAT-6/WXG100 superfamily – and a new Gram-positive secretion system? Trends Microbiol. 10, 209–212 19 Blatch, G.L. and Lassle, M. (1999) The tetratricopeptide repeat: a structural motif mediating protein–protein interactions. Bioessays 21, 932–939

20 de Pedro, M.A. et al. (2001) Constitutive septal murein synthesis in Escherichia coli with impaired activity of the morphogenetic proteins RodA and penicillin-binding protein 2. J. Bacteriol. 183, 4115–4126 21 Chaba, R. et al. (2002) Evidence that a eukaryotic-type serine/threonine protein kinase from Mycobacterium tuberculosis regulates morphological changes associated with cell division. Eur. J. Biochem. 269, 1078–1085 22 Av-Gay, Y. et al. (1999) Expression and characterization of the Mycobacterium tuberculosis serine/threonine protein kinase PknB. Infect. Immun. 67, 5676–5682 23 Belanger, A.E. et al. (1996) The embAB genes of Mycobacterium avium encode an arabinosyl transferase involved in cell wall arabinan biosynthesis that is the target for the antimycobacterial drug ethambutol. Proc. Natl. Acad. Sci. U. S. A. 93, 11919–11924 24 Ramaswamy, S.V. et al. (2000) Molecular genetic analysis of nucleotide polymorphisms associated with ethambutol resistance in human isolates of Mycobacterium tuberculosis. Antimicrob. Agents Chemother. 44, 326–336 25 Koul, A. et al. (2001) Serine/threonine protein kinases PknF and PknG of Mycobacterium tuberculosis: characterization and localization. Microbiology 147, 2307–2314 26 Belanger, A.E. and Hatfull, G.F. (1999) Exponential-phase glycogen recycling is essential for growth of Mycobacterium smegmatis. J. Bacteriol. 181, 6670–6678

Opinion

TRENDS in Microbiology Vol.10 No.12 December 2002

27 Urabe, H. and Ogawara, H. (1995) Cloning, sequencing and expression of serine/threonine kinase-encoding genes from Streptomyces coelicolor A3(2). Gene 153, 99–104 28 Mukhopadhyay, S. et al. (1999) Characterization of a Hank’s type serine/threonine kinase and serine/threonine phosphoprotein phosphatase in Pseudomonas aeruginosa. J. Bacteriol. 181, 6615–6622 29 Motley, S.T. and Lory, S. (1999) Functional characterization of a serine/threonine protein kinase of Pseudomonas aeruginosa. Infect. Immun. 67, 5386–5394 30 Wang, J. et al. (1998) A novel serine/threonine protein kinase homologue of Pseudomonas aeruginosa is specifically inducible within the host infection site and is required for full virulence in neutropenic mice. J. Bacteriol. 180, 6764–6768 31 Folkesson, A. et al. (1999) Multiple insertions of fimbrial operons correlate with the evolution of

32

33

34

35

Salmonella serovars responsible for human disease. Mol. Microbiol. 33, 612–622 Roest, H.P. et al. (1997) A Rhizobium leguminosarum biovar trifolii locus not localized on the sym plasmid hinders effective nodulation on plants of the pea cross-inoculation group. Mol. Plant–Microbe Interact. 10, 938–941 Das, S. et al. (2000) Comparison of global transcription responses allows identification of Vibrio cholerae genes differentially expressed following infection. FEMS Microbiol. Lett. 190, 87–91 Purcell, M. and Shuman, H.A. (1998) The Legionella pneumophila icmGCDJBF genes are required for killing of human macrophages. Infect. Immun. 66, 2245–2255 Segal, G. et al. (1998) Host cell killing and bacterial conjugation require overlapping sets of genes within a 22-kb region of the Legionella pneumophila genome. Proc. Natl. Acad. Sci. U. S. A. 95, 1669–1674

BSE – a wolf in sheep’s clothing? Matthew Baylis, Fiona Houston, Rowland R. Kao, Angela R. McLean, Nora Hunter and Mike B. Gravenor The entire sheep flock in the UK has been threatened with slaughter if BSE is found in farmed sheep, largely on the grounds that an epidemic of BSE in sheep could be harder to contain than was the case for cattle, and that lamb could present a greater risk to consumers than beef. However, identifying BSE in a sheep is not straightforward, because of its similarities to the related disease, scrapie. Here, we review the likelihood that any UK sheep have BSE, how they might have got it, how a case could be identified and what the Government is doing in terms of surveillance and possible control methods. Published online: 31 October 2002

In September 2001, the UK government published its contingency plan should naturally occurring BSE be found in sheep [1]. The contingency plan considers, as a worst-case scenario, slaughtering the entire national flock, with possible catastrophic impact on the livelihoods of tens of thousands of farmers and others involved directly or indirectly with the sheep meat industry. It would also change the face of the British landscape. By contrast, despite the fact that there were >1300 cases of BSE in UK cattle in 2001 and >100 people have died from variant Creutzfeld–Jacob disease (vCJD), probably following the dietary consumption of the BSE agent [2,3], the culling policy for cattle has been selective, restricted to BSE-affected animals, their offspring and other cattle exposed to the same source of infection. http://tim.trends.com

563

36 Plano, G.V. and Straley, S.C. (1995) Mutations in yscC, yscD, and yscG prevent high-level expression and secretion of V antigen and Yops in Yersinia pestis. J. Bacteriol. 177, 3843–3854 37 Hueck, C.J. (1998) Type III protein secretion systems in bacterial pathogens of animals and plants. Microbiol. Mol. Biol. Rev. 62, 379–433 38 Cho, K. and Zusman, D.R. (1999) Sporulation timing in Myxococcus xanthus is controlled by the espAB locus. Mol. Microbiol. 34, 714–725 39 Phizicky, E.M. and Fields, S. (1995) Protein–protein interactions: methods for detection and analysis. Microbiol. Rev. 59, 94–123 40 Shakespeare, W.C. (2001) SH2 domain inhibition: a problem solved? Curr. Opin. Chem. Biol. 5, 409–415 41 Jain, R. and Inouye, S. (1998) Inhibition of development of Myxococcus xanthus by eukaryotic protein kinase inhibitors. J. Bacteriol. 180, 6544–6550 42 Drews, S.J. et al. (2001) A protein kinase inhibitor as an antimycobacterial agent. FEMS Microbiol. Lett. 205, 369–374

The relative economics of cattle versus sheep have probably contributed to the difference in the policies for cattle and sheep BSE but the different nature of the diseases and the absence of precise data on sheep BSE have played an even greater part. Encouragingly, sheep themselves could have an alternative solution encoded in their genes, and the UK government is earnestly pursuing this possibility by instigating a policy of selective breeding for resistance.

‘...there are reasonable grounds for believing that up to several thousand farmed sheep could have been infected with BSE...’ Cause for concern

There have been no proven or even putative cases of BSE identified in farmed sheep. To date, all known cases of the disease have been induced under laboratory conditions by the experimental infection of sheep with tissue from infected animals (Fig. 1). Nevertheless, there are reasonable grounds for believing that up to several thousand farmed sheep could have been infected with BSE and some could even have developed the disease but were overlooked. The recycling of ruminant tissue into meat and bone meal (MBM) in the 1980s, which was commonly fed to cattle as an ingredient of feed concentrates, is clearly linked to the development of the epidemic of BSE in cattle (Box 1). Sheep, mostly adult pregnant ewes, are also fed concentrates, and these could also have contained MBM in the 1980s [4]. However, the amount of infectious material consumed by sheep would have been much less than that consumed by cattle. A sheep eats, on average, only 1–2% of the volume of feed concentrate eaten by a bovine [5] and the amount of MBM in sheep concentrate was at most equal to, and probably much less than, that present in

0966-842X/02/$ – see front matter © 2002 Elsevier Science Ltd. All rights reserved. PII: S0966-842X(02)02477-0