Adaptation of an enzyme to regulatory function: structure of Bacillus subtilis PyrR, a pyr RNA-binding attenuation protein and uracil phosphoribosyltransferase

Adaptation of an enzyme to regulatory function: structure of Bacillus subtilis PyrR, a pyr RNA-binding attenuation protein and uracil phosphoribosyltransferase

Research Article 337 Adaptation of an enzyme to regulatory function: structure of Bacillus subtilis PyrR, a pyr RNA-binding attenuation protein and ...

739KB Sizes 1 Downloads 10 Views

Research Article

337

Adaptation of an enzyme to regulatory function: structure of Bacillus subtilis PyrR, a pyr RNA-binding attenuation protein and uracil phosphoribosyltransferase Diana R Tomchick1, Robert J Turner2†, Robert L Switzer2 and Janet L Smith1* Background: The expression of pyrimidine nucleotide biosynthetic (pyr) genes in Bacillus subtilis is regulated by transcriptional attenuation. The PyrR attenuation protein binds to specific sites in pyr mRNA, allowing the formation of downstream terminator structures. UMP and 5-phosphoribosyl-1pyrophosphate (PRPP), a nucleotide metabolite, are co-regulators with PyrR. The smallest RNA shown to bind tightly to PyrR is a 28–30 nucleotide stemloop that contains a purine-rich bulge and a putative GNRA tetraloop. PyrR is also a uracil phosphoribosyltransferase (UPRTase), although the relationship between enzymatic activity and RNA recognition is unclear, and the UPRTase activity of PyrR is not physiologically significant in B. subtilis. Elucidating the role of PyrR structural motifs in UMP-dependent RNA binding is an important step towards understanding the mechanism of pyr transcriptional attenuation. Results: The 1.6 Å crystal structure of B. subtilis PyrR has been determined by multiwavelength anomalous diffraction, using a Sm co-crystal. As expected, the structure of PyrR is homologous to those proteins of the large type I PRTase structural family; it is most similar to hypoxanthine-guanine-xanthine PRTase (HGXPRTase). The PyrR dimer differs from other PRTase dimers, suggesting it may have evolved specifically for RNA binding. A large, basic, surface at the dimer interface is an obvious RNA-binding site and uracil specificity is probably provided by hydrogen bonds from mainchain and sidechain atoms in the hood subdomain. These models of RNA and UMP binding are consistent with biological data.

Addresses: 1Department of Biological Sciences, Purdue University, West Lafayette, Indiana 47907, USA and 2Department of Biochemistry, University of Illinois, Urbana, Illinois 61801, USA. †Present

address: The Skaggs Institute for Chemical Biology, The Scripps Research Institute, La Jolla, California 92037, USA. *Corresponding author. E-mail: [email protected] Key words: attenuation, biosynthesis, phosphoribosyltransferase, pyrimidine, transcription Received: 26 November 1997 Revisions requested: 22 December 1997 Revisions received: 15 January 1998 Accepted: 15 January 1998 Structure 15 March 1998, 6:337–350 http://biomednet.com/elecref/0969212600600337 © Current Biology Ltd ISSN 0969-2126

Conclusions: The B. subtilis protein PyrR has adapted the substrate- and product-binding capacities of a PRTase, probably an HGXPRTase, producing a new regulatory function in which the substrate and product are co-regulators of transcription termination. The structure is consistent with the idea that PyrR regulatory function is independent of catalytic activity, which is likely to be extremely low under physiological conditions.

Introduction The Bacillus subtilis protein PyrR regulates the expression of the pyrimidine nucleotide (pyr) biosynthetic pathway by modulating the attenuation of transcription at three points in the pyr operon, in response to exogenous pyrimidines [1,2]. PyrR exerts its regulatory function by binding, in the presence of the co-regulator UMP, to the pyr mRNA transcripts of the attenuator regions of the operon and disrupting an antiterminator stem-loop secondary structure. This permits the downstream stem of the antiterminator to participate in formation of an alternative stem-loop structure, which is a factor-independent transcription terminator (Figure 1a). In this way, UMP-dependent binding of PyrR converts the transcription system from a readthrough mode, which permits expression of the downstream pyrimidine biosynthetic genes, to a termination mode, which reduces expression of these genes.

The model for transcriptional attenuation of the B. subtilis pyr operon, originally based on molecular genetic experiments [1], was confirmed by reconstruction of transcriptional attenuation in vitro [3,4]. Together UMP and PyrR strongly promoted transcriptional termination in a simple in vitro system consisting of purified B. subtilis RNA polymerase, pyr DNA templates, rNTP substrates and purified PyrR. Only UMP and UTP promoted termination, whereas 5-phosphoribosyl-1-pyrophosphate (PRPP), a central nucleotide metabolite, antagonized the action of UMP on termination. The segments of pyr mRNA to which PyrR binds were identified from conservation of their sequences in the three pyr attenuation regions [1]. Gel mobility shift analysis shows that purified PyrR binds with high affinity and specificity to the expected RNA segments [5]. PyrR affinity for RNA is increased in the presence of UMP. A minimal sequence of 28 nucleotides

338

Structure 1998, Vol 6 No 3

Figure 1 (a)

(b)

A G

A

1

2 3

pyr genes

4

U

mRNA [UMP] low [PRPP]

[UMP] high [PRPP]

G

A

C

G

C

G

A

U

A

U

C

G

G

C

G G A A

PRPP

PyrR 2 1

U

A

UMP

U

A

PyrR

U

A

U

G

3

3 4

pyr genes

1

2

4 UUUUU

Antitermination

Termination

Structure

within the RNA segment is required for PyrR binding. The target RNA folds into a stem-loop secondary structure called the anti-antiterminator or the binding loop ([4]; ER Bonner, J D’Elia and RLS, unpublished results) (Figure 1b). PyrR belongs to a growing class of proteins that regulate transcriptional attenuation by binding to mRNA, thereby altering the frequency of transcription termination [6–13]. Such proteins provide a very general mechanism for the regulation of gene expression, a mechanism that is in principle as general as the regulation of transcriptional initiation by the binding of repressors or activator proteins to operator sites in DNA. High-resolution structures of the trp RNA-binding attenuation protein (TRAP) from B. subtilis [14] and of the RNA-binding domain of B. subtilis SacY [12,13] have been described, and although the specific RNA sequences to which they bind are known, the details of protein–RNA recognition are not yet fully defined. Analysis of the deduced amino acid sequences of TRAP and SacY does not reveal any relationship of these proteins to each other, to PyrR or to the recognized sequence motifs found in other RNA-binding proteins [15,16]. PyrR has another remarkable property: it is an enzyme. PyrR is a uracil phosphoribosyltransferase (UPRTase), even though its primary sequence bears little resemblance to those of other bacterial UPRTases, except for a short sequence that is characteristic of nearly all PRPP-binding

U

A

C

G

C

G

A

U

C

G

Transcriptional attenuation of pyr gene expression in B. subtilis. (a) Schematic representation of the transcriptional attenuation mechanism of PyrR. Numbered segments correspond to base paired stems of the proposed RNA secondary structures shown in the lower half of the figure. In the absence of PyrR binding (left), pairing of segments 1 and 2 leads to formation of the antiterminator stemloop, which prevents segment 3 from basepairing with segment 4. Transcription termination is prevented and downstream genes are transcribed. UMP-stimulated PyrR binding to segment 1, which stabilizes the alternative anti-antiterminator stem-loop (right), disrupts the antiterminator and frees segment 3 for base pairing with segment 4 to form a factor-independent terminator structure. Increased transcription termination reduces expression of the downstream genes. (b) Proposed secondary structure of the antiantiterminator stem-loop from the pyrR–pyrP attenuation region. The sequence shown binds tightly and specifically to PyrR [5]. The structure shown is that predicted by the PCFOLD program of Zuker and Stiegler [69], except that the pentaloop at the top has been folded to form a GNRA tetraloop [53].

proteins [1]. Other than the provision of specific sites for the binding of the regulatory metabolites UMP and PRPP, it is not clear why PyrR should possess enzymatic activity, nor is it certain that this activity is required for the normal regulatory functions of PyrR. The UPRTase activity of PyrR is not quantitatively important in B. subtilis cells [1,17]. These bacteria also produce another UPRTase, encoded by the upp gene, which lies outside of the pyr operon. The upp gene product has much greater sequence similarity to other bacterial UPRTases and contributes most of the UPRTase activity of the cell under physiological conditions. UPRTase activity of PyrR at physiological pH is extremely low [5]. Phosphoribosyltransferases (PRTases) are enzymes involved in nucleotide synthesis and salvage; they catalyze the displacement of PRPP pyrophosphate by nitrogenous compounds, mostly from purine and pyrimidine bases. Crystal structures of several PRTases have been reported [18–25]. Very recently, the structure of the upp-encoded UPRTase from Bacillus caldolyticus was determined (A Kadziola and S Larsen, personal communication). An additional objective of the present work is to understand the structural relationships between these enzymes and PyrR, which differs so markedly from them in biochemical function. PyrR is not unique to B. subtilis (Figure 2). Homologs that probably function in the regulation of pyr genes have been found in B. caldolyticus [26,27], Enterococcus faecalis ([28]; S-Y Ghim, JH Kim, ER Bonner, GK Grabner and RLS,

Research Article Structure of PyrR Tomchick et al.

339

Figure 2 β1 B. subtilis B. caldolyticus E. faecalis L. plantarum T. ZO5 M. tuberculosis Synechocystis sp. H. influenza

MN Q K MQ K MP K MA VRF K MG A A G D A A I G R E S MA A Q I ME

AV AV KE RE AE RE I E KI

I V V V L L I I

trimer loop

α1 10 L DE QA I RRA MD E Q A I R R A V D A V T MK R A V D A MT MR R A MN A P E MR R A MS A A D V G R T L SPEEI RRT I DHDRF L RT

L L L L L I L I

T T T T Y S T S

20 RI A RI A RI S RI T RI A RI A RL A RI S

H E MI HEI I YEI I YEI I HEI V HQI I S QV I HEI I

Q

β3 B. subtilis B. caldolyticus E. faecalis L. plantarum T. ZO5 M. tuberculosis Synechocystis sp. H. influenza

I I L L I I I V

E E K K A T E E

QI QI QL QL EF EY ML EL

E E E E E S E S

60 GNP V GA S V DI DV GV DV GK E V GI HV QV K V SI NL

T V GE P V GE P V GE P V GS P V GV GHGA P V GA P S ME

I L L L L L I L

DI DI DI DI DI DI DV DI

V DV GRP MD L G R P MD L G R P MD H G R P I DL GRP RDV GRP T E Y GRP T DF GRA I

130 S S I QL A RI QL RKI SL AKI SL RRI YL RA V QL QV I RL AKI EL

70 TL TL TL TL TL TL TL TF

Y Y Y Y Y Y Y Y

V V V V V V T I G

L L L L L L L F

V V A V V V V V

140 DRGHRE DRGHRE A RGHRE DRGHRE DRGHRE DRGHRE DRGHRE DRGHRE

L L L F L L L I

V V V I V L L V

GI GI GI GI GI GI GI GI

40 K T RGI K T RGI K T RGI K T RGI HT RGI P T RGV Y T RGV K RRGA

Y Y Y F P T P E

L L I L L L L I

A A A A A A A A

50 KRL AER RRL AER QRL A E R QRL A QR HRI ARF NRL A GN HQL A QQ E L L QRR

I

β4

80 RDDL SKKT SNDEPL V RDDL T VKT DDHEPL V RDDVKDM EEPEL RDDHHA V D V A GQA K L RDDL T EI G Y RP QV R D D L MI K P PRPL A RDDL KRI K T RT P R D D L T L V D Q E D K MP V Y

β5

90 K GA DI P K GT NV P HSSDVP NGA DI P RET RI P STSI PA AKTKI P S GS S QY

VDI FPV VSI VDI F DL GGI L SL L NI

100 T DQK V T ERNV E GK E V NGK HV T GK A I DDAL V T GK RV QDK T V

I I I I V I V I

L L L L L L L L

α3

PRPP

V V V V V V V V

DDV DDV DDV DDV DDV DDV DDV DDV

L L L L L L I L

Y F Y F Y Y Y F

110 T GRT T GRT T GRT T GRT T GRT S GRS K GRT T GRT

V V I V A V I I

RA RA RA RA RA RS RA RA

120 G MD A L A MD A V A MD A V AL DAL AL DAL AL DAL AL NAV A MD A L

Y

dimer loop

A A A A A A L V

30 R N K G MN N C I R NK GI DGCV R NK GI QDI V Q NK GV GNL V A NK GT E GL A K T A L DDP V GP DA P RV V K NSDL SEL V K HQT L DDL V

α2

A

flexible loop

β6 B. subtilis B. caldolyticus E. faecalis L. plantarum T. ZO5 M. tuberculosis Synechocystis sp. H. influenza

E E E E E E E E

β2

β7 L L L L L L L L

P P P P P P P P

I I I I I L I I

RA RA RA RP RA RA HP RA

DY DF DY DF DF DY DF DY

150 I GK V GK V GK I GK V GK V GK V GK V GK

Q

β8 NI NV NI NI NV NV I L NV

P P P P P P P P

T T T T T T T T

S S S A S S A S

160 KSEKV RSEL I KTEI I L DE QV RNEVV RSESV A E E QV RDEL V

I

MV VV V SV KV HV KV QV

β9 170 QL DE V DQNDL V E L S E V DGI DQV E ME E R E G A D R I A L E E HDGHDGI K V E E V DGE DRV RL RE HDGRDGV Y L QDP DGRDT V RT E K QDGCY E V

AI SI MI SI EL VI EL AI

180 YENE HEK S K GNE EKL EE WE K E G A SR I KG L GK Structure

Sequence alignment of the PyrR homologs. Residues identical in all homologs are highlighted in red and similar residues are highlighted in yellow and boxed. The secondary structure and sequence numbering for the B. subtilis protein are indicated; cylinders represent α helices and arrows represent β strands. The following loops are labeled: the trimer loop, which contributes to the threefold interface in the hexamer and is disordered in the dimeric structure; the flexible loop, which is near the PRTase active site and is disordered in both structures; the dimer loop, which forms the primary dimer contact; and the PRPP

loop, which contains the PRPP sequence motif. Open triangles indicate the complete PRPP sequence motif, filled magenta triangles indicate the spontaneous mutations of the B. subtilis protein isolated by Ghim and Switzer [35] and the filled blue circles indicate sitedirected mutations of B. subtilis PyrR that lead to defects in pyrimidine regulation (H Savacool and RLS, unpublished results). Amino acid substitutions of the mutations are given beneath the blue circles and magenta arrows. The figure was prepared with ALSCRIPT [70].

unpublished results), and Lactobacillus plantarum [29]. The identification of a typical attenuator sequence at the 5′ end of the Lactococcus lactis pyrKDbF operon suggests that a PyrR protein is also produced by this organism [30]. A pyrR gene was also identified in the Thermus strain ZO5 and was proposed to regulate pyr gene expression by RNA binding [31], but by a mechanism that differs substantially in detail from that established for B. subtilis. Genes with open reading frames having significant sequence similarity to B. subtilis pyrR have also been identified in the genomes of Haemophilus influenzae [32], Mycobacterium tuberculosis [33] and a Synechocystis species [34], but the functions of any protein products of these genes are unknown.

defects in the regulation of expression of the pyr operon has been developed [35]; analysis of these mutants would be greatly aided by the availability of a high-resolution structure of this protein. We report here the crystal structures of two forms of PyrR not bound by ligand, a dimeric form to 1.6 Å resolution and a hexameric form to 2.3 Å. The molecular architecture of PyrR is typical of a type I PRTase, but the interaction between monomers to form dimers is quite different from the other members of the PRTase family. The electrostatic surface potential of the dimeric form of PyrR suggests an obvious RNA-binding face.

Knowledge of the three-dimensional structure of PyrR is a necessary step in determining how PyrR recognizes specific pyr mRNA sequences in a UMP-dependent manner. A simple procedure for the isolation of PyrR mutants with

Results The structures of two ligand-free forms of PyrR, a dimeric form to 1.6 Å resolution and a hexameric form to 2.3 Å resolution, have been determined. Crystals of the hexameric form were obtained initially; heavy-atom screening to solve the phase problem led to the crystallization of the dimeric

340

Structure 1998, Vol 6 No 3

Figure 3 (a)

C

N

C

N

180

160

180

160 140

20

60

140 20

40

120 100

60

40

120 100

(b) HOOD

TRIMER LOOP

PRPP

HOOD

TRIMER LOOP FLEXIBLE LOOP

PRPP

Tertiary and quaternary structure of PyrR. (a) Stereo Cα trace of the PyrR monomer from the hexamer structure. The chain trace is numbered at every twentieth residue, and a dashed line representing the disordered residues of the PRTase flexible loop is drawn for residues 73–84. The PRPP loop, residues 101–113, is shown in green. (b) Stereo view of the PyrR dimer from the dimer crystal structure. One monomer is depicted with green helices and red β strands, and the other with blue helices and yellow β strands. The disordered residues of the trimer loop (residues 30–33) and the PRTase flexible loop (74–81) are indicated by the dashed black lines. Black spheres represent the positions of the Sm3+ ions used in the MAD structure determination. (c) Orthogonal views of the PyrR hexamer. The left-hand view is along the threefold axis, and the right-hand view is along a twofold axis. The approximate outer dimensions of the hexamer are 59 Å by 73 Å. The six monomers are colored differently. The figure was prepared with the program MOLSCRIPT [71].

FLEXIBLE LOOP

(c)

Structure

form, which were Sm3+ co-crystals. The structure of the dimeric form was solved by multiwavelength anomalous diffraction (MAD), and the resulting PyrR structural model was used to solve the structure of the hexameric form by molecular replacement. Sm3+ prevents threefold aggregation of dimers by binding to PyrR residues that contribute to the threefold subunit contact of the hexamer. Subunit structure

The PyrR monomer (Figure 3a) displays an overall architecture similar to other type I PRTases [18–23,25,36]. The

type I PRTase monomers are composed of a core domain that contains a parallel β sheet (β3, β2, β5, β6 and β7), flanked by α helices (α1, α2 and α3), and a second domain or subdomain. The core domain (residues 10–157 in PyrR) includes the active site, principal elements of which are the ‘PRPP loop’, the ‘PPi loop’ and the ‘PRTase flexible loop’. The PRPP loop, which connects the central β strand (β5 in PyrR) and the following α helix (α3), binds the ribose-5-phosphate moiety of PRPP. It is also the central segment of a sequence motif (101VILVDDVLYTGRT113), conserved in type I PRTases. The PPi loop, connecting β2

Research Article Structure of PyrR Tomchick et al.

341

Figure 4 Unique dimer interface of PyrR. The stereo view shows Cα traces for the dimers of B. subtilis PyrR (thick lines) and T. foetus HGXPRTase (thin lines). A monomer of HGXPRTase is superimposed on a monomer of PyrR, illustrating the very similar structure of the two proteins. The positions of the second monomers demonstrate the radically different dimerization mode of PyrR. All other type I PRTases of reported structure dimerize in a similar manner to HGXPRTase. The rmsd values for superposition of other PRTase monomers on PyrR are: 1.67 Å for 124 common Cα atoms with HGXPRTase ([21]; PDB code 1HGX); 1.48 Å for 103 common Cα atoms with HGPRTase ([20]; 1HMP); 1.44 Å for 83 common Cα atoms with XPRTase ([25]; 1NUL); 1.41 Å for 68 common Cα atoms with S. typhimurium OPRTase ([18]; 1STO); 1.33 Å for 70 common Cα atoms of E. coli OPRTase ([23]; 1ORO); 1.58 Å for 65 common Cα atoms with E. coli GPATase ([51]; 1ECF); and 1.46 Å for 63 common Cα atoms with B. subtilis GPATase ([39]; 1AO0). The figure was prepared with the program MOLSCRIPT [71].

and α2, binds the PPi moiety of PRPP [37,38] and is adjacent to the PRPP loop. The PRTase flexible loop, between β3 and β4, becomes ordered and closes the active site during catalysis; it includes residues that bind the pyrophosphate group of PRPP [38]. Residues 74–81 of the PRTase flexible loop in PyrR are disordered, as in most other PRTase structures. The second domain or subdomain (residues 3–9 and 158–180 in PyrR) specifically binds the nucleophile that will displace the pyrophosphate group from PRPP and is known as the ‘hood’ in those PRTases in which it is a small subdomain [18]. A second PRTase structural family was identified recently and is unrelated to PyrR [24].

Structure

cis-Thr47 in the GMP-bound T. foetus hypoxanthineguanine-xanthine PRTase (HGXPRTase) [21], cis-Tyr72 in E. coli OPRTase [23], cis-Asp282 in B. subtilis glutamine PRPP amidotransferase (GPATase) [39], cis-Glu303 in E. coli GPATase [36] and cis-Arg37 in E. coli XPRTase [25]. In our structure of PyrR, Lys40 is the trans conformation, but adopts an unfavorable backbone conformation (φ = +73°, ψ = +155°). In the structure of HGXPRTase from Toxoplasma gondii in complex with the product XMP, the equivalent residue (Lys79) of the PPi loop is 3–4 Å closer to the active site than in the free enzyme; a cis-peptide has not been reported for this residue [22]. Dimeric PyrR

The PRTase structural elements of substrate and product binding — the PRPP loop, the PPi loop, the flexible loop and the hood — are interpreted in terms of PyrR function because the UPRTase substrate and product — PRPP and UMP — are PyrR co-regulators. In both oligomeric forms of PyrR, a sulfate anion from the crystallization medium is bound in the PRPP loop, to the binding site for the 5′-phosphate of PRTase substrates, products or inhibitors. The sidechain of Lys40 in the PPi loop is extended across the active-site cleft and forms a salt bridge to the sulfate ion. Similar anion binding occurs in Salmonella typhimurium orotate PRTase (OPRTase), in which the corresponding lysine residue, Lys73, forms a salt bridge with the 5′-phosphate group in the OMP complex [18], and with the pyrophosphate group in the orotate–PRPP complex [37]. In several other PRTase structures, the PPi loop includes a cis peptide, specifically

The PyrR dimer has a flattened disc shape with maximal dimensions of 73 Å by 49 Å by 34 Å (Figure 3b). The PRTase active sites are separated by approximately 30 Å. Dimerization occurs through helix α3, strand β6 and the long ‘dimer loop’ between strands β6 and β7; the molecular dyad is also a crystallographic twofold axis. Five symmetric pairs of hydrogen bonds surround a hydrophobic core in this interface, which buries a total of 820 A2 of surface area per monomer, approximately 11% of the monomer surface area. The buried surface area and number of hydrogen bonds in the dimer interface are similar to a typical protease–inhibitor or antibody–antigen complex [40–43]. Residues 30–33 are disordered in the crystals of dimeric PyrR. The PyrR subunit dimerizes very differently to other type I PRTases (Figure 4). As a result, the two active sites are

342

Structure 1998, Vol 6 No 3

on opposite ends of the PyrR dimer, in contrast to other PRTases in which the two active sites are near the dimer interface, where the flexible loop may contribute to both active sites [23,37,38]. Each PyrR monomer must therefore comprise all of the residues necessary for catalysis, including basic residues that can interact with the pyrophosphate group of PRPP. The conserved residues (Figure 2) in PyrR that are near the active site and may interact with the 1pyrophosphate group of PRPP are Lys40 and Arg42 in the PPi loop and Arg73 in the flexible loop. In agreement with the structural data, there is no kinetic evidence for site–site interactions in the UPRTase reaction catalyzed by PyrR [5]. This is in contrast to the UPRTase from E. coli, which is positively regulated by the allosteric effector GTP and displays increased activity upon GTP- and PRPP-induced higher oligomerization [44]. Lanthanide ions disrupt subunit contacts along the threefold axis of the PyrR hexamer and were essential to crystallization of the dimeric form. Calcium ions do not have the same effect despite their similar radius, ligand preference and protein-binding sites [45], even at 50-fold greater concentration (100 mM versus 2 mM). The coordination geometry of Sm3+ does not explain the ion selectivity, which may be due to the greater electropositivity of lanthanide ions in the 3+ state compared with Ca2+. Sm3+ crosslinks PyrR dimers in the crystal because it is located on a crystallographic twofold axis and is octahedrally coordinated to the sidechains of Glu10, Asp167 and Asp173 and their twofold symmetry mates. Binding of the more bioavailable Ca2+ in the same manner would have rather deleterious effects on PyrR function because of the possibility of forming extended polymers of dimers. A second fully hydrated Sm3+ binds near the PRTase flexible loop, where it may influence loop conformation. Hexameric PyrR

The PyrR hexamer is a trimer of the dimer structure described above, having D3 molecular symmetry (Figure 3c). Molecular and crystallographic threefold axes are coincident; the molecular twofold axis of the hexamer is noncrystallographic. The hexamer shape resembles a flattened hollow cylinder, approximately 59 Å high along the threefold cylinder axis and 73 Å wide. The diameter of the inner space is approximately 30 Å, at its widest point in the center of the hexamer, and 8 Å at its narrowest near the top and bottom. The 8 Å opening is too small to accommodate any form of RNA. Threefold contacts are formed between the hood of one monomer and the PRTase core of another. Each subunit interface buries approximately 380 Å2 of surface area per monomer, less than half the surface area buried in the dimer interface, and includes one salt bridge, several van der Waals contacts and no hydrogen bonds. An important contributor to the threefold contact is the ‘trimer loop’ (residues 30–33), which is disordered in the crystal structure of dimeric

PyrR. The trimer loop contacts hood residues 5–8 of a threefold related subunit, and the sidechain of Met31 is buried within its own subunit. The PyrR dimer interface is flexible, as evident in comparison of the dimer and hexamer structures. The root mean square deviation (rmsd) for superposition of the 320 common Cα atoms in the dimers for the two oligomerization states is 0.98 Å. This reflects a small (3°) rotation of monomers at the dimer interface. Individual monomers superimpose with an rmsd of 0.53 Å.

Discussion PyrR aggregation state

The aggregation state of PyrR is variable. The apparent molecular weight for the major species eluted in gel filtration experiments is directly proportional to protein concentration [5]. On the basis of the crystal structures, rapid equilibrium between dimeric and hexameric species is inferred. The small buried surface area and lack of hydrogen bonds in the threefold interface of the hexamer imply that it is not a biologically relevant subunit interaction [46]. The hexameric form of PyrR therefore may be non-physiological and due only to the relatively high in vitro concentrations used during both gel filtration and crystallographic studies. The aggregation state of PyrR when complexed with RNA has not been determined, but we propose that PyrR is a dimer at intracellular concentrations. Relationship to other phosphoribosyltransferases

Type I PRTases from different biological sources that catalyze the same reaction (orthologs) typically show significant sequence identity; however, the family is characterized by low sequence identity (< 20%) among PRTases that catalyze different reactions (paralogs). Indeed, family members are most easily identified at the sequence level by the characteristic motif of the PRPP loop. The six different type I PRTases of known structure have a common fold and active-site structure, as expected, despite their low overall sequence identity. PyrR occupies a distinct branch on the family tree of type I PRTases. PyrR and the upp-encoded bacterial UPRTases appear to be distant PRTase relatives, even though they catalyze the same chemical reaction. The sequence identity between B. subtilis PyrR and upp-encoded UPRTase is only 15%, outside of the PRPP sequence motif region. The UPRTases are typically 30 residues longer than PyrR and its close homologs, due mostly to insertions in the hood subdomain. There are two differences in the PRPP sequence motif between these proteins. Firstly, the second of the two conserved aspartate residues in PyrR (Asp106) is a proline in the UPRTases; the carboxylate group of this acidic residue interacts with O3′ of ribose hydroxyl groups in nucleotide–type I PRTases complexes [22,38,39]. Secondly, a conserved aromatic residue in

Research Article Structure of PyrR Tomchick et al.

PurR (Tyr109), which may stack on the uracil base of UMP, is an alanine in the UPRTases. The structure of the PyrR subunit is more similar to the structures of Tritrichomonas foetus HGXPRTase [21] and human HGPRTase [20] than to the other type I PRTase structures (Figure 4). A core of 124 Cα atoms are structurally equivalent in PyrR and T. foetus HGXPRTase; the largest differences occur in the PRTase flexible loop and the hood. In contrast, structures of UPRTase, XPRTase, OPRTase and GPATase have only 60–80 residues structurally equivalent to those of PyrR. The significant difference in number of equivalent residues is because some structural elements shared by PyrR and HG(X)PRTase differ from those of other PRTases; these include a similarly folded hood, a loop rather than a helix in the last crossover connection of the central parallel β sheet and a shelf-like extension on the central β sheet. PyrR may therefore have originated from an HG(X)PRTase and has been optimized to bind UMP and RNA. The existence of a relationship between PyrR and HG(X)PRTase is also indicated by the greater, albeit low, similarity between the amino acid sequences of PyrR and HG(X)PRTases than between PyrR and other PRTases (RJT, unpublished results). Regulatory versus catalytic function of PyrR

The primary function of PyrR is to regulate transcription of genes for pyrimidine biosynthesis. It has not been established whether PyrR catalytic activity is essential to this regulatory function. PyrR displays typical PRTase activity under nonphysiological conditions (pH 8.5); however, the Km for uracil rises sharply as the pH is decreased to physiological values and is probably much higher (> 1 mM) than the intracellular uracil concentration [5]. On the basis of these data, we propose that the catalytic activity of PyrR is not an essential feature of its regulatory function. UPRTase activity of PyrR may be impaired at neutral pH by lack of a suitable base for deprotonation of uracil (pK = 9.5). The identity of the catalytic base has not been established for any PRTase, so there is little basis for comparison of active-site structures. Substrate-assisted catalysis with the PPi leaving group as the catalytic base has been proposed in a structure-based mechanism for GPATase [38]. Catalysis by such a mechanism could be impaired in PyrR by binding PRPP in a non-optimal conformation. Nonenzymatic regulatory function of a type I PRTase homolog has also been characterized in another system. Expression of genes for de novo purine biosynthesis in B. subtilis is controlled by a protein that contains the PRTase sequence motif [47]. This dimeric purine repressor (PurR) regulates transcription initiation of the pur operon in vivo in response to excess adenine, but is not related to E. coli PurR or other members of the LacI family. Binding of B. subtilis PurR to control-region DNA in vitro is inhibited by

343

PRPP, yet is unaffected by nucleotides or purines [48]. The 25% sequence identity between PurR and B. subtilis XPRTase indicates that PurR is a member of the type I PRTase structural family. No PRTase enzymatic activity was detected for this repressor protein [48]. PyrR homologs

In contrast to the low sequence identity it has with other type I PRTases, PyrR has significant sequence identity (45–73%) with a group of other bacterial proteins. The structure of PyrR should be an excellent three-dimensional model for these homologs. In the context of the type I PRTase family, the high level of sequence identity among homologs is itself evidence that the proteins have a common function. PyrR homologs have been implicated in the regulation of transcription of genes involved in pyrimidine biosynthesis in B. subtilis, B. caldolyticus, E. faecalis and L. plantarum. Other potential homologs from Thermus ZO5, Mycobacterium tuberculosis, Haemophilus influenzae and a Synechocystis species have been identified at gene level. Attenuation sequences and candidate PyrR regulatory sites are present in all these organisms, except H. influenzae and the Synechocystis species. Three highly conserved regions are identified in the sequence alignment of the eight PyrR homologs (Figure 2). These are the PRTase flexible loop (70–75), the PRPP loop (101–113) and dimer loop (138–146). Conservation of the dimer loop implies that all of the PyrR homologs dimerize in the same way as the B. subtilis protein. The dimer loop also contributes to a large basic surface that may be important for RNA interactions. In the trimer loop of PyrR, the sequence 28NKGX31 (where X is a small preferably hydrophobic residue) forms a type III′ reverse turn that makes hydrophobic contacts in the PyrR hexamer. The sidechains of Lys29 and Arg27 make hydrophobic contacts with Ile177 and Pro155, respectively, at the threefold interface. Five of the eight PyrR homologs contain the NKGX sequence motif; at elevated concentrations, these proteins may form hexamers like the PyrR hexamer. Relationship of PyrR to other RNA-binding attenuation regulatory proteins

Transcriptional attenuation is a common mechanism of gene regulation in eubacteria, and several proteins that regulate attenuation have been identified. None of these other transcriptional attenuation proteins appear to be related to PyrR or its homologs, and none of them include the PRTase sequence motif. Three-dimensional structures have been determined for two of these attenuator proteins [12–14], but no structures of these proteins in complex with RNA have been reported. The B. subtilis tryptophan RNA-binding attenuation protein (TRAP; [9]) controls transcription of the trp operon in response to changes in the intracellular levels of tryptophan. Like UMP-bound PyrR, tryptophan-bound TRAP

344

Structure 1998, Vol 6 No 3

Figure 5

Electrostatic surface potential of the PyrR dimer and a model for RNA binding. The left-most and right-most images are opposite views of PyrR along the dimer axis. The two active-site clefts are apparent in the upper right and lower left corners of the right-most image, which shows the acidic surface of PyrR and is in the same orientation as Figure 3b. The basic surface is prominent in the left-most image. The concavity of the basic surface is clear in the third image, which is a side view of the dimer with the basic surface facing left. The model of

the 28-nucleotide pyr mRNA stem-loop shown in Figure 1b is docked to the basic surface of PyrR in the two center images. The GNRA tetraloop is at the top of the dimer and the four-base bulge fits into the concavity in the basic surface. The displayed surface potential varies approximately from –10 kT to 10 kT with acidic surfaces in red and basic in blue. The electrostatic surface potential was calculated and rendered in the program GRASP [72].

induces formation of an mRNA stem-loop terminator in the control region of the operon. Similarly, in the absence of TRAP, formation of an alternative antiterminator is favored. Neither the undecamer TRAP structure [14] nor the 11fold trinucleotide repeat to which it binds are similar to the PyrR transcription attenuation system, however.

crystal structures of apo PyrR should be a reasonable starting point for modeling both UMP and RNA complexes.

Another major family of transcriptional attenuation proteins controls transcription of genes involved in the utilization of aromatic β-glucosides and oligosaccharides. The non-phosphorylated dimeric forms of E. coli BglG [49] and B. subtilis SacY [7,12,13] are postulated to recognize a 29-nucleotide mRNA antiterminator stem-loop that overlaps with the downstream terminator stem-loop [6,50]. Despite their common dimeric aggregation and the similar sizes of their RNA stem-loop recognition sites, PyrR and SacY/BglG are unrelated proteins [12,13]. Likewise, details of the RNA recognition sequences are dissimilar (E Bonner, J D’Elia and RLS, unpublished results; [50]). Model of RNA binding

UMP and PRPP are the physiological effectors of transcription attenuation by PyrR. Our working hypothesis is that UMP and PRPP bind PyrR in the same way for transcription attenuation as for catalytic turnover, in a manner similar to the binding of nucleotide products and PRPP substrate to other type I PRTases. PyrR binds RNA specifically in absence of effectors, but binding is enhanced by UMP [5]. Likewise, PRTase structures are very similar with and without their nucleotide products [21,22,39,51]. Only subtle structural differences are therefore anticipated in PyrR with and without UMP, and the

A stem-loop pyr mRNA model was constructed to reflect the specific sequence and predicted secondary structure elements of the RNA target sequence (Figure 1b). Two features of the RNA stem-loop sequence that are conserved among pyr attenuation regions and are sensitive to mutagenesis [52] are the loop, which is proposed to form a GNRA tetraloop, and a purine-rich four-nucleotide bulge in the stem. The GNRA tetraloop is a common, stable, RNA structural motif in which the N (any) and R (purine) bases are stacked over the A base of a G–A base pair [53,54]. The choice of a GNRA tetraloop, as opposed to other possible loops is conjectural at this time. RNA stemloops built with other types of loops could also be modeled onto the surface of PyrR. To build the 28nucleotide pyr mRNA stem-loop, the GNRA tetraloop of a DNA–RNA hammerhead ribozyme [54] was attached to an A-form RNA stem. A four-base bulge was inserted in the stem, based on the bulge in ligand-free HIV-1 Tar RNA [55]. The GNRA tetraloop and the four-base bulge are on the same side of the pyr mRNA model because of the intervening six base pairs. PyrR contains no recognized RNA-binding structure or sequence motifs, but a large basic region on the surface of the PyrR dimer is the obvious choice for an RNA-binding site (Figure 5). This surface is concave, whereas the convex surface that includes the active-site cleft is negatively charged. The model of pyr mRNA has a shape complementary to the concave, basic, surface of PyrR, such

Research Article Structure of PyrR Tomchick et al.

345

Figure 6 Stereo view of a model of UMP bound to the active site of the PyrR monomer from the dimer crystal structure. Protein residues implicated in catalytic activity and/or UMP coordination are shown with black bonds; UMP is shown with white bonds; carbon atoms are white; oxygen, nitrogen and phosphorus are shaded. Potential hydrogen bonds between the UMP base and protein are shown as dashed lines. One of these hydrogen bonds is to invariant Arg138 in the dimer loop. The figure was prepared with the program MOLSCRIPT [71].

T156

T156

R138

Y109

R138

Y109

R73

V162

R73

V162

D106

D106 D105 K40

R42

D105 K40 R42

T41

T41 Structure

that the bulge created by the four unpaired nucleotides fits into the concavity of PyrR (Figure 5). Nucleotides that are protected from hydroxyl radical cleavage by PyrR (J D’Elia and RLS, unpublished results) are on one side of the RNA model, oriented towards the protein face. Among the eight PyrR homologs, most of the conserved sidechains that are on the surface of the dimer map to the proposed RNAbinding surface. Prominent among these is the invariant dimer loop (Figure 2), which forms the deepest part of the concavity. The unique dimer contact found in PyrR may therefore have evolved specifically to favor the binding of RNA, at the expense of active-site interactions seen in other type I PRTase dimers (Figure 4). The model represents an asymmetric RNA molecule bound to a symmetric protein dimer; such behavior has been observed in the complex of the RNA bacteriophage MS2 coat protein and a 19-nucleotide stem-loop RNA [56]. Mutagenesis results are consistent with the model of pyr mRNA binding. A spontaneous mutant of PyrR, E23A, was selected in a screen for defects in regulation by pyrimidines [35]; residue 23 maps to the concave surface of the dimer. The sidechain of Glu23 in the wild-type protein forms a salt bridge to the guanidinium groups of Arg19 and Arg27, which contribute to the concave basic surface. Glu23 may orient these arginine residues for RNA recognition, or it may be directly involved in RNA binding. Current site-directed mutagenesis studies are being directed towards conserved basic residues on the proposed RNA-binding surface of PyrR, particularly, Arg15, Arg19, Arg27, Arg141, Arg146 and Lys152. Preliminary analysis of R19Q and R146Q mutants reveals altered in vivo regulation and in vitro RNA binding (H Savacool and RLS, unpublished results).

Nucleotide binding to PyrR

Nucleoside monophosphates bind in the PRPP site of type I PRTases in a variety of orientations [18,22,38,39]; however, in all of these complexes the 5′-phosphate group is bound in the PRPP loop to residues at the N terminus of the α helix (α3 in PyrR), as SO42– is bound in PyrR crystals. Base specificity is typically conferred through recognition by residues of the hood subdomain. On the basis of the complex between T. foetus HGXPRTase and GMP [21], UMP was modeled into the active site of dimeric PyrR (Figure 6). The protein structure accommodates UMP without modification except for the sidechain of Lys40, which forms a salt bridge to SO42– in the crystal structures and was re-oriented to avoid close contacts with the nucleotide. Complete uracil specificity is provided by hydrogen bonds with the backbone of Val162 in the hood subdomain, namely, uracil N3 to the Val162 carbonyl and the Val162 NH to uracil O4. Similar hydrogen bonds to the protein backbone are made by guanine in the complex of T. foetus HGXPRTase and GMP, but the analogous amino acid is further from the PRPP loop in HGXPRTase than in PyrR, presumably to accommodate the larger purine base. Purine bases would not fit in the PyrR site. Our model includes an additional hydrogen bond to uracil O4 from the guanidinium group of invariant Arg138 in the dimer loop (residues 138–144). The interaction of Arg138 with UMP may transmit a structural change to the proposed RNAbinding region of the protein. The flexible loop in type I PRTases (residues 71–82) shields the active site from the bulk solvent during catalysis. Conserved residues in this loop are involved in PRPP binding [38]. The sidechain of invariant Arg73 is less than 5 Å from uracil O2 of the UMP

346

Structure 1998, Vol 6 No 3

model; a small movement of the loop could position it within hydrogen bonding distance of the uracil base. The high concentration of SO42– in the crystallization medium competes with UMP for binding to the PRPP loop, and crystals of the PyrR–UMP complex cannot be produced by soaking. Co-crystallization experiments under a variety of high and low ionic strength conditions and pH values, however, have also not yielded a crystalline complex. The affinity of PyrR for UMP in absence of RNA is unknown. It may be that UMP binds nonspecifically to the basic surface of PyrR in absence of RNA. UTP also functions as a co-regulator of transcriptional attenuation at approximately tenfold higher concentrations than UMP, but UDP is not an effector [3]. It has been interpreted from the results of these studies that UTP binds PyrR differently than UMP or PRPP. Indeed, nucleoside di- and triphosphates have not been observed in the active sites of any type I PRTase structures and have been shown to be excluded in one case [39]. PRPP does not compete effectively with UTP in co-regulation of transcription termination [3]. A different binding site for UTP is not apparent in the PyrR structure. PyrR functional mutants

Among the spontaneous mutants selected in a screen for defects in regulation by pyrimidines are T41I, D106Y, V134G and T156I [35]. Residue 106 is the second of the invariant aspartate residues of the PRPP loop; tyrosine substitution would effectively block the UMP-binding site. Thr41 is near the proposed binding site of the pyrophosphate group of PRPP and is conserved in most PyrR homologs; substitution by the bulkier hydrophobic isoleucine may curtail the flexibility of the PPi loop and interfere with UMP and PRPP binding. Val134 is in the hydrophobic core of PyrR. The V134G mutation may introduce a destabilizing cavity in the protein interior or may affect the transduction of the UMP-binding signal from the active site to the RNA-binding site. There is a hydrogen bond between the sidechains of invariant residues Thr156 and Arg138. Arg138 forms an important connection between the UMP site and the basic RNA-binding surface. The T156I mutation may therefore disrupt the orientation of Arg138, and indirectly affect UMP binding or signaling between the UMP and RNA sites.

Biological implications Transcriptional attenuation proteins bind mRNA and regulate transcription termination by inducing changes in the secondary structure of the RNA. Expression of genes for pyrimidine nucleotide biosynthesis in Bacillus subtilis is regulated by the transcriptional attenuation protein PyrR, together with the co-regulators UMP and PRPP. PyrR binds three sites in the control regions of pyr mRNA that have a consensus 28 to 30 nucleotide

stem-loop structure, allowing formation of downstream terminator structures. PyrR is not related to other transcriptional attenuation proteins, but is a homolog of the type I phosphoribosyltransferases (PRTases) and probably exhibits very low levels of uracil PRTase catalytic activity at physiological uracil concentrations. The threedimensional structure of PyrR reveals that its closest structural homolog is Tritrichomonas foetus hypoxanthine-guanine-xanthine PRTase. A large basic region on the otherwise acidic surface of the PyrR dimer is an obvious choice for the RNA-binding site. Co-regulators PRPP and UMP are proposed to bind to the enzyme active site like the substrate and product molecules. The primary, if not the only, physiological function of PyrR is regulation of expression of the pyr operon. The tertiary structure and surviving UPRTase activity of PyrR demonstrate that this regulatory protein originated from an ancestral PRTase in which UMP- and PRPPbinding sites were retained and an RNA-binding surface arose, apparently by a novel interaction between monomeric units to form dimers in a fashion quite unlike dimer formation by other PRTases. Another instance in which a regulatory protein appears to have arisen from a PRTase is the PurR repressor protein from B. subtilis [47]. The ability of PurR to bind to pur operator DNA is also regulated by PRPP and its sequence is most similar to the B. subtilis XPRTase sequence. We suggest that many proteins regulating gene expression may have arisen from enzymes by evolution of specific DNA- or RNA-binding surfaces.

Materials and methods Crystallization and data collection B. subtilis PyrR was overexpressed in the E. coli strain S∅408 using plasmid pTSROX3 and purified as described previously [5]. The protein was stored in 10 mM Tris-acetate, pH 7.5, at –80°C. Crystals of hexameric PyrR were grown by hanging drop vapor diffusion at 20°C. Many conditions favorable for crystal growth were obtained initially from the Hampton Research Crystal Screen I [57], and some very small and thin but extremely fragile crystals grew in 2.0 M ammonium sulfate. An additional search for more suitable conditions was conducted via an experimentally designed orthogonal array screen [58], which yielded a microcrystalline precipitate from PEG 6000 and ammonium sulfate. Final conditions for crystal growth were obtained via refinement of these conditions over a fine screen of 0.2 pH units. Protein at 10 mg/ml in 10 mM Tris-acetate, pH 7.5, and 0.2% β-octylglucoside was mixed with an equal volume of reservoir solution, and equilibrated against 13–16% PEG 6000, 300 mM ammonium sulfate, and 50 mM sodium succinate at pH 5.1. Cube-shaped crystals appeared within 7–10 days and grew to typical dimensions of 0.3 × 0.3 × 0.3 mm3 over a period of 6–8 weeks. Crystals of dimeric PyrR were grown under the same conditions, except for the addition of 2.0 mM Sm(NO3)3 to the reservoir solution; long thin rods appeared overnight and grew to maximum dimensions of 0.05 × 0.15 × 1.0 mm3 over a period of 2–5 weeks. Other lanthanide ions such as Ho3+, Lu3+ and Yb3+ could be used to obtain this crystal form, but the crystals grown with Sm3+ were thicker and diffracted further. Crystals were cryoprotected by soaking for approximately 30 seconds in 20% (meso and DL)2,3-butanediol, 16% PEG 6000, 300 mM ammonium sulfate, 50 mM sodium succinate pH 5.1 and 0.2% β-octylglucoside. The crystals were quickly mounted in a small plastic fiber loop and either flash

Research Article Structure of PyrR Tomchick et al.

347

Table 1 Data collection and phasing statistics. C2 crystal form

Data range (Å)

R32 crystal form

λ1 (1.6950 Å)

λ2 (1.6959 Å)

λ3 (0.9799 Å)

CuKα

40.0–2.20

40.0–2.25

40.0–1.55

40.0–2.3

No. of observations

33,325

37,191

104,445

92,317

No. of unique reflections

8342

8644

26,609

23,877

Completeness (%)

87.6

93.7

98.0

98.6

Rsym (%)*

4.2

5.4

8.5

5.0

Energy (keV)

7.315

7.311

12.653

(e–)

–12.5

–16.2

–0.6

f′′ (e–)

16.9

13.7

5.9

Rcullis (iso)†

0.73

0.69

Rcullis (ano)‡

0.41

0.50

1.28

1.70

f′

Phasing

power#

0.75

*Rsym =Σ |Io–| / Σ Io, where Io are observed intensities and are average intensities for redundant measurements. †R cullis (iso) = Σ | | |FPH|obs +/–|FP|obs |–|FH|calc | / Σ | |FPH|obs +/–|FP|obs |, where |FP|obs and |FPH|obs are the native and derivative observed structure amplitudes, respectively, and |FH|calc is the calculated heavy

atom structure amplitude. ‡Rcullis (ano) = Σ | |DPH|obs–|DPH|calc | / Σ |DPH|obs , where |DPH|obs and |DPH|calc are the observed and calculated anomalous differences for |FPH|. #Phasing power = rms ( |FH| / E ), where E is the residual lack of closure error.

frozen in liquid ethane [59] and stored in liquid nitrogen for later use or frozen in a nitrogen gas stream at 120K with an Oxford Cryostream.

Sm LII edge on Sm3+ co-crystals. The diffraction data were treated as in a conventional multiple isomorphous replacement experiment with the inclusion of anomalous scattering. The three sets of data were placed on the same relative scale in the program SCALEIT, and inspection of an anomalous difference Patterson map calculated from data at λ1 revealed a high occupancy Sm3+ site on the crystallographic twofold axis at x = 0, z = 0, and a lower occupancy Sm3+ site in a general position. Difference Fourier syntheses confirmed that there were no additional Sm ions bound to the protein. For convenience, the y-coordinate of the high occupancy Sm3+ was set to 0.0 during phase refinement, which was performed in the program MLPHARE [63] using data with a d = 20.0 to 2.3 Å. In this procedure, data from λ3 were used as the ‘native’ data set, and a unitary scattering factor was used for the Sm ion (f′ = f′′ = 1.00 electron) [64]. A total of 94% of all unique reflections were assigned a phase in this procedure, and the final overall figure of merit was 0.81. Phasing statistics are shown in Table 1. The resulting electron-density map calculated from these MLPHARE phases was clearly interpretable for the bulk of the protein (Figure 7). The program O [65] was used to build an initial model for 165 residues of the protein, with a resulting R factor of 45.6% for data with d = 6.0 to 2.7 Å. Simulated annealing refinement of this model followed by cycles of positional refinement against data with d = 6.0 to 2.3 Å in the program X-PLOR [66] resulted in an Rwork of 28.0% and an Rfree of 40.5% (for a random 10% subset of all data). During all cycles of simulated annealing and standard positional refinement, the y-coordinate of the high occupancy Sm3+ ion was set to 0.0 to prevent the oscillation of this parameter in the polar C2 space group. Subsequent refinement in X-PLOR employed a bulk solvent mask and a Bayesian weighting scheme of structure factors as implemented in the program HEAVY [67]. The occupancies of the Sm3+ ion on the crystallographic twofold axis and Sm3+ ion in the general position were fixed at 0.5 and 0.4, respectively, and their individual B factors were refined. Standard positional and individual isotropic thermal factor refinement coupled with cycles of model rebuilding, data extension and addition of solvent sites yielded a final Rwork of 19.2% and Rfree of 23.3% for data with d = 15.0 to 1.6 Å. The current model includes two Sm3+ ions, one sulfate ion, 263 waters, protein residues

X-ray diffraction data at two wavelengths near the LII absorption edge of Sm (peak, λ1 = 1.6950 Å; inflection point, λ2 = 1.6959 Å) and at one remote wavelength (λ3 = 0.9799 Å) were collected from a single crystal of dimeric PyrR on Beamline 19 (BM14) of the European Synchrotron Radiation Facility. Although anomalous scattering is greater at the LIII absorption edge of Sm (1.846 A), greater absorption of X-rays by the crystal, air and cryosolvent at this wavelength significantly diminish diffracted intensities. The data were recorded on a Mar Research image plate detector. At each wavelength, 180° of data were recorded as 2° oscillation images in the standard mode (i.e. inverse beam geometry to minimize absorption effects was not used). A fresh spot on the crystal was used to record data at each wavelength. Crystals of the dimeric form of the 181-residue protein grew in the space group C2, a = 76.7 Å, b = 57.7 Å, c = 55.0 Å, β = 129.0°, with one monomer in the asymmetric unit. X-ray diffraction data were collected from a single crystal of hexameric PyrR frozen to 120K, using an R-axis II imaging plate system mounted on a Rigaku RU-200 rotating anode (CuKα) operated at 100 mA and 50 kV. Crystals of the hexameric form of the protein grew in the space group R32, a = b = 100.58 Å, c = 275.15 Å (hexagonal setting), with one dimer in the asymmetric unit. All data were processed and scaled in the programs DENZO and Scalepack [60]. Bijvoet pairs for the C2 crystal form data from λ1 and λ2 were scaled independently and not merged. Negative intensities were truncated and intensities were converted to structure factors and placed on an approximate absolute scale in the program TRUNCATE [61] from the CCP4 package [62]. Data collection and processing statistics for both crystal forms are shown in Table 1.

Structure determination and refinement The structure of the dimeric form of PyrR was phased via the multiwavelength anomalous diffraction (MAD) method from data collected at the

348

Structure 1998, Vol 6 No 3

Figure 7

Table 2 Refinement statistics.

Total non-hydrogen atoms SO42– sites Water sites Sm3+ sites Data range (Å) No. of reflections Sigma cutoff [F/σ(F)] R (%)* Rfree (%)† Mean B value (Å2) mainchain sidechain Rms deviations from target values bonds (Å) angles (°) Bonded B values (Å2) mainchain sidechain Non-crystallographic symmetry (Å) Missing residues Ramachandran outliers

Electron density at 2.25 Å resolution calculated from the MAD phases. The 1.6 Å refined coordinates for a portion of the central β sheet of the protein are superimposed on the density; contours are drawn at the rms level of the map. This figure was prepared in the program O [65].

3–29, 34–73 and 82–180 (Asn180 is modeled as an alanine) and seventeen residues modeled in two conformations. Due to the high degree of conformational flexibility of the missing residues (30–33, 74–81), a greater than usual number of solvent sites were modeled into the positive density peaks in these regions of the map. A last cycle of refinement that included the 10% of the data used in the free R factor calculation resulted in an R of 19.5%. Complete refinement statistics for the C2 crystal form can be found in Table 2. One monomer of the dimeric form of PyrR was truncated prior to use as a search model for the molecular replacement solution of the hexameric form (residues 70–73, 82–85 of the flexible loop, the sidechain of Lys40, and the sulfate ion were deleted). Rotation and translational searches in the program AMORE [68] using data with d = 10.0 to 3.5 Å located two protomers close enough to the crystallographic threefold axis to generate a hexamer of D3 point symmetry. This solution resulted in a correlation coefficient of 56.7% and an Rwork of 58.3%; rigid-body refinement improved the correlation coefficient to 65.4% and Rwork to 38.1%. Subsequent rigid-body refinement in the program X-PLOR against data with d = 10.0 to 3.0 Å resulted in an Rwork of 37.0% and Rfree of 37.8% (for a random 10% subset of all data). Inspection of electron-density maps in the program O allowed a model for residues 30–33 of the previously disordered loop, the sidechain of Lys40 and a sulfate ion in the PRPP loop to be built into both monomers. Simulated annealing refinement of this model followed by cycles of positional refinement against data with d = 10.0 to 3.0 Å in X-PLOR resulted in an Rwork of 28.2% and an Rfree of 38.5%. Subsequent refinement in X-PLOR using

C2 crystal form

R32 crystal form

1649 1 263 2 15.0–1.6 24,217 0.0 19.2 23.3

2958 2 239 NA 15.0–2.3 23,799 0.0 21.1 24.7

15.8 18.9

33.0 37.4

0.011 1.60

0.011 1.68

0.95 1.34

2.48 4.04

NA 1–2, 30–33, 74–81, 181 Lys40

0.61 A: 1–3, 74–83, 181 B: 1–3, 73–82, 181 Lys40 in both monomers

*R =Σ | |Fobs|–|Fcalc| | / Σ |Fobs|, where Fobs is the observed structure factor and Fcalc is the structure factor calculated from the final model. The values given included all of the reflections, including the ones used for the Rfree calculation. †R for a random 10% subset of the total reflections collected. data with d = 15.0 to 2.3 Å employed non-crystallographic symmetry restraints, a bulk solvent mask and a Bayesian weighting scheme of structure factors as implemented in the program HEAVY. Standard positional and individual isotropic thermal factor refinement coupled with model rebuilding and addition of solvent sites yielded a final Rwork of 19.2% and Rfree of 24.0% for the model, which includes two sulfate ions, 258 waters, residues 3–73, 84–180 (Leu86 is modeled as an alanine) of the A monomer and residues 4–72, 83–180 of the B monomer. Six residues of the A monomer and four residues of the B monomer were modeled in two conformations. A last cycle of refinement that included the 10% of the data used in the free R factor calculation resulted in an R of 19.6%. Complete refinement statistics for the R32 crystal form can be found in Table 2.

Accession numbers The coordinates of the dimeric (1A3C) and hexameric (1A4X) PyrR structures have been deposited with the Brookhaven Protein Data Bank. Accession codes for the structure factors are R1A3CSF and R1A4XSF, respectively.

Acknowledgements The authors thank AW Thompson for expert advice in data collection at the ESRF and S Larsen and A Kadziola for sharing their UPRTase coordinates prior to publication. This work was supported by NIH DK42303 to JLS and NIH GM47112 to RLS. RJT was supported in part by NIH Cell and Molecular Biology Training grant No. GMO7283.

References 1. Turner, R.J., Lu, Y. & Switzer, R.L. (1994). Regulation of the Bacillus subtilis pyrimidine biosynthetic (pyr) gene cluster by an autogenous transcriptional attenuation mechanism. J. Bacteriol. 176, 3708-3722.

Research Article Structure of PyrR Tomchick et al.

2. Lu, Y., Turner, R.J. & Switzer, R.L. (1995). Roles of the three attenuators of the Bacillus subtilis pyrimidine biosynthetic operon in the regulation of its expression. J. Bacteriol. 177, 1315-1325. 3. Lu, Y. & Switzer, R.L. (1996). Transcriptional attenuation of the Bacillus subtilis pyr operon by the PyrR regulatory protein and uridine nucleotides in vitro. J. Bacteriol. 178, 7206-7211. 4. Lu, Y., Turner, R.J. & Switzer, R.L. (1996). Function of RNA secondary structures in transcriptional attenuation of the Bacillus subtilis pyr operon. Proc. Natl. Acad. Sci. USA 93, 14462-14467. 5. Turner, R.J., Bonner, E.R., Grabner, G.K. & Switzer, R.L. (1998). Purification and characterization of Bacillus subtilis PyrR, a bifunctional pyr mRNA-binding attenuation protein/uracil phosphoribosyltransferase. J. Biol. Chem. 273, 5932-5938. 6. Houman, F., Diaz-Torres, M.R. & Wright, A. (1990). Transcriptional antitermination in the bgl operon of E. coli is modulated by a specific RNA binding protein. Cell 62, 1153-1163. 7. Crutz, A.-M., Steinmetz, M., Aymerich, S., Richter, R. & LeCoq, D. (1990). Induction of levansucrase in Bacillus subtilis: an antitermination mechanism negatively controlled by the phosphotransferase system. J. Bacteriol. 172, 1043-1050. 8. Debarbouille, M., Arnaud, M., Fouet, A., Klier, A. & Rapoport, G. (1990). The sacT gene regulating the sacPA operon in Bacillus subtilis shares strong homology with transcriptional antiterminators. J. Bacteriol. 172, 3966-3973. 9. Babitzke, P. & Yanofsky, C. (1993). Reconstitution of Bacillus subtilis trp attenuation in vitro with TRAP, the trp RNA-binding attenuation protein. Proc. Natl. Acad. Sci. USA 90, 133-137. 10. Otridge, J. & Gollnick, P. (1993). MtrB from Bacillus subtilis binds specifically to trp leader RNA in a tryptophan-dependent manner. Proc. Natl. Acad. Sci. USA 90, 128-132. 11. Gollnick, P. (1994). Regulation of the Bacillus subtilis trp operon by an RNA binding protein. Molec. Microbiol. 11, 991-997. 12. Manival, X., Yang, Y., Strub, M.P., Kochoyan, M., Steinmetz, M. & Aymerich, S. (1997). From genetic to structural characterization of a new class of RNA-binding domain within the SacY/BglG family of antiterminator proteins. EMBO J. 16, 5019-5029. 13. van Tilbeurgh, H., Manival. X., Aymerich, S., Lhoste, J.M., Dumas, C. & Kochoyan, M. (1997). Crystal structure of a new RNA-binding domain from the antiterminator protein SacY of Bacillus subtilis. EMBO J. 16, 5030-5036. 14. Antson, A.A., et al., & Gollnick, P. (1995). The structure of trp RNAbinding attenuation protein. Nature 374, 693-700. 15. Burd, C.G. & Dreyfuss, G. (1994). Conserved structures and diversity of functions of RNA-binding proteins. Science 265, 615-621. 16. Draper, D.E. (1995). Protein–RNA recognition. Annu. Rev. Biochem. 64, 593-620. 17. Martinussen, J., Glaser, P., Andersen, P.S. & Saxild, H.H. (1995). Two genes encoding uracil phosphoribosyltransferase are present in Bacillus subtilis. J. Bacteriol. 177, 271-274. 18. Scapin, G.S., Grubmeyer, C. & Sacchettini, J.C. (1994). Crystal structure of orotate phosphoribosyltransferase. Biochemistry 33, 1287-1294. 19. Smith, J.L., et al., & Satow, Y. (1994). Structure of the allosteric regulatory enzyme of purine biosynthesis. Science 264, 1427-1433. 20. Eads, J. C., Scapin, G., Xu, Y., Grubmeyer, C. & Sacchettini, J.C. (1994). The crystal structure of human hypoxanthine-guanine phosphoribosyltransferase with bound GMP. Cell 78, 325-334. 21. Somoza, J.R., Chin, M.S., Focia, P.J., Wang, C.C. & Fletterick, R.J. (1996). Crystal structure of the hypoxanthine-guanine-xanthine phosphoribosyltransferase from the protozoan parasite Tritrichomonas foetus. Biochemistry 35, 7032-7040. 22. Schumacher, M.A., Carter D., Roos, D.S., Ullman, B. & Brennan, R.G. (1996). Crystal structures of Toxoplasma gondii HGXPRTase reveal the catalytic role of a long flexible loop. Nat. Struct. Biol. 3, 881-887. 23. Henriksen, A., Aghajari, N., Jensen, K.J. & Gajhede, M. (1996). A flexible loop at the dimer interface is a part of the active site of the adjacent monomer of Escherichia coli orotate phosphoribosyltransferase. Biochemistry 35, 3803-3809. 24. Eads, J.C., Ozturk, D., Wexler, T.B., Grubmeyer, C. & Sacchettini, J.C. (1997). A new function for a common fold: the crystal structure of quinolinic acid phosphoribosyltransferase. Structure 5, 47-58. 25. Vos, S., de Jersey, J. & Martin, J.L. (1997). Crystal structure of Escherichia coli xanthine phosphoribosyltransferase. Biochemistry 36, 4125-4134. 26. Ghim, S.-Y., Nielsen, P. & Neuhard, J. (1994). Molecular characterization of pyrimidine biosynthesis genes from the thermophile Bacillus caldolyticus. Microbiology 140, 479-491.

349

27. Ghim, S.-Y. & Neuhard, J. (1994). The pyrimidine biosynthesis operon of the thermophile Bacillus caldolyticus includes genes for uracil phosphoribosyltransferase and uracil permease. J. Bacteriol. 176, 3698-3707. 28. Li, X., Weinstock, G.M. & Murray, B.E. (1995). Generation of auxotrophic mutants of Enterococcus facaelis. J. Bacteriol. 177, 6866-6873. 29. Elagöz, A., Abdi, A., Hubert, J.-C. & Kammerer, B. (1996). Structure and organization of the pyrimidine biosynthesis pathway genes in Lactobacillus plantarum: a PCR strategy for sequencing without cloning. Gene 182, 37-43. 30. Andersen, P.S., Martinussen, J. & Hammer, K. (1996). Sequence analysis and identification of the pyrKDbF operon from Lactococcus lactis including a novel gene, pyrK, involved in pyrimidine biosynthesis. J. Bacteriol. 178, 5005-5012. 31. Van de Casteele, M., Chen, P., Roovers, M., Legrain, C. & Glansdorff, N. (1997). Structure and expression of a pyrimidine gene cluster from the extreme thermophile Thermus strain Z05. J. Bacteriol. 179, 3470-3481. 32. Fleischmann, R.D., et al., & Venter, J.C. (1995). Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science 269, 496-512. 33. Philipp, W.J., et al., & Cole, S.T. (1996). An integrated map of the genome of the tubercle bacillus, Mycobacterium tuberculosis H37Rv, and comparison with Mycobacterium leprae. Proc. Natl. Acad. Sci. USA 93, 3132-3137. 34. Kaneko, T., et al., & Tabata, S. (1995). Sequence analysis of the genome of the unicellular cyanobacterium Synechocystis sp. strain PCC6803 I. Sequence features in the 1Mb region from map positions 64% to 92% of the genome. DNA Res. 2, 153-166. 35. Ghim, S.-Y. & Switzer, R.L. (1996). Mutations in Bacillus subtilis PyrR, the pyr regulatory protein, with defects in regulation by pyrimidines. FEMS Microbiol. Lett. 137, 13-18. 36. Kim, J.H., Krahn, J.M., Tomchick, D.R., Smith, J.L. & Zalkin, H. (1996). Structure and function of the glutamine phosphoribosylpyrophosphate amidotransferase glutamine site and communication with the phosphoribosylpyrophosphate site. J. Biol. Chem. 271, 15549-15557. 37. Scapin, G.S., Ozturk, D.H., Grubmeyer, C. & Sacchettini, J.C. (1995). The crystal structure of the orotate phosphoribosyltransferase complexed with orotate and α-D-5-phosphoribosyl-1-pyrophosphate. Biochemistry 34, 10744-10754. 38. Krahn, J.M., Kim, J.H., Burns, M.R., Parry, R.R., Zalkin, H. & Smith, J.L. (1997). Coupled formation of an amidotransferase interdomain ammonia channel and a phosphoribosyltransferase active site. Biochemistry 36, 11061-11068. 39. Chen, S., et al., & Zalkin, H. (1997). Mechanism of the synergistic endproduct regulation of Bacillus subtilis glutamine phosphoribosylpyrophosphate amidotransferase by nucleotides. Biochemistry 36, 10718-10726. 40. Janin, J. & Chothia, C. (1990). The structure of protein–protein recognition sites. J. Biol. Chem. 265, 16027-16030. 41. Davies, D.R., Padlan, E.A. & Sheriff, S. (1990). Antibody–antigen complexes. Annu. Rev. Biochem. 59, 439-473. 42. Bode, W. & Huber, R. (1991). Ligand binding: proteinase–protein inhibitor interactions. Curr. Opin. Struct. Biol. 1, 45-52. 43. Wilson, I.A. & Stanfield, R.L. (1993). Antibody–antigen interactions. Curr. Opin. Struct. Biol. 3, 113-118. 44. Jensen, K.F. & Mygind, B. (1996). Different oligomeric states are involved in the allosteric behavior of uracil phosphoribosyltransferase from Escherichia coli. Eur. J. Biochem. 240, 637-645. 45. Glusker, J.P. (1991). Structural aspects of metal liganding to functional groups in proteins. Adv. Prot. Chem. 42, 1-76. 46. Janin, J. & Rodier, F. (1995). Protein–protein interaction at crystal contacts. Proteins 23, 580-587. 47. Weng, M., Nagy, P. L. & Zalkin, H. (1995). Identification of the Bacillus subtilis pur operon repressor. Proc. Natl. Acad. Sci. USA 92, 7455-7459. 48. Shin, B.S., Stein, A. & Zalkin, H. (1997). Interaction of Bacillus subtilis purine repressor with DNA. J. Bacteriol. 179, 7394-7402. 49. Amster-Choder, O. & Wright, A. (1992). Modulation of the dimerization of a transcriptional antiterminator protein by phosphorylation. Science 257, 1395-1398. 50. Aymerich, S. & Steinmetz, M. (1992). Specificity determinants and structural features in the RNA target of the bacterial antiterminator proteins of the BglG/SacY family. Proc. Natl. Acad. Sci. USA 89, 10410-10414. 51. Muchmore, C.R.A., Krahn, J.M., Kim, J.H., Zalkin, H. & Smith, J.L. (1998). Crystal structure of glutamine phosphoribosylpyrophosphate amidotransferase from Escherichia coli. Protein Sci. 7, 39-51.

350

Structure 1998, Vol 6 No 3

52. Ghim, S.-Y. & Switzer, R.L. (1996). Characterization of cis-acting mutations in the first attenuator region of the Bacillus subtilis pyr operon that are defective in pyrimidine-mediated regulation of expression. J. Bacteriol. 178, 2351-2355. 53. Heus, H. A. & Pardi, A. (1991). Structural features that give rise to the unusual stability of RNA hairpins containing GNRA loops. Science 253, 191-194. 54. Pley, H.W., Flaherty, K.M. & McKay, D.B. (1994). Three-dimensional structure of a hammerhead ribozyme. Nature 372, 68-74. 55. Aboul-Ela, F., Varani, G. & Karn, J. (1996). Structure of HIV-1 Tar RNA in the absence of ligands reveals a novel conformation of the trinucleotide bulge. Nucl. Acids Res. 24, 3974-3981. 56. Valegård, K., Murray, J.B., Stockley, P.G., Stonehouse, N.J. & Liljas, L. (1994). Crystal structure of an RNA bacteriophage coat protein–operator complex. Nature 371, 623-626. 57. Jancarik, J. & Kim, S.H. (1991). Sparse matrix sampling: a screening method for crystallization of proteins. J. Appl. Cryst. 24, 409-411. 58. Kingston, R.L., Baker, H.M. & Baker, E.N. (1994). Search designs for protein crystallization based on orthogonal arrays. Acta Cryst. D 50, 429-440. 59. Teng, T.Y. (1990). Mounting of crystals for macromolecular crystallography in a free-standing thin film. J. Appl. Cryst. 23, 387-391 60. Otwinowski, Z. (1993). Oscillation data reduction program. In Data Collection and Processing (Sawyer, N.I.L., & Bailey, S., eds) pp. 5662, Science and Engineering Research Council Daresbury Laboratory, Daresbury, UK. 61. French S. & Wilson K. (1978). On the treatment of negative intensity observations. Acta Cryst. A 34, 517-525. 62. Collaborative Computational Project, Number 4 (1994). The CCP4 suite: programs for protein crystallography. Acta Cryst. D 50, 760-763. 63. Otwinowski, Z. (1991). Maximum likelihood refinement of heavy atom parameters. In Isomorphous Replacement and Anomalous Scattering (Wolf, W., Evans, P.R., Leslie, A.G.W., eds) pp. 80-86, Science and Engineering Research Council Daresbury Laboratory, Daresbury, UK. 64. Glover, I.D., et al., & Tame, J.R.H. (1995). Structure determination of OppA at 2.3 Å resolution using multiple-wavelength anomalous dispersion methods. Acta Cryst. D 51, 39-47. 65. Jones, T.A., Zou, J.Y., Cowan, S.W. & Kjeldgaard, M. (1991). Improved methods for the building of protein models in electron-density maps and the location of errors in these models. Acta Cryst. A 47, 110-119. 66. Brünger, A.T. (1992). X-PLOR Version 3.1. A system for X-ray Crystallography and NMR, Yale University, New Haven, CT. 67. Terwilliger, T.C. & Berendzen J. (1995). Difference refinement: obtaining differences between two related structures. Acta Cryst. D 51, 609-618. 68. Navaza, J. (1994). AMoRe: an automated package for molecular replacement. Acta Cryst. A 50, 157-163. 69. Zuker, M. & Stiegler, P. (1981). Optimal computer folding for large RNA sequences using thermodynamics and auxiliary information. Nucl. Acids Res. 9, 133-148. 70. Barton, G.J. (1993). ALSCRIPT: a tool to format multiple sequence alignments. Protein Eng. 6, 37-40. 71. Kraulis, P.J. (1991). MOLSCRIPT: a program to produce both detailed and schematic plots of protein structures. J. Appl. Cryst. 26, 946-950. 72. Nicholls, A., Sharp, K.A. & Honig, B. (1991). Protein folding and association: insights from the interfacial and thermodynamic properties of hydrocarbons. Proteins 11, 281-296.