Article No. mb981848
J. Mol. Biol. (1998) 280, 137±151
Crystal Structure of l -Cro Bound to a Consensus Ê Resolution Operator at 3.0 A Ronald A. Albright and Brian W. Matthews* Institute of Molecular Biology Howard Hughes Medical Institute and Department of Physics, University of Oregon Eugene, OR 97403-1229, USA
The structure of the Cro protein from bacteriophage l in complex with a 19 base-pair DNA duplex that includes the 17 base-pair consensus operÊ resolution. The structure con®rms the ator has been determined at 3.0 A large changes in the protein and DNA seen previously in a crystallographically distinct low-resolution structure of the complex and, for the ®rst time, reveals the detailed interactions between the side-chains of the protein and the base-pairs of the operator. Relative to the crystal structure of the free protein, the subunits of Cro rotate 53 with respect to each other on binding DNA. At the same time the DNA is bent by 40 through the 19 base-pairs. The intersubunit connection includes a region within the protein core that is structurally reminiscent of the ``ball and socket'' motif seen in the immunoglobulins and T-cell receptors. The crystal structure of the Cro complex is consistent with virtually all available biochemical and related data. Some of the interactions between Cro and DNA proposed on the basis of model-building are now seen to be correct, but many are different. Tests of the original model by mutagenesis and biochemical analysis corrected some but not all of the errors. Within the limitations of the crystallographic resolution it appears that operator recognition is achieved almost entirely by direct hydrogen-bonding and van der Waals contacts between the protein and the exposed bases within the major groove of the DNA. The discrimination of Cro between the operators OR3 and OR1, which differ in sequence at just three positions, is inferred to result from a combination of small differences, both favorable and unfavorable. A van der Waals contact at one of the positions is of primary importance, while the other two provide smaller, indirect effects. Direct hydrogen bonding is not utilized in this distinction. # 1998 Academic Press
*Corresponding author
Keywords: Cro; bacteriophage lambda; conformational change; repressor; helix-turn-helix
Introduction The Cro protein from bacteriophage l, together with the catabolite gene activator protein from Escherichia coli, were the ®rst sequence-speci®c DNA-binding proteins for which three-dimensional structures were determined (Anderson et al., 1981; McKay & Steitz, 1981). Together with the l-repressor protein (Pabo & Lewis, 1982) they were the prototypical helix-turn-helix DNA-binding proteins Present address: R. A. Albright, Department of Molecular Biophysics and Biochemistry, Yale University, J. W. Gibbs Building, Room 423, 260 Whitney Avenue, New Haven, CT 06520, USA. Abbreviations used: HTH, helix-turn-helix motif; MIR, multiple isomorphous replacement. 0022±2836/98/260137±15 $30.00/0
(Anderson et al., 1982; Ptashne, 1986; Brennan & Matthews, 1989; Harrison, 1991). The crystal structure of a complex of Cro with a 17 base-pair segment of operator (Brennan et al., 1990) subsequently con®rmed the overall nature of the complex of Cro with DNA that had been anticipated from model building (Anderson et al., 1981; Ohlendorf et al., 1982). This structure was, howÊ in ever, limited to a nominal resolution of 3.9 A Ê along c. For the a and b directions, and about 6 A this reason neither the individual base-pairs nor the amino acid side-chains could be resolved (Brennan et al., 1990). Thus, it was impossible to ascertain the speci®c interactions between the protein and the DNA and to distinguish between alternative modes of interaction that had been proposed (Ohlendorf et al., 1982; Hochschild & # 1998 Academic Press
138
Structure of a Cro-Operator Complex
Figure 1. DNA fragment used in the cocrystal. The overall fragment is a 19 base-pair duplex with 50 single ``stickyended'' overhangs. The central 17 base-pairs correspond to a consensus operator. The dot indicates the pseudo-dyad axis of the operator. Regions in which the sequence palindrome is strictly upheld are shaded. The circles indicate the locations and the nomenclature used to identify the phosphate groups that are directly contacted by Cro.
Ptashne, 1986; Hochschild et al., 1986; Takeda et al., 1989; Benson & Youderian, 1989). We here describe the structure of a complex between l-Cro and a 19 base-pair DNA duplex (Figure 1) that includes the 17 base-pair consensus operator (Ptashne, 1986). The crystals diffract isoÊ resolution and reveal, for the tropically to 3.0 A ®rst time, the detailed interactions that are responsible for the sequence-speci®c recognition by l-Cro of its different operators.
Results Quality of the structure The relatively high quality of the ®nal 2Fo ÿ Fc Ê resolution is presummap (Figure 2) at 3.0 A ably due to both the isotropic nature and the completeness of the diffraction data. Except for the disordered terminal residues 1 and 62 to 66, the electron density for the protein side-chains is well de®ned and the base-pairs are resolved, allowing the details of the Cro-operator interactions to be
observed. All non-glycine main-chain torsion angles fall within allowed regions of a Ramachandran plot (not shown), with 94.2% in the mostfavored regions as calculated by PROCHECK (Laskowski et al., 1993). These values are better than average for structures of this resolution. The Ê 2 for average main-chain thermal factor is 29.7 A residues 4 to 59, with higher values occurring at the chain termini and in the protruding b-hairpin near residue 47 (data not shown). In the crystal, the complex is statistically averaged about a 2-fold axis of symmetry. This affects the central three and the outermost base-pairs, which are not 2-fold symmetric (Figure 1), but not the remainder of the structure. The coordinates have been deposited in the Brookhaven Data Bank for immediate release (accession code 6CRO). Overall complex The Cro-operator complex is shown in Figures 3 and 4. Each 66 residue protein monomer consists of three a-helices and three b-strands: b1 (residues
Figure 2. Section of the ®nal 2FoÿFc electron density map (contoured at 1.0s) showing base-pairs 2 through 8 of the operator and the recognition helix of Cro in the major groove. The sugar-phosphate backbone can be seen entering the Cro subunit channel on the left, with the main-chain crossing into the minor groove around residue 60. One subunit of Cro is white, with the portion of the subunit being donated by the other monomer colored green. The plusstrand of the operator is yellow and the minus-strand is pink.
139
Structure of a Cro-Operator Complex
Figure 3. View of the overall complex between Cro and the operator DNA. The direction of view is parallel with the major grooves of the DNA (and parallel with the recognition helices). Phe58, which penetrates the core of the partner subunit, is shown.
3 to 6), a1 (7 to 14), a2 (16 to 23), a3 (27 to 36), b2 (39 to 45) and b3 (49 to 56) (Anderson et al., 1981; Ohlendorf et al., 1998). a2 and a3, along with the
intervening residues, form the helix-turn-helix (HTH) unit. Cro has an unusual dimer interface, with the C-terminal part of the b3-strand (residues 54 to 56) extending away from its own globular domain toward the partner subunit. The b3-strand terminates in a rigid hook-like conformation (Pro57, Phe58 and Pro59), which allows Phe58 to be donated to the core of the partner subunit (Figure 3). The dimer interface allows substantial ¯exibility between the two subunits. Consistent with the low-resolution Cro-DNA complex described by Brennan et al. (1990), both the protein and the DNA are seen to undergo large induced-®t changes upon complex formation (Figure 4; see below). The Cro dimer contacts the operator symmetrically, with the a3 ``recognition'' helices inserted into the major groove. Only the HTH unit of each Cro subunit makes direct contacts with the DNA bases. Those base-speci®c contacts shown to be most important for tight operator-binding (Takeda et al., 1989) all occur within the recognition helix (a3). Cro contacts the sugar-phosphate backbone of the operator in two regions per half-site, serving to ®x the orientation of the HTH relative to the major groove. The last ®ve residues of each monomer are disordered in the complex, as they are in the free protein (Anderson et al., 1981; Ohlendorf et al., 1998), but are located in the vicinity of the DNA backbone in the middle region of the operator. The Cro-operÊ 2 of surface area ator interface buries 2751 A through the central 17 base-pairs, as calculated by Ê radius over the surface of rolling a sphere of 1.4 A the molecules using EDPDB (Zhang & Matthews, 1995). Interactions with bases in the major groove
Figure 4. (a) Conformation of the Cro dimer as seen in the crystal structure of the free protein (Ohlendorf et al., 1998). The elements of secondary structure are identi®ed with the recognition a-helices shown in red. (b) The conformation adopted by the Cro dimer when in complex with operator DNA. In contrast to Figure 3, in which the direction of view is along the grooves of the DNA, the view direction here is perpendicular to the DNA.
As was anticipated in the original Cro structure determination (Anderson et al., 1981; Ohlendorf et al., 1982), speci®c recognition of the operator sequence primarily involves ``direct-readout'' interactions between the Cro HTH units and the edges of the base-pairs exposed in the major groove. The Cro dimer contacts 14 of the 17 base-pairs of the operator, directly interacting with a single base in each of the seven outermost positions per half-site (Figure 5). Strict 2-fold symmetry of the operator sequence is maintained in this region (Figure 1). The central three base-pairs are not contacted. In all, 14 of the 34 bases are involved; eight are contacted via hydrogen bonds, while the remaining six include van der Waals or hydrophobic interactions with the methyl groups of thymine bases. Details of the base-speci®c interactions are shown in Figure 6. The only base-pairs that are invariant in all 12 natural operator half-sites are at positions 2 and 4 (Ptashne, 1986). Changes at either of these positions drastically impair Cro-binding both in vitro (Takeda et al., 1989) and in vivo (Benson & Youderian, 1989). Cro contacts bases at both of these positions in a highly speci®c manner, in each
140
Figure 5. Schematic of the Cro-operator interactions. The Figure corresponds to the left half of the operator, as shown in Figure 1, with the central base-pair at the bottom. The bases on the left correspond to the top strand; those on the right to the bottom strand. Hydrogen bonds are shown as continuous lines with arrows pointing from the donor to the acceptor. Broken lines depict van der Waals contacts. bkbn indicates a contact with the protein backbone. The dotted lines show presumed electrostatic interactions.
Structure of a Cro-Operator Complex
case utilizing multiple hydrogen bonds with all exposed polar groups of a single base. Gln27, the ®rst residue of the recognition helix, makes a pair of hydrogen bonds with Ade(2). The approximate coplanarity of the carboxamide group and the base allows the side-chain NH2 of Gln27 to donate a hydrogen bond to N7 of the adenine, while the side-chain oxygen simultaneously accepts a hydrogen bond from N6 (Figures 5 and 6). Gln27 is further restricted by additional hydrogen bonding and van der Waals interactions within the protein-DNA interface. The side-chain donates a hydrogen bond to Gln16 and hydrophobic conÊ ) are made with the methyl group of tacts (4.3 A Thy(1). Similarly, Ser28, the second residue of the recognition helix, is primarily responsible for recognition at base-pair 4. The side-chain hydroxyl donates a bifurcated hydrogen bond to both the O6 and N7 of G(ÿ4) (Figures 5 and 6). Favorable Ê ) occur between the van der Waals contacts (3.8 A b g C and O atoms of Ser28 and the methyl group of Thy(ÿ5), further restricting possible side-chain rotation. The planar face of the peptide between Ser28 and Ala29, forms part of a hydrophobic Ê ) the methyl pocket that favorably contacts (3.6 A group of T(ÿ5). Near the outermost edge of the Ê ) the complex, the Cg atom of Thr17 contacts (3.5 A methyl group of T(1). Asn31, the middle residue of the recognition helix, is primarily responsible for distinguishing base-pair identities at position 3. The side-chain NH2 of Asn31 donates hydrogen bonds to both the side-chain oxygen atom of Gln16 and a phosphate group of the DNA backbone. At the same time, the side-chain oxygen atom of Asn31 accepts a hydrogen bond from the Nd1 atom of His35, while Cg and O@1 make favorable van der Waals interactions Ê ) with the methyl group of T(3). The rigidly (3.8 A
Figure 6. Stereo pair showing the inferred sequence-speci®c interactions between Cro and operator DNA. The Figure includes all direct hydrogen bonds (red broken lines) between the HTH unit of Cro (blue) and the operator DNA (green plus-strand, gray minus-strand). For simplicity, only Gln16 of the a2-helix is shown, and residues of the recognition helix not directly involved in operator contacts have been truncated to alanine. An extended hydrogenbonding network extends from Gln27 at base-pair 2 to the DNA backbone and helps to position Asn31 for recognition at base-pair 3. Cut-away portions of the relevant van der Waals surfaces of Asn31 and T(3) are shown as lavender dots. The single-letter code is used for both residues and bases, labeling only those involved in direct contacts.
141
Structure of a Cro-Operator Complex
held nature of Asn31 is re¯ected by the fact that Ê 2) is essenthe B-factor for this side-chain (30.0 A tially equal to that of the Cro main-chain for residues 4 to 59. The e-amino group of the unbuttressed Lys32 side-chain is in weak density, but appears able to donate hydrogen bonds to both N7 and O6 of G(ÿ6), as well as to N7 of G(ÿ7) (Figures 5 and 6). This is the only case in which a single side-chain hydrogen bonds to more than one base. Overall, the base-speci®c interactions follow an alternating pattern, with van der Waals contacts occurring at positions 1, 3 and 5, and hydrogen bonding at positions 2, 4, 6 and 7. Interactions with the sugarphosphate backbone Cro contacts the sugar-phosphate backbone in two regions per half-complex, one on each strand. These interactions ¯ank the major groove into which the recognition helix is placed and serve to position the HTH relative to the base-pairs, thereby determining the ``context'' in which recognition takes place. Cro donates hydrogen bonds to at least ten phosphate groups, or ®ve per half-operator. Keeping the nomenclature used for the l-repressor complex (Jordan & Pabo, 1988; Beamer & Pabo, 1992), these phosphates groups are identi®ed as PA to PE (or PA0 to PE0 ) per half-site (Figures 1 and 5). A less optimal contact also appears to be made with an additional phosphate group, termed PX (or PX0 ). The character of the interactions in these two regions of phosphate contact differ substantially. The ``outermost'' region of interaction, including phosphate groups PA and PB, remains fairly solvent-exposed. Contacts are made by both ends of the HTH unit. The main-chain amide group of Gln16, the ®rst residue of helix a2, donates a hydrogen bond to a phosphodiester oxygen atom at PA, which is also optimally positioned for favorable dipole interactions with the helix. Phosphate group PB interacts with three side-chains: Gln16 (the ®rst residue of a2), Asn31 (the middle residue of a3) and His35 (the next-to-last residue of a3). The side-chain -NH2 groups of Gln16 and Asn31 each donate a hydrogen bond to a phosphodiester oxygen atom at PB, while the His35 interaction is presumably induced-electrostatic (Figure 6). The ``innermost'' region of interaction, including contacts with the phosphate groups PC, PD and PE (Figure 5), is much less solvent-exposed. Here the DNA backbone passes through a channel on the surface of each Cro subunit, becoming substantially buried (Figure 3). At the beginning of the channel, a phosphodiester oxygen atom at PC accepts a hydrogen bond from the main-chain amide group of Tyr26. This phosphate group is substantially buried by the side-chains of Val25 and Tyr26, both ``turn'' residues of the HTH, as well as Ala29, the third residue of a3. In the deepest part of the channel, PD is completely buried by Val25, Ala29, Ala33, Arg38, Phe580 , Pro590 ,
Ser600 and Asn610 (where the prime symbol signi®es residues donated by the partner monomer). At the bottom of the channel, Arg38 forms a saltbridge with PD, with the terminal NH1 and NH2 groups donating hydrogen bonds to both phosphoÊ and 2.3 A Ê , respectdiester oxygen atoms (2.5 A ively). In free Cro (Anderson et al., 1981; Ohlendorf et al., 1998), as well as in a Cro monomer mutant (Albright et al., 1996), a salt-bridge with Glu54 greatly restricts movement of the Arg38 side-chain. This effectively prealigns Arg38 for interaction with the DNA backbone and presumably reduces the entropic cost of binding. The salt-bridge is maintained on complex formation, with Ne of Arg38 donating a hydrogen bond to a side-chain oxygen atom of Glu54. As the Cro polypeptide chain crosses over the DNA backbone to enter the minor groove, the main-chain amide group of Ser600 donates a hydrogen bond to a phosphodiester oxygen atom at PD. The serine side-chain is also close to this phosphate group, but the electron density is less clear, making a possible interaction uncertain. The last residue suf®ciently ordered to be observed is Asn610 , which dips into the minor groove, but not deep enough to make any base-speci®c contacts. As will be described, the minor groove becomes particularly narrow and deep in this region at the center of the complex. The DNA backbone emerges from the protein channel at phosphate group PE, which forms a saltÊ ) with Lys560 . bridge (2.6 A Beyond the other end of the channel, additional van der Waals interactions occur between the aromatic ring of Tyr26 and the phosphate group of Gua(ÿ4), termed PX (Figure 5). The hydroxyl group of Tyr26 may donate a hydrogen bond to a phosphodiester oxygen atom at PX, but if so, the bond angle is far from optimal, lying approximately 65 out of the plane of the aromatic ring. A Tyr26 to Phe mutant exhibits decreased in vivo activity (Eisenbeis et al., 1985), indicating that the hydroxyl group presumably enhances operatorbinding. Hydrogen-bonding network An extended hydrogen-bonding network, not seen in free Cro (Anderson et al., 1981; Ohlendorf et al., 1998), forms along the Cro-operator interface, linking residues involved in speci®c recognition to contacts with the DNA backbone. Residues spanning the HTH are involved, including the ®rst residue of a2 (Gln16), as well as the ®rst, middle and second to last residues of a3 (Gln27, Asn31 and His35, respectively). Three of these four residues also interact directly with DNA phosphate groups. This target-induced network of interactions serves, in part, to restrict the positions of Gln27 and Asn31, both of which make base-speci®c contacts. In particular, the immobilization of the short Asn31 side-chain allows it to reliably distinguish between various base-pair identities at position 3 by its ability to form favorable van der Waals
142
Structure of a Cro-Operator Complex
contacts only with T(3). Glutamine residues conserved in the HTH units of l-repressor, 434-Cro and 434-repressor also participate in similar but less extensive hydrogen bonding networks (Aggarwal et al., 1988; Jordan & Pabo, 1988; Wolberger et al., 1988; Mondragon & Harrison, 1991; Beamer & Pabo, 1992; Rodgers & Harrison, 1993; Shimon & Harrison, 1993).
Conformational changes in Cro In the crystal structure of free Cro (Anderson et al., 1981; Ohlendorf et al., 1998), the recognition helices (a3 and a30 ) are essentially anti-parallel, Ê with a midpoint to midpoint separation of 34.2 A (Figure 4(a)). Upon binding the operator, these helices rotate 53 and move closer together, reduÊ cing the midpoint to midpoint spacing to 29.3 A (Figure 4(b)). This is achieved with only minimal change in the amount of buried surface area at the Ê 2). Rotation Ê 2 to 1401 A dimer interface (from 1312 A of the subunits is not the result of a simple hingelike motion, but rather is achieved by a series of modest torsion-angle changes along the b-ribbon that connects one monomer to the other. The largest of these main-chain torsion angle adjustments occurs in the b-strand at Glu53 and Glu54, a point of transition from the core part of the b2b3-ribbon to the solvent-exposed b3b30 -ribbon of the dimer interface. In addition to changes in the relative positioning of the Cro subunits, perturbations occur within the core of each subunit. A difference-distance plot comparing the Ca-Ca distances in one monomer of operator-bound Cro to those of apo Cro (Ohlendorf et al., 1998) is shown in Figure 7. The Ê movement of the largest feature indicates a 2.5 A solvent-exposed b2b3-hairpin away from the center of the dimer interface (Val55), corresponding to a straightening of the b2b3-ribbon upon complex formation. In addition, residues 16 to 33 of the Ê to 1 A Ê away from the central HTH unit move 0.5 A b3-strand region near residue 52. A hydrogen bond linking the main-chain amide group of residue 580 to the main-chain carbonyl group of residue 52 results in an associated movement in the subunit core, such that the recognition helix is Ê from the tip of the Phe580 arowithdrawn 0.7 A matic ring. This outward movement of the HTH unit from the center of the dimer corresponds to a widening of the channel into which the DNA backbone ®ts. Model-building indicates that this ¯exing of the core is essential to avoid steric interference that would otherwise occur between the recognition helix and the DNA backbone. The energetic cost of this adjustment is presumably modest, since similar shifts have been observed within the core of a monomer mutant of Cro in the absence of DNA (Albright et al., 1996). These movements do not require the making or breaking of any hydrogen bonds.
Figure 7. Perturbations within the folding domain of each Cro subunit on binding operator DNA. The Figure is a difference-distance plot showing the changes in distance between all pairs of Ca atoms within onehalf of the operator-bound Cro structure (this work) relative to the wild- type Cro structure (Ohlendorf et al., 1998). The ®gure includes residues 3 to 55 of one monomer plus 550 to 600 of the second monomer. Residues 1 to 2 and 61 to 66 are not shown, because they are very mobile or disordered. The locations of the a-helices and b-strands are indicated. Positive contours (continuous) and negative contours (broken) are drawn at increments Ê with the zero contour omitted. Positive of 0.5 A contours indicate increased separation in the operator complex.
Conformation of the DNA As shown in Table 1, there are substantial perturbations throughout the bound operator relative to standard B-form DNA. The 19 base-pairs are bent toward the protein by 40 , as calculated using CURVES (Lavery & Sklenar, 1988). While there are no sharp kinks, bending is greatest at the T-C step between base-pairs 3 and 4 (13 roll) and at the C-C step of base-pairs 6 and 7 (12 tilt). Except for this C-C step, the middle of the operator is overwound, while the ends are slightly underwound, as re¯ected in the helical twist values. The C-C step corresponds to the region of the operator where the DNA backbone passes through the protein channel. The minor groove becomes sharply compressed at the center of the complex (Figure 3). The space available between the phosphate groups of Gua(ÿ7) and Gua(ÿ70 ), also identi®ed as PE and Ê , compared to 7 PE0 , is reduced to less than 3.6 A Ê to 9 A elsewhere (Table 1). Also, the minor groove Ê , as opposed to 3 to 5 A Ê reaches a depth of 7 A elsewhere. Notwithstanding the presence of the ``sticky'' single 50 -T/A overhangs (Figure 1), the DNA frag-
143
Structure of a Cro-Operator Complex Table 1. Conformation of bound DNA
Base-pair
Propeller twist (deg.)
Buckle (deg.)
1
24
ÿ17
2
ÿ2
ÿ4
3
5
ÿ6
4
1
ÿ10
5
ÿ8
8
6
ÿ8
8
7
ÿ4
12
8
(ÿ27)
(ÿ3)
9
11
ÿ3
Helical twist (deg.)
Rise Ê) (A
Tilt (deg.)
Roll (deg.)
3.1
7
4
38
3.3
ÿ4
ÿ7
23
3.3
5
13
32
3.1
ÿ2
3
33
3.4
ÿ5
7
35
3.4
12
ÿ1
25
(3.6)
(ÿ2)
(7)
(42)
(3.6)
(ÿ3)
(ÿ10)
(38)
Minor groove Width Depth Ê) Ê) (A (A ±
±
±
±
7.9
2.8
8.5
3.5
8.7
3.3
7.3
4.7
7.6
5.4
7.3
4.6
3.6
7.0
The parameters are as de®ned by Saenger (1984) and Ravishanker et al. (1989). The calculations were performed using CURVES (Lavery & Sklenar, 1988). The outermost base-pair is not included in the analysis because of distortions due to crystal contacts (see the text). The values in parentheses are the averages for the half-operator and its ``symmetry mate'' (see the text). The width of the minor groove as de®ned by CURVES is the distance between the phosphorus atoms minus the van der Waals radii of the two phosÊ ). phate groups (i.e. 2 2.85 A
ments do not form a pseudo-continuous helix throughout the crystal lattice. Rather, the ``overhang'' bases rotate away from the axis of the DNA and stack with the overhang bases of symmetryrelated operators. These interactions occur in a hydrophobic region between the Phe14 residues of two symmetry-related Cro molecules. The outermost base-pair on each end of the DNA fragment stacks against the aromatic ring of Tyr51 of a symmetry-related Cro molecule. In terms of crystal packing, the two ends of the complex are indistinguishable, consistent with the complex being equally distributed between the two orientations. This lack of ``overhang speci®city'' is also supported by the observation that it was possible to grow isomorphous crystals using operator fragments with ``like'' 50 -T/T overhangs.
Discussion In terms of the large change observed in the relative orientation of the Cro subunits on binding operator, together with the associated bending of the DNA, the complex described here is in good agreement with the low-resolution structure reported by Brennan et al. (1990). The consistency of these two complexes, crystallized in different space groups using different DNA fragments under different conditions, strongly suggests that they correspond to the structure in solution. Due to the limited resolution of the previous complex (Brennan et al., 1990), neither side-chains nor basepairs could be resolved, leaving many questions unanswered. The increased resolution of the structure presented here allows details of a Crooperator complex to be seen for the ®rst time. We begin by discussing the consistency of the complex with available biochemical and mutant data.
Consistency with biochemical and mutagenic data Eisenbeis et al. (1985) utilized Cro mutants to demonstrate the importance of Gln27 and Ser28 for in vivo operator binding. Pakula et al. (1986) also identi®ed Gln16, Tyr26, Gln27, Ser28, Lys32, Arg38 and Lys56 as residues that were important for DNA-binding or recognition. Three of these are now seen to contact base-pairs in the major groove, while the remaining four contact the phosphate backbone (Figure 5). A number of the same residues, as well as Asn31, were identi®ed by Caruthers et al. (1986). A subsequent study provided strong evidence that Gln27 contacts basepair 2 (Caruthers et al., 1987), as is observed. The observation that Ser28 to Ala mutants lose the ability to discriminate between some base-pairs at position 4 (Hochschild & Ptashne, 1986; Takeda et al., 1989) is now explained by the direct interaction of Ser28 with G(ÿ4). Similarly, the inability of Lys32 mutants to discriminate at base-pair 6 (Hochschild et al., 1986) and base-pair 7 (Takeda et al., 1989) is now seen as the result of Lys32 simultaneously contacting G(ÿ6) and G(ÿ7). These direct contacts (Figure 5) also explain why the binding of Cro protects only three guanine N7 groups per half-site from methylation, namely G(ÿ4), G(ÿ6) and G(ÿ7) (Johnson et al., 1979; Johnson, 1980), and why methylation of G(ÿ4) and G(ÿ6) interferes with Cro-binding (Hochschild & Ptashne, 1986). The almost total burial of the methyl group of T(ÿ5) in a hydrophobic pocket is consistent with results from 19F NMR studies of a Cro-OR3 complex (Metzler & Lu, 1989). Phenylazide-mediated photocrosslinking studies (Chen & Ebright, 1993) have indicated that Ca of Thr17 is Ê of parts of T(ÿ2) and A(ÿ3) in the within 12 A
144 speci®c operator complex, consistent with this Ê structure showing these distances to be about 10 A Ê , respectively. and 11 A The structure is also in excellent agreement with contacts inferred from methylation protection and interference, ethylation interference and DNaseI footprinting studies (Johnson et al., 1979; Johnson, 1980), as well as hydroxyl radical footprinting experiments (Tullius & Dombroski, 1986). Additionally, Takeda et al. (1986) showed that lysine residues 32 and 56 were strongly protected from reductive methylation in the operator-complex, Lys62 and Lys63 were less protected, while Lys8, Lys18 and Lys39 were not protected. In the complex, Lys32 makes speci®c contacts with two bases, while Lys56 forms a salt-bridge interaction with a phosphate group. Lys62 and Lys63 of the C-terminal tail probably contact the DNA backbone, but remain predominantly solvent-exposed and ¯exible. The remaining lysine residues do not interact with the DNA. The C-terminal tail of Cro (residues 60 to 66) has been shown to be important for both speci®c and non-speci®c binding (Takeda et al., 1986; Hubbard et al., 1990). Mutant studies (Hubbard et al., 1990) reveal little side-chain preference at residue 60, consistent with the observation that it is the mainchain of Ser60 that hydrogen bonds to a phosphate group. The impact of charged side-chains is greatest at residues 61 and 62, then rapidly decreases and becomes inconsequential for residues 64 to 66. Similarly, deletion mutants show that operatorbinding is drastically impaired if the negatively charged carboxy terminus is located at residue 62, but only moderately disfavored if at residue 63 (Hubbard et al., 1990). Deleting residues 65 and 66 has almost no impact. These observations are consistent with the Cro-operator complex presented here, which shows residue 61 to be positioned within the minor groove and the remainder of the tail to be disordered. Residue 62 must remain close to the DNA backbone, whereas subsequent residues are progressively less restricted. There is no evidence in the present structure, however, that the C-terminal tail is bent back against the protein (Brennan et al., 1990). The relative spacing of the operator half-sites has a critical impact on the ability of Cro to bind. A single base-pair insertion or deletion at the center of a Cro operator sequence effectively abolishes speci®c binding both in vitro (Takeda et al., 1989) and in vivo (Benson & Youderian, 1989). The structure suggests that such an alteration would not allow the speci®c Cro-operator interactions to be maintained simultaneously in both half-sites. Previous models for the interaction of Cro with DNA The ®rst detailed model for the Cro-operator complex (Ohlendorf et al., 1982; Anderson et al., 1981) was developed by docking and energy-minimizing the structure of apo Cro with successively
Structure of a Cro-Operator Complex
bent B-form DNA. Novel features included insertion of the recognition helices (a3 and a30 ) into the major groove, allowing speci®c interactions with the base-pair edges, extensive contacts with the sugar-phosphate backbone ¯anking the major groove, substantial bending of the DNA toward the protein, as well as a small rotation of the Cro subunits. Many of the general features of that model are now con®rmed and virtually all of the residues and base-pairs proposed to be important are, indeed, involved in protein-DNA interactions of some type. However, failure to take into account the large relative rotation of the Cro subunits (a possible caveat the authors mentioned) resulted in the misalignment of each subunit relative to the operator half-sites. As a result, some of the proposed contacts were correct but many are now seen to be in error. An alternative model for Crobinding was presented by Hochschild & Ptashne (1986) based, in part, on the altered speci®city of certain Cro mutants for methylated operators. In particular, it was demonstrated that the Cro mutant Ser28!Ala lost the ability to discriminate at base-pair 4. This model effectively repositioned the Cro HTH unit relative to the base-pairs such that its interactions would be very similar to those for l-repressor (Jordan & Pabo, 1988). In particular, Ser28 was proposed to contact G(ÿ4) instead of the A(ÿ3). Another model (Takeda et al., 1989) based on binding studies using a systematically mutated operator site and Cro mutants, implied that the HTH units of Cro and l-repressor recognized their operators in inherently different ways. An analysis of the in vivo binding of Cro to a set of 40 mutant l-operators by Benson & Youderian (1989) did not effectively differentiate between the various alternatives. The interaction predicted by Hochschild & Ptashne (1986) between Ser28 and G(ÿ4) is in agreement with the present structure. However, their assertion that Lys32 makes speci®c contacts at base-pairs 5 and 6 is only partially correct. Rather, Lys32 contacts base-pairs 6 and 7. The data of Takeda et al. (1989), as well as those of Benson & Youderian (1989), are, in general, consistent with the structure presented here. In particular, there is a contact at base-pair 5 but it comes from the protein main-chain. The inherent assumption at the time was that all interactions with the DNA bases came from protein side-chains. Accordingly, sidechains were shifted in each of the proposed Crooperator models in an attempt to account for this contact, leading in turn to additional errors. The following speci®c predictions of the model proposed by Takeda et al. (1989) are, however, incompatible with the structure presented herein: that Tyr26 contacts T(1) and T(ÿ2), that Ser28 contacts A(ÿ3), that Lys32 contacts G(ÿ4) and T(ÿ5), and that Arg38 contacts G(ÿ6) and (ÿ7). Some errors in the original model tended to propagate, by default, to subsequent models. A case in point is Arg38, long thought to make important base-speci®c contacts because it was strictly
145
Structure of a Cro-Operator Complex
required for binding. Instead, Arg38 is now seen to make a multiply buttressed phosphate contact. Is Cro designed to flex? In solution, the structure of the Cro dimer is presumably very ¯exible (Matsuo et al., 1995; Ohlendorf et al., 1998). Compared to the crystal structure of free Cro, the Cro dimer undergoes substantial changes in conformation on binding operator, including both interdomain hinge-bending, and changes within each subunit that widen the channel occupied by the DNA backbone. Dimer ¯exibility, therefore, appears to be important for Cro function. Furthermore, the central b-sheet region of the dimer, better described as a series of slightly overlapping anti-parallel b-ribbons, appears designed to permit ¯exibility (Figure 4(a) and (b)). Large changes in the relative orientation of the subunits result primarily from a series of small torsion angle changes along the central b3b30 -ribbon, allowing the preservation of all hydrogen bonding interactions with only a slight change in buried surface area. Presumably, the crystal structures of the apo-protein and the complex of Cro with DNA represent just two of the many possible conformations that the protein dimer can adopt in solution. A Val55 to Cys mutant of Cro allows the spontaneous formation of an intersubunit disul®de bond at the center of the dimer interface (Hubbard et al., 1990; Shirakawa et al., 1991; Griko et al., 1992; Baleja & Sykes, 1994). The resulting threefold decrease in operator af®nity is thought to be due to restriction of subunit rotation. Consistent with these results, analysis of the side-chain geometry of residue 55 suggests that a disul®de bridge (Sowdhamini et al., 1989) is more compatible with free Cro than with the operator-bound form, although not ideal. Presumably, this mutant can still achieve the subunit rotation required for a speci®c complex, but at a higher energetic cost. Additionally, Hubbard et al. (1990) have shown that this disul®de mutant binds about fourfold more tightly to non-speci®c DNA than does wildtype Cro. This might indicate that a change in intersubunit conformation occurs when switching from the non-speci®c to the speci®c complex. It might also suggest that the speci®c complex is more intimate than the non-speci®c one and places greater restriction on geometry of the disul®de bridge. A ball-and-socket-like joint The burial of Phe58 of one monomer in the core of the other is unusual and random mutagenesis of the Cro dimer interface has shown the identity of this residue to be critical for stability (Mossing & Sauer, 1990; Mollah et al., 1996). Phe58 is preceded by Pro57, which breaks away from the b3-strand, and is followed by Pro59, which is in the cis conformation. The protruding Phe-cisPro unit and the
surrounding pocket of Cro (Figure 8(a)) is reminiscent of the highly conserved molecular ball and socket joint found in the ¯exible switch regions of immunoglobulins (Figure 8(b)), as well as in T-cell receptors (Lesk & Chothia, 1988). In immunoglobulins, sizable movements of the VLVH dimer relative to the CLCH1 dimer are facilitated by the ability of a Phe-cisPro ``ball'' (of CH1) to undergo large rotations and small translations relative to the hydrophobic ``socket'' residues (of VH) into which it ®ts (Lesk & Chothia, 1988). In the case of Cro, the ball is contributed by the central b-sheet subdomain, while the socket is formed by the a-helical subdomain. Although the ball is donated by the partner monomer, its motion is linked to that of the central b-sheet region of the given monomer (Figure 7) by virtue of a mainchain hydrogen bond between Phe580 and residue 52 (of b3), as well as by the preceding b3b30 interactions. As with immunoglobulins, a three-residue covalent linker attaches the two regions. In Cro, however, a second connection between the subdomains is established by interactions directly preceding the a1-helix, including those of the b1b2ribbon and a main-chain hydrogen bond between residue 7 (of a1) and residue 40 (of b2). These additional interactions may limit the ability of the Cro subdomains to undergo large rotations, making this a ``semi-fused'' joint. Nevertheless, relatively large movements within the globular core, such as those observed upon operator binding, can still occur without disruption of hydrogen bonds and with minimal change in solvent-exposed surface area. The small size of the Cro subunit might also facilitate structural ¯exibility. It has been shown by NMR that peptides with the sequence Ar-Pro-Ar, where Ar is an aromatic residue, form an unusually stable structure in which the proline residue is in the cis conformation (Yao et al., 1994). The Phe-cisPro dipeptide seen in Cro and in the immunoglobulins both incorporate this motif. Bending of the DNA In the absence of Cro, the operator is thought to exist as regular B-form DNA (Chou et al., 1983; Ulrich et al., 1983; Kirpichnikov et al., 1984a,b; Baleja et al., 1990; Evertsz et al., 1991; Torigoe et al., 1991). The expected conformation of the DNA in the Cro-operator complex, however, has been a source of debate. A number of NMR studies have suggested that the structure of operator DNA changes on binding Cro, although the nature and magnitude of the conformational change was not speci®ed (Kirpichnikov et al., 1984a,b; Lee et al., 1987; Metzler & Lu, 1989). Gel-mobility studies have indicated a bend angle of about 30 (Kim et al., 1989). Cyclization studies of 21 base-pair OR3-containing fragments have suggested a bend of 40 to 45 (Lyubchenko et al., 1991). Atomic force microscopy conducted under low-salt conditions, however, indicated a substantially larger
146
Structure of a Cro-Operator Complex
Figure 8. (a) Stereo view showing the ¯exible ball and socket joint of Cro. The Phe-cisPro ball of one monomer (green) is inserted into the hydrophobic socket of the other monomer (blue). For simplicity, only the Ca main-chain trace is shown, together with the side-chains that form the ball and socket. The short, covalent, linker between the ball and socket is shown in red, and hydrogen bonds between the b1 and b2 strands are shown as broken lines. (b) Ball and socket as seen in the Fab switch region of the immunoglobulin McPC603 (Satow et al., 1986; Brookhaven accession code 2MCP). As in (a), the ball is shown in green, the socket in blue and the covalent linker in red. All of the immunoglobulin residues are from the heavy chain.
value of 69()11 (Erie et al., 1994). The structure described here, as well as the previous low-resolution structure (Brennan et al., 1990), show the operator to be bent by about 40 . There is no strong evidence for interactions with the DNA backbone extending beyond the 17 basepair operator (Tullius & Dombroski, 1986). Modelbuilding suggests that, in the absence of substantial additional bending or kinking, a longer piece of DNA would not make substantive further contacts with the protein. Therefore it seems reasonable to
assume that the bending of about 40 seen in the two crystal structures should occur in solution as well. The DNA is especially distorted near the center of the operator, where large propeller twisting (ÿ33 and ÿ20 ) occurs at the base-pairs 8 and 80 , ¯anking the central base-pair (Table 2). This results in substantial base-pair destacking in the middle GC-rich region of the operator, consistent with Raman studies (Evertsz et al., 1991). The changes near the middle of the operator induced by Cro
Table 2. X-ray data collection statistics for native and derivative Cro-operator cocrystals
Crystal
Resolution Ê) (A
Ra,c sym (%)
Total observations
Native
20-3.0
7.2
132,786
IUSb ICSb IUAb ICAb PtCl4
20-3.0 20-3.0 20-3.3 20-3.0 20-3.6
7.9 8.9 7.4 7.4 8.1
125,248 105,306 143,116 136,853 96,441
Number of unique reflections observed
Number of unique reflections calculated
Completeness of the data (%)
7493 (3758)a 7492 7496 5657 7402 4386
7506 (3760)a 7506 7506 5661 7506 4386
99.8 (99.9)a 99.8 99.9 99.9 99.8 100.0
Unit cell Ê) (A
Rdiso (%)
103.0
±
102.9 103.0 102.5 102.9 103.7
11.3 11.6 10.0 5.8 14.1
a Data were initially measured and processed assuming space group P213 (see the text). This resulted in the inclusion of many re¯ections that were very weak and that were eliminated when the correct space group was con®rmed as I213. The numbers given in parentheses are for space group I213. b IUS has iodouridine at positions (ÿ2) and (ÿ20 ), ICS has iodocytidine at positions (7) and (70 ), IUA has iodouridine at position (ÿ2), ICA has iodocytidine at position (7). c Rsym gives the average agreement between independently measured intensities. Intensities with I < 2s(I) were not included. d Riso is the average change in structure amplitude due to the introduction of the heavy-atom.
147
Structure of a Cro-Operator Complex
binding are consistent with distortions observed using 19F NMR (Metzler & Lu, 1989), as well as local overwinding inferred from circular dichroism spectroscopy (Torigoe et al., 1991). The propeller twisting appears to allow an additional non-Watson-Crick interaction between two adjacent basepairs of one strand in the minor groove, with the Ê ) to N2 of G(ÿ9) donating a hydrogen bond (2.7 A O2 of C(ÿ8). However, since this is a region of statistical disordering in the crystal structure, due to a breakdown of the DNA sequence symmetry, this interaction must remain in question. Specificity of operator recognition: OR1 versus OR3 The in vivo function of Cro is determined primarily by its ability to distinguish between the OR1 and OR3 operators (Ptashne, 1986). These operators differ in just three base-pair positions (30 , 50 and 80 ), all located in the non-consensus-half site. Takeda et al. (1989) have systematically tested the effect of all possible single base-pair changes within an OR1 operator on Cro-binding af®nity, including thymine to uracil substitutions to measure the contributions of individual methyl groups. These thermodynamic binding data can be combined with structural data to better understand how Cro distinguishes between the OR1 and OR3 sites. The most energetically important position for this discrimination is at base-pair 30 where TA (OR3) is favored by Cro over CG (OR1) by about 1.3 kcal/ mol. This can be explained by differences in van der Waals interactions with Asn31. The side-chain of Asn31 is held in place by a full set of hydrogen bonds, linking it to two other residues (Gln16 and His35) as well as to the PB phosphate group of the DNA backbone (Figure 6). Favorable van der Waals interactions are made between the hydrophobic part of the Asn31 side-chain and the methyl group of Thy(30 ), but would not occur with cytosine at this position. A thymine to uracil mutation of this base has shown the methyl group to contribute 1.4 kcal/mol of favorable binding energy (Takeda et al., 1989), fully accounting for the observed preference for Thy(30 ) of OR3 rather than Cyt(30 ) of OR1. The base-pair most preferred by Cro at position 50 is AT, which occurs in the consensus half of all six operators, as well as the non-consensus half of three of them. The cocrystal structure also contains AT, allowing favorable van der Waals contacts between the rigid main-chain of the recognition helix and the methyl group of thymine (ÿ5). AT does not, however, occur in the non-consensus half of either OR1 or OR3. Rather, the TA of OR1 is preferred by Cro over the CG of OR3 by about 0.8 kcal/mol. At ®rst, this may seem counter-intuitive, since it is opposite to the overall pattern of operator preference exhibited by Cro, namely OR3 over OR1. Unexpectedly, the binding studies indicated that the discrimination between these two sets of (isosteric) base-pairs occurs primarily at the
non-contacted 50 base. In particular, a thymine to uracil mutation showed the 50 methyl group to provide 0.6 kcal/mol of favorable binding energy (Takeda et al., 1989), accounting for most of the observed difference between TA and CG. The bene®t of the 50 methyl group does not appear to come from direct interaction with any part of the protein. Rather, the effect of the methyl group appears to be on the local deformability of the operator. A similar situation is observed for l-repressor, which also makes no direct contact with the 50 methyl group (Beamer & Pabo, 1992). The least-important position for distinguishing between OR1 and OR3 is base-pair 80 , where Cro favors the TA of OR3 over GC of OR1 by just 0.2 kcal/mol (Takeda et al., 1989). Consistent with this observation, Cro does not contact this position. However, this base-pair does undergo substantial propeller twisting upon Cro-binding (Table 1) associated with the overwinding of the middle of the operator. The weak preference of Cro for TA or AT over GC or CG might be explained by the ease of twisting against two hydrogen bonds rather than three.
Conclusion Both Cro and its operator undergo extensive induced-®t conformational changes upon complex formation, including the creation of an extended hydrogen-bonding network along the proteinÊ DNA interface. Within the limitations of this 3 A resolution structure determination, Cro appears to recognize its operator sites almost entirely through a combination of direct hydrogen bonds and van der Waals interactions with base-pair edges in the major groove. In each case, a given residue appears to make not one, but multiple base-speci®c hydrogen bonds, thereby increasing the all-or-nothing character of these recognition contacts. Accordingly, changes in direct hydrogen-bonding seem to be avoided in making ®ner distinctions between operator sites. Rather, Cro appears to utilize the more subtle differences afforded by changes in van der Waals interactions and, to a lesser extent, indirect effects. The key interaction used to distinguish between OR1 and OR3, for example, appears to involve the target-induced immobilization of a polar side-chain, which, in turn, positions the nonpolar part of the same side-chain to distinguish the key base-pair through van der Waals interactions.
Materials and Methods Crystallization Puri®ed Cro protein was kindly supplied by Y. Takeda and stored as a frozen glycerol stock. Ammonium sulfate-precipitated protein was resuspended in 20 mM sodium cacodylate (pH 6.9) and concentrated by centrifugation to 15 mg/ml using Centricon 3000 Da cutoff ®lters. Oligonucleotides were synthesized without a trityl group, puri®ed by reverse-phase HPLC
148
Structure of a Cro-Operator Complex
on a Hamilton PRP-1 column at 60 C to eliminate potential secondary structure, then annealed by slow-cooling from 80 C to 4 C. Cocrystallization trials utilized a wide variety of DNA fragments ranging from 17 to 23 basepairs in length, each containing the 17 base-pair consensus operator in the middle, and with differing end-types. The best cocrystals were obtained using a 19 base-pair DNA duplex with single-base 50 -overhangs, shown in Figure 1, mixed in 20% to 40% molar excess over Cro dimer. Hanging drops (5 ml) containing 7.0 mg/ml complex, 50 mM to 80 mM ammonium sulfate, and 11% to 14% polyethylene glycol (PEG) 3350 were allowed to equilibrate over a sealed reservoir containing double these concentrations, except complex. Cocrystals typically appeared only after 3.5 to 4 months at room temperature, growing up to 0.6 mm per side. Often cubic in shape, the cocrystals exhibited no birefringence and difÊ resolution under fracted isotropically to a limit of 2.8 A the conditions described below.
radiation from a Rigaku rotating anode generator. Each data set was collected from a single crystal at room temperature. The high symmetry of the cubic space group allowed the unique data to be collected quickly and also resulted in essentially 100% complete data sets with high redundancy (Table 2). In order to allow analysis of the putative systematic absences, P-lattice symmetry was assumed during data collection and reduction. Inspection of possible re¯ections with indices h k l 2n 1 revealed only background intensities at all resolutions, and for all data sets. This is consistent with the systematic absences of an I-lattice space group. These absences were maintained for iodo-derivatives, which were constructed to be asymmetric by iodinating only one-half of the operator (see below). After the space group was con®rmed to be I213, the native data were reprocessed accordingly.
Space Group Ambiguity
Difference Patterson maps were evaluated by both visual inspection and using the program VERIFY written by S. Roderick (personal communication). In order to clearly distinguish between space groups P213 and I213, two types of iodinated derivatives were prepared, symmetric and asymmetric. The symmetrically labeled derivatives, IUS (Iodo-U, symmetric) and ICS (Iodo-C, symmetric; Table 2), contained iodine atoms at two positions related by the molecular pseudo-dyad. The corresponding asymmetrically labeled derivatives, IUA (IodoU, asymmetric) and ICA (Iodo-C, asymmetric) contained an iodine atom at just one of these positions. Despite the deliberately introduced asymmetry, the difference Patterson map calculated in the lower-symmetry space group P213 for IUA had peaks at exactly the same positions as IUS. Similarly, the peaks for ICA were at the same positions as for ICS. This strongly suggested that the single iodine atoms in the crystals of IUA and ICA were occupying positions that were statistically disordered. A variety of Fourier maps (not shown) gave an identical result. Re®nement of heavy-atom positions and occupancies was carried out using HEAVY (Terwilliger & Eisenberg, 1983). Statistics are shown in Table 3. Consistent with the Patterson functions, the single-iodine derivatives re®ned to not one, but two sites, identical with those found in their symmetrically labeled counterparts. Furthermore, for each asymmetrically labeled derivative, the re®ned occupancies at these two heavy-atom sites were equal, with values approximately half those observed in the corresponding symmetrically labeled counterparts (data not shown). These results con®rmed the space group to be I213 with an ``averaged'' half-complex in the asymmetric unit Ê 3/Da. The and a solvent content parameter, VM, of 3.3 A complex is randomly distributed between two orien-
Screened precession photographs revealed a cubic unit Ê , with diffraction cell of dimensions a b c 103.0 A appearing to obey the condition h k l 2n. These apparent systematic absences suggested that the space group was either I23 or I213, but either such choice led to a number of ambiguities. If there was one complex per asymmetric unit, the solvent content parameter, VM, Ê 3/Da, which is generally regarded as would be 1.65 A unacceptably low (Matthews, 1968). A more reasonable Ê 3/Da for VM required that the complex be value of 3.3 A located on a crystallographic 2-fold axis with one monomer per asymmetric unit. Due to the inexact symmetry of the DNA sequence, however (Figure 1), the complex does not have exact 2-fold symmetry. This led to two possibilities. (1) The complex might have a distinct orientation in one or other of the lower-symmetry space groups P23 or P213, with pseudo symmetry corresponding to I23 or I213. (2) The complex might crystallize in a statistically disordered fashion in one or other of the higher-symmetry space groups I213 or I23. Because of the presence of ``sticky ends'' on the DNA fragment (Figure 1), the single-orientation mode initially seemed most probable, since it might permit the DNA to stack end-to-end throughout the crystal (Jordan et al., 1985). Careful analysis, however, as described below, showed that the molecule is, in fact, statistically disordered in space group I213. Data collection and reduction X-ray data were collected on a Xuong-Hamlin area detector (Hamlin, 1985; Howard et al., 1985; Zhang & Matthews, 1993) using graphite-monochromated CuKa
Multiple isomorphous replacement
Table 3. Heavy-atom re®nement statistics for the Cro-DNA complex
Derivative IUS ICS PtCl4
Sites
Number of heavy atoms
Resolution Ê) range (A
1 1 1
1 1 1
20-3.5 20-3.5 20-3.5
Centric reflections Number of Centric reflections R-factor 327 326 315
0.65 0.72 0.71
Acentric reflections Number of Phasing reflections power 2044 2044 1916
1.41 1.34 1.10
The statistics quoted are for space group I213 (see the text). The centric R-factor is jjFPHÿFPjÿFHj/jFPHÿFPj where FP, FPH and FH are the structure amplitudes of the native protein, the protein plus heavy atom, and the calculated scattering of the heavy atoms. The phasing power is hFHi/hjjFPH ÿ FPj ÿ FHji, where h i indicates the root-mean-square value. The mean ®gure of merit is 0.57.
149
Structure of a Cro-Operator Complex tations related by a 180 rotation about the molecular pseudo-dyad, which is coincident with a crystallographic 2-fold axis. Molecular replacement Molecular replacement was also used to provide an independent check on the multiple isomorphous replacement (MIR) solution. Searches were carried out using the program package ROTFUN (Zhang & Matthews, 1994) assuming the lower-symmetry space group of P213. The best search model proved to be a truncated version of the low-resolution complex (Brennan et al., 1990), containing residues 4 to 59 of each Cro monomer with solvent-exposed residues reduced to alanine, together with base-pairs 1 to 7 from each side of the operator. The rotation search solution was 4.0s above average (0.5s above the next-highest peak) and the translation search solution was 17.5s above average (6.1s above the nexthighest peak). The anticipated sites for the iodine atoms bound to the appropriate bases were in excellent agreement with those determined via difference Patterson functions (see the preceding section). The consistency of the two results gave additional con®dence in the space group determination, as well as the location and orientation of the complex within the unit cell.
initially restrained to 20 -endo. These restraints were relaxed in the ®nal stages of re®nement after modelbuilding was unable to remedy the repeated inversion of two sugar chiral centers. Consistent with this observation, Raman studies have suggested that some sugars in this complex are not 20 -endo (Evertsz et al., 1991). Flanking DNA bases were added to the model, as were previously truncated side-chains and residues near the protein termini. The initial R-factor of this averaged Ê resolution, but dropped half-complex was 29.5% at 3.0 A quickly with alternating rounds of positional and correlated B-factor re®nement, interspersed with model buildÊ ing. The ®nal R-factor is 19.3% (using all data from 20 A Ê resolution), with root-mean-square deviations to 3.0 A for bond lengths, bond angles and correlated B-factors of Ê 2, respectively. Both Fo ÿ Fc and Ê , 2.6 and 2.0 A 0.014 A 2Fo ÿ Fc omit maps were used to check the entire structure. A Luzzati (1952) plot (not shown) suggests a Ê . The ®nal re®ned model, 891 coordinate error of 0.32 A atoms in all, contains Cro residues 2 through 61, the averaged half-operator, seven water molecules, and a single sulfate ion (at 33.3% occupancy) straddling a crystallographic 3-fold axis. The crystallographic 2-fold symmetry was used to generate the other half of the complex, after which the statistically averaged bases at the three central and the outermost positions (Figure 1) were edited to give the correct DNA sequence.
Refinement The IUS, ICS and PtCl4 derivatives were used to calcuÊ resolution. The asymmetrically late an MIR map at 3.5 A labeled derivatives were not included because they essentially corresponded to half-occupancy versions of their symmetric counterparts. This map clearly showed the overall arrangement of the protein and DNA, and agreed with the location suggested by molecular replacement. Rigid-body re®nement of the search model using TNT (Tronrud et al., 1987; Tronrud, 1992) marginally reduced the R-factor from 40.0% to 39.1% for data Ê and 4.5 A Ê resolution. Inspection of the between 9.0 A MIR map, however, revealed that many parts of the model needed to be substantially adjusted. After adjusting the protein and DNA using FRODO (Jones, 1982), the central three base-pairs were added, resulting in a continuous 17 base-pair blunt-end DNA fragment. B-facÊ 2. After an initial round of postors were ®xed at 35 A itional re®nement and model-building in space group P213, the resulting map was further improved by combining the model phases with those from MIR. Further rounds of model-building and positional re®nement Ê to reduced the R-factor to 28.1% for data from 20 A Ê resolution. 4.0 A From this point on, full I213 symmetry was used with the model in one asymmetric unit consisting of a Cro monomer plus an ``averaged'' half-operator. The model for the DNA was normal, throughout, except having two bases, each at 50% occupancy, bonded to a single ribose moiety at positions where the sequence palindrome breaks down (Figure 1). These bases were allowed to re®ne in a normal manner, except that they ignored the presence of each other. It was not necessary to average any part of the sugar-posphate backbone and this was not done. Because a crystallographic 2-fold axis passed directly through the central base-pair of the operator, it required special care to maintain geometric and B-factor restraints between the averaged half-operator and its symmetry-mate. Sugar puckering could not be unambiguously determined at the given resolution and was
Acknowledgments We thank Dr Yoshinori Takeda for generous gifts of puri®ed Cro protein, Drs Steve Roderick, Ray Jacobson and Phil Pjura for numerous helpful discussions, Dr Cai Zhang for advice on molecular replacement, Dr Larry Weaver for help in preparing Figures, and Dr Dale Tronrud for insights regarding re®nement of the statistically disordered DNA. This work was supported in part by NIH grant GM20066 to B.W.M.
References Aggarwal, A. K., Rodgers, D., Drottar, M., Ptashne, M. & Harrison, S. C. (1988). Recognition of a DNA operator by the repressor of phage 434: a view at high resolution. Science, 242, 899± 907. Albright, R. A., Mossing, M. C. & Matthews, B. W. (1996). High-resolution structure of an engineered Cro monomer shows changes in conformation relative to the native dimer. Biochemistry, 35, 735± 742. Anderson, W. F., Ohlendorf, D. H., Takeda, Y. & Matthews, B. W. (1981). Structure of the Cro repressor from bacteriophage l and its interaction with DNA. Nature, 290, 754± 758. Anderson, W. F., Takeda, Y., Ohlendorf, D. H. & Matthews, B. W. (1982). Proposed a-helical supersecondary structure associated with protein-DNA recognition. J. Mol. Biol. 159, 745± 751. Baleja, J. D. & Sykes, B. D. (1994). A nuclear magnetic resonance study of the DNA-binding af®nity of Cro repressor protein stabilized by a disul®de bond. Biochem. Cell. Biol. 72, 95 ± 108. Baleja, J. D., Pon, R. T. & Sykes, B. D. (1990). Solution structure of phage l half-operator DNA by use of NMR, restrained molecular dynamics, and NOEbased re®nement. Biochemistry, 29, 4828± 4839.
150 Ê crystal Beamer, L. J. & Pabo, C. O. (1992). Re®ned 1. 8 A structure of the repressor-operator complex. J. Mol. Biol. 227, 177± 196. Benson, N. & Youderian, P. (1989). Phage l Cro protein and cI repressor use two different patterns of speci®c protein-DNA interactions to achieve sequence speci®city in vivo. Genetics, 121, 5 ± 12. Brennan, R. G. & Matthews, B. W. (1989). The helixturn-helix DNA-binding motif. J. Biol. Chem. 264, 1903± 1906. Brennan, R. G., Roderick, S. L., Takeda, Y. & Matthews, B. W. (1990). Protein-DNA conformational changes in the crystal structure of a l Cro-operator complex. Proc. Natl Acad. Sci. USA, 87, 8165± 8169. Caruthers, M. H., Barone, A. D., Beltman, J., Bracco, L. P., Dodds, D. R., Dubendorff, J. W., Eisenbeis, S. J., Gayle, R. B., Prosser, K., Rosendahl, M. S., Sutton, J. & Tang, J.-Y. (1986). The interaction of Cro, cI, and Escherichia coli RNA polymerase with operators and promoters. In Protein Structure, Folding and Design, pp. 221± 228, Alan R. Liss, Inc., New York. Caruthers, M. H., Gottlieb, P., Bracco, L. & Cummins, L. (1987). The thymine 5-methyl group: a proteinDNA contact site useful for redesigning Cro repressor to recognize a new operator. In Structure & Expression (Sarma, R. H. & Sarma, M. H., eds), vol. 1, pp. 157± 166, Adenine Press, New York. Chen, Y. & Ebright, R. H. (1993). Phenyl-azide-mediated photocrosslinking analysis of Cro-DNA interaction. J. Mol. Biol. 230, 453±460. Chou, S.-H., Hare, D. R., Wemmer, D. E. & Reid, B. R. (1983). Sequence-speci®c recognition of deoxyribonucleic acid. Chemical synthesis and nuclear magnetic resonance assignment of the imino proteins of lambda OR3 operator deoxyribonucleic acid. Biochemistry, 22, 3037± 3041. Eisenbeis, S. J., Nasoff, M. S., Noble, S. A., Bracco, L. P., Dodds, D. R. & Caruthers, M. H. (1985). Altered Cro repressors from engineered mutagenesis of a synthetic cro gene. Proc. Natl Acad. Sci. USA, 82, 1084± 1088. Erie, D. A., Yang, G., Schultz, H. C. & Bustamante, C. (1994). DNA bending by Cro protein in speci®c and nonspeci®c complexes: implications for protein site recognition and speci®city. Science, 266, 1562± 1566. Evertsz, E. M., Thomas, G. A. & Peticolas, W. L. (1991). Raman spectroscopic studies of the DNA Cro binding site conformation, free and bound to Cro protein. Biochemistry, 30, 1149± 1155. Griko, Y. V., Rogov, V. V. & Privalov, P. L. (1992). Domains in lambda Cro repressor. A calorimetric study. Biochemistry, 31, 12701± 12705. Hamlin, R. (1985). Multiwire area X-ray diffractometers. Methods Enzymol. 114, 416±452. Harrison, S. C. (1991). A structural taxonomy of DNAbinding domains. Nature, 353, 715± 719. Hochschild, A. & Ptashne, M. (1986). Homologous interactions of l repressor and l Cro with the l operator. Cell, 44, 925± 933. Hochschild, A., Douhan, J., III & Ptashne, M. (1986). How l repressor and l Cro distinguish between OR1 and OR3. Cell, 47, 807± 816. Howard, A. J., Nielsen, C. & Xuong, N. H. (1985). Software for a diffractometer with multiwire area detector. Methods Enzymol. 114, 452± 471. Hubbard, A. J., Bracco, L. P., Eisenbeis, S. J., Gayle, R. B., Beaton, G. & Caruthers, M. H. (1990). Role of the Cro carboxy-terminal domain and ¯exible dimer
Structure of a Cro-Operator Complex linkage in operator and nonspeci®c DNA binding. Biochemistry, 29, 9241± 9249. Johnson, A. D. (1980). Dissertation, Harvard University, Boston, MA. Johnson, A., Meyer, B. J. & Ptashne, M. (1979). Interactions between DNA-bound repressors govern regulation by the l phage repressor. Proc. Natl Acad. Sci. USA, 76, 5061± 5065. Jones, T. A. (1982). FRODO: a graphics ®tting program for macromolecules. In Computational Crystallography (Sayre, D., ed.), pp. 303± 317, Oxford University Press, Clarendon, Oxford. Jordan, S. R. & Pabo, C. O. (1988). Structure of the Ê resolution: Details of the lambda complex at 2.5 A repressor-operator interactions. Science, 242, 893± 899. Jordan, S. R., Whitcombe, T. V., Berg, J. M. & Pabo, C. O. (1985). Systematic variation in DNA length yields highly ordered repressor-operator cocrystals. Science, 230, 1383± 1385. Kim, J., Zwieb, C., Wu, C. & Adhya, S. (1989). Bending of DNA by gene-regulatory proteins: construction and use of a DNA bending vector. Gene, 85, 15 ± 23. Kirpichnikov, M. P., Hahn, K. D., Buck, F., RuÈterjans, H., Chernov, B. K., Kurochkin, A. V., Skryabin, K. G. & Bayev, A. A. (1984a). 1H NMR study of the interaction of bacteriophage l Cro protein with the OR3 operator. Evidence for a change of the conformation of the OR3 operator on binding. Nucl. Acids Res. 12, 3551± 3561. Kirpichnikov, M. P., Kurochkin, A. V., Chernov, B. K. & Skryabin, K. G. (1984b). Interactions between cro repressor and the model speci®c binding site. FEBS Letters, 175, 317±320. Laskowski, R. A., MacArthur, M. W., Moss, D. S. & Thornton, J. M. (1993). PROCHECK: a program to check the stereochemical quality of protein structures. J. Appl. Crystallog. 26, 283± 291. Lavery, R. & Sklenar, H. (1988). The de®nition of generalized helicoidal parameters and of axis curvature for irregular nucleic acids. J. Biomol. Struct. Dynam. 6, 63 ± 91. Lee, S. J., Shirakawa, M., Akutsu, H., Kyogoku, Y., Shiraishi, M., Kitano, K., Shin, M., Ohtsuka, E. & Ikehara, M. (1987). Base sequence-speci®c interactions of operator DNA fragments with the l-cro repressor coupled with changes in their conformations. EMBO J. 6, 1129 ±1135. Lesk, A. M. & Chothia, C. (1988). Elbow motion in the immunoglobulins involves a molecular ball-andsocket joint. Nature, 335, 188± 190. Luzzati, P. V. (1952). Traitement statistique des erreurs dans la determination des structures cristallines. Acta Crystallog. 5, 802±810. Lyubchenko, Y., Shlyakhtenko, L., Chernov, B. & Harrington, R. E. (1991). DNA bending induced by Cro protein binding as demonstrated by gel electrophoresis. Proc. Natl Acad. Sci. USA, 88, 5331± 5334. Matsuo, H., Shirakawa, M. & Kyogoku, Y. (1995). Three-dimensional dimer structure of the l-Cro repressor in solution as determined by heteronuclear multidimensional NMR. J. Mol. Biol. 254, 668± 680. Matthews, B. W. (1968). Solvent content of protein crystals. J. Mol. Biol. 33, 491± 497. McKay, D. B. & Steitz, T. A. (1981). Structure of cataboÊ resolution lite gene activator protein at 2.9 A suggests binding to left-handed B-DNA. Nature, 290, 744± 749.
Structure of a Cro-Operator Complex Metzler, W. J. & Lu, P. (1989). (Cro repressor complex with OR3 operator DNA: 19F nuclear magnetic resonance observations. J. Mol. Biol. 205, 149± 164. Mollah, A. K. M. M., Aleman, M. A., Albright, R. A. & Mossing, M. C. (1996). Core packing defects in an engineered Cro monomer corrected by combinatorial mutagenesis. Biochemistry, 35, 743±748. Mondragon, A. & Harrison, S. C. (1991). The phage 434 Cro/OR1 complex at 2.5 D resolution. J. Mol. Biol. 219, 321± 334. Mossing, M. C. & Sauer, R. T. (1990). Stable, monomeric variants of l Cro obtained by insertion of a designed b-hairpin sequence. Science, 250, 1712± 1715. Ohlendorf, D. H., Anderson, W. F., Fisher, R. G., Takeda, Y. & Matthews, B. W. (1982). The molecular basis of the DNA-protein recognition inferred from the structure of cro repressor. Nature, 298, 718± 723. Ohlendorf, D. H., Tronrud, D. E. & Matthews, B. W. (1998). Re®ned structure of Cro repressor protein from bacteriophage l suggests both ¯exibility and plasticity. J. Mol. Biol. 280, 129± 136. Pabo, C. O. & Lewis, M. (1982). The operator-binding domain of l repressor: structure and DNA recognition. Nature, 298, 443± 447. Pakula, A. A., Young, V. B. & Sauer, R. T. (1986). Bacteriophage l cro mutations: effects on activity and intracellular degradation. Proc. Natl Acad. Sci. USA, 83, 8829± 8833. Ptashne, M. (1986). A Genetic Switch. Gene Control and Phage l, Blackwell, Palo Alto. Ravishanker, G., Swaminathan, S., Beveridge, D. L., Lavery, R. & Sklenar, H. (1989). Conformational and helicoidal analysis of 30 ps of molecular dynamics on the d(CGCGAATTCGCG) double helix: ``curves'', dials and windows. J. Biomol. Struct. Dynam. 6, 669±699. Rodgers, D. W. & Harrison, S. C. (1993). The complex between phage 434 repressor DNA-binding domain and operator site OR3: structural differences between consensus and non-consensus half-sites. Structure, 1, 227± 240. Saenger, W. (1984). Principles of Nucleic Acid Structure, Springer-Verlag, New York. Satow, Y., Cohen, G. H., Padlan, E. A. & Davies, D. R. (1986). Phosphocholine binding immunoglobulin Ê. Fab McPC603. An X-ray diffraction study at 2.7 A J. Mol. Biol. 190, 593±604. Shimon, L. J. W. & Harrison, S. C. (1993). The phage Ê resolution. J. Mol. 434 OR2/R1-69 complex at 2.5 A Biol. 232, 826± 838. Shirakawa, M., Matsuo, H. & Kyogoku, Y. (1991). Intersubunit disul®de-bonded l-Cro protein. Protein Eng. 4, 545± 552. Sowdhamini, R., Srinivasan, N., Shoichet, B., Santi, D. V., Ramakrishnan, C. & Balaram, P. (1989). Stereoche-
151 mical modeling of disul®de bridges. Criteria for introduction into proteins by site-directed mutagenesis. Protein Eng. 3, 95 ± 103. Takeda, Y., Kim, J. G., Caday, C. G., Steers, E., Jr, Ohlendorf, D. H., Anderson, W. F. & Matthews, B. W. (1986). Different interactions used by Cro repressor in speci®c and nonspeci®c DNA binding. J. Biol. Chem. 261, 8608±8616. Takeda, Y., Sarai, A. & Rivera, V. M. (1989). Analysis of the sequence-speci®c interactions between Cro repressor and operator DNA by systematic base substitution experiments. Proc. Natl Acad. Sci. USA, 86, 439± 443. Terwilliger, T. C. & Eisenberg, D. (1983). Unbiased three-dimensional re®nement of heavy-atom parameters by correlation of origin-removed Patterson functions. Acta Crystallog. sect. A, 39, 813±817. Tronrud, D. E. (1992). Conjugate-direction minimization: An improved method for the re®nement of macromolecules. Acta Crystallog. sect. A, 48, 912± 916. Tronrud, D. E., Ten, Eyck L. F. & Matthews, B. W. (1987). An ef®cient general-purpose least-squares re®nement program for macromolecular structures. Acta Crystallog. sect. A, 43, 489± 503. Torigoe, C., Kidokoro, S., Takimoto, M., Kyogoku, Y. & Wada, A. (1991). Spectroscopic studies on l cro protein-DNA interactions. J. Mol. Biol. 219, 733± 746. Tullius, T. D. & Dombroski, B. A. (1986). Hydroxyl radical ``footprinting'': high-resolution information about DNA-protein contacts and application to l repressor and Cro protein. Proc. Natl Acad. Sci. USA, 83, 5469± 5473. Ulrich, E. L., John, E.-M., Gough, G. R., Brunden, M. J., Gilham, P. T., Westler, W. M. & Markley, J. L. (1983). Imino protein assignments in the protein nuclear magnetic resonance spectrum of the lambda phage OR3 deoxyribonucleic acid fragment. Biochemistry, 22, 4362± 4365. Wolberger, C., Dong, Y., Ptashne, M. & Harrison, S. C. (1988). Structure of a phage 434 cro/DNA complex. Nature, 335, 789± 795. Yao, J., Feher, V. A., Espejo, B. F., Reymond, M. T., Wright, P. E. & Dyson, H. J. (1994). Stabilization of a type VI turn in a family of linear peptides in water solution. J. Mol. Biol. 243, 736± 753. Zhang, X.-J. & Matthews, B. W. (1993). STRAT: a program to optimize X-ray data collection on an area detector system. J. Appl. Crystallog. 26, 457± 462. Zhang, X.-J. & Matthews, B. W. (1994). Enhancement of the method of molecular replacement by incorporation of known structural information. Acta Crystallog. sect. D, 50, 675± 686. Zhang, X.-J. & Matthews, B. W. (1995). EDPDB: a multifunctional tool for protein structure analysis. J. Appl. Crystallog. 28, 624± 630.
Edited by P. E. Wright (Received 24 December 1997; received in revised form 3 April 1998; accepted 8 April 1998)