Structure, Vol. 9, 717–723, August, 2001, 2001 Elsevier Science Ltd. All rights reserved.
PII S0969-2126(01)00632-3
Beyond the “Recognition Code”: Structures of Two Cys2His2 Zinc Finger/TATA Box Complexes Scot A. Wolfe,1 Robert A. Grant,1 Monicia Elrod-Erickson,1 and Carl O. Pabo1,2 1 Department of Biology Howard Hughes Medical Institute Massachusetts Institute of Technology Cambridge, Massachusetts 02139
Summary Background: Several methods have been developed for creating Cys2His2 zinc finger proteins that recognize novel DNA sequences, and these proteins may have important applications in biological research and gene therapy. In spite of this progress with design/selection methodology, fundamental questions remain about the principles that govern DNA recognition. One hypothesis suggests that recognition can be described by a simple set of rules—essentially a “recognition code”—but careful assessment of this proposal has been difficult because there have been few structural studies of selected zinc finger proteins. Results: We report the high-resolution cocrystal structures of two zinc finger proteins that had been selected (as variants of Zif268) to recognize a eukaryotic TATA box sequence. The overall docking arrangement of the fingers within the major groove of the DNA is similar to that observed in the Zif268 complex. Nevertheless, comparison of Zif268 and the selected variants reveal significant differences in the pattern of side chain-base interactions. The new structures also reveal side chainside chain interactions (both within and between fingers) that are important in stabilizing the protein-DNA interface and appear to play substantial roles in recognition. Conclusions: These new structures highlight the surprising complexity of zinc finger-DNA interactions. The diversity of interactions observed at the protein-DNA interface, which is especially striking for proteins that were all derived from Zif268, challenges fundamental concepts about zinc finger-DNA recognition and underscores the difficulty in developing any meaningful recognition code. Introduction Like many other studies of macromolecular interactions, studies of Cys2His2 zinc finger-DNA complexes [1, 2] have been motivated by the desire to understand the physical/chemical principles of recognition and by the hope that such knowledge may allow the design of useful new molecules. The crystal structure of the Zif268DNA complex [3, 4] provided the foundation for thinking about zinc finger-DNA recognition, and the simple, repeating pattern of DNA contacts observed in the Zif268 2
Correspondence:
[email protected]
complex formed the basis for proposals about a recognition code [5, 6]. Numerous selection and design studies have explored the idea of a code, but almost all of these have (in one way or another) implicitly incorporated the pattern of Zif268 contacts into the design and/or analysis of these experiments. Our laboratory has employed a less biased approach toward exploring the versatility of the zinc finger motif. We developed a selection protocol [7] that used the Zif268 framework, but fingers were added sequentially in a way that allows context-dependent interactions and that places relatively few restrictions on the pattern of base contacts or the docking arrangement of the fingers. This method has successfully produced zinc finger variants that recognize novel DNA sequences, including the TATA box of the adenovirus major late promoter, a portion of a p53 binding site, and a site normally recognized by a nuclear hormone receptor [7]. In this paper we describe the structures of two zinc finger proteins that recognize the TATA box (Figure 1). These structures are interesting because the sequence of the A/T-rich TATA site is radically different from that of the G/C-rich Zif268 site. Since the selection protocol that was used for the creation of these proteins made no assumptions about the specific pattern of DNA contacts, these structures allow a critical evaluation of the validity of a recognition code. Results and Discussion TATA-Specific Fingers Dock in the Major Groove Like the Fingers of Zif268 To understand how variants derived from Zif268 could recognize such a radically different DNA sequence, we crystallized and determined the structures of two TATAspecific zinc finger proteins (TATAZF and TATAZF*; Table 1; Figure 1). These proteins represent the two somewhat different consensus sequences obtained for finger 2 within the final set of TATA binding zinc finger proteins [7]. For simplicity, we will focus on the TATAZF complex, which has been biochemically characterized in greater detail, and will only comment on the TATAZF* complex when discussing differences in the DNA contacts made by finger 2. As previously reported, the DNA binding specificity of TATAZF has been defined by DNA site selections [8]. Furthermore, this protein displays good specificity in vitro, and several experiments show that these fingers are functional in vivo ([9]; J.A. Hurt, J.K. Joung, E.I. Ramm, and C.O.P., unpublished data). Although the sequential selection protocol [7] placed relatively few constraints on the orientation of the fingers in the major groove (other than the use of a canonical linker as each new finger was added), the crystal structure shows that the docking arrangement of the fingers in the TATAZF complex is very similar to the original Zif268 complex (Figure 1a). As expected, the TATA binding site adopts Key words: protein-DNA recognition; TATA box; X-ray crystallography; zinc finger; recognition code
Structure 718
a B-DNA conformation (with 10.7 base pairs per turn). There is no major distortion of the DNA in the TATAZF complex, and thus it is radically different from the severely bent structure observed for this DNA sequence when it is complexed with TBP [10, 11]. Analysis of the helical parameters for this site shows that it has an enlarged major groove (as observed in the Zif268 DNA [12]) and that it has a compressed minor groove (ⵑ3.5 A˚ wide) within its adenine tract [13]. TATAZF Recognizes Its Binding Site by Using a Different Pattern of DNA Contacts than Zif268 Although the TATAZF and Zif268 fingers dock in the major groove in a similar manner (with comparable patterns of phosphate contacts and similar linker conformations between the fingers), the pattern of interacting side chain and base positions is rather different in these complexes. The fingers in Zif268 make a relatively simple pattern of base contacts, and this pattern has provided the basis for all discussions of a zinc finger-DNA recognition code. Base-specific contacts involve residues at positions ⫺1, 2, 3, and 6 of the recognition helix, and the majority of these contacts are made to the “primary” strand of the DNA (Figure 1e). According to this pattern, the residue at position 6 of the ␣ helix will contact the 5⬘ base on the primary strand, the residue at position 3 will contact the central base, the residue at position ⫺1 will contact the 3⬘ base, and the residue at position 2 will contact the flanking base on the opposite (secondary) strand. In this arrangement, each finger recognizes a 4 bp site that overlaps with the binding site of neighboring fingers by one base pair (Figure 1d). Surprisingly, this simple pattern breaks down in the TATAZF complex (Figure 1d). Zif268-like DNA contacts are observed from position ⫺1 and position 3 in the TATAZF fingers, but there are unexpected contacts involving positions 1, 2, and 6 of the recognition helix. Several of these contacts are described in detail below, but the fundamental observation is that the pattern of interacting residue and base positions in the TATAZF fingers is different from that in Zif268, and it is thus different from that used in the formulation of zinc finger-DNA recognition codes. Several surprising features emerge from an analysis of side chain-base interactions in the TATAZF complex.
Figure 1. Comparison of the DNA Recognition of the TATAZF, TATAZF*, and Zif268 Fingers (a) Left panel: structure of TATAZF bound to a 16 bp oligonucleotide. At this level of detail, this structure is virtually indistinguishable from the TATAZF*-DNA complex. Right panel: superposition of the TATAZF and Zif268 structures [4] (protein: red and gray, respectively; DNA: blue and magenta, respectively). Each protein uses a tandem array of three fingers to recognize its binding site. Every finger is folded into a ␣ motif around a single metal ion (displayed as a van der Waals surface). The ␣ helix of each finger fits into the major groove, and base-specific recognition is mediated by residues within (or immediately preceding) this helix. (b) Aligned sequence of the three Cys2His2 zinc fingers from Zif268 [26], with the residues involved in base contacts highlighted in blue [3, 4]. The ␣-helical region of each finger is denoted by a cylinder (above), and  strands are indicated by filled arrows. The numbers just below the orange cylinder indicate the position of the residue
within the recognition helix, with ⫺1 indicating the residue immediately preceding the helix. (c) Sequences of the recognition helices from Zif268, TATAZF, and TATAZF* displaying all of the positions (⫺1, 1, 2, 3, 5, and 6) that had been randomized by Greisman and Pabo [7]. Residues involved in base contacts are shown in blue. Residues involved in interfinger interactions are indicated in red. (d) Diagram comparing the base contacts made by Zif268, TATAZF, and TATAZF*. Hydrophobic or van der Waals interactions are indicated by black dots, and hydrogen bonds are indicated by arrows. Color coding of the DNA region contacted by each finger emphasizes the size (and overlapping arrangement) of the binding sites. Red bars between residues at positions 6 and ⫺1 on neighboring fingers indicate the presence of interfinger interactions. (e) Diagram comparing the DNA recognition pattern of the fingers from Zif268 and the TATA complexes. Solid lines denote DNA interactions that occur in most of the fingers, while dashed lines indicate DNA interactions that occur less frequently.
Two Cys2His2 Zinc Finger/TATA Box Complexes 719
Table 1. Data Collection and Refinement Statistics Data Collection and MIR Phasinga TATAZF Complex Data Set
ZF Native-2
ZF Native-1
ZF I-dU #1
ZF I-dU #2
ZF* Native-1
Resolution range (A˚) Measured reflections Unique reflections Completeness (percent) Rsymb (percent) Average I/I Risoc (percent) Phasing powerd (acentrics/centrics) Figure of merit, MIR
20–2.0 496,779 43,331 99.5 (96.2) 7.4 (56.7) 28.52 (6.58)
20–2.2 181,526 31,505 98.8 (93.1) 6.4 (39.0) 22.36 (9.55)
20–2.85 133,389 15,452 100 (100) 7.1 (69.3) 9.94 (7.19) 9.3 0.85/0.87
20–2.50 83,303 21,938 98.4 (96.8) 14.3 (34.8) 11.27 (6.12) 16.5 1.33/1.21
20–2.2 358,651 32,875 99.5 (95.5) 6.6 (44.8) 30.04 (9.35)
0.37
Refinement TATAZF Complex
ZF
ZF*
Resolution range (A˚) Reflections, F ⬎ 2(F) Number of non-H atoms Rcryste (percent) Rfreef (percent) Number of water molecules Average B value (A˚2) Rmsd from ideal, bond lengths, A˚ (protein/DNA) Rmsd from ideal, bond angles, ⬚ (protein/DNA)
20–2.0 40,380 2,980 23.3 26.3 227 41.6 0.007/0.006 1.235/1.042
20–2.2 31,492 2,997 22.5 26.2 224 38.6 0.006/0.005 1.242/0.927
a
Values in parentheses are for the highest resolution shell. Rsym ⫽ ⌺ (|I ⫺ ⬍I⬎|) / ⌺ (I) c Riso ⫽ ⌺ (|Inat ⫺ Ider|) / ⌺ (Inat) d Phasing power ⫽ ⌺ (|FH|) / ⌺ (||FPH| ⫺ |FP ⫹ FH||) e Rcryst ⫽ (⌺(h,k,l)||Fo| ⫺ |Fc|| / ⌺(h,k,l)|Fo|) f Rfree was calculated for the 10% of reflections omitted from the refinement. b
Position 1 of the recognition helix, which is not even considered in discussions of a recognition code, makes important DNA contacts: i.e., in fingers 2 and 3, this residue interacts with two bases in the secondary strand (Figure 2). This creates a 5 bp binding site for some of the fingers and thus leads to an unexpected 2 bp overlap in recognition sites of neighboring fingers (Figures 1d and 1e). (DNA site selections performed on TATAZF confirm that the residues at position 1 have a direct role in DNA recognition. Thus, there is a preference for thymine at positions 4 and 7 within the secondary strand of the DNA [8] even though there are no base contacts with the corresponding adenines on the primary strand.)
Our new crystal structures also show that there are differences in the pattern of contacts from positions 2 and 6 of the recognition helix. In each finger of the TATAZF complex, the residue at position 2 of the recognition helix contacts the complementary base in the same base pair recognized by position ⫺1 of the recognition helix (Figure 1e). This secondary-strand interaction from position 2 occurs instead of (or in addition to) the “expected” contact with the 5⬘ base of the neighboring subsite. Another distinctive and interesting interaction (which again breaks the expected pattern of contacts) involves the arginine at position 6 of finger 3; this arginine contacts a guanine on each strand of the DNA, Figure 2. Unexpected DNA Contacts Are Observed from Position 1 of the Recognition Helix In fingers 2 and 3 of both TATA complexes, the amino acid at position 1 interacts with the secondary strand of the DNA as well as with the sugar-phosphate backbone. In this example, Leu-75 in finger 3 of TATAZF makes contacts to bases T7* and A6* on the secondary strand of the DNA. (Note: In this and subsequent figures, interactions of special interest are indicated by dotted lines. Numbers near dotted lines indicate the distance in angstroms. Zinc fingers are modeled as ribbons that are color coded by finger as shown in Figure 1d. The primary DNA strand is colored brown, and the secondary stand is colored magenta. Asterisks denote DNA bases in the secondary strand. Portions of Figure 1d that correspond to the displayed region of the structure are shown as an inset for each of the more detailed pictures.)
Structure 720
Figure 3. Interfinger Interactions Play Important Roles in Stabilizing the Protein-DNA Interface (a) In TATAZF, Thr-24 (position 6 of finger 1) interacts with the side chain of Gln-46 (position ⫺1 of finger 2) which in turn packs against the methyl group of T6. Presumably, these interactions stabilize the DNA contact from Gln-46 to A5. Thr-24 also interacts with Arg27 to further organize the interfinger junction. Arg-27, in turn, makes interfinger contacts to Pro-34 and to the backbone carbonyl of Ser-45. (b) Although they have received no attention in previous studies, interfinger interactions also occur in the Zif268 structure [4]. As illustrated here, both Thr-52 (position 6 of finger 2) and Asp-76 (position 2 of finger 3) help to stabilize the interaction of Arg-74 (position ⫺1 of finger 3) with a guanine on the primary strand of the DNA. Interfinger interactions between position 6 of one finger and position ⫺1 of a neighboring finger also occur in the GLI structure [27]. (Coloring scheme is the same as in Figure 2.)
and thus makes contacts that are different from the “classical” arginine ⇔ guanine interaction which is a recurring theme in the Zif268 structure. (An interaction that bridges guanines on opposite strands of the DNA has been previously observed for a lysine at position 6 in a designed zinc finger protein [14].) The different pattern of side chain-base interactions observed in the TATAZF structure demonstrates that a wider range of DNA contacts are accessible from each position of the recognition helix than predicted by the recognition code, and it illustrates the remarkable versatility of zinc fingers in DNA recognition. We note that in the TATAZF complexes there are an essentially equal number of contacts with the “primary” and “secondary” strands of the DNA. Interfinger Interactions Stabilize the Protein-DNA Interface The TATAZF structures also reveal important side chainside chain interactions between neighboring fingers. For example, threonine is highly conserved at position 6 of finger 1 in the set of clones obtained in the TATA selections [7], but our structures show that it does not directly
contact the DNA. Rather, the ␥-methyl group of Thr-24 stabilizes the orientation of the side chain of Gln-46 (at position ⫺1 of the neighboring finger), and thus helps organize the protein-DNA interface at the interfinger junction (Figure 3a). This type of interfinger interaction also occurs between Ala-52 and Thr-74 at the other finger-finger interface of the TATAZF complex. Corresponding interactions between residues at position 6 of one finger and position ⫺1 of a neighboring finger occur in other zinc finger structures, but they have previously been overlooked. For example, a similar interfinger interaction also occurs between Thr-52 and Arg-74 in the Zif268 complex (Figure 3b), and this interfinger interaction appears to complement the prominent Arg-74 ⇔ Asp-76 interaction that orients Arg-74 to interact with the DNA. Intrafinger Side Chain-Side Chain Interactions Are Important Elements of DNA Recognition Side chain-side chain interactions that stabilize DNA contacts also are observed within individual fingers. These interactions explain the correlated sequence dif-
Two Cys2His2 Zinc Finger/TATA Box Complexes 721
Figure 4. Side Chain-Side Chain Interactions within the Recognition Helix Stabilize the Protein-DNA Interface In the selected TATA proteins, whenever glutamine was present at position 6 of finger 2, threonine was absolutely conserved at position 2 [7]. The structure of TATAZF* explains this preference; the ␥-methyl group of Thr48 and the methyl group of T8 pack against opposite faces of Gln-52. Presumably, this stabilizes the interaction of Gln-52 with A7. This is analogous to the interfinger interaction described between fingers 1 and 2 of TATAZF in Figure 3a. (Coloring scheme is the same as in Figure 2.)
ferences between the two consensus sequences that were obtained in the TATA zinc finger selections [7]. These sequences differ in finger 2, with correlated changes in the residues present at positions 2, 3, and 6 (Figures 1b and 1c). One consensus sequence (represented by TATAZF*) has Thr at position 2, Gly or Ala at position 3, and Gln at position 6; the other (represented by TATAZF) has Ala at position 2, Ser at position 3, and Ala at position 6. Prior to the solution of the TATAZF and TATAZF* cocrystal structures, it seemed plausible that the observed sequence differences in finger 2 might reflect different docking arrangements of these fingers relative to the DNA. However, the structures show that the docking of finger 2 is virtually indistinguishable in the two complexes. Instead, we find that the correlated sequence changes characterizing these two different consensus sequences actually reflect different patterns of intrafinger interactions that stabilize the protein-DNA interface. Thus, in TATAZF* the ␥-methyl group of Thr-48 (position 2) contacts the side chain of Gln-52 (position 6) and stabilizes its interaction with the DNA (Figure 4). In TATAZF, both Ser-49 (position 3) and Ala-52 (position 6) help to stabilize the protein-DNA interface. The hydroxyl of Ser-49 hydrogen bonds to the backbone N-H and carbonyl of position ⫺1, and stabilizes the backbone conformation of this DNA contacting residue. As described above, Ala-52 (position 6) makes a contact to Thr-74 (position ⫺1 of the neighboring finger). Implications for Future Zinc Finger Studies The structures of the TATA complexes have profound implications for the selection and design of novel zinc finger proteins, and they highlight the complexity of protein-DNA recognition inherent for even such an apparently “simple” DNA binding motif. Although the TATA fingers are derived from Zif268 and dock in the major groove in a manner similar to that of the “parent” molecule, the pattern of interacting residue and base positions cannot be described within the framework of any standard recognition code. Our structures thus show that DNA recognition by canonically docked zinc fingers is much more versatile and “plastic” than had been inferred from the Zif268 structure and from the structures of variants with changes in a single finger [15]. These
unexpected contacts emphasize the importance of using an “unbiased,” context-sensitive selection method (like our strategy for sequential selection [7]) for generating optimal multifinger proteins. The numerous side chain-side chain interactions within and between fingers at the protein-DNA interface emphasize the importance of these supporting interactions. The consensus sequences observed in the final pool of selected fingers [7, 8] clearly confirm the value of these interactions; side chain-side chain packing at the protein-DNA interface plays a critical stabilizing role in zinc finger-DNA recognition. This fact, coupled with the potential for each finger to recognize a 5 bp subsite (with the help of residues at positions 1 and 2), implies that intra- and interfinger interactions will need to be considered more carefully in future selection, modeling, and design studies. In a deeper sense, these limitations of a recognition code, as revealed in this paper, emphasize the combinatorial and geometric complexity [16] inherent in even a relatively “simple” protein-DNA interface such as that of the zinc finger-DNA complex. The possibilities for DNA recognition are richer, and the pattern of contacts more complex, than most previous studies have assumed. Biological Implications Cys2His2 zinc fingers are the most common motif found in the human genome [17, 18] and thus appear to constitute the most “successful” and versatile DNA binding domain found in higher eukaryotes. The simplicity and stability of this fold, the apparent modularity of zinc finger proteins, and the simple repeating pattern of contacts observed in the first DNA bound zinc finger structure (Zif268) have made zinc fingers the domain of choice for the design of proteins with novel DNA binding specificity. The apparent simplicity of the zinc finger motif, however, belies the complexity that exists in its interactions with DNA, and the simple pattern of contacts observed in the Zif268 structure cannot be assumed to pertain in all Zif268 variants obtained in design and selection studies. The TATAZF structures reported here require a
Structure 722
reevaluation of many of the early hypotheses about zinc finger-DNA recognition. DNA recognition is not simply defined by a series of one-to-one interactions between amino acids at specific positions on the zinc finger recognition helix and DNA bases at specific positions in the binding site. We find numerous contacts that are inconsistent with previously proposed recognition codes, and we also find that side chain-side chain interactions (both within and between fingers) play critical roles in DNA recognition. Experimental Procedures Crystallization The TATAZF and TATAZF* peptides are clones number six and two, respectively, from the TATA box sequential selections [7], and our structural studies used a 90 amino acid region corresponding to the portion that had been used for the Zif268 crystallization [3]. These proteins were overexpressed and purified as previously described [15]. The oligonucleotides used to form the 16-mer duplex contain the TATA box of the adenovirus major late promoter (top strand, 5⬘-GACGCTATAAAAGGAG-3⬘; bottom strand, 5⬘-TCCTTTTA TAGCGTCC-3⬘). These were synthesized and purified as previously described [19]. When annealed, this duplex has a 1 bp overhang at each end. All crystallization steps were carried out in an anaerobic chamber ([O2] ⫽ ⵑ1 ppm). Prior to complex formation, the peptides were refolded by the use of 1.5 equivalents/finger CoCl2 in 100 mM Bis-Tris Propane (pH 8.0). The protein-DNA complex was then formed by the mixing of equal volumes of protein (1 mM) and DNA (1.1 mM) in 100 mM Bis-Tris Propane (pH 8.0). Crystals were grown via hanging-drop vapor diffusion, starting with a 1:1 mixture of complex and well solution (30%–35% PEG-400, 100–200 mM NaCl, and 10–50 mM Bis-Tris Propane [pH 8.0]). Data Collection Crystals were looped directly out of the drop, flash frozen in liquid nitrogen, and then mounted in a liquid nitrogen cooled nitrogen stream (⫺145⬚C). Crystals belonged to the space group P64 with unit cell dimensions a ⫽ b ⫽ 104.8 A˚, c ⫽ 104.2 A˚ for TATAZF and a ⫽ b ⫽ 105.1 A˚, c ⫽ 104.6 A˚ for TATAZF*. Diffraction data were typically collected with a rotating anode X-ray generator (Rigaku RU-200) and an R-axis IV image plate system, but one data set (TATAZF Native-2) was collected at the National Synchrotron Light Source on beamline X4A. Data were indexed, integrated, and scaled with the HKL [20] and CCP4 [21] suites with a 2 cutoff (Table 1). MIRAS Phasing, Model Building, and Refinement There are two complexes in the asymmetric unit for both TATAZF and TATAZF*. The structure of the TATAZF complex was solved first by MIRAS with heavy-atom derivatives generated by the substitution of 5-iodo-deoxyuracil (I-dU) for thymine at positions 4 or 14 of the bottom DNA strand (derivatives #2 and #1, respectively). Heavyatom coordinates were determined from visual inspection of the isomorphous-difference and anomalous-difference Patterson maps, and initial phases were calculated and refined for the TATAZF Native-1 data set with MLPHARE [21]. The resulting experimental electron density map was used to build an initial model with the program O [22], and this model was iteratively refined and rebuilt to generate a high-quality 2.2 A˚ resolution structure. Model refinement was carried out by the use of the simulated annealing, positional refinement, and isotropic B factor refinement protocols in X-PLOR [23] and included the use of a bulk solvent correction. The TATAZF structure was subsequently refined to 2.0 A˚ after a data set (Native-2) was obtained at the synchrotron. TATAZF* crystals were essentially isomorphous with the TATAZF crystals (Rmerge ⫽ 14.9%), and thus molecular replacement was entirely straightforward. The TATAZF* structure was refined to 2.2 A˚ by the same method. The test sets of reflections used for Rfree calculations were identical for both structures (where the resolution ranges overlap), which avoided bias. For checking the quality of both models, simulated-annealing omit maps were created for residues in
the recognition helices. PROCHECK analysis of the final models of TATAZF and TATAZF* indicate that 89.2% and 89.8% of the residues, respectively, lie in the core regions of the Ramachandran plot, with the remaining residue conformations corresponding to allowed regions. Analysis of the TATAZF and TATAZF* Structures The TATAZF unit cell contains two crystallographically distinct complexes. Superposition of these complexes (using the protein backbone and DNA) provides a rms difference of 0.32 A˚ for the two TATAZF complexes. The corresponding analysis of the two TATAZF* complexes provides a rms difference of 0.32 A˚. Comparisons between the TATAZF and TATAZF* complexes provides an average rms difference of 0.35 A˚. By use of Luzzoti plots, the mean coordinate error estimates calculated for the TATAZF and TATAZF* structures are 0.37 and 0.38 A˚, respectively [24]. The helical pitch was calculated for the 10 base pairs of DNA bound by TATAZF by the use of Curves 5.0 [25]. Acknowledgments We would like to thank B. Wang, J. K. Joung, J. Miller, L. Nekludova, E. Peisach, S. Fay-Richard, and C. Ogata for their helpful advice and discussions. Research was carried out in part at the National Synchrotron Light Source, Brookhaven National Laboratory, which is supported by the U.S. Department of Energy, Division of Material Sciences and Division of Chemical Sciences. S.A.W. was supported as a Special Fellow of the Leukemia and Lymphoma Society; M.E.-E. was supported by a National Institutes of Health interdepartmental training grant (#NIH5T32GM08334); and C.O.P. and R.A.G. were supported by the Howard Hughes Medical Institute. Received: April 9, 2001 Revised: June 15, 2001 Accepted: June 28, 2001 References 1. Wolfe, S.A., Nekludova, L., and Pabo, C.O. (2000). DNA recognition by Cys2His2 zinc finger proteins. Annu. Rev. Biophys. Biomol. Struct. 29, 183–212. 2. Pabo, C.O., Peisach, E., and Grant, R.A. (2001). Design and selection of novel Cys2His2 zinc finger proteins. Annu. Rev. Biochem. 70, 313–340. 3. Pavletich, N.P., and Pabo, C.O. (1991). Zinc finger-DNA recognition: crystal structure of a Zif268-DNA complex at 2.1 A˚. Science 252, 809–817. 4. Elrod-Erickson, M., Rould, M.A., Nekludova, L., and Pabo, C.O. (1996). Zif268 protein-DNA complex refined at 1.6 A˚: a model system for understanding zinc finger-DNA interactions. Structure 4, 1171–1180. 5. Desjarlais, J.R., and Berg, J.M. (1992). Toward rules relating zinc finger protein sequences and DNA binding site preferences. Proc. Natl. Acad. Sci. USA 89, 7345–7349. 6. Choo, Y., and Klug, A. (1997). Physical basis of a protein-DNA recognition code. Curr. Opin. Struct. Biol. 7, 117–125. 7. Greisman, H.A., and Pabo, C.O. (1997). A general strategy for selecting high-affinity zinc finger proteins for diverse DNA target sites. Science 275, 657–661. 8. Wolfe, S.A., Greisman, H.A., Ramm, E.I., and Pabo, C.O. (1999). Analysis of zinc fingers optimized via phage display: evaluating the utility of a recognition code. J. Mol. Biol. 285, 1917–1934. 9. Wolfe, S.A., Ramm, E.I., and Pabo, C.O. (2000). Combining structure-based design with phage display to create new Cys2His2 zinc finger dimers. Structure 8, 739–750. 10. Kim, Y., Geiger, J.H., Hahn, S., and Sigler, P.B. (1993). Crystal structure of a yeast TBP/TATA-box complex. Nature 365, 512–520. 11. Kim, J.L., Nikolov, D.B., and Burley, S.K. (1993). Co-crystal structure of TBP recognizing minor groove of a TATA element. Nature 365, 520–527. 12. Nekludova, L., and Pabo, C.O. (1994). Distinctive DNA conformation with enlarged major groove is found in Zn-finger-DNA
Two Cys2His2 Zinc Finger/TATA Box Complexes 723
13.
14.
15.
16.
17. 18. 19.
20.
21.
22.
23.
24. 25.
26.
27.
and other protein-DNA complexes. Proc. Natl. Acad. Sci. USA 91, 6948–6952. DiGabriele, A.D., and Steitz, T.A. (1993). A DNA dodecamer containing an adenine tract crystallizes in a unique lattice and exhibits a new bend. J. Mol. Biol. 231, 1024–1039. Kim, C.A., and Berg, J.M. (1996). A 2.2 A˚ resolution crystal structure of a designed zinc finger protein bound to DNA. Nat. Struct. Biol. 3, 940–945. Elrod-Erickson, M., Benson, T.E., and Pabo, C.O. (1998). Highresolution structures of variant Zif268-DNA complexes: implications for understanding zinc finger-DNA recognition. Structure 6, 451–464. Pabo, C.O., and Nekludova, L. (2000). Geometric analysis and comparison of protein-DNA interfaces: Why is there no simple code for recognition? J. Mol. Biol. 301, 597–624. Lander, E.S. et al. (2001). Initial sequencing and analysis of the human genome. Nature 409, 860–921. Venter, J.C. et al. (2001). The sequence of the human genome. Science 291, 1304–1351. Klemm, J.D., Rould, M.A., Aurora, R., Herr, W., and Pabo, C.O. (1994). Crystal structure of the Oct-1 POU domain bound to an octamer site: DNA recognition with tethered DNA-binding modules. Cell 77, 21–32. Otwinowski, Z., and Minor, W. (1997). Processing of X-ray diffraction data collected in oscillation mode. Methods Enzymol. 276, 307–326. CCP4 (Collaborative Computational Project 4) (1994). The CCP4 suite: programs for protein crystallography. Acta Crystallogr. D 50, 760–763. Jones, T.A., Zou, J.Y., Cowan, S.W., and Kjeldgaard (1991). Improved methods for binding protein models in electron density maps and the location of errors in these models. Acta Crystallogr. A 47, 110–9. Bru¨nger, A.T. (1992). X-PLOR Version 3.1: a system for X-ray crystallography and NMR (New Haven, CT: Yale University Press). Kleywegt, G.J., and Brunger, A.T. (1996). Checking your imagination: applications of the free R value. Structure 4, 897–904. Ravishanker, G., Swaminathan, S., Beveridge, D.L., Lavery, R., and Sklenar, H. (1989). Conformational and helicoidal analysis of 30 PS of molecular dynamics on the d(CGCGAATTCGCG) double helix: “Curves”, dials and windows. J. Biomol. Struct. Dyn. 6, 669–699. Christy, B.A., Lau, L.F., and Nathans, D. (1988). A gene activated in mouse 3T3 cells by serum growth factors encodes a protein with “zinc finger” sequences. Proc. Natl. Acad. Sci. USA 85, 7857–7861. Pavletich, N.P., and Pabo, C.O. (1993). Crystal structure of a five-finger GLI-DNA complex: new perspectives on zinc fingers. Science 261, 1701–1707.
Accession Numbers The TATAZF and TATAZF* structures have been deposited with the Protein Data Bank with accession codes 1G2F and 1G2D, respectively.