MACROMOLECULAR STRUCTURES
DAVID NEUHAIJS AND DANIELA RHODES
Putting the finger on DNA The first high-resolution crystal structure of a complex of zinc finger domains with DNA reveals another variation on the use of an a-helix in DNA recognition. One of the hottes#t areas of structural biology is the study of sequence-specific DNA recognition, that is, how transcription factors recognize regulatory sequences of genes. This is not without good reason; it is largely through the remarkable specificity with which these proteins recognize specific DNA sequences that the process of transcription is regulated, leading to the production of particular proteins only at the appropriate place and time within an organism.The DNA-binding domains of several transcriptional factors have been crystallized as complexes with DNA, and we now know some of the structural motifs in these proteins that do the business. The best understood example is the helix-turn-helix motif found in bacterial repressors and the homeodomains (see [l] for review), but more recent examples include the steroid hormone receptor DNA-binding domains (see [2] for review). However, one of the frustrations in this field is that the sequence and biological functions of a protein are usually known well in advance of the structure of its complex with DNA This has been particularly true in the case of one of the most talked about DNA-binding motifs of recent years, the so-called ‘zinc finger’ first identified in 1985 by Miller et al. [3]. This motif, which is repeated nine times consecutively in the Xenopus transcription factor IIIA (TF[IIA), consists of a 30.amino-acid sequence containing two hi&dines, two cysteines and three hydrophobic residues, all at conserved positions. From this, together with measurements of zinc content and partial proteolysis data, Miller et al. proposed in essence that each finger motif forms a small, independently folded, zinc-containing mini-domain, used repeatedly in a mod ufar fashion to achieve sequence-specific recognition of DNA [31. Since that time, zinc finger motifs have turned up in hundreds of proteins. SIOmany of these have been shown to interact with DNA tlhat the motif has in itself almost become a diagnostic for DNA binding. NMR studies [4,5] have shown that the structured region of the zinc-finger motif comprises about 25 residues, confirming an earlier proposal for the structure of the finger [6]. A small antiparallel P-sheet containing the two cysteines lies at the N-terminus and is packed against a short helix containing the two histidines on successive turns near its C-terminus. This arrangement also brings together the three conserved hydrophobic residues, one on each strand of the p-sheet and one in the helix, and the f%t metal-binding histidine to form a miniature hydrophobic core to the structure. However, alI the NMR results to date concern
268
(4 1
vv\lO
//j/j
20
30
MERPYACPVESCDRRF q
Q
7~8
6~7
4~5
GiO
"k 6P7
5~6
3P4
70 6o
80
w
9(
GEKPFACDI--CGE
RQKL G4 lP2
G2
A
8p’7
Al(5’OH)
q Next
DNA
APGPCPGPTPGPGPGPCPGPT 1 2 3 4 5 6 7 8 9 101112 CPGPCPAPCPCPCPGPCPAPJ-
Fig. 1. Map of the contact points between the Zif268 three-tinger peptide and its DNA-binding site. (a) Sequence of the 90residue peptide (single-letter amino-acid code) with the three fingers aligned. The zinc-binding cysteines and histidines are shown in bold letters. The P-sheets are indicated by zig-zag lines and the helices are boxed. Circles represent a contact to theindicated base, squares represent a contact to the indicated phosphate. (b) Sequence of the consensus DNA-binding site. Circles and squares are used to indicate the protein contacts as in (a). The DNA strand that makes all but one of the protein contacts (the G-rich strand) is anti-parallel to the protein chain, in the sense that the 3’ end lies at the N-terminus of the protein, and the 5’ end lies at the C-terminus.
only the protein in the absence of DNA Now the first, crystal structure of a complex between DNA and a zinctiger protein, the DNA-binding domain of 2X268 (a mouse protein, also called Krox 20, NGFl-A or Egrl, conmining three zinc fingers) has been reported by Nikola
@ 1991 Current Biology
Pavletich and Carl Pabo [7]. Figure 1 shows the sequences of the zinc-finger peptide and the ll-base-pair oligonucleotide in the complex. The structure has been solved to 2.111 resolution with an R factor of 18.2% using isomorphous replacement, and it reveals a wealth of information that will be avidly studied by all those in the field. The structure of lthe protein-DNA complex, shown schematically in Fig. 2, contains some expected features, but also some surprises. The crystal structure confirms the overall three-dimensional structure of the protein as determined using NMR methods [4,5], and the arrangement of the fingers on the DNA has simifarities to models proposed from sequence comparisons and mutation studies [8]. The thaee zinc-finger domains of Zif268 are wound continuously around the DNA following the major groove, each occupying a binding site of three basepairs, Overall, therefore, the protein wraps around almost one complete turn of the DNA double helix. A striking and unusual feature of the complex is that virtually all the protein-DNA co.ntacts involve only the G-rich strand of the DNA - only one, the Ser75-phosphate contact in the third finger, involves the other strand. In total, the protein makes 11 hydrogen bonds to the bases as well as several contacts to the backbone of the DNA (Fig. 1). Each finger makes nearly, but not exactly, equivalent contacts with the DNA, as a consequence of the highly repetitive nature of both the protein and the DNA sequences. The N-terminus of the helix in each finger lies at the bottom of the major groove of the DNA, and is held there by hydrogen bonds involving the ArgSer-Asp sequence at the start of each helix; the Arg gua nadinium group contacts a guanine on the DNA, and is ‘buttressed’ by hydrogen bonds to the carboxylate group of the Asp. In DNA-b&l.ing proteins, recognition helices can lie along the bottom of the major groove, or can climb the wall of the groove to varying degrees towards their C-termini. The helices of the zinc lingers climb the wall right to the top, such that there is a phosphate backbone contact near the C-terminus of the helix in two of the Angers (His25 and His53; Fig. 1). This contact is one of the surprises, as it is made through the ring of one of the metal-binding hi&lines, which contacts a phosphate oxygen through the Ns nitrogen (the zinc is bound to NE); in each case the phosphate lies within the biding site of the adjacent finger (Fig. 2). Although His81 cannot make an equivalent phosphate contact to those of His25 and His53 in the other fingers, it contacts the 3’ terminal OH group of Adenine 1 through an intervening water molecule. There is also a third contact between the DNA and each helix. Fingers 1 and 3 make a contact between the At-g immediately preceding the first metal-binding His, and a guanine. This interaction is absent in tiger 2, but it is replaced by one mvolving a (non-metal-binding) His earlier in the helix and G6 of the DNA Thus, for fingers 1 and 3, contacts are made from turns 1 and 3 of the recognition helix of the peptide to guanines separated by one base pair in the three-base-pair recognition site (GCG), whereas for finger 2, contacts are made from turns 1 and 2 of the recognition :helix to adjacent guanines on the
Volume
1
Number
4
1991
corresponding DNA recognition site (TGG). In addition, Arg87 in linger 3 contacts a phosphate on a neighbouring molecule of DNA in the crystal. Away from the peptide recognition helix, the main interaction is a phosphate contact from the Arg two residues beyond the second Cys in the p-sheet in all three fingers. If the only DNA contacts were with the recognition helix, the small zinc-finger domain would probably have considerable freedom to roll around in the major groove, but presumably the contact from the p-sheet helps to prevent this, pinning the protein to one side of the groove. COOH
L
Fig. 2. plexed tein as making
Schematic view of the three zinc fingers of Zif268 comto their II-base-pair consensus DNA-binding site. The proa whole spirals around in the major groove, each finger approximately equivalent contacts.
In the Zif26sDNA complex there are almost no contacts between adjacent fingers, the linkers are largely extended and make no sign&ant DNA contacts, so that the orientation of adjacent fingers with respect to each other seems to be determined very largely by the DNA This conclusion is reinforced by the observation from NMR that, in a two-finger peptide, the linker is flexible and adjacent linger domains do not interact in the absence of DNA [9]. As for the DNA in the Zif268 complex, although it is said to be B-DNA, there must be at least some A-type character, as there is clearly a central hole corresponding to a displacement of the base pairs by about 2A from the axis of the DNA (visible on the cover picture of the 10 May issue of Science). How similar is this zinc-finger complex likely to be to others? Being amongst the smallest protein motifs used for DNA-binding, zinc-tiger domains lack much of the ‘scatfolding’ seen in larger DNA-binding domains, and are thus likely to be less inhibited sterically in their interactions with DNA. This implies that zinc-finger proteins in general are likely to have a wider repertoire of possible interactions with DNA than just those seen in this particular complex, as pointed out by Pavletich ( and Pabo themselves [ 71. Indeed, all the specific contacts for Zif268 involve guanines, and although some other recognition sequences for zinc fingers are G-C rich, others are A-T rich, so at the very least there must
269
be interactions with other bases in other complexes. Also, several of the residues involved in the specific contacts in 23268 are not conserved in other zinc-finger sequences. Perhaps most fundamentally, although it is quite reasonable for three fingers to wrap around the major groove of DNA as seen here, there must be a severe topological problem if one attempts to use only this mode of binding for proteins having a significantly larger number of consecutive hnger domains (for example, TFIIIA with nine fingers). The fingers and linkers in 23268 are very similar to one another, but quite different arrangements, partic;ularly in the sequence and length of the linkers, exist in other proteins. Furthermore, there seems to be no simple way of reconciling the footprinting data on other zinc-finger proteins, such as TFIIIA and SP1, with the particular geometry seen in the Zif26%DNA complex. Thus, me suggestions made previously that successive fingers in multi-finger proteins might be arranged differently, or even jump the minor groove [lo], are not excluded by the Zif268 results. Zinc fingers are so ubiquitous and varied that it seems likely that the arrangement of the three fingers seen in this complex is just the beginning of the story.
3.
References 1. Snu?z
10.
action: B@bys 2.
Structural the sources
TA
1990,
JWR, mone receptors nition. Trends
SCHW~E
studies of protein-nucleic of sequence-specific binding.
MILLER J, MCLKHIAN
AD,
mains
transcriptional
in the protein
oocytes.
4.
EMBO
LEE MS, GI~PERT PG, dimensional solution
binding 5.
domain.
KLEWT
BERG JM:
from Acad 7.
A Repetitive
factor
SOMAN KV, CASE DA, WRIGHT PE: Threestructure of a single zinc-finger DNA
Science
1989,
2453635637. SJ: Solution
domain
of yeast
ADRl.
Proposed
structure
for
Proteins the
transcription factor IJL4 and related Sci USA 1988, 8599-102.
PAVIETICH
structure
zinc-binding doIRA from Xenopz4s
4:1603-1615.
RE, HERRIO’~~ JR, HOF#ATH
zinc-finger 6.
J 1985,
tiUG
structure
of a
1990, 7~215-226.
zinc-binding proteins.
domains Proc Nat2
NP, PABO CO: Zinc finger-DNA recognition: crystal of a Zif268-DNA complex at 2.l.k. Science 1!391,
252:809-+X7.
8.
NARDELLI J, GB~ON
TJ, VESQUE C, CHARNAY P: Base sequence discrimination by zinc-Jinger DNA-binding domains. Nature 1991, 399175-178.
9.
NEUHAUS
D, NAICESEKO Y, NAGAI K, KLUG A Sequence-specific [tH]NMR resonance assignments and secondary structure identification for l- and 2-zinc finger constructs from SWIS. FEBS Lett 1990, 262:179-K%
FAIRALL L, RHODES D, KLUG A Mapping
tion on a SSRNA gene by the III& A model for the interaction.
acid interQuart I&w
of the sites of protecXenopus transcription factor JMoZ Bioll986, 192:577-591.
23205-280. D: Beyond zinc fingers: steroid horbave a novel structural motif for DNA recogBitxbem Sci 1391, 16:291-296.
RHDDES
IN THE FEBRUARY
David Neuhaus and Daniela Rhodes, MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 2QH, LJK.
1992 ISSUE OF CURRENT OPINION
IN STRUCTURAL BIOLOGY
Simon Phillips will edit eleven reviews covering the past year’s developments in Protein-Nucleic Acid Interactions:
Protein-RNA
interactions
by K Nagai by R Kaptein zippers by P Sigler synthetases by D Moms
Zinc tigers
Leucine Aminoacyl-tRNA Structure and function of restriction endonucleases by F Winkler Nuclease structure and catalytic function by D Suck DNA bending and kinking - sequence dependence and function by A Travers DNA recognition by the helix-turn-helix motif by R Brennan Single-stranded DNA binding proteins by G Kneale Viral protein-nucleic acid interactions by P Stockley Protein-nucleic acid interactions in nucleosomes by J Baldwin The same issue will also contain eleven reviews on Protein Folding and Binding edited by Thomas Creighton.
270
@ 1991.Current
Biology