Crystal Structure ofCitrobacter freundiiRestriction EndonucleaseCfr10I at 2.15 Å Resolution

Crystal Structure ofCitrobacter freundiiRestriction EndonucleaseCfr10I at 2.15 Å Resolution

J. Mol. Biol. (1996) 255, 176–186 Crystal Structure of Citrobacter freundii Restriction Endonuclease Cfr 10I at 2.15 Å Resolution Damir Bozic1*, Saul...

4MB Sizes 8 Downloads 51 Views

J. Mol. Biol. (1996) 255, 176–186

Crystal Structure of Citrobacter freundii Restriction Endonuclease Cfr 10I at 2.15 Å Resolution Damir Bozic1*, Saulius Grazulis1,2, Virginijus Siksnys2 and Robert Huber1 1

Max-Planck-Institut fu¨r Biochemie, D-82152 Planegg-Martinsried Germany 2

Institute of Biotechnology FERMENTAS, Lt-2028 Vilnius, Lithuania

The X-ray crystal structure of Citrobacter freundii restriction endonuclease Cfr10I has been determined at a resolution of 2.15 Å by multiple isomorphous replacement methods and refined to an R-factor of 19.64%. The structure of Cfr10I represents the first structure of a restriction endonuclease recognizing a degenerated nucleotide sequence. Structural comparison of Cfr10I with previously solved structures of other restriction enzymes suggests that recognition of specific sequence occurs through contacts in the major and the minor grooves of DNA. The arrangement of the putative active site residues shows some striking differences from previously described restriction endonucleases and supports a two-metalion mechanism of catalysis. 7 1996 Academic Press Limited

*Corresponding author

Keywords: restriction endonuclease; crystal structure; degenerated DNA sequence; active site arrangement; two-metal-ion mechanism

Introduction Type II restriction endonucleases are widely used in genetic engineering experiments for their unique ability to recognize specific nucleotide sequences in DNA, and to cut the phosphodiester bond at a defined location within their recognition sequence. About 2400 type II restriction endonucleases have been identified, representing 188 different specificities (Roberts & Macelis, 1993; Roberts & Halford, 1993). This is probably one of the largest classes of functionally similar enzymes with different specificities. Since their primary sequences show no or only weak primary sequence similarities, these enzymes might represent a great diversity of structures and mechanisms employed in sequence-specific recognition of DNA and catalysis. Restriction endonuclease Cfr10I recognizes the hexanucleotide sequence 5'-Pu 3 CCGGPy and cleaves after the first Pu (as indicated by the arrow) to produce 5'-overhanging ends (Janulaitis et al., 1983). This is in contrast to the well characterized restriction enzymes EcoRI (Kim et al., 1990), EcoRV (Winkler et al., 1993; Kostrewa & Winkler, 1995), BamHI (Newman et al., 1994a,b) and PvuII (Cheng et al., 1994; Athanasiadis et al., 1994), which show a stringent sequence specificity. All these enzymes recognize stringent hexanucleotide sequences, e.g. G 3 AATTC is recognized by EcoRI and GAT 3 ATC 0022–2836/96/010176–11 $12.00/0

by EcoRV. The two latter enzymes differ in cleavage position relative to their recognition sequence, and are suggested to form different families of restriction endonucleases (Winkler et al., 1993; Newman et al., 1994a,b), producing either 5'-overhanging ends (EcoRI and BamHI) or blunt ends (EcoRV and PvuII). In this paper we describe the structure of Cfr10I ˚ resolution and restriction endonuclease at 2.15 A compare it with the structures of EcoRI and EcoRV, representing different cleavage pattern prototypes of restriction enzymes. Although Cfr10I cleaves DNA to produce 5'-overhanging ends characteristic of the members of EcoRI family, it also exhibits structural similarities to EcoRV and may be regarded as a structural link between these two families. Like other type II restriction endonucleases, Cfr10I requires Mg2+ as a cofactor for catalysis, and is functional as a dimer. Comparison of the organization of active site residues between Cfr10I and EcoRV allowed us to assume a similar two-metal-ion mechanism for Cfr10I, as suggested recently for EcoRV (Kostrewa & Winkler, 1995; Baldwin et al., 1995; Vipond et al., 1995). At present, most of our knowledge of sequencespecific DNA recognition by restriction endonucleases comes from structural studies of enzymes with distinct specificity. The structure of Cfr10I represents the first structure of a restriction 7 1996 Academic Press Limited

177

C. Freundii Restriction Endonuclease Cfr10I

Table 1. Data collection and phasing statistics Dataset Native I Native II Native IIIf HGAC UOAC PCMBg LANI LAHG

Maximum resolution

Total observations

Unique reflections

Completeness (%)a

Rmergeb (%)

Rsym c (%)

Binding sites

˚ Riso d 25–3 A (%)

Phasing powere

2.8 2.7 2.15 3.1 2.9 3.0 2.9 2.9

22158 6324 56341 22001 24412 24152 23107 22190

5735 3893 15745 4964 6457 6430 6166 6712

73.9 (40.1) 43.4 (45.3) 90.2 (62.6) 82.9 (86.4) 82.8 (52.1) 88.2 (74.3) 84.6 (28.1) 92.1 (31.0)

6.9 8.3 3.4 6.1 4.2 5.4 5.7 5.7

12.0 9.0 7.0 13.2 5.9 7.9 8.7 7.0

2 2 2 2 3

22.9 14.5 24.2 17.2 25.2

1.48 1.07 2.09 1.17 2.09

a

Completeness of the highest resolution shell in parenthesis. Rmerge : Sh Si [=I(h, i ) − I(h)=]Sh Si I(h, i ), where I(h, i ) is the intensity value of the ith measurement of h and I(h) is the corresponding value of h for all i measurements of h, the summation is over all measurements. c Rsym : S(IF − IF )/SIF , where IF is the average value of point group related reflections and IF  is the average value of a Bijvoet pair. 2 2 d Riso : S=FPH − FP2 =IS(FPH + FP2 ), where FPH and FP are the derivative and the native structure-factor amplitudes, respectively. 2 2 e FH/residual: rms mean heavy-atom contribution/rms residual defined as [(FPHC − FPH )/n]1/2 with the sum over all reflections, where FPHC is the calculated structure-factor and FPH is the structure-factor amplitude of the heavy-atom derivative, respectively. f ˚. Data were collected at DESY, Hamburg (Beamline BW6) at l = 0.96 A g Same sites as HGAC. The abbreviations used represent the following treatments: HGAC, mercuric acetate (5 mM, 2 hours); UOAC, uranyl acetate (saturated, 20 hours); PCMB, p-chloromercuribenzoic acid (saturated, 24 hours); LANI, La(NO3 )3 (100 mM, 16 hours); LAHG, La(NO3 )3 (100 mM, 15 hours); afterwards washing the crystal in harvesting buffer and soaking in PCMB (saturated, 8 hours). All crystals were soaked in harvesting buffer containing 1.5 M ammonium acetate in 0.075 M Mes (pH 6.5) with 10% (w/v) PEG 8000. b

endonuclease recognizing a degenerated nucleotide sequence and may help us to understand the principles of Pu and Py base discrimination. Although the structure of Cfr10I presented here is of the free enzyme, structural resemblance to other restriction endonucleases has enabled us to identify structural elements that are probably involved in specific interactions with DNA.

Results Quality of the model The data collection and refinement statistics for Cfr10I are summarized in Table 1 and Table 2. The current model consists of 283 out of 285 residues (two carboxy-terminal serine residues are not visible in the density) and 98 water molecules. The model has been refined to an R-factor of 19.65% ˚ and 2.15 A ˚ resolution. using all data between 10 A All main-chain dihedral angles fall within allowed

Table 2. Final refinement statistics of Cfr10I Average temperature factors No. Bav Protein atoms (main chain) Protein atoms (side-chain) Protein atoms (all) Ordered water molecules All atoms rms deviations from ideality Bond length Bond angles ˚) No. of reflections (10.00–2.15 A Rfactor (%)

1132 1111 2243 98 2341 ˚ 0.011 A 1.49° 15396 19.64

27.4 30.1 28.7 43.0 29.3

regions of the Ramachandran plot (Ramachandran & Sasisekharan, 1968). One part of the final 2Fobs − Fcalc density map is shown in Figure 1. Overall structure The Cfr10I monomer (Figure 2) is folded into a compact a/b-structure with approximate ˚ × 50 A ˚ × 35 A ˚ . According to dimensions of 55 A the program DSSP (Kabsch & Sander, 1983) Cfr10I consists of nine a-helices (45%), three 310-helices (4%) and seven b-strands (12%). A list of secondary structure elements of Cfr10I is given in Table 3. The most dominant feature of the structure is the central five-stranded mixed b-sheet, which is flanked on both sides by helices. This central b-sheet has a left-handed twist of approximately 80°. Within the b-sheet the strands b3, b4 and b5 are oriented antiparallel, while the parallel strands b5, b6 and b7 resemble a mononucleotide binding fold (Rosenberg, 1991) with the parallel helices a7 and a8/310 3 acting as the crossovers. The carboxy-terminal helix a9 is packed on the opposite side of the central b-sheet to the crossover helices. Secondary structure elements 310 1/a4 and a5/a6/310 2 connect antiparallel strands b3,b4 and b4,b5, respectively. Helix a4 flanks the central b-core over its top and ˚ out of the molecule, whereas protrudes about 10 A the antiparallel helices a5 and a6/310 2 flank the molecule from the axial side, connecting strand b4 with b5. On the bottom side of the central b-core, helix a3 is packed against the antiparallel strands of the central b-sheet and is sandwiched between helices a5/a6 and strand b5, and protrudes like helix a4 out of the core. An amino-terminal domain of Cfr10I, consisting of the structural elements a1,

178

C. Freundii Restriction Endonuclease Cfr10I

Figure 1. A stereo view of the final 2Fobs − Fcalc density map contoured at 1s level. One part of the central b-sheet is shown with the view down on helix a3.

a2, b1 and b2 precedes helix a3, and is packed against its amino terminus. Within the monomer structure a cleft is formed by helices a3, a7 and strands b3 and b4. It has ˚ ×6A ˚ in width and approximate dimensions of 15 A ˚ in depth. 10 A The U-shaped dimer structure is generated by applying one crystallographic dyad along the c-axis. The dimer interface is formed by helices a7 and a8/310 3. Additional contacts are made by the structural elements connecting strand b4 to helix a4, ˚ towards the symmetrywhich protrudes about 10 A equivalent molecule. A surface area of about

˚ 2 per monomer becomes buried upon 2600 A dimerization. The overall dimer dimensions are ˚ × 65 A ˚ × 40 A ˚ 3, including a cleft of about 80 A ˚ diameter and 25 A ˚ length, which may about 18 A provide enough space to accommodate B-form DNA. The cleft is mainly formed by the symmetryrelated amino-terminal domains; specifically, helices a2, a3 and interconnecting segment point into the cleft. The amino termini of symmetry-equivalent a7 helices generate the roof of the cleft. The binding of DNA may be stabilized by the positive electrostatic potential at the amino termini of helices a3 and a7.

Figure 2. Stereo ribbon diagrams of a Cfr10I monomer. Secondary structure elements are labelled according to Table 2.

C. Freundii Restriction Endonuclease Cfr10I

Table 3. Secondary structure elements of Cfr101 b1 b2 a1 a2 a3 b3 310 1 a4 b4 a5

I4–K6 Y13–I15 S17–S26 F35–L53 D59–E84 F89–K93 V102–I105 S108–L123 F135–D139 R144–M150

a6 310 2 b5 a7 b6 a8 310 3 b7 a9

L161–N169 L170–F174 I183–V189 P195–R218 R228–A233 N238–L244 T249–I252 E265–K268 V273–I282

The assignment of secondary structure elements was done with DSSP (Kabsch & Sander, 1983).

Comparison with other restriction endonucleases Despite no significant primary sequence similarities, the structures of Cfr10I, EcoRI and EcoRV are related to each other. Spatial alignment of Cfr10I with EcoRI and EcoRV reveals (Figure 3) that a central five-stranded b-sheet sandwiched by two a-helices is common to all restriction endonucleases. Common structural motifs of Cfr10I (a7, a8, a9 and b3 to b7) and EcoRI (a4, a5, a6 and b1 to b5) can be superimposed with an rms ˚ . A fit of 1.5 A ˚ rms deviation is deviation of 2 A obtained between similar structural elements of Cfr10I (a3, a9 and b3 to b7) and EcoRV (aA, aB, bc to be, bg and bh). The topological diagrams

179 of Cfr10I, EcoRI and EcoRV (Figure 4) indicate the similarity of these common structural motifs (these elements are shaded). Common elements of secondary structure are shown in similar positions to those in Figure 3. Apart from these structural elements, which form a substructure of a mixed five-stranded b-sheet flanked by helices, there are no further structural similarities between all these enzymes. However, pairwise structural comparison reveals that helix a3 of Cfr10I is structurally equivalent to the aB helix of EcoRV and helices a7 and a8 of Cfr10I appear to be analogous to helices a4 and a5 of EcoRI, respectively. Note that the structural elements of Cfr10I and EcoRI involved in the formation of the dimer interface (shown in blue in Figure 4) are similar to each other; however, they are different from those of EcoRV. The structures of EcoRI and EcoRV have been regarded as prototypes for restriction endonucleases with different cleavage properties and DNArecognition modes (Winkler et al., 1995; Newman et al., 1994a,b). Restriction enzymes of the EcoRI family (BamHI and Cfr10I) cleave their recognition sequence with a 4 bp stagger to produce 5'-overhanging ends, while EcoRV and PvuII cut the DNA in the middle of their recognition sequence, leaving blunt ends. It appears that differences in cleavage position are related to the modes of dimer formation and DNA binding. In EcoRI, BamHI and Cfr10I ˚. dimers the active sites are separated by about 18 A

(a)

(b)

Figure 3. Stereo views showing superimposition of Ca chains of Cfr10I on EcoRI (a) and EcoRV (b) restriction endonucleases. Cfr10I is always shown in black.

180

C. Freundii Restriction Endonuclease Cfr10I

(a)

(b)

(c)

Figure 4. Topological diagrams of Cfr10I (a), EcoRI (b) and EcoRV (c) at the same orientation as in Figure 3. Common structural elements are shadowed. Dimerisation regions are emphasised in blue and structural elements that contact DNA are in red.

All these enzymes share a similar dimer interface formed by crossover helices (a7 and a8/310 3 in Cfr10I and its equivalents in EcoRI and BamHI) of the mononucleotide binding fold (Rosenberg, 1991). Such an arrangement of the monomers ensures the correct positioning of the active sites spaced apart by 4 bp in the major groove of DNA. In EcoRI (Kim et al., 1990) all sequence-specific contacts with DNA are also made from the major groove side. In contrast to the EcoRI family, in blunt-end cutters (EcoRV and PvuII) the active site residues are ˚ . Since the active site residues separated by only 2 A of enzymes generating blunt or 5'-overhanging ends, respectively, are similarly arranged within topologically equivalent secondary structure elements (see below), different dimerization modes are used to achieve different distances between the active sites. Consequently, EcoRV dimerizes via strands ba and bb (see Figure 4). The mode of DNA orientation of

EcoRV is also different: the minor groove of DNA points towards protein and active site residues. Consequently, sequence-specific contacts are made by EcoRV from both the major and minor groove sides. This structural resemblance of Cfr10I to EcoRI and EcoRV allows the identification of structural elements of Cfr10I that could be involved in DNA contacts and recognition. X-ray studies of the EcoRI–DNA complex (Kim et al., 1990) supported by mutagenesis (Heitman & Model, 1992) and biochemical experiments (Jeltsch et al., 1995) demonstrated that sequence-specific contacts come from different parts of the protein: namely, the extended chain motif (M137 to A142) and ‘‘inner’’a4 and ‘‘outer’’ a5 recognition helices. Superimposition of Cfr10I and EcoRI structures revealed that helix a7 of Cfr10I structurally coincides with ‘‘inner’’ recognition helix a4 of EcoRI, suggesting

C. Freundii Restriction Endonuclease Cfr10I

that a7 helix of Cfr10I and the preceding short loop are probably involved in sequence-specific contacts with the recognized sequence and DNA backbone. As this sequence of Cfr10I is rather basic, due to a cluster of three arginine residues in a short region (R194PDRRLQ), it may be involved in DNA–phosphate backbone contacts. In Cfr10I the extended chain motif is missing. The long connection between strand b3 and helix a4 (the ‘‘inner arm’’) of EcoRI is replaced by only four residues between the topologically equivalent structural elements b5 and a7 in Cfr10I (Figure 4). In BamHI (Newman et al., 1994a,b) the extended chain motif is also missing, but the region between b6 and a6 (P144 to N158) is supposed to make specific contacts with the outer G base of the recognized sequence (Figure 5). This part of BamHI structurally corresponds to the ‘‘outer arm’’ of EcoRI; however, it is much shorter (15 amino acid residues in BamHI versus 30 residues in EcoRI). In EcoRI it connects strand b4 with the outer recognition helix a5 and contains residues R200/ R203, which make water-mediated contacts with the outer G base. In Cfr10I, the structurally equivalent connection between b6 and a8 (Figure 4) contains five amino acid residues and would not be able to contact DNA without large conformational changes. As has been discussed above, Cfr10I also shows structural similarities to EcoRV. Superposition of the central b-core reveals the equivalence of Cfr10I helix a3 and helix aB of EcoRV. Noteworthy the residues E71 and K64 of Cfr10I superimpose with the E45 and K38 residues of EcoRV. In EcoRV E45 forms a part of the active site (Kostrewa & Winkler, 1995; and see below), while K38 makes unique specific contacts with the central T base of its recognition sequence (GAT 3 ATC) within the minor groove (Winkler et al., 1993). The active site arrangement Mg2+ play an essential role in phosphodiesterbond hydrolysis by restriction endonucleases.

181 Structural comparison of EcoRI and EcoRV (Winkler, 1993) revealed a conserved structural motif of two acidic and one basic side-chain residues similarly located in the vicinity of the scissile phosphodiester bond. A sequence motif with PDXn (E/D)YK (where X is any residue and Y is a hydrophobic one) has been proposed (Anderson, 1991) as an active site signature motif, though only nine out of 36 type II restriction endonucleases within the Swissprot protein sequence database obey this motif. However, it seems that the list of such restriction enzymes is growing (Siksnys et al., 1995). Recent structural analysis of EcoRV (Kostrewa & Winkler, 1995) supported by biochemical studies (Baldwin et al., 1995; Vipond et al., 1995) showed that EcoRV exhibits a second Mg2+ binding site, where glutamate E45 of helix aB acts as a Mg2+ ligating residue. In EcoRI no structural counterpart to E45 was found, as there is no equivalent secondary structure element (see Figure 4). The structurally conserved Mg2+ binding site is formed by P90D91, E111 and K113 residues in EcoRI and P73D74, D90 and K92 in EcoRV, where the lysine residues are supposed to stabilize the doubly charged pentavalent transition state (Kostrewa & Winkler, 1995). Selent et al. (1992) showed that, instead of a lysine, another acidic residue in combination with Mn2+ can fulfil the same function in EcoRV. Interestingly a negatively charged residue was found at the active site of BamHI (Newman et al., 1994a,b) instead of a conserved lysine, where E111/E113 residues appeared to be involved in Mg2+ coordination. These residues of BamHI coincide structurally with E111/K113 in EcoRI and D90/K92 in EcoRV. The conservation of the active site regions within restriction endonucleases provided a strong indication of the catalytic residues in Cfr10I. Similar to P73D74 in EcoRV and P90D91 in EcoRI, Cfr10I exhibits a structurally equivalent P133D134 ˚ counterpart pair (Figure 6). At a distance of 3.8 A from D134, glutamate E71 is located on helix a3 of

Figure 5. Stereo view of the superimposition of the DNA-recognition region of Cfr10I (red) on BamHI (purple). Residue R155 in BamHI is supposed to contact the outer base-pair of the recognition sequence.

182

C. Freundii Restriction Endonuclease Cfr10I

Figure 6. Stereo view of the superimposition of active site motifs of Cfr10I (green), EcoRI (yellow) and EcoRV (white). A structurally equivalent motif SVK is present in Cfr10I instead of the E/DYK motif of EcoRI and EcoRV. Glutamate E204 of Cfr10I is supposed to be functionally similar to E111/D90 residues of EcoRI and EcoRV, respectively. Residues E71 of Cfr10I and E45 of EcoRV are supposed to form the second Mg2+ binding site.

Cfr10I. A binding site for La3+, which was used in the heavy-atom derivative search, was located between these two latter residues. The E71 residue of Cfr10I structurally coincides with the E45 residue of EcoRV, which is involved in the binding of a second Mg2+ and suggests a similar role for this residue in Cfr10I. Site-directed mutagenesis of E71 to A lead to a complete loss of activity (Siksnys et al., 1995) and supports the assumption that this residue is essential for catalysis. Surprisingly, Cfr10I does not exhibit the second (E/D)YK-part of the active site signature motif as is characteristic of EcoRV and EcoRI, but has residues S188,V189,K190 on the equivalent ˚ structural element (Figure 6). However, at 2.8 A from S188 the acidic residue glutamate E204 is located, and we assume that this residue is the functional counterpart of D90 in EcoRV and E111 in EcoRI, respectively. Strikingly, this E204 residue of Cfr10I is positioned on helix a7, which we suppose makes sequence-specific contacts with DNA. Probably only the correct positioning of the DNA recognition sequence with respect to helix a7 would align E204 for correct Mg2+ binding and would ensure the coupling between recognition and cleavage.

Discussion Despite the lack of primary sequence similarities, restriction endonucleases share a similar structural arrangement. Superimposition of all restriction endonucleases with known three-dimensional structure revealed the presence of a common mixed five-stranded b-sheet flanked by helices (Cheng et al., 1994; Aggarwal, 1995). The structural organization of active site residues within this mixed b-sheet appeared to be similar, suggesting a common mechanism. The major difference between the active sites of restriction endonucleases with known X-ray

structures is the presence of a second Mg2+ binding site within EcoRV, suggesting a two-metal-ion mechanism (Kostrewa & Winkler, 1995; Baldwin et al., 1995; Vipond et al., 1995) of catalysis. Structural similarity between the active sites of Cfr10I and EcoRV, and site-directed mutagenesis studies (Siksnys et al., 1995), allow us to suggest a similar functional role for residues E45/D74 and E71/D134 in EcoRV and Cfr10I, respectively. Nevertheless, experiments in which Cfr10I crystals were soaked with Mg2+ (100 mM, 3 days) showed no significant binding at either putative Mg2+ binding site. However, Mg2+ binding may be synergistic with DNA as was demonstrated by kinetic studies of EcoRV (Halford & Goodall, 1988; Taylor & Halford, 1989). Within EcoRI and BamHI most of the sequencespecific and DNA backbone contacts are made by equivalent secondary structure elements from topologically similar b-loop-a motifs, which was initially suggested for EcoRI (Kim et al., 1990). In EcoRI, loops from the latter motifs connecting b-strands to the helices of the mononucleotide binding fold extend out of the protein and form the so called inner (b3 to a4 connection) and outer arms (b4 to a5 connection) that wrap around the DNA. The sequence-specific contacts of EcoRI with the inner AATT tetranucleotide are made by helix a4 (inner recognition helix) and a short region spanning from M137 to A142 (extended chain motif) buttressed between the inner arm and the a4 helix. The specific contacts to the outer G base are made by arginine R200 and R203 residues through water-mediated hydrogen bonds. These residues are positioned at the carboxy-terminal part of the outer arm and the amino terminus of helix a5 (outer recognition helix). BamHI (Newman et al., 1994a,b) does not exhibit similar inner arm and extended chain motifs. Sequence-specific contacts of BamHI with the central GAAT nucleotides are supposed to be made

C. Freundii Restriction Endonuclease Cfr10I

by amino acids from the amino-terminal part of helix a4 that is structurally equivalent to the inner recognition helix a4 of EcoRI. The outer base-pair is supposed to be recognized by BamHI through arginine R155, which is positioned in the loop preceding helix a6. This loop is topologically similar to the outer arm of EcoRI; however, it is much shorter, and does not protrude out of the protein core. Superposition of Cfr10I and EcoRI structures revealed that helix a7 of Cfr10I structurally coincides with the inner recognition helix a4 of EcoRI, suggesting that the a7 helix of Cfr10I, and the loop preceding it, probably make DNA backbone and sequence-specific contacts with an inner tetranucleotide similar to EcoRI and BamHI. The helix a8 of Cfr10I is the structural counterpart of the outer recognition helix of EcoRI; however, the equivalent outer arm motif is missing. The topologically similar connection is very short (five residues) and would spatially probably not be able to contact an external base like EcoRI and BamHI do, at least without major conformational changes. Interestingly, structural comparison of Cfr10I with EcoRV revealed that helices a3 and aB are structurally equivalent. Structure-based protein sequence alignment showed that only two amino acid residues are conserved between helix a3 of Cfr10I and aB of EcoRV, i.e. residues E71 and K64 in Cfr10I and E45 and K38 in EcoRV, respectively. Structural analysis (Kostrewa & Winkler, 1995) suggested that the E45 residue in EcoRV is involved in Mg2+ binding and catalysis (Baldwin et al., 1995; Vipond et al., 1995), and K38 makes unique contacts with the central A·T/T·A basepairs of the recognition sequence (GAT 3 ATC; Winkler et al., 1993). The recognition of these two central A·T/T·A base-pairs proceeds through minor groove contacts of the protein. A kink in DNA of about 50° and unwinding at the central four base-pairs make the minor groove wider and thus permit the accommodation of both symmetryequivalent K38 residues of EcoRV. Such interactions would not be possible with canonical B-form DNA, since its minor groove does not provide enough space for multiple contacts between amino acid residues and DNA. Because of the different dimerization interface in Cfr10I the equivalent K64 ˚ residues of the dimer are separated by about 18 A from each other. This distance is similar to the distance between outer base-pairs within a hexanucleotide sequence, suggesting possible contacts of K64 residues with the external base-pair from the minor groove side, as in EcoRV. As Py bases are indistinguishable, with respect to their hydrogen bonding pattern, from the minor groove side of DNA, such a contact of lysine K64 with Py bases could rationalize the degenerate outer base-pair recognition in Cfr10I. The structures of EcoRI (Kim et al., 1990) and BamHI (Newman et al., 1994a,b) revealed that within a hexanucleotide sequence the inner four base-pairs are recognized by one common struc-

183 tural element, which is equivalent to the inner recognition helix of EcoRI; and the outer base-pairs are recognized by another module, which corresponds to the outer recognition helix in EcoRI and the topologically equivalent element in BamHI, suggesting that within restriction endonucleases a modular organization of DNA-recognition elements is present. The ‘‘central recognition module’’ seems to be present in Cfr10I, but the presence of a structurally equivalent ‘‘outer recognition module’’, as in EcoRI and BamHI, is not evident. A similar modular organization of DNArecognizing elements is also found in the family of blunt-end-cutting restriction endonucleases, but within EcoRV (Winkler et al., 1993) and PvuII (Cheng et al., 1994; Athanasiadis et al., 1994) the central two base-pairs are recognized by one element and the outer four base-pairs by a different one, in contrast to the family of 5'-overhanging end cutting restriction endonucleases. In Cfr10I the a3 helix is structurally equivalent to the aB helix of EcoRV. The latter helix of EcoRV contacts DNA from the minor groove side and contains a K38 residue, which is involved in sequence-specific contacts with two central basepairs of recognized sequence. Note that due only to the different dimer organisation, the K64 residues of Cfr10I that are structurally equivalent to the K38 ˚ residues of EcoRV become separated by about 18 A from each other and may be able to contact the outer base-pairs. Therefore, the a3 helix of Cfr10I could be considered as the equivalent outer recognition module. However, in contrast to EcoRI and BamHI, in Cfr10I the contacts with the outer base-pair will be over the minor groove side of DNA, just as in EcoRV. This suggests that not only the different cleavage pattern within restriction endonucleases, as proposed by Winkler et al. (1993) and Newman et al. (1994a,b), but also the relative positioning of the recognized DNA-sequence, which contact in a different pattern in the two different families, are ruled by the different dimer interfaces. If EcoRV was artificially dimerized in order to produce 5'-overhanging ends, just like Cfr10I, the recognition pattern would be that one ‘‘module’’ recognizes the central four base-pairs and the other one the outer base-pairs with symmetry-equivalent K38 residues. In order to identify structural elements of Cfr10I that may be involved in DNA recognition we superimposed the central core of Cfr10I on the equivalent structural elements of EcoRI and included EcoRI DNA in this superimposition. Such a Cfr10I–DNA complex showed some steric clashes due to the kink in the DNA complexed with EcoRI. Therefore, we chose regular B-form DNA and superimposed equivalent regions of the first half of the EcoRI recognition sequence on B-form model DNA (Figure 7(a) and (b)). This model DNA fitted well into the U-shaped cleft of the Cfr10I dimer. The a7 helices from different subunits protrude into the major groove of DNA, suggesting a role for them in sequence-specific contacts, just as occurs in EcoRI and BamHI. However, helices a8 seem to be

184

C. Freundii Restriction Endonuclease Cfr10I

(a)

Figure 7. Stereo view of (a) Cfr10I with modeled DNA; and (b) EcoRI in complex with DNA (Kim et al., 1990). Within EcoRI, a4 helices (blue) point into the major groove of DNA, and the inner and outer arms (red) wrap around DNA. Within Cfr10I the equivalent parts of the structure are marked using the same color code. The symmetry equivalent a7 helices of Cfr10I also point into the major groove of DNA, but the inner and outer arms are missing in Cfr10I. Putative Cfr10I contacts from the minor groove of DNA via a3 helices are shown.

(b)

positioned too far from the DNA to make any sequence-specific contacts at least with a model DNA. The symmetrically related a3 helices of Cfr10I are located at the minor groove side of DNA and suggest that K64 may contact the outer base-pairs of recognized sequence as occurs in EcoRV.

Materials and Methods Crystallization Cfr10I was expressed and purified as described by Janulaitis et al. (1983). Crystals were grown by mixing the protein (12 mg/ml in 20 mM Tris-HCl, pH 7.5, 200 mM NaCl) with mother liquor (1 M ammonium acetate in 75 mM Mes, pH 6.5) and equilibrating the mixture against 1 ml of the latter solution at 20°C in Limbro (FB16-24TC) plates using the hanging drop vapor diffusion method. Egg-shaped crystals grew within 5 days in the space ˚ , b = 81.3 A ˚ , c = 119.7 A ˚ , and group I222 with a = 64.5 A a = b = g = 90°, and contain one molecule in the asymmetric unit with a solvent content of 54%. Data collection All data (except dataset Native III) were collected on a MARresearchtmk image plate mounted on a Rigaku

rotating anode generator RU200 at a wavelength of ˚ . Dataset Native III was collected at l = CuKa = 1.5418 A ˚. DESY-Hamburg (Beamline BW6) at l = 0.96 A Phasing The X-ray intensities were evaluated with the MOSFLM package (Leslie, 1991); data scaling and reduction was performed with the CCP4 package (Collaborative Computational Project, number 4, 1994) and loaded into PROTEIN (Steigeman, 1991). Heavyatom data were analyzed by difference Patterson methods and cross-phased difference Fourier maps. Heavy-atom refinement was performed within PROTEIN. A summary of data collection and phasing statistics is given in Table 1. The resultant phases were further improved by solvent flattening, density histogram matching and phase ˚ using the program extension to a resolution of 2.8 A SQUASH (Zhang, 1993). The resulting electron density map allowed the identification of all secondary structure elements.

Model building and refinement ˚ An initial Ca backbone trace was built into the 2.8 A map using the program O (Jones et al., 1991). The model was subsequently completed and refined by energyrestrained crystallographic refinement with X-PLOR

C. Freundii Restriction Endonuclease Cfr10I

(Bru¨nger, 1992) using the parameters derived by Engh & Huber (1991). After each cycle of refinement, a phase-combined Fourier map was calculated and the model was checked and rebuilt. The resolution was ˚ to 2.15 A ˚ using dataset Native III extended from 2.8 A during this refinement procedure. The stereochemical quality of the model was assessed with the program PROCHECK (Laskowsky et al., 1993). The secondary structure was assigned with the program DSSP (Kabsch & Sander, 1983). All Figures were drawn with the program SETOR (Evans, 1993) except Figure 3a and b (Kraulis, 1991). Coordinates have been deposited in the Brookhaven Protein Data Bank with a delay of release of two years. They are available from the author on request.

Acknowledgements We thank Milton Stubbs, Richard Engh and Doriano Lamba for helpful discussions, and Vida Petrusyte and Audra Kesminiene for help in protein purification. This work was supported by the Deutsche Volkswagen Stiftung.

References Aggarwal, A. K. (1995). Structure and function of restriction endonucleases. Curr. Opin. Struct. Biol. 5, 11–19. Anderson, J. E. (1993). Restriction endonucleases and modification methylases. Curr. Opin. Struct. Biol. 3, 24–30. Athanasiadis, A., Vlassi, M., Kotsifaki, D., Tucker, P. A., Wilson, K. S. & Kokkinidis, M. (1994). Crystal structure of PvuII endonuclease reveals extensive homologies to EcoRV. Nature Struct. Biol. 1, 469–475. Baldwin, G. S., Vipond, I. B. & Halford, S. E. (1995). Rapid reaction analysis of the catalytic cycle of the EcoRV restriction endonuclease. Biochemistry, 34, 705–714. Bru¨nger, A. T. (1992). X-PLOR, Version 3.1: A System for Crystallography an NMR, Yale University Press, New Haven, CT. Cheng, X., Balendiran, K., Schildkraut, I. & Anderson, J. E. (1994). Structure of PvuII endonuclease with cognate DNA. EMBO J. 13, 3927–3935. Collaborative Computational Project, number 4 (1994). The CCP4 Suite: Programs for protein crystallography. Acta Crystallog. sect. D, 50, 760–763. Engh, R. A. & Huber, R. (1991). Accurate bond and angle parameters for X-ray protein structure refinement. Acta Crystallog. sect. A, 47, 392–400. Evans, S. V. (1993). SETOR: Hardware–lighted threedimensional solid model representation of macromolecules. J. Mol. Graph. 11, 134–142. Halford, S. E. & Goodall, A. J. (1988). Modes of DNA cleavage by the EcoRV restriction endonuclease. Biochemistry, 27, 1771–1777. Heitman, J. & Model, P. (1990). Substrate recognition by the EcoRI endonuclease. Proteins, 7, 185–197. Janulaitis, A., Stakenas, P. & Berlin, Yu. A. (1983). A new site-specific endodeoxyribonuclease from Citrobacter freundii. FEBS Letters, 161, 210–212. Jeltsch, A., Alves, J., Urbanke, C., Maass, G., Eckstein, H., Lianshan, Z., Bayer, E. & Pingoud, A. (1995). A dodecapeptide comprising the extended chain-a4 region of the restriction endonuclease EcoRI specifi-

185 cally binds to the EcoRI recognition site. J. Biol. Chem. 270, 5122–5129. Jones, T. A., Zou, J. Y., Cowan, S. W. & Kjeldgaard, M. (1991). Improved methods for building protein models in electron density maps and the location of errors in these models. Acta Crystallog. sect. A, 47, 110–119. Kabsch, W. & Sander, C. (1983). Dictionary of secondary structures: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers, 22, 2577–2637. Kim, Y., Grable, J. C., Choi, P. J., Greene, P. & Rosenberg, J. M. (1990). Refinement of EcoRI endonuclease structure: A revised protein chain tracing. Science, 249, 1307–1309. Kostrewa, D. & Winkler, F. K. (1995). Mg2+ binding to the active site of EcoRV endonuclease: A crystallographic study of complexes with substrate and ˚ resolution. Biochemistry, 34, product DNA at 2 A 683–696. Laskowsky, R. A., MacArthur, M. W., Moss, D. S. & Thornton, J. M. (1993). PROCHECK: a program to check the stereochemical quality of protein structures. J. Appl. Crystallog. 26, 283–291. Leslie, A. G. W. (1991). In Crystallographic Computing 5 (Moras, D. Podjarni, A. D. & Thierry, J. C., eds), pp. 50–61, Oxford University Press, Oxford. Newman, M., Strzelcka, T., Dorner, L. F., Schildkraut, I. & Aggarwal, A. K. (1994a). The structure of restriction endonuclease BamHI and its relationship to EcoRI. Nature, 368, 660–664. Newman, M., Strzelcka, T., Dorner, L. F., Schildkraut, I. & Aggarwal, A. K. (1994b). Structure of restriction ˚ resolution by endonuclease BamHI phased at 1.95 A MAD analysis. Structure, 2, 439–452. Ramachandran, G. N. & Sasiekharan, V. (1968). Conformation of polypeptides and proteins. Advan. Protein Chem. 23, 283–437. Roberts, R. J. & Halford, S. E. (1993). Type II restriction endonucleases. In Nucleases (Linn, S. M., Lloyd, R. S. & Roberts, R. J., eds), 2nd edit., pp. 35–88, Cold Spring Harbour Laboratory Press, New York. Roberts, R. J. & Macelis, D. (1993). Restriction enzymes and their isoschizomers. Nucl. Acids Res. 21, 3125–3137. Rosenberg, J. M. (1991). Structure and function of restriction endonucleases. Curr. Opin. Struct. Biol. 1, 104–113. Selent, U., Ru¨ter, T., Ko¨hler, E., Liedtke, M., Thielking, V., Alves, J., Oelschla¨ger, T., Wolfes, H., Peters, F. & Pingoud, A. (1992). A site-directed mutagenesis study to identify amino acid residues involved in the catalytic function of the restriction endonuclease EcoRV. Biochemistry, 31, 4808–4815. Siksnys, V., Timinskas, A., Klimasauskas, S., Butkus, V. & Janulaitis, A. (1995). Sequence similarity among type-II restriction endonucleases related by their recognized 6-bp target and tetranucleotide overhang. Gene, 157, 311–314. Steigeman, W. (1991). In Crystallographic Computing (Moras, D., Podjarny, A. D. & Thierry, J. C., eds) vol. 5, pp. 115–125, Oxford University Press, Oxford. Taylor, J. D. & Halford, S. E. (1989). Discrimination between DNA sequences by the EcoRV restriction endonuclease. Biochemistry, 28, 6198–6207. Vipond, I. B., Baldwin, G. S. & Halford, S. E. (1995). Divalent metal ions at the active sites of EcoRV and EcoRI restriction endonucleases. Biochemistry, 34, 697–704.

186 Winkler, F. K., Banner, D. W., Oefner, C., Tsernoglou, D., Brown, R. S., Heathman, S. P., Bryan, R. K., Martin, P. D., Petratos, K. & Wilson, K. S. (1993). The crystal structure of EcoRV endonuclease and of its complexes with cognate

C. Freundii Restriction Endonuclease Cfr10I

and non-cognate DNA fragments. EMBO J. 12, 1781–1795. Zhang, K. Y. J. (1993). SQUASH–Combining constraints for macromolecular phase refinement and extension. Acta Crystallog. sect. D, 49, 213–222.

Edited by A. R. Fersht (Received 20 June 1995; accepted 28 September 1995)