Evolutionary Optimization of Computationally Designed Enzymes: Kemp Eliminases of the KE07 Series

Evolutionary Optimization of Computationally Designed Enzymes: Kemp Eliminases of the KE07 Series

doi:10.1016/j.jmb.2009.12.031 J. Mol. Biol. (2010) 396, 1025–1042 Available online at www.sciencedirect.com Evolutionary Optimization of Computatio...

6MB Sizes 154 Downloads 55 Views

doi:10.1016/j.jmb.2009.12.031

J. Mol. Biol. (2010) 396, 1025–1042

Available online at www.sciencedirect.com

Evolutionary Optimization of Computationally Designed Enzymes: Kemp Eliminases of the KE07 Series Olga Khersonsky 1 , Daniela Röthlisberger 2 , Orly Dym 3 , Shira Albeck 3 , Colin J. Jackson 4 , David Baker 2,5,6 and Dan S. Tawfik 1 ⁎ 1

Department of Biological Chemistry, Weizmann Institute of Science, Rehovot 76100, Israel 2

Department of Biochemistry, University of Washington, Seattle, WA 98195, USA 3

Israel Structural Proteomics Center, Weizmann Institute of Science, Rehovot 76100, Israel 4

Institut de Biologie Structurale, Centre National de la Recherche Scientifique, Grenoble 38027, France 5

Biomolecular Structure and Design, University of Washington, Seattle, WA 98195, USA 6

Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA

Understanding enzyme catalysis through the analysis of natural enzymes is a daunting challenge—their active sites are complex and combine numerous interactions and catalytic forces that are finely coordinated. Study of more rudimentary (wo)man-made enzymes provides a unique opportunity for better understanding of enzymatic catalysis. KE07, a computationally designed Kemp eliminase that employs a glutamate side chain as the catalytic base for the critical proton abstraction step and an apolar binding site to guide substrate binding, was optimized by seven rounds of random mutagenesis and selection, resulting in a N 200-fold increase in catalytic efficiency. Here, we describe the directed evolution process in detail and the biophysical and crystallographic studies of the designed KE07 and its evolved variants. The optimization of KE07's activity to give a kcat/KM value of ∼ 2600 s− 1 M− 1 and an ∼ 106-fold rate acceleration (kcat/kuncat) involved the incorporation of up to eight mutations. These mutations led to a marked decrease in the overall thermodynamic stability of the evolved KE07s and in the configurational stability of their active sites. We identified two primary contributions of the mutations to KE07's improved activity: (i) the introduction of new salt bridges to correct a mistake in the original design that placed a lysine for leaving-group protonation without consideration of its “quenching” interactions with the catalytic glutamate, and (ii) the tuning of the environment, the pKa of the catalytic base, and its interactions with the substrate through the evolution of a network of hydrogen bonds consisting of several charged residues surrounding the active site. © 2010 Elsevier Ltd. All rights reserved.

Received 23 August 2009; received in revised form 15 December 2009; accepted 17 December 2009 Available online 28 December 2009 Edited by I. Wilson

Keywords: directed evolution; computational protein design; enzymatic catalysis

Introduction Enzyme active sites apply a complex combination of forces and factors to achieve extremely high rate accelerations. 1 The catalytic cycle involves the *Corresponding author. E-mail address: [email protected]. Abbreviations used: HisF, imidazole-3glycerolphosphate synthase cyclase subunit; TS, transition state; WT, wild type; PEG, polyethylene glycol.

concerted action of different elements of the active site acting as base, acid, nucleophile, or charge relays and of conformational changes that may facilitate turnover.2–4 A detailed understanding of enzymes has come not only from studying natural enzymes by classical approaches 5 but also by making new enzymes and enzyme-like catalysts. Indeed, the making of new enzymatic catalysts is often considered the most severe test of our understanding of enzyme catalysis. The advent of computational design methods for predicting structure from sequence at atomic

0022-2836/$ - see front matter © 2010 Elsevier Ltd. All rights reserved.

1026

Scheme 1. Kemp elimination of 5-nitrobenzisoxazole.

accuracy provides a new and powerful way of generating tailor-made active sites.6–8 The computational design of enzymes depends on two factors: (i) the ability to design an active-site structure that confers efficient catalysis and (ii) the ability to design a sequence that confers the desired structure. Both steps are currently performed with considerable success but are far from being optimal. The first step, in particular, challenges our knowledge of enzyme catalysis. Directed evolution, which requires no previous knowledge of structure–function, can be applied to improve computationally designed enzymes and can therefore bridge part of the gap created by our limited design skills. However, the bottleneck of directed evolution is the very limited sequence space that can be covered by screening. We have recently described a series of computationally designed enzymes that catalyze an unnatural reaction dubbed the Kemp elimination.9 The Kemp elimination was chosen as a model reaction for proton transfer—a critical step in numerous enzymatic reactions. In this highly activated model system, a base-catalyzed proton transfer from carbon, concerted with the cleavage of the nitrogen–oxygen bond, leads to the cyanophenol product (Scheme 1). The physical organic chemistry of the

Designed KE07 and Its Evolved Variants

Kemp elimination has been extensively studied, and it has been used as a probe for studying medium effects in catalysis.10–12 Several enzyme-like systems that catalyze this reaction have been explored, including catalytic antibodies13–15 and synthetic polymers.16 The Kemp elimination has also been shown to be promiscuously catalyzed by various proteins—serum albumins, for example. 17,18 In both catalytic antibodies and serum albumins, the reaction is catalyzed by a base in a hydrophobic active site. The alignment of the catalytic base relative to the substrate, medium effects that lead to the activation of the base catalysts, and chargedispersing interactions that stabilize the negatively charged transition state (TS) have all been shown to play important roles in these synthetic polymer and protein catalysts.12,15,19 The computationally designed Kemp eliminases were also designed to have apolar active sites and a residue acting as a base aligned against the C–H bond. The locations of the catalytic residues around the TS were optimized by quantum mechanical calculations. Next, the RosettaMatch algorithm was used to search for constellations of protein backbones capable of supporting these catalytic residues based on a large set of natural proteins with known structures. The Kemp eliminase active-site residues were then installed within the natural scaffolds by replacing up to 20 residues within the active sites of the chosen scaffold and around it.9 KE07 is based on the TIM barrel scaffold of a thermostable imidazole3-glycerolphosphate synthase [HisF; Protein Data Bank (PDB) accession code 1THF] from Thermotoga maritima (Fig. 1a). Its key active-site residues comprise Glu101 serving as the catalytic base,

Fig. 1. (a) The KE07 design, showing the TIM barrel scaffold of HisF (PDB accession code 1THF), the modeled 5nitrobenzisoxazole substrate (red), and the 13 residues that were replaced to create the designed Kemp eliminase active site (green). (b) Details of the active site of the designed KE07. Shown are the 5-nitrobenzisoxazole substrate (cyan), the catalytic base (Glu101), the general acid/H-bond donor (Lys222), and the stacking residue (Trp50).

1027

Designed KE07 and Its Evolved Variants

Trp50 to facilitate substrate binding and charge delocalization of the TS, and Lys222 to serve as a hydrogen-bond donor for phenoxide stabilization (Fig. 1b). Our previous report described the preliminary enzymatic and structural characterization of the designed KE07 and seven rounds of directed evolution to yield variants with up to 200-fold higher kcat/KM values.9 Here, we provide a detailed description of the directed evolution of KE07 throughout 10 rounds of mutagenesis and screening; an in-depth analysis of the designed and evolved KE07 variants, including X-ray structures of three evolved variants from the fourth, sixth, and seventh rounds of directed evolution; and mechanistic data that shed light on the properties of KE07 and the routes that led to its catalytic optimization. The data indicate critical contributions from the refinement of the electrostatics in the vicinity of the catalytic base and that of the active-site pocket in general.

Results Directed evolution of KE07 We performed three additional rounds of directed evolution beyond the seven previously described rounds,9 exploring random mutagenesis, stabilizing consensus mutations, and targeted replacements of positions adjacent to designed residues. However, these rounds yielded no further improvement. The mutagenesis protocols and outcome of all 10 rounds are summarized in Materials and Methods. We therefore assumed that the catalytic potential of the KE07 design has been largely exhausted, certainly within the sequence space that has been currently explored. All the evolved variants exhibit turnover numbers (kcat) higher than that of the designed KE07, while KM values remained in the millimolar range (0.3– 2.4 mM) (Table 1). Several mutations that were repetitively observed in evolved variants could be divided into three groups: (a) Ile7Asp (as well as Ile7Thr and Ile199Thr, which appeared in the two first rounds and were later taken over by Ile7Asp) is located at the bottom of the active site but not in direct contact with the substrate. As shown below, this mutation refined the environment of the catalytic Glu101 base, increased its basicity, and thereby led to higher catalytic efficiency. (b) Val12Leu/Met, Gly202Arg, and Asn224Asp are mutations located in the upper part of the active site. Residues Val12 and Gly202 are adjacent to the designed residues Ile11 and His201, and Asn224 is in itself a designed position. Two other mutations in helix spanning residues 224–229 were observed (Val226Ala and Phe229Ser). This group of mutations seems to be involved in refining the shape of the active-site entry. (c) Lys19Glu/Thr and Lys146Glu/ Thr are surface mutations that seem to increase the levels of soluble expression of the KE07 variants.

Structural analysis of the KE07 variants As previously described, the backbone of the crystal structure of the designed KE07 (PDB accession code 2RKX)9 is virtually superimposable with the computed model, thus validating the accuracy of the design methodology. The activesite residues show only minor deviations from the model (rmsd of 0.95 Å), both in their backbone conformations and the conformations of the vast majority of side chains. Two loops (residues 18–24 and 51–58) adopt different conformations; however, they are situated away from the active site and are therefore unlikely to significantly affect catalysis.9 Here, we describe the crystal structures of three variants obtained by directed evolution of the designed KE07: round 4 1E/11H, round 6 3/7F, and round 7 1/3 H. Superposition of the structure of the starting point (the computationally designed KE07) with the crystal structures of the evolved eliminases indicates that the backbone conformation has remained virtually unchanged (Supplementary Fig. 1). To understand the origins of the increases in kcat, it is therefore necessary to focus on subtle changes in surface loops (e.g., residues 201– 207) and on localized changes in side-chain conformations and the interactions mediated by these side chains. Unfortunately, our attempts to obtain structures of KE07 variants in the presence of substrate analogues failed. Positively charged substrate-like ligands that could interact with the catalytic glutamate, such as benzimidazole (which was previously used to generate catalytic antibodies for the Kemp elimination),13 2-amino-benzimidazole, 2amino-5,6-dimethyl-benzimidazole, and benzotriazole, were examined, but they exhibited no inhibition. Substrate analogues such as the unsubstituted benzisoxazole and the product 5-nitro-2-cyanophenol exhibited very low affinity (≥ 1 mM). These were nevertheless used for co-crystallization and soaking experiments, but no electron density corresponding to these compounds was observed. Thus, we opted for the alternative of docking the TS model within the various crystal structures using the highresolution RosettaLigand program20 while using the distance and orientation between the carboxylate of Glu101 and the substrate proton that gets abstracted (Fig. 1b) as initial constraints. The computed TS–enzyme energies for the structures of the initial design and the evolved KE07s were found to be very similar. Thus, in agreement with the relatively small changes in the KM values (Table 1), no major change occurred in the mode of binding and packing interactions of the TS or the substrate. It therefore appears that the mutations optimized key catalytic interactions with the TS, rather than binding, or tightness of packing of the TS, which could in principle be predicted by Rosetta algorithm. It has to be noted that the structural analysis was performed on the basis of the ligand-free structures. Indeed, docking using a mode that enabled side-chain rotations indicated no confor-

1028

Table 1. Summary of mutations and kinetic and structural parameters of representative KE07 variants KE07 design

Round 2 11/10D

Round 3 I3/10A

Round 4 1E/11H

Round 5 10/3B

Round 6 3/7F

Round 7 1/3H

Round 7 10/11G

kcat (s− 1) KM (mM) kcat/KM (M− 1 s− 1) Fold improvement in kcat/KM relative to the designed KE07 Active-site mutations Ile7 Val12 Gly202 Asn224 Surface mutations Lys19 Lys146 Helix 224–229 mutations Phe229 Other mutations

0.018 ± 0.001 1.4 ± 0.1 12.2 ± 0.1 1

0. 0213 ± 0.0004 0.31 ± 0.02 66 ± 2 5.4

0.206 ± 0.003 0.48 ± 0.03 425 ± 16 35

0.699 ± 0.001 2.40 ± 0.07 291 ± 9 24

0.49 ± 0.01 0.59 ± 0.03 836 ± 18 69

0.60 ± 0.07 0.69 ± 0.09 872 ± 25 71

0.76 ± 0.03 0.54 ± 0.05 1414 ± 84 116

1.37 ± 0.14 0.54 ± 0.12 2590 ± 302 212

Gln

Asp

Asp

Arg Asp

Arg Asp

Arg Asp

Asp Met Arg Asp

Asp Leu Arg Asp

Asp Met Arg Asp

Glu Thr

Thr

Glu

Gln123Arg

Ser Phe86Leu

Structure resolution (Å) Molecules in asymmetric unit Ile/Asp7–Lys222 distance (Å)a Glu101–Lys222 distance (Å)a pKa(kcat) pKa(kcat/KM) Tm app (°C)e Temperature dependencyf Fluorescence emission (peak; nm)

2.259 1 6.1 2.9 b4.5b ∼5.0b N95 16.2 ± 1.5 337

No structure available

No structure available

b4.5 5.0 N95 10.0 ± 0.8 339

NDc NDc ∼ 95 5.8 ± 0.1 351

Variant

Glu Thr

No structure available

2.3 6

NDc NDc 79 5.2 ± 0.2 350

5.1 5.3 ∼ 87 4.7 ± 0.3 353

Thr Ser Phe77Ile His84Tyr Met207Thr 1.8 2 4.0–4.2 3.3–3.5 6.2 6.0 76 3.4 ± 0.3 345

Ser Phe77Ile Ile102Phe No structure available 2.8–5.1 4.0–5.7 5.9 5.7 76 2.94 ± 0.06 352

The distances measured for NɛLys–OβAsp or NɛLys–Cβ2Ile and for NɛLys–OγGlu. The pH-rate profile could not be determined for the KE07 design due to its low activity. The variant 11/10D from round 2, not bearing the Ile7Asp mutation, probably represents the pKa of Glu101 in the KE07 design. c ND indicates not determined. d The pH-rate (kcat) profile could not be determined for round 4 1E/11H due to its high KM and limited substrate solubility, and only the pH rate profile for kcat/KM could be obtained from the linear phase of Michaelis–Menten plots.9 e Tapp m is the apparent midpoint temperature of melting determined by CD spectroscopy. f Provided is the ratio of enzymatic activity (initial rates) at 60 versus 20 °C with 0.25 mM 5-nitrobenzisoxazole. a

b

Designed KE07 and Its Evolved Variants

2.25 6 3.3–5.7 3.3–4.8 NDd 5.4 ∼ 86 4.5 ± 0.2 350

Arg Asp

Designed KE07 and Its Evolved Variants

mational change in the active site upon TS binding. That said, the accuracy of docking and modeling is obviously limited, and thus the possibility that the conformations of the actual enzyme-TS complexes differ from the models analyzed here cannot be ruled out. Refining the environment of the catalytic base The previously determined structure of the designed KE07 (PDB accession code 2RKX) revealed that the catalytic Glu101 forms a salt bridge with Lys2229 (Fig. 2a and b). The distance in the crystal structure is shorter than that in the designed model (2.9 versus 3.6 Å). Sub-angstrom differences can be explained by the positional error of the computed model, which is on the order of 0.7–1.0 Å, but even 3.6 Å is an interacting distance. Thus, in the

1029 designed model, the ammonium group of Lys222 interacts and stabilizes the negative charge that develops on the phenoxide product, whereas in the crystal structure, which contains no ligand, Lys222 is within salt-bridge distance to the carboxylate group of Glu101 (Fig. 2b). It is possible that this interaction is a unique feature of the free enzyme. Indeed, fluctuations of the side chains of Glu101 and Lys222 were observed following the docking of the TS and optimization of rotamers in the KE07 design structure, and the Glu101–Lys222 distance varied from 3.0 to 4.7 Å. However, in the early rounds of directed evolution, the hydrophobic residues Ile7 and Ile199, situated at the bottom of the active site, were mutated to polar residues (Table 2). As from round 4, the mutation Ile7Asp appeared in all the selected variants. We hypothesized that Ile7Asp mutation prevents the interaction between Lys222

Fig. 2. Refinement of the environment of Glu101. (a) In the designed KE07, the amino group of Lys222 was placed 4.1 Å away from the TS's phenolic oxygen, with the aim of stabilizing the negative charge of the product phenoxide. However, at its designed location, Lys222 can also form a weak salt bridge with the catalytic Glu101 with a distance of 3.6 Å (NɛLys222–OγGlu101). (b) In the crystal structure of the KE07 design (PDB accession code 2RKX;9 with the substrate overlaid from the KE07 designed model), the Glu101–Lys222 distance was found to be 2.9 Å. (c) In the evolved variants of KE07 bearing the Ile7Asp mutation (e.g., round 4 1E/11H shown here), Asp7 largely breaks the Glu101–Lys222 salt bridge (in the evolved variants, the NɛLys222–OγGlu101 distance is 3.3–5.7 Å), in some cases directly interacting with Lys222 (the NɛLys222–OβAsp7 distance is 2.8–5.7 Å). (d) An overlay of the structures of the KE07 design (magenta) and the evolved round 4 1E/11H (cyan) illustrates how the Asp7Ile mutation causes the shift of the Lys222 side chain away from Glu101.

1030

Designed KE07 and Its Evolved Variants

Table 2. Summary of the directed evolution of KE07 Random mutagenesis (no. of variants taken, mutagenesis technique, Round no. of mutations/gene) 1

Recombination

Randomization of specific positions by spiking oligonucleotides

Error-prone PCR, “wobble” base analogues dPTP and 8-oxo-dGTP21, 5 ± 3 mutations per gene

2

Shuffling of the 23 best variants of round 1 plus the KE07 design for back-crossing (20%)

3

Shuffling of the four Library A: Ile7Thr/Val/Ala/ best variants of round 2 Phe/Ser/Glu/Asp/Gln/His and Ile199Thr/Val/Ala/Phe/ Ser/Glu/Asp/Gln/His (bottom of active site); library B: Tyr128Leu/Pro/Ile/ Thr/Val/Ala/Phe/Ser and His201Cys/Ser/Tyr/Thr/Asn (attempt to optimize the stacking residues); library C: Insertion of one to two amino acids between residues 224–225 and 225–226 (helix elongation) Shuffling of the 17 best Ile199Thr/Val/Ala/Phe/Ser/ variants of round 3 plus Glu/Asp/Gln/His; the KE07 design for Ile173Ala/Val; back-crossing (25%) Leu176Ile/Asp/Asn/Ile

4

5

Twelve best variants of round 4, Mutazyme (Stratagene),22 approximately one mutation per gene Twelve best variants of round 5, Mutazyme (Stratagene),22 3 ± 1 mutations per gene

6

7

Shuffling of the 20 best variants of round 6

8

Fourteen best variants of round 7, Mutazyme (Stratagene),22 3 ± 1 mutations per gene

9 10

Shuffling of the 14 best variants of round 7 Library A: shuffling of the 14 best variants of round 7; library B: shuffling of the 18 best variants of round 9

Consensus mutagenesisb Randomization of residues adjacent to designed: Ala8, Leu10, Phe49, Asp51, Val100, Ile102, Val127, Ile129, Ala131, Arg175, Gly177, Ala200, Gly202, Ala221, Ala223

Key mutations identified in the most active variants

Fold improvement measured with crude lysates, relative to the KE07 designa

Ile7Thr, Ile173Val, Gly202Arg, Asn224Asp, Asn224Ser, Phe227Ser Ile7Thr, Lys19Thr/Glu, Phe86Leu, Lys146Thr, Leu152Pro, Ile199Thr, Val226Ala Ile7Asp/Thr/ Ser/Gln/Val; Lys146Thr/Glu; Ile199Gln/His/Thr; Phe227Leu

5-fold

Ile199Gln/Val/ His/Thr; Ile173Ala/Val; Leu176Ile/Asp/ Asn/Ile Val12Met, Leu47Ile, His84Tyr

200-fold

15-fold

70-fold, only in library A; no improvement in libraries B and C

300-fold

Val12Leu/Met, Phe77Ile, Gly171Ala

300-fold

Ile102Phe

300-fold

Ala223Thr

No improvement relative to round 7

Three to eight consensus mutations One to three mutations of the residues adjacent to designed

No improvement relative to round 7 No improvement relative to round 7

a

The activity improvement was measured in crude lysates and it is not corrected for protein expression. It is therefore only a preliminary measure for an increase of protein activity. b Detailed information about consensus mutagenesis of KE07 can be found in Supplementary Material.

and Glu101 and breaks the salt bridge between them, thus rendering the catalytic Glu101 more basic and therefore more reactive.9 This hypothesis is supported by the structures of the evolved variants and their pH-rate profiles.

The crystal structures of the three evolved variants indicated that the negative charge introduced by Ile7Asp mutation results in a movement of the Lys222 side chain away from the catalytic Glu101. The distance between Lys222 and Glu101 increased

Designed KE07 and Its Evolved Variants

from 2.9 Å in the crystal structure of the designed KE07 to 3.3–5.7 Å in the evolved variants (Fig. 2c and d; Table 1). The increased distance between the amine group of Lys222 and the carboxylate of Glu101 is concomitant with the reduced distance of Lys222 from the side chain of Asp7 (or Ile7 in the designed template) from 6.1 Å in the structure of the designed KE07 to 2.8–5.1 Å in the evolved variants. So while the difference between the Lys222–Glu101 distances in the computed model and its crystal structure (3.6 versus 2.9 Å) may represent the deviation between the bound and unbound forms, the Lys222–Glu101 distances in the evolved variants reach 5.7 Å and therefore significantly differ both from the KE07 model and from its actual structure. Asp7 was reverted to Ile in the round 7 1/3H variant to probe the importance of the mutation Ile7Asp. As expected, the mutation Asp7Ile caused a significant decrease in the activity (∼ 50-fold decrease in kcat/KM; Supplementary Table 4). In the evolved variants of KE07, Asp7 not only interacts with the Lys222 side chain but also is involved in a network of electrostatic interactions at the bottom of the active site (Fig. 3). In addition, the Lys222Ala mutation, which was found to cause a mild increase in the activity of the designed KE07 (∼3-fold),9 was introduced to the round 7 1/3H variant, resulting in an even more detrimental effect on activity (N 500fold decrease of kcat/KM; Supplementary Table 4). Expression and solubility were not compromised in the cases of Asp7Ile and Lys222Ala mutants. Indeed, in the evolved variants, Lys222 is involved in a network of electrostatic interactions with the adjacent residues other than Glu101. Although in some cases the distance between Lys222 and Asp7 is too large for a direct salt bridge (N 4.0 Å), an interaction mediated through a water molecule introduces Lys222 into the network (Fig. 3), and its removal

1031 therefore seems to disrupt favorable electrostatic interactions. The change in the environment of Glu101 is also manifested in changes in the pH optimum of the evolved variants. Due to its very low activity, the pH-rate profile of the designed KE07 could not be determined. However, we assumed that round 2 11/10D, which bears no mutation at position 7 or 199, represents the pH-rate profile of the designed KE07. This variant exhibits no significant decrease in kcat down to pH 5.0, indicating a pKa ≤ 4.5. The decrease in kcat /K M below pH 5.0 is mostly attributed to an increase in the KM (Fig. 4). Below pH 4.5, the activity of round 2 11/10D decreases, probably due to a loss of conformational stability. Indeed, the KM values of all the KE07 variants tested increased below pH 5.0, and CD measurements indicated a conformational change at pH 4.5.9 In contrast to round 2 11/10D, and presumably the designed KE07, the pH-rate profiles of the evolved KE07 variants bearing the Ile7Asp mutation exhibit a fully pronounced acidic shoulder, consistent with the deprotonation of the catalytic base, Glu101. The pKa(kcat) values therefore range from not measurable (b 4.5) in the absence of the Ile7Asp mutation to ∼ 6.0 for the evolved round 7 variant. The correlation between the increase in pKa(kcat) and the larger distance between Glu101 and Lys222 (and decreasing Ile/Asp7–Lys222 distances) (Table 1) supports the hypothesis that Asp7 is eliminating the saltbridge formation between the catalytic Glu101 and Lys222, thus rendering the catalytic Glu101 more basic and increasing the kcat of the evolved variants. Notably, above pH 4.5, the changes in KM values with pH were relatively small, and most differences between the variants relate to the pKa values for kcat (Table 1). This pattern suggests that the change in the environment of Glu101 is manifested primarily

Fig. 3. Electrostatic networks at the bottom of the active site. (a) The KE07 design, with the network composed of Arg5, Glu46, Lys99, and Glu167. (b) KE07 round 7 1/3H variant (monomer B), with the mutated residue Asp7 as part of the new network. The interaction of Asp7 and Lys222 via a water molecule introduces Lys222 to this network as well.

1032

Fig. 4. pH-rate profiles of the KE07 variants. (a) kcat (s−1) versus pH. (b) kcat/KM (M−1 s−1) versus pH. Round 2 11/10D, which bears no mutation at position 7 or 199, is assumed to represent the pH-rate profile of the designed KE07. The resulting parameters are given in Table 1.

Designed KE07 and Its Evolved Variants

and round 7 variants, however, the side chain of Arg202 seems to adopt different conformations, and the interactions with the nitro group are predicted for only some of the copies in the asymmetric unit. However, even in absence of explicit interactions with the substrate, the Gly202Arg mutation seems to affect substrate positioning within the active site and the KM values of the evolved variants are 2- to 3-fold lower than that of the designed KE07. The directed evolution has also led to new interaction networks of charged surface residues that could facilitate the reaction by favorably interacting with the substrate and/or properly aligning certain parts of the active site (Fig. 6). In the round 7 variant, Asp224 can potentially interact with Arg202 (distance of 3.9 Å) and with His201 (3.6 Å). This network of Arg202–Asp224–His201 has evolved gradually. In the designed KE07, residue 202 is Gly, residue 224 is Asn, and its distance to His201 is 7.9 Å (OAsn–NHis). In the evolved variants, following the Gly202Arg and Asn224Asp mutations, Asp224 and His201 gradually became closer, with distances of 4.9 Å for Asp224–His201 in the round 4 variant and 4.4 and 3.6 Å in the round 6 and round 7 variants, respectively. This network also brings His201 closer to the substrate leaving-group oxygen (based on the docking models, His201–Osub is 4.2 Å in the designed KE07, is 4.4–4.7 Å in the round 4 and round 6 variants, and is 2.8 Å in the round 7 variant, monomer B). This change therefore provides potentially favorable interactions with the TS. The unanticipated incorporation of potential interactions with the substrate and TS (Figs. 5 and 6) illustrates that electrostatic interactions can play a critical role in catalysis and that

at the level of the enzyme–substrate complex (kcat values), rather than the free enzyme (kcat/KM). Changes in active-site architecture The crystal structures of the evolved KE07 variants also reveal that replacement of side chains via mutations, combined with subtle backbone changes, may have allowed the introduction of new enzyme–substrate interactions. As shown in Fig. 5 for round 4 1E/11H, the mutation Gly202Arg caused relocation in the adjacent loop at the top of the active site (residues 175–177; that in the original design would clash with the Cβ of the arginine), and now the loop can accommodate the arginine residue. The mutation also brought the side chain of Arg202 close enough to the substrate, to potentially enable an interaction with its nitro group. The latter interaction appears in the docked structures of all the six copies in the asymmetric unit of the round 4 1E/11H structure. In round 6

Fig. 5. Overlay of the structures of the KE07 design (gray) and evolved KE07 round 4 1E/11H monomer A (green). The TS was docked into the structure with RosettaLigand. The mutation Gly202Arg induced a move of the adjacent loop (residues 175–177; marked with an arrow) and may have introduced a new interaction with the nitro group of the substrate.

Designed KE07 and Its Evolved Variants

1033

Fig. 6. The evolution of electrostatic networks at the upper part of the active site. (a) Crystal structure of the KE07 design, with His201, Gly202, and Asn224 marked as sticks. (b) The same region in the structure of KE07 round 4 variant (monomer A), with Arg202 and Asp224. (c) Structure of KE07 round 7 variant (monomer B), with His201, Asp224, and Arg202 at interacting distances. (d) Structure of KE07 round 7 variant (monomer B) with the two possible rotamers of Arg202, obtained by docking with higher allowed mobility of rotamers. The TS (cyan) was docked into these structures with RosettaLigand.

better modeling of surface electrostatics will be important for generating improved designs. The His201–Osub interaction is likely to be disfavored by the design process because of the unfavorable interactions between Arg202 and His201 (indeed, in the designed KE07, residue 202 is glycine) and because burying charged groups in the relatively non-polar substrate–enzyme complex is costly. When the TS was docked into the KE07 structures, with optimization of rotamer position (“dock_ repack” protocol),20 the position of Arg202 varied greatly and in some runs the rotamers were different from those observed in the crystal structures (Fig. 6d). Thus, it seems that these different rotamers of Arg202 are close in energy, which provides an

additional explanation for the difficulty of surface electrostatic modeling. Overall, a significant and consistent change in the interior of the barrel occurred. Ten charged residues were found in the interior of the β-barrel in the evolved variants of KE07 (Fig. 7, Supplementary Table 5), as opposed to four in the template used for the designed KE07 (HisF protein; PDB accession code 1THF). Three charged residues were added as part of the original design, and three more charged residues accumulated in the course of its directed evolution (Asp7, Arg202, and Asp224). These make the tunnel of KE07 unusually rich with charged residues and in interaction networks mediated by these charged residues.

1034

Designed KE07 and Its Evolved Variants

Alternative conformers of a round 7 variant

Fig. 7. Charged residues in the interior of KE07's barrel. Shown is the structure of KE07 round 7 1/3H (monomer B), with the charged residues of the template HisF (Arg5, Glu46, Lys99, and Glu167; cyan), the charged residues added by the computational design (Glu101, Lys222, and His201; yellow), and the charged residues added in the course of directed evolution (Asp7, Arg202, and Asp224; magenta).

Beyond the above-described specific interactions, more global changes in the surface of the evolving enzymes and in the active-site periphery in particular seem to have taken place. The KE07 design aimed at a solvent-excluded hydrophobic active site where the substrate is relatively buried. In the KE07 model, the entrance and the walls of the active site are therefore mostly uncharged (Supplementary Fig. 2a). However, in the crystal structure of the designed KE07, the active-site surface is composed primarily of positively charged residues, such as Lys19, His201, and Lys222 (Supplementary Fig. 2b), thus generating a positive surface potential. In the evolved variants, the positive charge is much less pronounced (Supplementary Fig. 2c and d), both due to mutations that introduced opposing charges (Asn224Asp) and due to rearrangement of charged residues (e.g., rotations of the Lys222 side chain and exposure of Glu167). Notably, these changes in the electrostatic properties of the active sites were caused primarily by different side-chain rotamers with hardly any backbone changes. The changes in the electrostatic potential of the active site therefore seem to have brought the evolved variants closer to the designed model and might have also contributed to the increased pKa of the catalytic Glu101, as observed by changes in the surface potential of other enzymes.23 Changes in the active site of the evolved KE07s were also manifested in their fluorescence spectra. While we could not find a consistent explanation for these changes, the observed changes in tryptophan fluorescence of the free enzyme seem to largely parallel changes in the active-site structure and the improved kinetic parameters of the evolved variants (Table 1; Supplementary Fig. 3).

The crystal structure of one of the variants from the last round of directed evolution, KE07 round 7 1/3H, reveals unexpected changes in the active site. The symmetry contacts in the crystal structure involve the insertion of the polyhistidine tail (introduced in the construct) of one molecule in the asymmetric unit (molecule B) into the active site of a symmetry-related molecule within the same asymmetric unit (molecule A). The imidazole ring seems to stack against Trp50, in a mode reminiscent of the designed substrate binding, and the imidazole's nitrogen is within a hydrogen-bonding distance from Glu101 (Fig. 8a). Due to the similarity between imidazole and the oxazole moiety of the substrate and the potential for hydrogen bonding with the catalytic glutamate (benzimidazole was previously used as a hapten for generating catalytic antibodies for this reaction13), this mode of binding could be relevant and might reflect certain features of the substrate–enzyme, or product–enzyme, complexes. However, the active-site conformation in molecule A significantly differs from the designed conformation as observed in other structures, including molecule B of the same variant. In particular, the side chain of Trp50 that is supposed to stack against the substrate is rotated by N 90° (Fig. 8b). Among other factors, it appears that this alternative conformation of Trp50 is related to the movement of Lys222 in response to the Ile7Asp mutation (Fig. 2). Due to changes in positioning of several side chains, the active-site surfaces of the of the two round 7 1/ 3H molecules are quite different—in molecule A with the flipped Trp50 conformation, Trp50 actually lies at the bottom of the active site, thus forming a shallow cavity (Fig. 8c) that differs from the designed cavity and the cavity observed in all other evolved variants, including molecule B within the same structure (Fig. 8d). The increased flexibility of the active-site region also correlates with the observed changes in the temperature dependency of the evolved KE07 variants as described below. Interestingly, a flip of the substrate-stacking tryptophan was observed in another laboratorymade Kemp eliminase, the catalytic antibody 34E4.24 In this catalytic antibody, the flipped Trp conformation actually blocks the entrance to the active site. Pre-steady-state kinetics of binding of a TS analogue indicated that, in solution, the free 34E4 exists primarily in the flipped conformation and that isomerization to give the substrate-accessible active conformation is limiting the turnover rate of 34E4. In the evolved KE07 variants, however, the flipped conformation was observed only in one of the evolved variants, and only in one of the two molecules of the asymmetric unit. Pre-steady-state kinetics provided no support to the possibility that the flipped conformation actually exists in solution and is not the outcome of crystal packing forces. A linear rate of product release was observed in stopped-flow runs within the timescale of a single turnover, and presteady-state kinetics of ligand binding could not be

Designed KE07 and Its Evolved Variants

1035

Fig. 8. The two conformers of the KE07 round 7 1/3H variant. The active-site conformation in molecule A significantly differs from the designed conformation that was observed in all other structures, including molecule B within the asymmetric unit of round 7 1/3H. (a) The imidazole ring of the polyhistidine chain of one molecule within the asymmetric unit (chain B; magenta) stacks against the side chain of Trp50 of the other molecule (chain A) and is within a hydrogen-bond distance from Glu101. (b) An overlay of the KE07 round 7 1/3H chain A (yellow) versus the designed model (green) showing the different positioning of the side chain of Trp50 and that of other residues, such as Lys222 and Arg202. The substrate (in magenta) is part of the KE07 model. (c) Surface image of KE07 round 7 1/3H chain A; note that Trp50 overlaps the substrate. (d) Surface image of KE07 round 7 1/3H chain B. Residues Trp50, Gly/Arg202, and Asn/Asp224 are shown in yellow. The catalytic Glu101 is shown in magenta. The substrate is overlaid from the KE07 model, as none of the structures was obtained with ligand.

performed since no suitable micromolar affinity ligand for the KE07s has been identified. Thermal properties and stability The KE07 design was based on a thermophilic imidazole-3-glycerolphosphate synthase (HisF, from T. maritima; PDB accession code 1THF). The design process also aimed at higher stability, as the energy of the enzyme–TS comprised the primary optimization factor. That is, at each step, sequences exhibiting the lowest configurational energy were chosen. This process resulted in the KE07 design being extremely stable: the apparent melting temperature (Tm) of the designed KE07, determined by CD spectroscopy, is N95 °C (Fig. 9a; Table 1). The evolved variants became increasingly less stable as the apparent Tm decreased from N 95 °C for the designed KE07 and round 2

variants down to 72 °C for round 7 variants (Fig. 9a; Table 1). Directed evolution of KE07 was performed at ambient temperature, and the temperature dependency of the catalyzed reaction rates also changed. The enzymatic activity of the designed KE07 increased with the temperature and was 16-fold higher at 60 °C than at 20 °C (Fig. 9b; Table 1). This temperature dependency is similar or even slightly weaker than the temperature dependency of the spontaneous reaction in buffer, whose rate is N 60-fold higher at 60 °C than at 20 °C (Fig. 9b). The enzymatic activity of the evolved KE07 variants became less temperature dependent, with the change being gradual and consistent throughout the evolutionary process. By the seventh round, the reaction rate of KE07 variants became only ∼ 3-fold higher at 60 °C than at 20 °C (Fig. 9b; Table 1).

1036

Fig. 9. Changes in the thermostability and temperature dependency of the KE07 variants. (a) Melting curves of the KE07 variants determined by CD spectroscopy. (b) Relative rates of the Kemp elimination reaction catalyzed by various KE07 variants and of the spontaneous reaction in buffer at 20 to 60 °C. Rates were determined with 5nitro-benzisoxazole as substrate under kcat/KM conditions and normalized to the activity at 20 °C.

Discussion Certain properties of natural proteins are amenable to evolutionary optimization in the laboratory, but the degree and ease by which they can be optimized are dramatically different.25 For example, in natural enzymes, the catalytic efficiency for the native substrate is usually optimal and is not likely to further increase. The results with KE07 and with two other designed Kemp eliminases that possess different active-site configurations and mechanisms (KE59 and KE70; O.K. et al., unpublished results) support the notion that designed proteins are generally evolvable. However, at present, there appears to be a limit, or a ceiling, to the optimization

Designed KE07 and Its Evolved Variants

of KE07 in the range of 100-fold or so. In fact, 100fold improvements are the norm in most directed evolution projects that target natural enzymes and may not be unique to designed enzymes. Although some directed evolution experiments resulted in N 10,000-fold increases in activity via only a small number of mutations, these cases are relatively rare (for examples, see Refs. 26–28). It was also demonstrated that while the promiscuous functions of natural enzymes are plastic and can be easily altered by mutations, their native, primary functions, which had been under selection for long periods, are more robust to the effects of mutations.29 The Kemp eliminase function of the designed enzymes could be regarded as their primary, or native, function and could therefore be resistant to evolutionary optimization. Indeed, with one exception, the designed active-site residues were not mutated and the designed active-site configuration and mechanism remained unchanged. Nonetheless, the catalytic efficiency of the designed KE07 could be significantly improved. The catalytic efficiency (kcat/KM) of the most active designed–evolved KE07 is of 2.6 × 103 M− 1 s − 1 with a rate acceleration (k cat /k uncat ) of ∼ 1.2 × 106. While these numbers are comparable with other enzyme models, or mimics, including catalytic antibodies with Kemp eliminase activity,13,17 they are orders of magnitude lower than those observed for the most efficient natural enzymes. Higher throughput screens (only 800– 1600 variants could be screened with the chromogenic 5-nitrobenzisoxazole substrate) might yield higher improvements. It is also conceivable that the improvement ceiling could be raised by exploring larger deviations in sequence space that introduce more dramatic changes in the active-site configuration, such as changes in loop lengths (insertions/ deletions). Subsequent rounds of computational design may help minimize the huge sequence diversities that are associated with such changes (in particular, with insertions of additional residues) and thereby enable their exploration by lowor medium-throughput screens. The crystal structures of the designed KE07 and of its evolved variants from successive rounds of evolution provide a unique view of how an enzyme's catalytic machinery can be refined by sequence optimization. Directed evolution experiments generally aim at changing substrate specificity, rather than improving the catalytic machinery, and crystal structures of the evolved enzymes provide insights into how substrate specificity has been altered.30 In contrast, the series of evolved KE07 eliminases have the same substrate specificity and similar binding affinity for the substrate but progressively increased kcat values. The crystal structures in our study therefore illuminate how increases in catalytic prowess can be achieved. Fundamental aspects of enzyme catalysis can be examined, and in particular, the current shortcomings and paths for improvement of computational enzyme design methodologies can be identified.

Designed KE07 and Its Evolved Variants

The results of the evolutionary optimization suggest that the design represents a deep energy minimum that cannot be easily altered, certainly not by the number and type of mutations and the screening throughput applied here. The only mutation observed in a designed residue is Asn224Asp, and this residue is not a substrate-contacting residue. With the exception of the proton-donating Lys222, the main active-site features remained unchanged, as indicated by the near-complete loss of activity in Glu101Ala/Gln mutants.9 The mutations optimized the design, and in some cases they corrected inaccuracies and drawbacks of the design and refined the electrostatic properties of the active site. Indeed, as might be the case with natural enzymes,31 the electrostatic effects seem to be the most conspicuous changes and the ones that may account for most of the rate improvements. The issue of leaving-group stabilization is a clear example. Lys222 was placed in the original design to interact and stabilize the phenolic oxygen of the TS (Fig. 1a). The structural and pH-rate effects of the Ile7Asp mutation revealed that Lys222 actually interferes with catalysis, and its relocation away from the catalytic base (Glu101) resulted in increased rates. In fact, the same has been observed with the two other Kemp eliminase designs, KE59 and KE70, whereby the removal of the H-bond donor that was placed against the phenol oxygen (Ser131 and Ser137, respectively) increased catalytic efficiency (D.R. & O.K., unpublished results). Having H-donors in the active site, and especially in the vicinity of the catalytic base, reduces its basicity and catalytic power, as has been shown by mechanistic studies of the Kemp elimination.14 In the evolved variants from round 7, the predicted interaction between His201 and the phenolic oxygen of the TS (Fig. 6) may potentially replace the role designated for Lys222 without “quenching” the catalytic base, although at present there is no kinetic evidence that supports a role for such interaction (e.g., efficient catalysis of benzisoxazoles with higher pKa leaving groups). Coordinating the position and the action of two opposing functions, such as deprotonation and protonation, is a matter of routine for natural enzymes, but the design of these features remains a major challenge. There are several lessons for computational design provided by the structures of the evolved variants. First, computational design processes must consider alternative arrangements of the catalytic residues, some of which may be slightly energetically disfavored (e.g., the interactions of His201). Second, small backbone readjustments can allow for introduction of dramatic new interactions (such as the interaction of Arg202 with the substrate; see Fig. 5). Third, improvements in computation of electrostatic interactions are necessary to predict salt bridges that favor catalysis. The major obstacle to this is that such surface interactions are weak on their own, but they can contribute significantly to the active site and entire protein architecture when combined or networked. This is evident by the detrimental effects

1037 of disturbing the electrostatic networks at the bottom of the active site (Fig. 3) by mutating Asp7 and Lys222. The design should also fix or optimally restrict the mobility of the catalytic residues with additional designed interactions so that the intended interactions important for catalysis are indeed realized. This follows a primal requirement emphasized early on by Jencks32 — the catalytic residues need to be precisely positioned and preorganized to ensure that the interactions essential to catalysis are optimal.31 The catalytic base, for example, needs to be forced into the position necessary for abstraction of the proton, and this state is energetically unfavorable compared with interactions of the carboxylate with water molecules prior to substrate entry. The mobility of the carboxylate seen for most KE07 variants (except round 6 variants; see Supplementary Fig. 4) and the mobility of Lys222 that is corrected by the Ile7Asp mutation highlight the need for future computational efforts to incorporate second shell interactions that confine the catalytic residues to their desired conformations. Other effects of the evolutionary optimization relate to thermal properties of the KE07 variants. Directed evolution of the designed KE07 resulted in loss of stability on the one hand and in increase in enzymatic activity on the other. Gains of enzymatic functions at the expense of stability have been reported,28,33,34 but the two processes need not necessarily trade off, as exemplified by the directed evolution of enzymes that retain full activity at ambient temperature while gaining higher thermostability.35 The average mutation is destabilizing,36 and mutations that alter enzymatic functions comprise no exception. 37 Thus, the accumulation of mutations (four to eight mutations per evolved KE07 variant) is expected to reduce stability to the degree observed here. However, the differences between the thermal properties of the designed KE07 and the evolved variants (Fig. 9; Table 1) suggest a link between the reduction in stability, increased flexibility of the active-site region, and increased catalytic efficiency.28 The temperature dependency of the designed enzyme largely follows the expected increase in rate for any reaction (ca 2-fold per 10 °C, and slightly more for the spontaneous Kemp elimination). This suggests not only that the designed KE07 is highly stable overall (Tm N 95 °C) but also that its catalytic configuration is maintained at relatively high temperature (≤ 60 °C). Relative to the spontaneous reaction, the evolved variants actually lose their activity as the temperature increases. Thus, the active sites of the evolved KE07s seem to lose their catalytic configuration as the temperature rises, even before global unfolding occurs. The evolved active sites also became better preorganized for catalysis (the catalytic side chains are fixed in more optimal positions for catalysis), and the accompanying strain may have led to a parallel decrease in stability. 28,34 The notion that active-site

1038

Designed KE07 and Its Evolved Variants

Table 3. Summary of the crystallographic data collection and analysis Crystal parameters Data collected Wavelength (λ) Crystallization conditions No. of copies in the asymmetric unit Space group Unit cell parameters

PDB accession code Data collection Resolution range (Å) Last resolution shell (Å) No. of observations No. of unique reflections Completeness (%)a Redundancy 〈I〉/〈σ(I)〉a Rmergeb on I (%)a Refinement and model statistics Total no. of reflections No. of reflections in the test set Water molecules Mean B value (Å2) Rcryst (%)c Rfree (%)d rmsd bond length (Å) rmsd bond angles (°) Stereochemical parameters Ramachandran plot (%) Residues in most favored regions Residues in additionally allowed regions Residues in generously allowed regionse Residues in disallowed regionse

KE07 round 4 1E/11H

KE07 round 6 3/7F

KE07 round 7 1/3H

BM14 ESRF 0.954 0.2 M NaF 0.1 M BisTris, pH 8.5 20% polyethylene glycol (PEG) 3350 6

ID14-2 0.933 0.2 M NaI BisTris, pH 7.5 20% PEG 3350 6

BM14 ESRF 0.954 0.2 M MgCl2 BisTris, pH 8.0 20% PEG 6000 2

P31 a = 106.58 Å b = 106.58 Å c = 128.75 Å α = 90.00° β = 90.00° γ = 120.00° 3IIO

P31 a = 106.87 Å b = 106.87 Å c = 127.66 Å α = 90.00° β = 90.00° γ = 120.00° 3IIP

P212121 a = 38.77 Å b = 82.32 Å c = 175.40 Å α = 90.00° β = 90.00° γ = 90.00° 3IIV

50–2.25 2.33–2.25 333,948 77,187 99.1 (95.7) 4.3 15.5 (2.0) 9.2 (45.2)

50–2.30 2.38–2.30 354,016 72,858 100.0 (100.0) 4.9 19.6 (3.5) 8.7 (46.2)

50–1.80 1.86–1.80 313,607 52,231 98.1 (92.4) 6.0 36.2 (12.3) 4.9 (11.2)

73,236 3874 220 34.8 21.4 27.7 0.015 1.5

69,140 4981 167 26.6 21.8 26.6 0.032 2.47

49,076 2632 319 21.4 17.6 22.8 0.026 2.0

90.9 6.8

91.2 7.9

92.6 6.0

1.8

0.9

0.9

0.5

0.0

0.4

a

Highest-resolution shell data are given in parentheses. The reason for the high 〈I〉/〈σ(I)〉 in the last shell of the structure of KE07 round 7 1/3H is that one of the crystal axes was long (175.4 Å) and the reflections were very close to each other at the distance at which the data were collected. Moving the detector closer caused the reflections to be too close to each other, and the reflections could not be separately integrated. b Rmerge = ∑hkl∑j|Ij(hkl) − 〈I(hkl)〉|/∑hkl∑iIj(hkl), where Ij(hkl) and 〈I(hkl)〉 are the intensity of measurement j and the mean intensity for the reflection with indices hkl, respectively. c Rcryst = ∑hkl||Fobs|−|Fcalc||/∑|Fobs|, where Fobs denotes the observed structure factor amplitude and Fcalc(hkl) denotes the structure factor calculated from the model. d Rfree is for 5% of randomly chosen reflections excluded from the refinement. The somewhat high Rfree (22.8%) of the KE07 round 7 1/ 3H structure is mostly due to the local disorder in one of the two copies in the asymmetric unit. e Residues Arg202 and Asp224 were found in the disallowed or generously allowed regions in the structures of all the evolved KE07 variants.

optimization involved structural strain is supported by the fact that two mutated residues at the upper part of the active site (Gly202Arg and Asn224Asp) appear in the disallowed region or in the generously allowed region of the Ramachandran plot (Table 3) in the evolved KE07 variants. The increase in the pKa of the catalytic Glu101 and the loss of its interaction with Lys222 are also a possible source of instability since a deprotonated carboxylate within the desolvated environment of the enzyme–substrate complex is thermodynamically unfavored.

The difficulty of predicting the effects of protein stability and dynamics on the catalytic efficiency is another limitation of the design process that is highlighted by the directed evolution results. In contrast to the restriction of movement of the catalytic residues, an increase in flexibility of other regions of the active site could contribute favorably to substrate binding and/or product release.3,4 The alternative conformation seen in the round 7 1/3H variant, with the flipped Trp side chain, suggests that binding elements in the active sites of the evolved variants have become more

1039

Designed KE07 and Its Evolved Variants

flexible. However, the actual existence of the “flipped” conformation in solution and its relevance for catalysis remain questionable. Structural data, such as complexes with substrate, or TS analogues, may provide further insights regarding this issue, and these insights might also be useful for the integration of dynamics in the computational design process. One can envisage, for example, a design strategy that aims at identifying sequences that fold with similar energy into two alternative conformations that are both relevant to the catalytic cycle, rather than a singular conformation applied in the design of KE07. Future developments may also enable a tighter information exchange between the computational design and the directed evolution processes, such as using computation to direct library designs so that mutagenesis is targeted to certain positions and/or mutational compositions or using the sequences and structures of the evolved variants to refine the design algorithm. In summary, optimization of a simple active site created by computational design was accomplished through a combination of (i) correction of a mistake in the designed site that “quenched” the catalytic glutamate via interactions with an adjacent lysine and (ii) tuning of the pKa of the catalytic glutamate and electrostatic optimization of the active site through the evolution of intricate salt-bridge networks. These changes were accompanied by a decrease in the overall stability of the evolved enzymes and in the configurational stability of their active sites. The current and non-trivial challenge is to improve the computational design methodology to incorporate the modeling of these and other contributions and thus enable the computational design of highly active enzyme catalysts.

Materials and Methods Cloning The synthetic gene encoding the designed KE07 protein was purchased from Codon Devices, Inc., and the gene was cloned into His-tag expression vector pET29b (Novagen) using Nde and Xho restriction sites. In the libraries, the Nde site was replaced by Nco, which involved addition of amino acid alanine to the N-terminus, after the initial methionine. Library making Several mutagenesis methods were used to create genetic diversity in KE07 genes. Two random mutagenesis techniques were used: (i) error-prone PCR using the “wobble” base analogues dPTP (1–4 μM) and 8-oxo-dGTP (25–100 μM)21 and (ii) error-prone PCR with Mutazyme (Genemorph™ PCR mutagenesis kit, Stratagene).22 At certain mutagenesis rounds (Table 2), the genes were shuffled.38 Various positions in the genes were randomized by incorporation of spiking oligonucleotides during assembly of DNA fragments.39 After mutagenesis, the KE07 genes were recloned into the original pET29b

plasmid and the ligated DNA was transformed into Escherichia coli DH5α cells. Transformants (104–105) were obtained, and the plasmids encoding the libraries were extracted. It should be noted that random mutagenesis and gene shuffling are not exclusive—that is, during random mutagenesis, the genes are shuffled and the shuffling procedure incorporated a certain level of random mutations created by polymerase errors (one mutation per gene, on average). Screening procedure The substrate, 5-nitrobenzisoxazole, was prepared by nitration of benzisoxazole.18 Its identity and purity were confirmed by 1H NMR and MS. The libraries were screened by growing the cultures of E. coli in 96-deepwell plates and testing the activity of the crude lysates with 5-nitrobenzisoxazole. Briefly, E. coli BL21 (DE3) cells transformed with the libraries were grown on LB agar plates (containing 50 μg/ml of kanamycin). Individual colonies were inoculated into 2YT supplemented with 50 μg/ml of kanamycin (300 μl) in 96-deep-well plates and grown overnight at 37 °C. Overnight cultures (20 μl) were inoculated into 2YT supplemented with 50 μg/ml of kanamycin (500 μl) in 96-deep-well plates and grown to an OD600 of ∼0.6. Overexpression was induced by adding 1 mM IPTG; the cultures were grown for another 5 h and centrifuged, and the pellet was frozen overnight at − 20 °C. The cells were lysed with lysis buffer (Hepes, pH 7.25, 50 mM, 0.2% triton, 0.1 mg/ml of lysozyme, 250 μl/well); the lysates were cleared by centrifugation and assayed with 5-nitrobenzisoxazole (0.125 mM) by following the release of the phenol product at 380 nm (PowerWave HT microtiter scanning spectrophotometer). Overnight cultures of the most active clones were plated on LB agar plates containing 50 μg/ml of kanamycin. To ensure monoclonality and verify the activity of the selected variants, we reassayed the hydrolysis rates after growing two subclones from each original colony under the same conditions described above. Plasmids were extracted and used for sequencing and as templates for subsequent mutagenesis and screening rounds.

Characterization of purified enzyme variants Variants subjected to detailed analysis were retransformed into E. coli BL21 (DE3) cells. Five milliliters of 2YT medium supplemented with 50 μg/ml of kanamycin was inoculated with a single colony and shaken at 37 °C for ∼ 15 h. 2YT medium (500 ml) supplemented with 50 μg/ ml of kanamycin was inoculated with 5 ml of the overnight culture and grown at 37 °C until an OD600 of ∼ 0.6 was reached. Overexpression was induced by adding 1 mM IPTG; the cultures were grown for another 5 h and harvested, and the pellet was frozen overnight at − 20 °C. The cells were resuspended in lysis buffer (Hepes, pH 7.25, 25 mM, 5% glycerol, 100 mM NaCl, 50 ml/l of expression culture) and lysed by sonication. The soluble fraction was loaded onto a Ni-NTA (nitrilotriacetic acid) column (Qiagen) and washed with 10 and 20 mM imidazole; the protein was eluted with 250 mM imidazole. This protocol yielded ≥ 90% pure protein as judged by SDS-PAGE. After extensive dialysis against lysis buffer, protein concentrations were determined using a BCA protein assay kit (Pierce), and the samples were stored at 4 °C, supplemented with 0.02% sodium azide.

1040 For the kinetic characterization, the reactions were started by adding 150 μl of substrate (final concentration of 0.13–1.05 mM) in 25 mM Hepes, pH 7.25, with 100 mM NaCl to 50 μl of KE07 variant (various concentrations were used for different variants) in 25 mM Hepes, pH 7.25, with 100 mM NaCl and 5% glycerol (or no protein for the background reaction) in a 96-well plate. 5-Nitrobenzisoxazole was used from 0.1 M stock in acetonitrile, the cosolvent percentage was equalized to 1.5%, and the glycerol percentage was 1.25% in all the reaction mixtures. Product formation was monitored spectrophotometrically at 380 nm (PowerWave HT microtiter scanning spectrophotometer), in 200-μl reaction volumes, using 96-well plates. The reported results are the average of at least three independent measurements.

Designed KE07 and Its Evolved Variants

two independent measurements. Measurements above 60 °C were precluded by high spontaneous substrate decomposition. Fluorescence measurements Fluorescence spectra of KE07 variants (7–10 μM in phosphate buffer, pH 7.0, supplemented with 50 mM NaCl) were obtained using a Cary Eclipse fluorescence spectrophotometer (Varian) by excitation at 280 nm and fluorescence monitoring at 300–550 nm. The data presented are the average of triplicate measurements. Data analysis

pH-rate profile For KE07 round 2 11/10D, round 6 3/7F, round 7 1/3H, and round 7 10/11G, kcat and KM values were determined with 5-nitrobenzisoxazole at pH 4.0–9.0. Initial velocities (v0) were determined at eight substrate concentrations (0.13–1.05 mM). For KE07 round 4 1E/11H, kcat/KM values were determined with 5-nitrobenzisoxazole at pH 4.0–9.0 by measuring the initial velocities (v0) with low substrate concentrations (0.1–0.3 mM). It was impossible to obtain a pH-rate profile for the original KE07 design because of its low activity, and round 2 11/ 10D was assumed to represent the pH-rate profile of the KE07 design. The buffers used (at 50 mM) were citrate, pH 4.0–5.5; Mes, pH 5.5–6.5; and Bis-Tris propane, pH 6.5–9.0 . The ionic strength was adjusted to a total of 0.1 M with NaCl. The enzyme stocks were kept in 10 mM Hepes, pH 7.25, containing 100 mM NaCl and 5% glycerol. At pH 5.5 and pH 6.5, the activity was measured with both relevant buffers.

Kinetic parameters (kcat, KM, and kcat/KM) were obtained by fitting the data to the Michaelis–Menten equation [v0 = kcat[E]0[S]0/([S]0 + KM)] using the program Kaleidagraph 5.0. Data were fitted to the linear regime of the Michaelis–Menten model [v0 = [S]0[E]0kcat/KM] and kcat/KM was deduced from the slope when working with low substrate concentrations. All the data presented are the averages of at least three independent experiments with standard deviations. The pH-rate data (kcat and kcat/KM values) for each pH value [(kcat)H and (kcat/KM)H] were fitted using the equations (kcat)H = (kcat)max × 10−pKa/(10−pH + 10−pKa) and (kcat/KM)H = (kcat/ KM)max × 10−pKa /(10−pH + 10−pKa ), where (kcat)max and (kcat/KM)max are the plateau values of kcat and kcat/KM, respectively, and pKa is the apparent pKa value for the acidic group. The apparent midpoint temperatures of melting (Tapp m ) were obtained by fitting the normalized ellipticity data (ɛ) to the following equation:    app =RT e = ðaN þp4TÞþðaU þqT Þ4exp mN − U 4 T − Tm    app = 1 þ exp mN − U 4 T − Tm =RT

Directed evolution rounds of KE07 The 10 rounds of directed evolution of KE07 are summarized in Table 2. A detailed description of the directed evolution rounds can be found in Supplementary Material. CD measurements Far-UV CD wavelengths scans (210–300 nm) were collected on an Aviv 202 spectrometer in a 1-mm pathlength cuvette. The proteins were used at an ∼ 10 μM concentration in 50 mM phosphate buffer at pH 7.0. For melting curves, the samples were heated at a rate of 2 °C/ min from 25 to 95 °C, and the CD ellipticity signal at 223 nm, which showed the maximal change with the temperature, was monitored. Temperature profile of KE07 variants Activity of KE07 variants with 0.25 mM 5-nitrobenzisoxazole was determined at 20 –60 °C by following the absorbance at 380 nm (Cary 50 Bio UV-Visible Spectrophotometer, Varian). Proteins at various concentrations were used in 50 mM phosphate buffer at pH 7.0, and blank reaction rates (without the KE07 proteins) were subtracted. The data presented are the average of at least

where αN is a normalized ellipticity signal at T = 25 °C, p is a slope of ellipticity change of the native state, αU is a normalized ellipticity signal at T = 95 °C, q is a slope of ellipticity change of the unfolded state, and mN−U is the slope of transition.23 In cases where the denaturation was not complete at 95 °C, the Tapp could be only roughly m estimated. Docking of the TS model into the crystal structures of the KE07 design and its evolved variants Each copy of the protein in the asymmetric unit was superimposed on the designed model of KE07, and the xyz coordinates of the TS model were copied into the crystal structures. This position served as a starting point for the subsequent docking protocols. During all the repack and minimization steps, geometric constraints between the oxygen of Glu101 (abstracting the proton) and the carbon of the TS (from which the proton is abstracted) were applied to ensure a catalytically productive conformation (since all variants show enzymatic activity). To explore the potential conformational space of the TS–enzyme complex, we used “dock_repack.”20 First, the rigid-body position of the TS was randomly perturbed in a box of 2 Å × 2 Å and rotated up to 45°, and 100 models were generated. For these 100 models, the total energy of the system was minimized while allowing the rigid-body

1041

Designed KE07 and Its Evolved Variants

position of the TS to move and the amino acid side chains to repack (e.g., to adopt different conformations). Of the 10 best structures for each copy, the 5 predictions with the lowest TS energy were selected for inspection. A “dock_ minimize” protocol was applied, assuming that the conformation of the side chains found in the crystal structure does not change upon substrate binding. The total energy of the starting conformation was minimized, while the rigid-body position of the TS was allowed to move, but only small changes in side-chain torsion angles were allowed (e.g., the side chain stayed in the crystallized rotameric state). One model was generated per starting point and analyzed directly.

Marjorie Peress Philanthropic Fund, the Defense Advances Research Projects Agency, and the Adams Fellowship (Israel Academy of Science) to O.K. We thank the reviewers for their insightful comments.

Supplementary Data Supplementary data associated with this article can be found, in the online version, at doi:10.1016/ j.jmb.2009.12.031

Crystallization, data collection, and refinement The evolved KE07 variants subjected to crystallization were purified by Ni-NTA HiTrap chelating HP column (Amersham), followed by gel filtration (HiLoad 16/60 Superdex™, Amersham). Crystals of the evolved KE07 variants were obtained by the microbatch method under oil by using the Oryx6 robot (Douglas Instruments Ltd., East Garston, Hungerford, Berkshire, UK). The protein concentration used for crystallization of all variants was 20–25 mg/ml, and all the crystals were grown at 4 °C. Diffraction data were integrated, scaled, and reduced using the HKL2000 program package.40 The structures of the evolved variants were solved using the KE07 structure as a model (PDB accession code 2RKX). All the steps of atomic refinement were carried out with the program CCP4/Refmac5.41 The models were built to σa-weighted, 2Fobs − Fcalc, and Fobs − Fcalc maps using the program COOT.42 Water molecules were built into peaks greater than 3σ in the Fobs − Fcalc maps. The final refined structure of KE07 round 4 1E/11H contained six (A–F) copies in the asymmetric unit, each comprising residues 2–250. While monomers A– D exhibited a well-defined electron density, few regions of monomers E and F did not exhibit a well-defined electron density. The final refined structure of KE07 round 7 1/3H contained two copies in the asymmetric unit. The regions encompassed by residues 22–30 and 54–60 of monomer B did not exhibit a well-defined electron density. The final refined structure of KE07 round 6 3/7F contained six (A–F) copies in the asymmetric unit, each comprising residues 2–250. The coordinates of all the KE07 variants have been deposited to the PDB, and their accession codes are listed in Table 3. The KE07 models were evaluated with the program PROCHECK.43 Details of the data collection and structure refinement are described in Table 3. Accession numbers Coordinates and structure factors for the structures of KE07 variants have been deposited in the PDB with the following accession numbers: 2RKX for the KE07 design,9 3IIO for round 4 1E/11H, 3IIP for round 6 3/7F, and 3IIV for round 7 1/3H.

Acknowledgements We gratefully acknowledge financial support by the BioModularH2 EU Network, the Sasson and

References 1. Wolfenden, R. & Snider, M. J. (2001). The depth of chemical time and the power of enzymes as catalysts. Acc. Chem. Res. 34, 938–945. 2. Boehr, D. D., Nussinov, R. & Wright, P. E. (2009). The role of dynamic conformational ensembles in biomolecular recognition. Nat. Chem. Biol. 5, 789–796. 3. Pisliakov, A. V., Cao, J., Kamerlin, S. C. & Warshel, A. (2009). Enzyme millisecond conformational dynamics do not catalyze the chemical step. Proc. Natl Acad. Sci. USA, 106, 17359–17364. 4. Jackson, C. J., Foo, J. L., Tokuriki, N., Afriat, L., Carr, P. D., Kim, H. K. et al. (2009). Conformational sampling, catalysis, and evolution of the bacterial phosphotriesterase. Proc. Natl Acad. Sci. USA, 106, 21631–21636. 5. Kraut, D. A., Carroll, K. S. & Herschlag, D. (2003). Challenges in enzyme mechanism and energetics. Annu. Rev. Biochem. 72, 517–571. 6. Das, R. & Baker, D. (2008). Macromolecular modeling with Rosetta. Annu. Rev. Biochem. 77, 363–382. 7. Jiang, L., Althoff, E. A., Clemente, F. R., Doyle, L., Rothlisberger, D., Zanghellini, A. et al. (2008). De novo computational design of retro-aldol enzymes. Science, 319, 1387–1391. 8. Rohl, C. A., Strauss, C. E., Misura, K. M. & Baker, D. (2004). Protein structure prediction using Rosetta. Methods Enzymol. 383, 66–93. 9. Rothlisberger, D., Khersonsky, O., Wollacott, A. M., Jiang, L., DeChancie, J., Betker, J. et al. (2008). Kemp elimination catalysts by computational enzyme design. Nature, 453, 190–195. 10. Casey, M. L., Kemp, D. S., Paul, K. C. & Cox, D. D. (1973). The physical organic chemistry of benzisoxazoles: I. The mechanism of the base-catalyzed decomposition of benzisoxazoles. J. Org. Chem. 38, 2294–2301. 11. Kemp, D. S., Cox, D. D. & Paul, K. G. (1975). The physical organic chemistry of benzisoxazoles: IV. The origins and catalytic nature of the solvent rate acceleration for the decarboxylation of 3-carboxybenzisoxazoles. J. Am. Chem. Soc. 97, 7312–7318. 12. Hollfelder, F., Kirby, A. J. & Tawfik, D. S. (2001). On the magnitude and specificity of medium effects in enzyme-like catalysts for proton transfer. J. Org. Chem. 66, 5866–5874. 13. Thorn, S. N., Daniels, R. G., Auditor, M. T. & Hilvert, D. (1995). Large rate accelerations in antibody catalysis by strategic use of haptenic charge. Nature, 373, 228–230. 14. Seebeck, F. P. & Hilvert, D. (2005). Positional ordering of reacting groups contributes significantly to the

1042

15.

16. 17. 18.

19.

20. 21.

22.

23. 24. 25. 26.

27.

28.

efficiency of proton transfer at an antibody active site. J. Am. Chem. Soc. 127, 1307–1312. Debler, E. W., Ito, S., Seebeck, F. P., Heine, A., Hilvert, D. & Wilson, I. A. (2005). Structural origins of efficient proton abstraction from carbon by a catalytic antibody. Proc. Natl Acad. Sci. USA, 102, 4984–4989. Hollfelder, F. & Tawfik, D. (1997). Efficient catalysis of proton transfer by synzymes. J. Am. Chem. Soc. 119, 9578–9579. Hollfelder, F., Kirby, A. J. & Tawfik, D. S. (1996). Offthe-shelf proteins that rival tailor-made antibodies as catalysts. Nature, 383, 60–62. Hollfelder, F., Kirby, A. J., Tawfik, D. S., Kikuchi, K. & Hilvert, D. (2000). Characterization of proton-transfer catalysis by serum albumins. J. Am. Chem. Soc. 122, 1022–1029. Hu, Y., Houk, K. N., Kikuchi, K., Hotta, K. & Hilvert, D. (2004). Nonspecific medium effects versus specific group positioning in the antibody and albumin catalysis of the base-promoted ring-opening reactions of benzisoxazoles. J. Am. Chem. Soc. 126, 8197–8205. Davis, I. W. & Baker, D. (2009). RosettaLigand docking with full ligand and receptor flexibility. J. Mol. Biol. 385, 381–392. Zaccolo, M., Williams, D. M., Brown, D. M. & Gherardi, E. (1996). An approach to random mutagenesis of DNA using mixtures of triphosphate derivatives of nucleoside analogues. J. Mol. Biol. 255, 589–603. Barlow, M. & Hall, B. G. (2002). Predicting evolutionary potential: in vitro evolution accurately reproduces natural evolution of the tem beta-lactamase. Genetics, 160, 823–832. Fersht, A. (1999). Structure and Mechanism in Protein Science. W.H. Freeman and Company, New York, NY. Debler, E. W., Muller, R., Hilvert, D. & Wilson, I. A. (2008). Conformational isomerism can limit antibody catalysis. J. Biol. Chem. 283, 16554–16560. Romero, P. A. & Arnold, F. H. (2009). Exploring protein fitness landscapes by directed evolution. Nat. Rev. Mol. Cell Biol. 10, 866–876. Vick, J. E., Schmidt, D. M. & Gerlt, J. A. (2005). Evolutionary potential of (beta/alpha)8-barrels: in vitro enhancement of a “new” reaction in the enolase superfamily. Biochemistry, 44, 11722–11729. Varadarajan, N., Gam, J., Olsen, M. J., Georgiou, G. & Iverson, B. L. (2005). Engineering of protease variants exhibiting high catalytic activity and exquisite substrate selectivity. Proc. Natl Acad. Sci. USA, 102, 6855–6860. Wang, X., Minasov, G. & Shoichet, B. K. (2002). Evolution of an antibiotic resistance enzyme constrained by stability and activity trade-offs. J. Mol. Biol. 320, 85–95.

Designed KE07 and Its Evolved Variants

29. Amitai, G., Gupta, R. D. & Tawfik, D. S. (2007). Latent evolutionary potentials under the neutral mutational drift of an enzyme. HFSP J. 1, 67–78. 30. Fasan, R., Meharenna, Y. T., Snow, C. D., Poulos, T. L. & Arnold, F. H. (2008). Evolutionary history of a specialized p450 propane monooxygenase. J. Mol. Biol. 383, 1069–1080. 31. Warshel, A. (1998). Electrostatic origin of the catalytic power of enzymes and the role of preorganized active sites. J. Biol. Chem. 273, 27035–27038. 32. Jencks, W. P. (1975). Binding energy, specificity, and enzymic catalysis: the Circe effect. Adv. Enzymol. Relat. Areas Mol. Biol. 43, 219–410. 33. Meiering, E. M., Serrano, L. & Fersht, A. R. (1992). Effect of active site residues in barnase on activity and stability. J. Mol. Biol. 225, 585–589. 34. Roca, M., Liu, H., Messer, B. & Warshel, A. (2007). On the relationship between thermal stability and catalytic power of enzymes. Biochemistry, 46, 15076–15088. 35. Arnold, F. H., Wintrode, P. L., Miyazaki, K. & Gershenson, A. (2001). How enzymes adapt: lessons from directed evolution. Trends Biochem. Sci. 26, 100–106. 36. Tokuriki, N., Stricher, F., Schymkowitz, J., Serrano, L. & Tawfik, D. S. (2007). The stability effects of protein mutations appear to be universally distributed. J. Mol. Biol. 369, 1318–1332. 37. Tokuriki, N., Stricher, F., Serrano, L. & Tawfik, D. S. (2008). How protein stability and new functions trade off. PLoS Comput. Biol. 4, e1000002. 38. Abecassis, V., Pompon, D. & Truan, G. (2000). High efficiency family shuffling based on multi-step PCR and in vivo DNA recombination in yeast: statistical and functional analysis of a combinatorial library between human cytochrome P450 1A1 and 1A2. Nucleic Acids Res. 28, E88. 39. Herman, A. & Tawfik, D. S. (2007). Incorporating Synthetic Oligonucleotides via Gene Reassembly (ISOR): a versatile tool for generating targeted libraries. Protein Eng. Des. Sel. 20, 219–226. 40. Otwinowski, Z. & Minor, W. (1997). Processing of Xray diffraction data collected in oscillation mode. Methods Enzymol. 276, 307–326. 41. Murshudov, G. N., Vagin, A. A. & Dodson, E. J. (1997). Refinement of macromolecular structures by the maximum-likelihood method. Acta Crystallogr., Sect. D: Biol. Crystallogr. 53, 240–255. 42. Emsley, P. & Cowtan, K. (2004). Coot: model-building tools for molecular graphics. Acta Crystallogr., Sect. D: Biol. Crystallogr. 60, 2126–2132. 43. Laskowski, R. A., MacArthur, M. W., Moss, D. S. & Thornton, J. M. (1993). PROCHECK: a program to check the stereochemical quality of protein structures. J. Appl. Crystallogr. 26, 283–291.