Journal of Biotechnology 66 (1998) 11 – 26
Protein engineering the surface of enzymes Steffen B. Petersen a,*, Per Harald Jonson a, Peter Fojan a, Evamaria I. Petersen a, Maria Teresa Neves Petersen a, Sissel Hansen b, Rodney J. Ishak b, Edward Hough b a
Biostructure and Protein Engineering Laboratory, Department of Biotechnology, Uni6ersity of Aalborg, Sohngaardsholms6ej 57, DK-9000, Aalborg, Denmark b Department of Chemistry, Faculty of Science, Uni6ersity of Tromsø, N-9037 Tromsø, Norway Received 14 November 1997; received in revised form 19 June 1998; accepted 1 July 1998
Abstract The protein surface is the interface through which a protein senses the external world. Its composition of charged, polar and hydrophobic residues is crucial for the stability and activity of the protein. The charge state of seven of the twenty naturally occurring amino acids is pH dependent. A total of 95% of all titratable residues are located on the surface of soluble proteins. In evolutionary related families of proteins such residues are particularly prone to substitutions, insertions and deletions. We present here an analysis of the residue composition of 4038 proteins, selected from 125 protein families with B25% identity between core members of each family. Whereas only 16.8% of the residues were truly buried, 40.7% were \ 30% exposed on the surface and the remainder were B30% exposed. The individual residue types show distinct differences. The data presented provides an important new approach to protein engineering of protein surfaces. Guidelines for the optimization of solvent exposure for a given residue are given. The cutinase family of enzymes has been investigated. The stability of native cutinase has been studied as a function of pH, and has been compared with the cutinase activity towards tributyrin. Whereas the onset of enzymatic activity is linked with the deprotonation of the active site HIS188, destabilization of the 3D structure as determined by differential scanning calorimetry is coupled with the loss of activity at very basic pH values. A modeling investigation of the pH dependence of the electrostatic potentials reveals that the activity range is accompanied by the development of a highly significant negative potential in the active site cleft. The 3D structures of three mutants of the Fusarium solani pisi cutinase have been solved to high resolution using X-ray diffraction analysis. Preliminary X-ray data are presented. © 1998 Published by Elsevier Science B.V. All rights reserved. Keywords: Protein engineering; Protein stability; Entropy stabilisation; Cutinase; Microcalorimetry; Protein surface; Protein electrostatics
* Corresponding author. Tel.: +45 96 358080; fax: + 45 98 142555; e-mail:
[email protected] 0168-1656/98/$ - see front matter © 1998 Published by Elsevier Science B.V. All rights reserved. PII S0168-1656(98)00153-9
12
S.B. Petersen et al. / Journal of Biotechnology 66 (1998) 11–26
1. Introduction The surface of a protein is composed of charged, hydrophilic and hydrophobic residues. For soluble, globular proteins the fraction of surface residues capable of facilitating the transfer of the protein into the aqueous phase is higher than for the core of the protein. More than 95% of all charged residues reside on the surface of the protein, and when charged residues are buried they almost always participate in salt bridge formation or take part in extensive hydrogen bond networks. The protein surface is the point of contact with the solvent, whether this is water, organic solvent or other media. Thus the static or dynamic condition of the physical chemical environment are sensed intrinsically through the surface residues. The binding of a ligand to a protein, e.g. a substrate to an enzyme, constitutes a special case of changes in the physical chemical environment for the protein and will be given special attention in the present paper. Titratable residues will be influenced by pH, and pending on their individual pK values the choice of pH will determine whether or not a titratable residue carries charge. Such changes may interfere with, e.g. enzymatic activity of enzyme, and may also alter its thermostability of the enzyme. Likewise, both an increase and sometimes also a decrease in temperature may lead to denaturation of the enzyme, and thus abolishing catalytic activity. The protein surface thus constitute a very challenging and highly relevant target for protein engineering, and mutations introduced are likely to influence one or more of the parameters alluded to above. A common view is that the protein surface consists of hydrophilic residues, whereas the interior consists of hydrophobic residues. The wealth of 3D structural information stemming from Xray diffraction analysis as well as multi dimensional NMR studies has prompted investigations into the accurate amino acid composition of the protein surface as compared to the protein interior. Such analysis consistently indicate that the interior of proteins contain more hydrophobic residues than the protein surface, thus supporting the popular view that one central mechanism contributing to protein folding involves a collapse
of a hydrophobic core. Another result that is much less appreciated is that the protein surface may still contain large numbers of hydrophobic residues. Thus in principle a protein surface may display all types of residues. From a physical point of view this indicates a significant amount of ‘frustration’—it is impossible for the protein to ‘hide’ all hydrophobic residues in the core of the protein—at least this appears to be true for a representative set of naturally occuring proteins. One may speculate if this ‘frustration’ contributes to the intrinsic lability of protein structures, all seem to be surprisingly unstable, most displaying a net stabilization of only a few kcal mol − 1. This intrinsic instability may be the underlying mechanism that provides the necessary structural flexibility when a protein adapts its 3D structure in response to the presence of a ligand. Earlier investigations have used several different ways of defining whether or not a residue is on the surface or in the interior of a protein, but all investigations are based on a static structure, normally determined by X-ray crystallography. The earlier investigations can be divided into two different classes, those based on accessible area and those based on fractions of accessibility as can be seen from Table 1. Reasons for the different choices vary, e.g. 27% accessibility (Go and Miyazawa 1980), because this gave the same number of residues in each set (surface and interior). A globular, monomeric protein is normally considered to be an approximately spherical structure. This very simplified view is somewhat misTable 1 Overview of some earlier investigations of protein surfaces and interiors Authors
Decision criteriaa
Miller et al. (1987) Chothia (1976) Chothia (1975) Janin (1979) Go and Miyazawa (1980) Koshi and Goldstein (1995)
5% 0%, 5% 5% ˚2 0, 10 and 20A 27% 18%
a
The decision criteria are the criteria used in each case to decide whether or not a residue is on the surface or in the interior.
S.B. Petersen et al. / Journal of Biotechnology 66 (1998) 11–26
leading. Many proteins display highly oblate 3D structures (e.g. calmodulin). Also, at the more microscopic level, the protein surface is far from smooth—the individual amino acid residues protrude from the surface in a complex manner. In the active site region this is often responsible for the specificity of the enzyme. The local surface residues likewise influence secondary binding sites on the protein surface. Both the stability and activity of a protein as a function of pH are linked with the protein surface composition. The exact location and nature of titratable residues on the protein surface will determine its titration behavior. Whereas aspartic acid and glutamic acid both display negative charges if pH \5, lysine and arginine display positive charges if pHB 9. Histidine, which often figures as an active site residue is positively charged below pH 6.5, whereas tyrosine and cysteine are negatively charged at pH\9.5. Thus the charge state of 7 of the 20 naturally occurring amino acids is pH-dependent. In addition, both the N-terminal and C-terminal residues titrate. Individual residues display a titration behavior that is modulated by their local environment, e.g. the effective pKa of an aspartic acid will depend on its local electrostatic environment. The charge composition of the protein surface is therefore a complex function of pH, ionic strength and temperature. Although the term ‘crystal’ invokes the idea of a hard, dry object with well defined facets, only the last of these applies to protein crystals. These are often beautiful, but are quite soft and 30 – 80% of the crystal is water, most of which is liquid. A protein crystal is thus more a ‘3-dimensionally ordered solution’ than a solid. The intermolecular interactions that form the crystal lattice are weak and are frequently mediated via water molecules so that there is little direct protein-protein interaction. It is not unusual to find several crystal forms growing in the same solution, implying that the forces controlling the formation of the crystal lattice are weak. Crystallisation usually occurs in a narrow pH range, which is not necessarily near pI or the optimum pH for catalysis. However, caution is needed when assuming that the internal pH in a protein crystal is the same as that in the
13
mother liquor—it is quite possible to change the pH in a mother liquor by many pH units without changing the diffraction quality of the crystals. X-ray structure determination shows that the surface of a crystalline protein is covered by at least one shell of tightly bound water molecules so that, apart from possible perturbations in intermolecular contact regions, the crystal structure provides biochemically relevant information about the surface of the molecule. The structures of native F. solani cutinase and 34 mutant forms of F. solani cutinase have been determined (Longhi et al., 1996). Almost all of these have ˚ so been determined at resolutions better than 2A that the water structure is well defined. This has enabled a comprehensive study of the dynamics of the cutinase molecule but no systematic analysis of the surface and its interaction with solvent has been carried out. Protein engineering of surface positioned residues will change the size of the residue in question and possibly the functionality of the residue. Whereas it was previously thought that few structural restrictions existed with respect to which types of substitutions that could be made, the present paper presents evidence that significant restrictions may apply to such substitutions.
2. Methods The core sequence data used in the present work consists of 125 single chain protein sequences for which the protein 3D structures have been determined. No pair of protein sequences displays \ 25% identity. This data set is a subset of the 25% identity set of protein sequences made available by Hobohm and Sander (1994). For each of the 125 core sequences an HSSP file (Sander and Schneider, 1991) is available, which, among other data, lists sequences homologous to the sequence in question as well as the solvent accessibility of the residue in the core sequence. These sequences were included, and a total of 4038 sequences with 629472 residues thus formed the dataset analyzed in the present paper. Please note that although the core sequences share B 25% homology, the sequences extracted from each
14
S.B. Petersen et al. / Journal of Biotechnology 66 (1998) 11–26
of the HSSP files may display a much higher identity. The average number of sequences per HSSP file was 32. the lowest number of HSSP homologous sequences was 0, and the highest 290. The ratio of actual to maximum solvent accessibility, racc, was assumed not to change for a given sequence position. We believe that this approximation is a reasonable one. The solvent accessibility given in the data files was calculated by a rolling-sphere method derived by Lee and Richards (1971) with slightly different values for the atomic radii (Kabsch and Sander, 1983). In the present context we have examined the composition of the protein surface at different levels of racc considering only completely buried residues as interior and residues with more than 30% solvent accessibility as surface residues. The remaining residues are classified as intermediate. The amino acids are grouped into charged, polar and non-polar residues with all amino acid with sulphur, oxygen and/or nitrogen considered polar when not charged. This is similar to the classification used by Chothia (1976) and Miller et al. (1987). The classification of Go and Miyazawa (1980) based on polarity values estimated by Grantham (1974) has also been used.
2.1. DSC measurements Differential Scanning Calorimetry (DSC) measurements were performed on a MicroCal™ VPDSC (MicroCal, Northampton, MA) with cell volumes of 0.52 ml and interfaced to a personal computer (Plotnikov et al., 1997). The cutinase concentration was 0.7 mg · ml − 1. The DSC runs were performed at a scan rate of 90°C · h − 1 from pH 2.50 to 10.50. The runs were programmed identically, with a starting temperature of 10°C, and a stop temperature of 90°C in each run. Buffers were selected according to their ionization enthalpies to minimize pH drift during heating. Phosphate buffer was used in the pH-range from 2 to 4, acetate buffer between 4 and 5, citric acid between 5.5 and 7.0 and TRIS in the pH-range between 8.0 and 10.5, all at 20 mM. Raw experimental data were processed with the Microsoft Windows™-based software Origin supplied by MicroCal™. Calorimetric investigations were per-
formed using a VP-DSC from MicroCal™ under constant pressure of 30 psi in order to avoid the formation of gas bubbles during the experiment. The instrument was interfaced to a PC equipped with a data translation board for instrument control and automatic data collection. Samples were degassed for at least 20 min prior to the experiments using a membrane vacuum pump. The excess heat capacity functions were obtained by baseline subtraction and concentration normalization. To provide maximum reproducibility of data when taking the thermal history of the instrument into account, scans were performed with the same initial and final temperatures. Correction of the data with respect to the instrument response time (20 s) was carried out.
2.2. Crystallization, data collection, structure determination and refinement Three mutant (W69F, L81W and L81G/ L182G) cutinases were crystallized using the vapour phase diffusion method. Data sets were collected (room temperature) at the Swiss–Norwegian beamline (BM01), ESRF, Grenoble, using a MAR image plate detector. The data sets were processed using DENZO (Otwinowski, 1993), whereas the CCP4 program package (Colloborative Computational Project, Number 4, 1994) was used for scaling, merging, and molecular replacement calculations. For the double mutant, which crystallizes in space group P6, the phase problem was solved using the program AMoRe (Navaza, 1994) with the coordinates of the native enzyme (PDB code 1CUS, Martinez et al., 1992) as a search object. For both single mutants, the coordinates of the native enzyme were used directly as a starting model in the refinement (X-PLOR, Bru¨nger, 1992). The program PEAKMAX in the CCP4 package was used for locating water molecules. Final refinement statistics are given in Table 2. Full details of the structures will be published elsewhere.
2.3. Construction of an E. coli expression system Plasmid pFCEX1, a T7-promoter based E. coli expression system for the cutinase gene (cut) of
S.B. Petersen et al. / Journal of Biotechnology 66 (1998) 11–26
15
Table 2 Details of crystallization, data collection and refinement
Crystallization conditions Space group ˚) Cell dimensions (A ˚) Resolution (A Completeness (%) Rmerge Rfactor, Rfree (%)
W69F
L81W
L81G/L182G
18–24% PEG6000, Hepes buffer pH 7.5 P21 a = 35.06, b = 67.08, c = 36.85, beta= 94.8 10–1.65 98.6 7.4 16.8, 20.4
10–15% PEG4000, NaAc buffer pH 4.8 P21 a= 35.14, b =67.50, c= 36.99, beta= 94.4 10–1.55 96.4 5.8 17.7, 21.0
15–18% PEG6000, Hepes buffer pH 7.5 P6 a =b = 131.40, c = 37.02
Fusarium solani, was constructed by replacing the NdeI/BamHI fragment of pET11a (Studier et al., 1990) by the phoA-signal/cutinase (phoA-cut) fusion gene of pMa5-L (Lauwereys et al., 1990). In order to perform this cloning, it was necessary to introduce a NdeI site at the translational start ATG of the phoA-cut gene and BamHI site at the polylinker downstream of the cutinase gene. Plasmid pMa5-L, which carries a fusion between the signal sequence for the alkaline phosphatase (phoA) and the cutinase gene corresponding to the mature cutinase, was a gift from M. Lauwereys.
2.4. Expression of the cutinase gene in E. coli and purification of the protein Expression of the F. solani cutinase gene in E. coli BL21(DE3), using pFCEX1, was performed in 11 of Luria Broth (LB) containing ampicillin (100 mg ml − 1). Cells were grown at 25°C. Induction of the cultures were carried out at OD600 1.5 for about 6 h with 0.1 mM isopropyl-b-Dthiogalactopyranoside (IPTG). Since the cutinase gene (cut) is cloned behind the signal peptide for the alkaline phosphatase (phoA) the gene product is directed to the periplasm of E. coli. The periplasmic fraction was prepared by osmotic shock using a TES buffer containing 20% sucrose. The cells were then washed with pure water. The supernatants of the TES and water fractions contain 80 – 90% pure cutinase. For further purification a high-resolution strong cation-exchange column (SP Sepharose Fast Flow, Pharmacia LKB) was used.
10–1.65 98.9 4.6 18.7, 22.3
All investigations of the mutants W69F, L81G and L81G/L182G have been performed with purified protein provided by Professor Joaquim Cabral, IST, Technical, University of Lisbon.
3. Results The protein dataset used in the present study contained a total of 629472 residues. Only 16.8% of the residues had zero surface contact. 40.7% had \ 30% of their side chain exposed to solvent contact. Even though the interior of proteins mostly contains hydrophobic residues we still find 5% charged residues in the protein interior. The distribution of amino acids as a function of their surface exposure can be seen in Fig. 1. As is seen from Fig. 1a, a clear trend is visible in the data. In the protein interior (0% accessibility), more than half of the residues are non-polar—as we move up towards full surface exposure this number drops to about 10%. Concurrently the weakly polar residues become more and more abundant, constituting almost half of the residues at full solvent exposure. Surprisingly, the fraction of polar residues displays their maximum abundance, not at full exposure, but at 50–80% exposure. After grouping the amino acids in interior, surface and intermediate residues, the trend in amino acid distribution becomes more evident. The number of non-polar residues decreases 50% when we move from the interior to the surface, but even at full solvent exposure at the protein surface, 1/3 of the residues are non-polar residues
16
S.B. Petersen et al. / Journal of Biotechnology 66 (1998) 11–26
Fig. 1.
S.B. Petersen et al. / Journal of Biotechnology 66 (1998) 11–26
(Table 3). Note that all 20 amino acids are present in each layer. The fraction of residues with solvent exposure less than a certain cut-off value is plotted in Fig. 2. This figure shows that the variation with the cut-off is very pronounced, and clearly related to the type of residue in question. The non-polar residues (V, L, I, M, F, W, Y, and C) predominantly populate the deeper regions of the protein surface, displaying a 50% population at B 13% solvent exposure (Fig. 2a). Tyrosine and methionine behave different from the rest of the non-polar amino acid subset and cysteine is the amino acid with the lowest fraction in contact with solvent. In Fig. 2b, the weakly polar residues (G, A, P, S, and T) are given, displaying 50% population levels between 13 and 32% solvent exposure, which correlates well with the physical chemical nature of these residues. Alanine clearly displays the least polar nature of this subset of the amino acids. Finally in Fig. 2c, the polar and charged residues (H, R, K, Q, E, N, and D) display 50% population levels in the range between 17 and 43% solvent exposure. Since this amino acid subset includes the most polar residues we should clearly expect the high solvent exposure revealed by our data. Surprisingly, in this context histidine appears as the most hydrophobic residue, despite the fact that histidine is titratable. In Fig. 3 the local molecular environment around the active site of the Fusarium solani pisi cutinase is displayed. The active site of cutinase is positioned close to the molecular surface, and it is believed that the reported specificity towards short chain triacylglycerides is linked with this fact. We have studied both the native form, as well as three mutants of cutinase. The pH dependence of the thermal stability of the native form of cutinase was investigated using DSC (Fig. 4). This figure also shows the pH dependence of cutinase activity towards tributyrin (data from Lauwereys et al., 1990). The pH-range where cuti-
17
nase displays high activity against tributyrin (pH 8–9) coincides with maximum thermal stability. If the kinetics of unfolding is slow, then the unfolding kinetics can be probed by altering the scan rate, provided that the folding/unfolding process can be analyzed using equilibrium thermodynamics. We investigated the scan rate dependence of Tm at pH 8.5 (Fig. 5A, coinciding with the activity maximum towards tributyrin). In accordance with the investigations of Sanchez-Ruiz et al. (1988) and assuming a two state kinetic model (Eq. (1)) for the unfolding, the effect of the scanrate, 6, on Tm is given by Eq. (2): k3
k
2 U D N? k
(1)
6/T 2m = (A0 R/EA) exp (− EA/RTm)
(2)
1
where A0 is the Arrhenius frequency factor. The activation enthalpy, EA, can be calculated from the slope of a linear fit of ln(6/T 2m) versus 1/Tm (Fig. 5B). The correlation coefficient for the linear fit was 0.99. The EA (activation enthalpy of unfolding) was 146911 kcal mol − 1. Similarly the activation enthalpy of unfolding for the double mutant was calculated to 1629 10 kcal mol − 1. The theoretical pK values of the titratable residues of native cutinase was computed using TITRA (Anthonsen et al., 1994; Martel et al., 1996). The electrostatic map at pH 8.5 was displayed on the molecular surface of cutinase as shown in Fig. 6A.
4. Discussion The analysis of surface residues has led to several interesting observations. The most important one is that the local environment around a given surface positioned residue appears to be very important. This is readily apparent from the clear environmental preferences that emerge from our data analysis of protein surfaces. Hydrophobic residues on the surface are only marginally
Fig. 1. The distribution of amino acids in different layers of the protein. (a) the values for the different groups of residues. (b) The values for each of the 20 different amino acids found in proteins. The trend toward a more hydrophilic composition can easily be seen as we move out from the interior toward the surface. The number of residues in each column is: 0%, 105615; 0 – 10%, 132217; 10–20%, 69647; 20 – 30%, 65878; 30–40%, 59624; 40–50%, 55264; 50 – 60%, 51539; 60 – 70%, 38372; 70 – 80%, 25658; 80 – 90%, 14175; 90 – 100%, 6686l; \ 100%, 4797. The total number of residues is 629472.
18
S.B. Petersen et al. / Journal of Biotechnology 66 (1998) 11–26
Fig. 2. The fraction of interior residues the amino acids as a function of surface accessibility. A cutoff of e.g. 20% means that all residues with less or equal to 20% solvent accessibility are considered interior residues. Also given in the figure is the total fraction of interior residues. (a) The non-polar residues (V, L, I, M, F, W, Y and C). (b) Weakly polar residues (G, A, P, S and T) and (c) Polar residues (H, R, K, Q, E, N and D). The curve without markings is the average curve.
S.B. Petersen et al. / Journal of Biotechnology 66 (1998) 11–26
19
Table 3 Distribution of amino acid types in the three different areas of proteins
Interior Intermediate Surface
Chargeda (%)
Polar (%)
Non-polar (%)
Polar (%)
Weak (%)
Non-polar (%)
Fraction (%)
5 18 36
28 34 33
67 48 31
9 28 50
34 28 35
57 44 15
16.8 42.5 40.7
a
The classification of residues are given by Chothia (1976) (charged, polar and non-polar) and Go and Miyazawa (1980) (polar, weak and non-polar). At the far right the fraction of amino acids in the given layer is given.
exposed to solvent, typically displaying a 50% population at B 10% surface exposure. This observation indicates that hydrophobic residues, if present at the surface, preferentially are found in tightly packed hydrophobic environments. The composition of these environments has not been analyzed in the present context, but we propose that other non-polar residues, or non-polar chain fragments from polar residues, such as at the CH2-groups of lysine, glutamic acid and glutamine, may provide the necessary environment.
Conversely, the observation that the highly polar residues, display 50% population levels at around 40% surface exposure, indicate that these residues typically expose about half of their side chain. Therefore, we can anticipate a significant difference in side chain mobility at protein surfaces— on average the hydrophobic side chains will be more sterically hindered than the long chain polar residues. Cysteines belong to the non-polar group of residues. It is very likely that the majority of the
Fig. 2. (Continued)
20
S.B. Petersen et al. / Journal of Biotechnology 66 (1998) 11–26
cysteines in the data set are involved in disulphide bridge formation. Such bond formation will presumably form between a buried cysteine and a cysteine in the surface layer. The formation of a disulphide bond has to involve a spatial co-location of the two side chains, which forces the surface layer cysteine to point its sidechain towards the protein interior and the number of highly exposed cysteines will, as a consequence, be low. This is further supported by the fact that the percentage of cysteines with 100% solvent exposure is virtually nil (Fig. 1b). The tyrosines contain a hydroxyl group, and we propose that the polar, as well as titratable, nature of the hydroxyl group is causing the behavior seen in Fig. 2a. The polar properties of the hydroxyl group is expected to give a low number of totally buried residues (racc= 0%). The relatively high solvent exposure of the longer hydrophilic side chains may imply that these side chains reside in more flexible parts of the protein. This group of residues also include the most conspicuous titratable residues, lysine, arginine, aspartic acid and glutamic acid. It is conceivable that alteration in the titration state of a residue caused by molecular interactions, pH or solvent changes will induce attraction or repulsion between residues that can be compensated for by reorientation of the side chain. Molecular biology owes much of its current understanding of the underlying molecular mechanisms to the steadily growing number of high resolution structures of biological macromolecules. Although high resolution nuclear magnetic resonance is becoming more and more popular, it is still a fair description to state that the majority of protein structures are solved by X-ray diffraction analysis. It is therefore appropriate to discuss the limitations that the crystal phase imposes on our interpretation of protein structure. The crystal phase always contain some solvent, usually water. The water content of a protein crystal may vary dramatically, ranging from a few percent to \ 80% water by weight. In a low water protein crystal, the nearest molecular entity on a neighboring molecule for a surface residue on a protein is likely to be a residue on another protein molecule — conversely, in a high water content protein crystal, the nearest
neighbor will most likely be a water molecule. The most relevant situation is one where the water content is high, such that the local environment at the protein surface resembles the biological situation. Using the classification scheme developed as part of the present paper, the surface of the native Fusarium solani pisi cutinase shows very clearly that the active site is in a hydrophobic environment. Only in the immediate vicinity of the active site residues (S120, D175 and H188) is a hydrophilic region apparent. This is seen in Fig. 6B as a white (SER) and green (ASP and HIS) patch in the bottom of the active cleft. The affinity of a hydrophobic entity towards this part of the surface is thus accounted for. Four residues are flanking the lower part of the active site: L81; L182; N84; and V184. L81 and L182 are located on the left rim of the active site, presumably providing hydrophobic surfaces that stabilize the substrate when bound to the active site. L182 and V184 form a wall, which terminates the lower part of the active site. In the case of the native Fusarium solani pisi cutinase, we observe an almost classic pH-stability profile, where maximum thermal stability is achieved in the pH-range 5–9. On both sides of this range the stability, as measured by the calorimetric melting temperature, drops sharply. The existence of an intact 3D structure is a prerequisite for catalytic activity. The concurrent loss of activity and drop in thermal stability above pH 9 is therefore fully accounted for. However, the onset of catalytic activity appears to be uncoupled from the thermal stability. Whereas the 3D structure at pH 5 is as stable as it is at the pH where maximum activity is reported, no detectable activity is reported at this pH. Obviously, catalytic activity is dependent on more parameters than structural stability alone—and the most likely explanation for the delayed onset of catalytic activity is that the HIS 188 in the active site of cutinase has to be deprotonated before the trypsin-like catalytic triad can become functional. The map of electrostatic potentials at pH 8.5 (Fig. 6A) reveals another interesting aspect of the active site physical chemistry. Several tyrosines are located in the active site, and although they posses clear hydrophobic characteristics, their hy-
S.B. Petersen et al. / Journal of Biotechnology 66 (1998) 11–26
21
Fig. 3. The active site of Fusarium solani pisi cutinase. Active site residues are outlined in ball and stick models. The surface residues L81 and L182 are highlighted as well.
droxyl groups are titratable, with typical pK values around 8–10. At pH 8.5 we start to observe a progressive titration of these residues, leading to a pronounced negative electrostatic potential (red) in the bottom of the active cleft. We propose that this negative potential in the active site is providing for an efficient ejection mechanism of free fatty acid, which is also negatively charged at this pH. The fact that the pH-activity profile towards triolein reaches its maximum at pH : 8.5, we propose is a direct consequence of deprotonation of tyrosine residues in the active site. The binding of a substrate to the active site is associated with a change in dielectric constant, since the water molecules (dielectric constant, D =80), will be forced out of the active site as a consequence of the binding of the triacylglyceride (D= 5 – 10). Theoretical electrostatic calculations using TITRA (Anthonsen et al., 1994; Martel et al., 1996) predict that the pK values for histidine and tyrosines will decrease in size as a consequence of the drop in dielectric constant. Apart from alternative side chain conformations for a number of polar surface residues, the structures of the three mutant forms, which we have determined so far, are very similar to native cutinase. Ca rms deviations from the native enzyme are 0.17 and 0.17 for W69F, L81W, respectively and 0.37 and 0.28 for the two independent molecules in the L81G/L182G double-mutant. The rms differences between the two independent molecules in the asymmetric unit of the L81G/ ˚. L182G double mutant is 0.38 A
The overall impression is that backbone differences are restricted to the flexible regions, which were observed by Longhi et al. (1996). The same general picture applies to the distribution of temperature factors, with highest values for polar side chains, which protrude into the solvent and for the N- and C-termini. This observation correlates very well with our analysis of surface positioned residues and our conclusion that the long chained polar residues can be expected to display high mobility (6ide supra). Temperature factors in the core of the molecule are low but, again, are higher in the region of ALA62 and the 24–31 loop and somewhat higher in the 80–87 and 180–188 loops, which form the outer edge of the active site. It is interesting to note that in our case L81G/L182G has crystallized in the hexagonal space group P6 with two independent molecules in the asymmetric unit rather than in the native P21 reported in the Longhi et al. determination of this mutant. A surprising observation for the double mutant is that temperature factors in the 180–188 loop ˚ 20 lower than in the native enzyme. are about 10 A This also applies to the 80–87 loop but the decrease here is much smaller. In neither case can the reduction in temperature factors be attributed to changes in the crystal packing since there are no short intermolecular contacts in this region. The observation is directly contrary to the normal experience that mutation to glycine usually yields a more flexible structure. It is also surprising that the double mutant, where the entrance to the
22
S.B. Petersen et al. / Journal of Biotechnology 66 (1998) 11–26
Fig. 4. Cutinase thermal denaturation curve (triangles) and activity towards tributyrin [squares, activity data adapted from Lauwereys et al. (1990)].
active site is more open, only displays ca. 1% of the activity towards triolein shown by native cutinase. This may be due to the absence of side chains, which provide a hydrophobic environment (Longhi et al., 1997) for the aliphatic side chains in the enzyme-substrate complex. In the case of the W69F mutant the main chain lies buried below the surface of the protein with the side chain pointing toward the surface. In this case the mutation has resulted in simple replacement of the indole ring by a phenyl ring. With regard to water structure the overall impression is that this is largely conserved for all three mutants. It is interesting to note that the volume created by deletion of the leucine side chains in the double mutant remains completely empty — no new water molecules are observed. On the other hand, replacement of LEU81 by TRP has caused the displacement of three water molecules from the native structure. Interestingly, for the mutants L81W and
L81G/L182G the enzymatic activity is almost abolished, although of quite different reasons. In the case of L81W, the introduction of the large tryptophane residue undoubtedly hinders the entrance of the substrate into the active site (Fig. 6C). In the case of the double mutant (L81G/ L182G) the reason for the reduction in enzymatic activity must be caused by structural or dynamic changes (Fig. 6D). The X-ray structure of the double mutant shows no significant changes from the native enzyme apart from the mutated residues themselves. Thus we must conclude that the reduction in catalytic activity is probably due to changes in the dynamic properties of this mutant The activation energy for global unfolding was measured both for the native and for the double mutant at pH 8.5, i.e. at pH activity maximum. The 142 kcal mol − 1 activation energy obtained for the native cutinase represents a considerable barrier for unfolding. The thermal energy be-
S.B. Petersen et al. / Journal of Biotechnology 66 (1998) 11–26
23
Fig. 5. (A) Scan rate, 6, dependence of the denaturation temperature for cutinase at pH 8.50. (B) Plot of ln(6/T 2m) versus 1/Tm.
24
S.B. Petersen et al. / Journal of Biotechnology 66 (1998) 11–26
Fig. 6. The surface of cutinase. (A) The electrostatic potential distribution at pH 8.5 on the molecular surface of native cutinase. The pK values were calculated using TITRA (Anthonsen et al., 1994; Martel et al., 1996), and the potential map was visualized using GRASP (Nicholls et al., 1991). (B) Colored according residue property. The polar residues are green, the weak polar are white and the non-polar cyan. (C) The surface of the L81W mutant and (D) the L81G/L182G double mutant using the same color-coding as in (B).
comes sufficient for this transition to take place at 549 0.05°C. For the double mutant the barrier for unfolding was 162 kcal mol − 1, and the transition occurred at 51 9 0.048°C. The double mutant represents an entropic stabilization of the unfolded form, since the leucine to glycine mutation provide more motional freedom to the unfolded
state than does the leucines. It is therefore expectable that we observe a small decrease in Tm. The increase in activation energy is somewhat surprising, and may point towards differences in protein hydration involving the mutated sites. It is likely that the peptide NH and CO of the glycines are more involved in hydrogen bond formation
S.B. Petersen et al. / Journal of Biotechnology 66 (1998) 11–26
with the solvent water than the leucines. This can account for a less hydrophobic environment in the binding cleft with a concomitant reduction in activity towards triolein as well as an increase in the barrier for unfolding, due to increased interaction with water. The decrease in hydrophobicity may in turn explain the drop in enzyme activity due to less efficient substrate binding.
5. Conclusion The residue composition of protein surfaces has been analyzed with respect the type and prevalence as a function of water accessibility. When grouped in three categories, polar, weakly polar and non polar distinct differences in the occurences have been documented. Whereas the non-polar reach 50% population already at 7% accessibility the polar first reach the 50% population level above 40% accessibility. In order to investigate a model enzyme, we have studied the cutinase enzyme from Fusarium solani pisi, using both experimental and modelling techniques. The thermal stability was studied as a function of pH, and a bell shaped stability curve was obtained, with maximum thermal stability in the pH range 4 – 9. Whereas the loss of activity is coupled to loss of thermal stability, the onset is coupled with deprotonation of the active site histidine-188. We propose that a negative electrostatic potential stemming from deprotonated, negatively charged tyrosines in the binding site cleft is necessary for maximum activity. When coloring the Connolly solvent accessible surface, the hydrophobic nature of the cutinase active site is evident. Three mutants of cutinase have been solved to atomic resolution using X-ray diffraction analysis. Only minor structural changes are observed when comparing with the native cutinase structure. Two of the mutants are surface mutations. In the case of L81W, loss of enzymatic activity is caused by simple blocking of the entrance to the active site. The enhanced thermodynamic stability of the double mutant may be due to entropic stabilisation of the unfolded state. The loss of activity in the mutant, we propose, is a consequence of a less hydrophobic binding
25
cleft, combined with increased binding of solvent water to the protein backbone. The activation energy for unfolding increases for this mutant.
Acknowledgements We are indebted to Dr M. Lauwereys and Professor Joaquim Cabral for providing us with the Cutinase mutants W69F, L81W and L81G/ L182G. The Obels family fund and the Energy Foundation of Northern Jutland are gratefully acknowledged for generous financial support. PHJ thanks the Research Council of Norway for financial support (NFR-116316/410). EIP gratefully acknowledges a Schro¨dinger grant from the Austrian Research Foundation. MTNP thanks EU, TMR grant, Marie Curie program (ERBFMBICT972574).
References Anthonsen, H.W., Baptista, A., Drabløs, F., Martel, P., Petersen, S.B., 1994. The blind watchmaker and rational protein engineering. J. Biotechnol. 36, 185 – 220. Bru¨nger, A.T., 1992. X-PLOR, version 3.1: a system for X-ray crystallography and NMR. Yale University Press, New Haven, CT. Chothia, C., 1975. Structural invariants in protein folding. Nature 254, 304 – 308. Chothia, C., 1976. The nature of the accessible and buried surfaces in proteins. J. Mol. Biol. 105, 1 – 14. Colloborative Computational Project, Number 4, 1994. The CCP4 suite: programs for protein crystallography. Acta Cryst. D50, 760 – 763. Go, M., Miyazawa, S., 1980. Relationship between mutability, polarity and exteriority of amino acid residues in protein evolution. Int. J. Pept. Protein Res. 15, 211 – 224. Grantham, R., 1974. Amino acid difference formula to help explain protein evolution. Science 185, 862 – 864. Hobohm, U., Sander, C., 1994. Enlarged representative set of protein structures. Protein Sci. 3, 522 – 524. Janin, J., 1979. Surface and inside volumes in globular proteins. Nature 277, 491 – 492. Kabsch, W., Sander, C., 1983. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22, 2577 – 2637. Koshi, J.M., Goldstein, R.A., 1995. Context-dependent optimal substitution matrices. Protein Eng. 8, 641 – 645. Lauwereys, M., de Geus, P., de Meutter, J., Stanssens, P., Matthyssens, G., 1990. Cloning, expression and characterization of cutinase, a fungal lipolytic enzyme. In: Alber-
26
S.B. Petersen et al. / Journal of Biotechnology 66 (1998) 11–26
ghina, L., Schmid, R.D., Verger, R. (Eds.), Lipases: Structure, Mechanism and Genetic Engineering. Verlag Chemie, Weinheim. Lee, B., Richards, F.M., 1971. The interpretation of protein structures: estimation of static accessibility. J. Mol. Biol. 55, 379 – 400. Longhi, S., Nicolas, A., Creveld, L., Egmond, M., Verrips, C.T., de Vlieg, J., Martinez, C., Cambillau, C., 1996. Dynamics of Fusarium solani cutinase investigated through structural comparison among different crystal forms of its variants. Proteins 26, 442–458. Longhi, S., Mannesse, M., Verheij, H.M., et al., 1997. Crystal structure of cutinase covalently inhibited by a triglyceride analouge. Protein Sci. 6, 275–286. Martel, P.J., Baptista, A., Petersen, S.B., 1996. Protein electrostatics. Biotechnol. Ann. Rev. 2, 315–372. Martinez, C., De Geus, P., Lauwereys, M., Matthyssens, G., Cambillau, C., 1992. Fusarium solani cutinase is a lipolytic enzyme with a catalytic serine accessible to solvent. Nature 356, 615 – 618. Miller, S., Janin, J., Lesk, A.M., Chothia, C., 1987. Interior and surface of monomeric proteins. J. Mol. Biol. 196, 641 – 656.
.
Navaza, J., 1994. AMoRe-an automated package for molecular replacement. Acta Cryst. A50, 157 – 163. Nicholls, A., Sharp, K., Honig, B., 1991. Protein folding and association: insights from the interfacial and thermodynamic properties of hydrocarbons. Proteins 11, 281 – 296. Otwinowski, Z., 1993. Oscillation data reduction program. In: Sawyer, L., Issacs, N., Bailey, S. (Eds.), Data Collection and Processing. SERC Daresbury Laboratory, Warrington, UK, pp. 56 – 62. Plotnikov, V.V., Brandts, J.M., Lin, L.N., Brandts, J.F., 1997. A new ultrasensitive scanning calorimeter. Anal. Biochem. 250, 237 – 244. Sander, C., Schneider, R., 1991. Database of homologyderived protein structures and the structural meaning of sequence alignment. Proteins 9, 56 – 68. Sanchez-Ruiz, J.M., Lopez-Lacomba, J.L., Cortijo, M., Mateo, P.L., 1988. Differential scanning calorimetry of the irreversible thermal denaturation of thermolysin. Biochemistry 27, 1648 – 1652. Studier, F.W., Rosenberg, A.H., Dunn, J.J., Dubendorff, J.W., 1990. Use of T7 RNA polymerase to direct expression of cloned genes. Methods Enzymol. 185, 60 – 89.