Chemical Physics Letters 432 (2006) 275–280 www.elsevier.com/locate/cplett
On the energetics of protein folding in aqueous solution Yuichi Harano a, Roland Roth b
b,c
, Masahiro Kinoshita
a,*
a International Innovation Center, Kyoto University, Uji, Kyoto 611-0011, Japan Max-Planck-Institut fu¨r Metallforschung, Heisenbergstr. 3, D-70569 Stuttgart, Germany c ITAP, Universita¨t Stuttgart, Pfaffenwaldring 57, D-70569 Stuttgart, Germany
Received 25 August 2006; in final form 4 October 2006 Available online 17 October 2006
Abstract We argue that upon the structural change of a protein, the gain or loss of the intramolecular energy is largely compensated by the loss or gain of the hydration energy, when the folding is considered under the isochoric condition. We introduce the sum of the intramolecular energy and the hydration free energy as the key function. The change in this function is governed by the change in the hydration entropy. A protein is designed to fold into the structure that maximizes the entropy of water under the requirement that sufficiently many intramolecular hydrogen bonds be formed to compensate the dehydration penalty. Ó 2006 Elsevier B.V. All rights reserved.
1. Introduction Protein folding is the most fundamental and universal example of the biological self-assembly [1]. Uncovering the mechanism through which the folding occurs is a central subject for understanding life at the molecular level. Despite an enormous amount of effort devoted, the folding mechanism has not been understood yet. One of the reasons is that many groups attempt to solve the problems related to protein folding by a brute-force approach in which all atomistic details are included in the model. As a consequence, the results obtained are often extremely hard to interpret. In this Letter, we wish to reveal the essential physics of protein folding using a completely different approach based on a unique concept. A protein in aqueous solution under physiological conditions folds into a unique native structure. Any other polymer and even a polypeptide with an arbitrary aminoacid sequence do not share this feature. In the native structure of the protein the side chains and backbone are tightly packed with little space in the interior [2–4]. The protein is characterized by the side chains with a variety of sizes and geometries, and it seems that the tight packing leads to the *
Corresponding author. Fax: +81 774 38 3508. E-mail address:
[email protected] (M. Kinoshita).
0009-2614/$ - see front matter Ó 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.cplett.2006.10.038
uniqueness of the structure finally stabilized. In our opinion, the physics of protein folding appears to be much simpler than that of the conformational propensity of the heteropolymers other than proteins. We have recently shown that a major driving force in protein folding is the gain in the translational entropy of water, which arises mainly from a reduction of the excluded volume [3–5]. This in turn leads to an increase in the total volume available to the translational motion of water molecules (i.e., the volume of the configurational phase space which can be explored by water molecules) that are present in the system. (The thought, that the reduction in the excluded volume leads to an entropy gain, becomes incorrect at high pressures, which causes the phenomenon called ‘pressure denatutation’ of a protein [6,7].) In our earlier studies [3,4], we analyzed the structural stability of a protein in terms of only its structure-dependent hydration entropy. In this Letter, we clarify the physical significance of such an analysis by further investigating the role of the water entropy. It is argued that the change in the key function, the sum of the intramolecular energy and the hydration free energy, is governed by the change in the hydration entropy. We theoretically calculate the entropy of water for the native structure and 600 structures of protein G which correspond to local-minimum states of the protein immersed in water. It is found that the entropy is
276
Y. Harano et al. / Chemical Physics Letters 432 (2006) 275–280
maximized when the protein takes the native structure. The attractive interaction between water molecules is shown to be an important factor to be taken into account in evaluating the entropy. 2. Sum of intramolecular energy and hydration free energy In a theoretical analysis on the structural stability of a protein, the quantities of hydration thermodynamics are usually calculated for a fixed structure [11–13]. The hydration free energy l (i.e., the excess chemical potential), which is the most important quantity, is the same under isobaric and isochoric conditions [14]. Therefore, in the following we consider structural changes under the isochoric condition. Unrealistic structures with overlaps of protein atoms and/or high values of the torsion energy are excluded from our discussion. Let EI be the intramolecular energy of a protein. When a protein changes its structure to a more compact one, there is a gain in EI (e.g., hydrogen bonds and van der Waals attractive interactions). This effect is strong enough to drive the protein to take a greatly compact structure in vacuum. In aqueous solution, however, such a structural change is accompanied by a serious dehydration penalty comprising the loss of hydrogen bonds and van der Waals attractive interactions with water molecules. Baldwin [8] argued that the energy change in the formation of an intramolecular hydrogen bond between ‘O’ and ‘N’ in –CONH– groups in water, CO W + NH W ! CO HN + W W, is 0 ± 1 kcal/mol. A similar conclusion was drawn by another group [9] who performed a computer simulation for hydrogen-bond formation between two formamide molecules in water. On the other hand, the gain in the intramolecular van der Waals energy upon protein folding was emphasized by Pace [2]. However, the loss of the van der Waals energy with water molecules possessing ‘O’ should be considerably large, because the Lennard–Jones (LJ) potential depth is larger for ‘O’ than for ‘H’, ‘C’, and ‘N’. The anticorrelation between EI and l has already been pointed out in the literature [10,11], but in our view the anticorrelation stems from the feature that changes in the intramolecular energy and the hydration energy are compensating, which is emphasized in this Letter. The key function determining the structural stability of a protein in aqueous solution, which is denoted by Y, is the sum of EI and l [11–13]: Y ¼ EI þ l:
ð1Þ
Unlike in vacuum, there is a strong effect that water tries to force the protein to form the structure minimizing l. Y competes with the conformational entropy of the polypeptide. l comprises the hydration energy EH and the hydration entropy S: l ¼ EH TS;
ð2Þ
where T is the absolute temperature. EH, which takes a large, negative value, consists of two components. The first
component is the energy gain arising from the compression of bulk water due to the protein insertion under the isochoric condition [14]. The second one is the energy gain originating from the appearance of protein–water correlations and the perturbation of water–water correlations in the vicinity of the protein. These gains become larger as the excluded volume V generated by the protein and the water-accessible surface area A increase, respectively. By contrast, the intramolecular-energy gain becomes smaller with the increase in V and/or A. The anticorrelation between EH and EI thus occurs. Here, we denote the change in the thermodynamic quantity X by DX. DY is given by DY ¼ DEI þ Dl ¼ DEI þ DEH T DS:
ð3Þ
Upon the structural change, the gain or loss of EI is largely compensated by the loss or gain of EH. The absolute values of DEI and DEH are large but comparable in magnitude, and their signs are opposite. For this reason, a significant cancellation occurs when DEI is added to DEH, with the result that DY is governed mainly by TDS. Thus, in aqueous solution, the water entropy is likely to be the only quantity which can surpass the conformational entropy and allow a protein to reach its native structure. The tight packing in the protein interior is necessary to ensure that the native structure be stable and that partially unfolded, inactive structures have negligible probability at ambient temperatures [4]. We note that the concept mentioned so far is the most valid in the special cases where the amino-acid sequences selected by nature are immersed in aqueous solution under physiological conditions. In other cases, the water-entropy effect is not always sufficiently strong in comparison with the other effects, and the polypeptide chain can take a variety of structures including rather extended ones. 3. Theoretical method of calculating hydration entropy We employ our powerful morphometric method of calculating the solvation free energy which was first developed for solutes with simple geometries [15] and later extended to complex molecules like proteins [16]. (When the solvent is water, the solvation free energy is referred to as the hydration free energy.) It is capable of predicting results that are almost indistinguishable from those calculated using the three-dimensional integral equation theory [3,4,17–19] in a computation time which is about four orders of magnitude shorter. The idea of the method is to predict the solvation free energy l using only four geometrical measures of the protein and corresponding thermodynamical coefficients [15,16]: X ; l ¼ pV þ rA þ jC þ j
ð4Þ
where V, A, C, and X are the volume excluded by the protein, the surface area accessible to the solvent, and the integrated mean and Gaussian curvatures of the accessible surface, respectively. The thermodynamic coefficients are the pressure of the solvent p, the surface tension of the
Y. Harano et al. / Chemical Physics Letters 432 (2006) 275–280
solvent r at a planar wall, and two bending rigidities, j , which account for the curvature effects. More deand j tails are described in our earlier publications [15,16]. The morphometric form of l, Eq. (4), separates the geometric properties of the protein and the thermodynamical coefficients. This feature allows one to determine the values in simple geometries. We determine these of p, r, j, and j coefficients from calculations of l of spherical solutes with varying radius using the integral equation with the hypernetted-chain (HNC) closure. In principle these coefficients can be determined via different routes like the integral equation, the density-functional theory, and computer simulations. The solvation entropy S can be calculated through S ¼ ðol=oT ÞV ;
ð5Þ
where the temperature derivative is numerically evaluated from ðol=oT ÞV ¼ fl=ðT þ dT Þ l=ðT dT Þg=ð2dT Þ;
dT ¼ 5 K: ð6Þ
Under the isochoric condition, S is governed mainly by the excluded volume effect and is not significantly influenced by the solute–solvent interaction potential [20]. However, S in general depends on the nature of the solvent. We model a protein as a set of fused hard spheres. The diameter of each atom is set at the LJ potential diameter of AMBER99. The complex polyatomic structure is taken into account at the atomic level by our morphometric approach. The nature of the solvent–solvent interaction potential is reflected in the values of the four coefficients. We consider three models for the solvent. In model 1, the solvent is treated as hard spheres with diameter d = 0.28 nm. Within this model S = l/T. In model 2, spheres forming the solvent interact through strongly attractive pair potential u(r) given by uðrÞ ¼ 1
for r < d;
uðrÞ ¼ eðd=rÞ
6
for r P d:
277
which is calculated in the manner described in Section 3 (DY TDS). When a number of structures are considered, the relative values of TS among different structures correspond to those of Y. We are interested in knowing if TS takes the minimum value (i.e., the water entropy takes the maximum value) for the native structure of a protein. To this end, we compare the solvent entropy for the native structure and 600 structures of protein G [16] which are taken from the local-minimum states of the protein immersed in water using a computer simulation with the all-atom potentials. They cover a very wide range of different structures including quite compact ones [16]. Figs. 1–3 show the plot of S/kB–(S/kB)Native (the subscript denotes the value for the native structure) for models 1, 2, and 3, respectively, against the radius of gyration Rg. There is a general trend that S/kB–(S/kB)Native is lower for smaller Rg. In Fig. 1, the solvent entropy for the native structure is fairly high, but there are several structures giving even higher solvent entropy. In Figs. 2 and 3, on the other hand, the solvent entropy takes the maximum value when the protein is in the native structure. It is interesting to note that the relative values of the solvent entropy among different structures are magnified when the solvent–solvent attractive interaction is incorporated. It can thus be suggested that the native structure can be characterized as the structure maximizing the entropy of water. The solvent–solvent attractive interaction has been shown to be important in evaluating the entropy. However, the details of water molecules such as the rotational motion appear to be insignificant and the structural stability of a protein can be analyzed using model 2. The solvation entropy calculated using model 2 can also be regarded as
ð7aÞ ð7bÞ
The value of e/(kBT) (kB is Boltzmann’s constant) is chosen to be 1.8 for T = 298 K. Following our earlier works [3,4], we set the number density of bulk solvent qd3 at 0.7 in models 1 and 2. Model 3 is a much more realistic model of water. A solvent molecule is a hard sphere with diameter d in which a point dipole and a point quadrupole of tetrahedral symmetry are embedded [21–26]. The effects of the molecular polarizability are taken into account using a mean-field theory. The angle dependent integral equation theory [21–26] is employed to calculate the four coefficients in the morphometric form. qd3 is set at 0.7317, the value for water at ambient temperature and pressure. 4. Hydration entropy of protein G in different structures Our concept is that DY for local-minimum states of a protein immersed in water can be approximated by TDS
Fig. 1. S/kB–(S/kB)Native plotted against the radius of gyrationRg for model 1. The value of Rg for the native structure is 1.08 nm. S denotes the solvation entropy. S/kB measures the magnitude of the solvent-entropy loss upon the insertion of the protein in a fixed structure. The smaller the loss is, the higher the solvent entropy is. In this figure, there are several structures for which the ordinate is negative and thus the loss is smaller (i.e., the solvent entropy is higher) than that for the native structure.
278
Y. Harano et al. / Chemical Physics Letters 432 (2006) 275–280
5. Crucial importance of formation of secondary structures
Fig. 2. S/kB–(S/kB)Native plotted against the radius of gyration Rg for model 2.
It is obvious that the formation of the helical structure by a long backbone, which features the a-helix structure, leads to a significant decrease in the excluded volume for the water molecules [3–5]. The formation of the b-sheet structure also results in the excluded-volume decrease due to the lateral contact of backbones [3–5]. The formation of these secondary structures also gives contacts of the side chains followed by a further decrease in the excluded volume. The decrease in the excluded-volume means an increase in the volume of the configurational phase space explored by the water molecules and a translationalentropy gain. Another important feature of the formation of the secondary structures is that a large number of intramolecular hydrogen bonds can be assured to make up for the dehydration penalty. It is no surprise that the secondary structures frequently appear in the native structure. 6. Relevance to experimental observations
Fig. 3. S/kB–(S/kB)Native plotted against the radius of gyration Rg for model 3.
the hydration entropy. Since compact structures with small values of Rg share almost the same conformational entropy, their stability can be described in terms of Y. The above result supports our concept that DY can be replaced by TDS as long as a suitable model is employed for water. In protein folding, the formation of the intramolecular hydrogen bonds itself is not a driving force. However, their formation is of vital importance to compensate the dehydration penalty accompanying the folding. A protein folds to increase the water entropy with assuring a sufficiently large number of intramolecular hydrogen bonds. Dill [27] claimed that the hydrophobic effect is a major driving force in protein folding. His view is consistent with ours though the physical origin of the hydrophobicity addressed is not the same. In our interpretation the hydrophobicity originates from the translational motion of water molecules [3–5].
Our idea, that the translational motion of water molecules is a major driving force in protein folding, is apparently consistent with the experimental observation that the changes in entropy and enthalpy upon the folding are usually both positive at sufficiently low temperatures (<280 K) [28–30] and the folding is entropically driven. It is remarkable that the entropy change is positive despite the huge loss of the conformational entropy. At higher temperatures, the entropy and enthalpy changes are negative for most of the proteins [31]. However, this does not conflict with our idea for the following reason. In general, the change in the hydration free energy for a process occurring in aqueous solution is the same under isochoric and isobaric conditions while the changes in the hydration energy (or enthalpy) and entropy are not. Protein folding is no exception. The experiments are performed under the isobaric condition while we consider the folding under the isochoric condition for theoretical convenience. We have recently made a fundamental analysis on the changes in hydration thermodynamic quantities [14]. An important finding is that under the isobaric condition the folding is accompanied by system-volume compression. The compression leads to a gain in the internal energy of water within the system and a corresponding loss of the water entropy, keeping the free energy of water unchanged. The effect of the compression is quite small below 280 K but becomes increasingly larger with rising temperature. As a result, the changes in enthalpy and entropy upon the folding are strongly dependent on the temperature whereas the temperature dependency of the free-energy change is much weaker due to the cancellation of the enthalpy and entropy changes. This is the so-called entropy–enthalpy compensation phenomenon [31–34] which is known for a variety of physicochemical processes occurring in aqueous solution as well as for protein folding. At higher temperatures, the large gain in the internal energy of water gives rise to a neg-
Y. Harano et al. / Chemical Physics Letters 432 (2006) 275–280
279
ative enthalpy change upon the folding and a waterentropy gain that is smaller than the conformationalentropy loss. In either case, the translational motion of water molecules drives a protein to fold in order to lower the free energy of water. The view [2] that the native structure is stabilized by the gain in the intramolecular van der Waals energy is only superficially consistent with the negative enthalpy change discussed above. We have recently shown that the van der Waals interaction is weaker than the attractive interaction induced by the water-entropy effect (see Fig. 6 in [4]). Moreover, the burial of the backbone and side chains undergoes the loss of van der Waals interactions with water molecules that should be considerably large. If one looks at the intramolecular van der Waals energy alone, it could be almost the lowest for the native structure. However, in our view the gain in this energy itself is not a driving force in protein folding in aqueous solution. The intramolecular electrostatic energy of the native structure is also quite low due to the formation of sufficiently many intramolecular hydrogen bonds, but the formation is accompanied by the serious loss of the hydrogen bonds with water molecules. Again, the gain in this energy is not powerful enough to drive a protein to fold. The entropy–enthalpy compensation also means that the free-energy change is considerably smaller than the entropy and enthalpy changes. Even under the isochoric condition, Dl could be much smaller than DEH and TDS due to a significant cancellation. However, Dl still plays important roles: It has been shown that EI is not always minimized for the native structure and the hydration effects must be taken into account to obtain a better energy function [10] which takes the lowest value for the native structure with higher accuracy. At 277 K the thermal expansion coefficient of water is zero and there is no effect due to the volume compression of water [14]. As a result, the changes in the thermodynamic quantities upon protein folding under the isobaric condition do not differ from those under the isochoric condition. The energetics presented in this Letter is supported by the experimental observation that the folding is entropically driven at 277 K as mentioned above. When the folding is considered under the isochoric condition, we can take the following view: A protein is designed so that it can fold into the structure maximizing the water entropy under the requirement that a sufficiently large number of intramolecular hydrogen bonds be formed to compensate the dehydration penalty.
structures are the most favorable units not only increasing the water entropy but also forming intramolecular hydrogen bonds. They are likely to be formed in an early stage of the folding process. By packing the side chains with a variety of sizes and geometries in a later stage, the uniqueness of the native structure can be assured. This feature is true only for the amino-acid sequences selected by nature (or those which are foldable) immersed in aqueous solution under physiological conditions. In this respect, protein folding is clearly distinguished from the conformational propensity of a heteropolymer or a polypeptide with an arbitrary amino-acid sequence. The gain in the intramolecular energy is largely compensated by the loss of the hydration energy and the structural stability is governed by the change in the entropy of water. This concept is effective only on the condition that atoms such as ‘O’ and ‘N’ form intramolecular hydrogen bonds when they are buried. The structures of protein G tested in this Letter correspond to local-minimum states of the protein immersed in water. The result from our study is suggestive that such structures, which should avoid serious dehydration penalty, satisfy the condition mentioned above. If we wish to discriminate the native structure from artificially constructed, misfolded structures many of which possess only an insufficient number of intramolecular hydrogen bonds, for example, the water entropy alone is not a good measure and additional considerations are required. Work in this direction is in progress [35].
7. Concluding remarks
[10] [11]
In aqueous solution, a protein is driven to fold into a tightly packed structure on the condition that a sufficiently large number of intramolecular hydrogen bonds is assured to compensate the dehydration penalty. The water-entropy effect originating from the translational motion of water molecules is a major driving force. The a-helix and b-sheet
Acknowledgements We thank K. Nagayama and N. Matubayasi for useful comments. This work was supported by Grants-in-Aid for Scientific Research on Priority Areas (No. 15076203) from the Ministry of Education, Culture, Sports, Science and Technology of Japan and by the Next Generation Super Computing Project, Nanoscience Program, MEXT, Japan. References [1] [2] [3] [4] [5] [6] [7] [8] [9]
[12] [13] [14]
C.M. Dobson, Nature 426 (2003) 884. C.N. Pace, Biochemistry 40 (2001) 310. Y. Harano, M. Kinoshita, Chem. Phys. Lett. 399 (2004) 342. Y. Harano, M. Kinoshita, Biophys. J. 89 (2005) 2701. M. Kinoshita, Chem. Eng. Sci. 61 (2006) 2150. Y. Harano, M. Kinoshita, J. Phys.: Condens. Matter 18 (2006) L107. Y. Harano, M. Kinoshita, J. Chem. Phys. 125 (2006) 024910. R.L. Baldwin, J. Biol. Chem. 278 (2003) 17581. S.F. Sneddon, D.J. Tobias, C.L. Brooks III, J. Mol. Biol. 209 (1989) 817. B.N. Dominy, C.L. Brooks III, J. Comput. Chem. 23 (2002) 147. M. Kinoshita, Y. Okamoto, F. Hirata, J. Chem. Phys. 110 (1999) 4090. M. Kinoshita, Y. Okamoto, F. Hirata, J. Am. Chem. Soc. 120 (1998) 1855. A. Mitsutake, M. Kinoshita, Y. Okamoto, F. Hirata, J. Phys. Chem. B 108 (2004) 19002. M. Kinoshita, Y. Harano, R. Akiyama, J. Chem. Phys., to be published.
280
Y. Harano et al. / Chemical Physics Letters 432 (2006) 275–280
[15] P.M. Ko¨nig, R. Roth, K.R. Mecke, Phys. Rev. Lett. 93 (2004) 160601. [16] R. Roth, Y. Harano, M. Kinoshita, Phys. Rev. Lett. 97 (2006) 078101. [17] M. Ikeguchi, J. Doi, J. Chem. Phys. 103 (1995) 5011. [18] M. Kinoshita, J. Chem. Phys. 116 (2002) 3493. [19] M. Kinoshita, Chem. Phys. Lett. 387 (2004) 47. [20] T. Imai, Y. Harano, M. Kinoshita, A. Kovalenko, F. Hirata, J. Chem. Phys. 125 (2006) 024911. [21] P.G. Kusalik, G.N. Patey, J. Chem. Phys. 88 (1988) 7715. [22] P.G. Kusalik, G.N. Patey, Mol. Phys. 65 (1988) 1105. [23] M. Kinoshita, M. Harada, Mol. Phys. 81 (1994) 1473. [24] M. Kinoshita, J. Sol. Chem. 33 (2004) 661.
[25] M. Kinoshita, J. Mol. Liq. 119 (2005) 47. [26] M. Kinoshita, N. Matubayasi, Y. Harano, M. Nakahara, J. Chem. Phys. 124 (2006) 024512. [27] K.A. Dill, Biochemistry 29 (1990) 7133. [28] P.L. Privalov, N.N. Khechinashvili, J. Mol. Biol. 86 (1974) 665. [29] G.I. Makhatadze, P.L. Privalov, J. Mol. Biol. 232 (1993) 639. [30] P.L. Privalov, G.I. Makhatadze, J. Mol. Biol. 232 (1993) 660. [31] L. Liu, C. Yang, Q.-X. Guo, Biophys. Chem. 84 (2000) 239. [32] R. Lumry, S. Rajender, Biopolymers 9 (1970) 1125. [33] B. Lee, Biophys. Chem. 51 (1994) 271. [34] K. Sharp, Protein Sci. 10 (2001) 661. [35] Y. Harano, R. Roth, Y. Sugita, M. Ikeguchi, M. Kinoshita, Phys. Rev. Lett., to be published.