144
ENERGETICS OF BIOLOGICAL MACROMOLECULES
[61
linkage between important structural domains of the enzyme is dominated by the kinetic, rather than the equilibrium, components. We have given the exact solution of the linkage scheme for serine proteases in the presence of an allosteric effector, as an extension of the Botts-Morales treatment of the action of a modifier.23The solution reveals the substantial complexity of linked functions at steady state and, at the same time, provides a convincing example of how macromolecules can exploit more complicated pathways of communication to accomplish biological function. Our treatment sets the stage for a quantitative analysis of allosteric effects that dominate the blood coagulation cascade. It also provides the necessary framework for casting protein-protein interactions in this biologically relevant system. We have seen that in the equilibrium picture the interference between two ligands is quantified by a coupling free energy in the thermodynamic cycle. In the kinetic scenario all kinetic fluxes in the linkage cycle are to be taken into account. Therefore, it is only from the combination of thermodynamic and kinetic information on linked effects that we can obtain unequivocal answers to molecular mechanisms of site-site communication in biological macromolecules. A comprehensive approach along thermodynamic and kinetic coordinates is best suited to handle the many challenging questions that arise in the study of regulatory interactions in complex macromolecular systems. Acknowledgments This work was supported in part by NIH Grant ELL49413, NSF Grant MCB94-06103, and by grants from the American Heart Association and the Monsanto-Searle Company. E. D. C. is an Established Investigator of the American Heart Association and Genentech. Equations (15) (19) were derived using a symbolic algebraic algorithm in Mathematica, running on a Hewlett-Packard Apollo9000/730 computer.
[6] T h e r m a l D e n a t u r a t i o n M e t h o d s i n t h e S t u d y o f Protein Folding
By
ERNESTO F R E I R E
Introduction Temperature occupies a central and unique role as a perturbant of the equilibrium between different conformational species in macromolecules. Temperature dependence of equilibrium provides an access to the enthalpy, entropy, and heat capacity components of the Gibbs free energy. METHODS IN ENZYMOLOGY.VOl,. 259
Copyright ~t2 1995by Academic Press, Inc. All righlsof reproduction in any form reserved.
[61
THERMALDENATURATIONMETHODS
145
Since the enthalpy of a system is the conjugate variable of the temperature (or more properly the inverse temperature), an experimental access to this quantity permits an experimental determination of the partition function and therefore a complete thermodynamic characterization of the system under study. Because differential scanning calorimetry (DSC) provides this information, it is the technique of choice to determine the energetics of protein folding/unfolding transitions and the thermodynamic mechanisms underlying those reactions. Therefore a significant portion of this chapter is dedicated to a discussion of the theoretical foundations of DSC and the statistical thermodynamic characterization of thermally induced transitions. Differential scanning calorimetry measures the heat capacity of a solution directly; however, only recently have instruments with the required sensitivity and precision for absolute measurements of the heat capacity of proteins in dilute solution been developed. The heat capacity itself contains a wealth of information and can be related directly to structural parameters. Consequently, this chapter concludes with an analysis of the heat capacity of proteins in different conformational states and its relation to structural parameters. In this chapter we focus on the folding/unfolding equilibrium of monomeric protein systems under equilibrium conditions; however, the equations and the general treatment are applicable to other macromolecular systems or can be extended in a straightforward fashion to other systems.
General Considerations
Statistical Thermodynamic Representation of Conformational Equilibrium The most fundamental quantity required to describe the conformational equilibrium of a monomeric protein is the partition function, Q, defined as the sum of the statistical weights of all the states accessible to the protein: N
Q = ~ exp(-2xGi/RT)
(1)
i 0
where the statistical weights or Boltzmann exponents [exp(-~Gi/RT)] are defined in terms of the Gibbs free energy 2xG~ for each state, R is the gas constant, and T the absolute temperature. Because the system under consideration is characterized by a constant number of particles and an average energy, the partition function in Eq. (1) can be equated to the canonical ensemble partition function. The Gibbs free energy of each state is given by the standard thermodynamic relationship:
146
ENERGETICS OF B I O L O G I C A L MACROMOLECULES
AG~ = AHi(TR) +
s
TR
[
ACp~dT- T AS~(TR) + '
s T ACv,idln T ] TR
AG~ = AHi(TR) + ACp,i(T- TR) -- T[AS~(TR) + ACp,i ln(T/TR)]
[6] (2a) (2b)
where AHi(TR) and ASi(TR) are the relative enthalpy and entropy of state i at the reference temperature TR and ACp,~is the relative heat capacity of that state. Equation (2a) is the most general equation, while Eq. (2b) is the traditional equation in which the heat capacity difference between states is assumed to be temperature independent. For convenience, the native state (state 0) is chosen as the reference state to express all relative thermodynamic parameters. All thermodynamic parameters can be expressed in terms of the partition function. The average system free energy ((AG)) is equal to
(AG) = - R T l n Q
(3)
the average excess enthalpy ((AH)) is equal to
(All) = RT2(O In Q/OT)
(4)
and the average excess entropy ((kS)) is equal to
(AS) = RT(O In Q/OT) + R In Q.
(5)
The temperature dependence of the system free energy is used to define the character of phase transitions.
Average System Properties The observed or measured values of extensive physical properties contain contributions from all the molecules in the system and as such, when normalized on a per-molecule or per-mole basis, they constitute canonical ensemble averages (unless otherwise indicated all quantities in this chapter are expressed on a per-mole basis). So, for example, if we designate by oe any arbitrary physical observable of the system, then the total magnitude of c~ (C~Tot~l)will be given by N OgTotal = E n i o / i i 0
(6)
where ni is the total number of molecules in state i and o~iis the characteristic contribution of state i to the observable O/. The molar average value of the observable ((o/)) is obtained by dividing O/Tota 1 over the total number of moles in the system:
[6]
THERMAL DENATURATION METHODS
147
N (0/) = ~Total/NTotal = Z i=0
(ni/NTotal)Oq
(7a)
N
(7b)
= ~',Pi~i i-0
where Pi is the population of molecules in state i. The angular brackets ( ( ) ) are used to designate ensemble averages as opposed to time averages even though for an ergodic system they should be identical. For the canonical ensemble P~ is equal to the ratio of the statistical weight of that state over the sum of the statistical weights of all the states: P~ = e x p ( - A G f l R T )
exp(-AGJRT)
(8a) (8b)
Pi = e x p ( - A G i / R T ) / Q .
It follows that the ensemble average of c~ is equal to N
(oz) = ~ o~i e x p ( - A G d R T ) / Q .
(9)
i=0
Equations (7) and (9) establish a rigorous mathematical relationship between an arbitrary physical observable of the system and the thermodynamic parameters that govern the conformational equilibrium. Except for thermodynamic parameters and their conjugate variables there is in general no relationship between a~ and AG~. For example, if o~ is a spectroscopic observable like the quantum yield or polarization in a fluorescence measurement or ellipticity in a circular dichroism (CD) experiment, it is clear that it will not be related to the free energy in any predictable way. For this reason, Eqs. (7) and (9) can be solved exactly only for the simplest case, in which the equilibrium involves only two states. The situation is different, however, if the observable is the enthalpy and is known as a function of temperature from a DSC experiment. Two-State Equilibrium
For the case in which only two states are in equilibrium, Eq. (7) reduces to @¢) = PooLo q- PNOCN
(10)
which can be solved for PN by taking advantage of the relationship ~ P i = 1, PN = ((~)
-
Oeo)/(aN
-
s0).
(11)
148
ENERGETICS OF BIOLOGICAL MACROMOLECULES
[6]
Equation (11) establishes a direct link between the observable a and the thermodynamic parameters of the system, because in this case exp(-AGJRY)
= PN/(1 - PN).
(12)
The reader should recognize that Eq. (12) is identical to the equilibrium (KN) for the reaction I0 ~ IN. For a two-state equilibrium the population obtained by Eq. (11) corresponds to the true population of molecules in state N and as such the derived thermodynamic parameters correspond to the correct ones. This is, however, not true if the equilibrium involves more than two states. It must be pointed out that under most circumstances the analysis of thermal denaturation curves is performed with Eqs. (11) and (12) even though it is not known a priori whether the system conforms or not to the two-state mechanism. For that reason thermodynamic parameters obtained with these equations are often called " a p p a r e n t " parameters. Also, for a two-state equilibrium the values obtained with Eqs. (11) and (12) are independent of the nature of the observable used to measure the equilibrium, that is, different observables will yield the same results. This fact was realized many years ago by Lumry and collaboratorsJ who devised a series of tests aimed at evaluating the validity of the two-state approximation in a number of experimental situations. constant
van 't H o f f Analysis and Cooperativity
If the value of a physical observable is measured at different temperatures, then it is possible to determine P,,j and K N at those temperatures for a two-state transition. In the usual situation encountered in protein folding/ unfolding studies, the population of molecules in the native state (I0) is maximal at low temperatures while that of the denatured state (IN) is maximal at high temperatures. On increasing the temperature the denatured state becomes populated in a characteristic sigmoidal way, as shown in Fig. 1. It must be emphasized that the sigmoidal character or appearance of a transition is not an indication of cooperativity, as has been erroneously argued, 2 because a noncooperative transition will also exhibit a sigmoidal temperature profile as shown in Fig. 1. The temperature at which the populations of molecules in the native and denatured states are the same is known as the transition temperature (Tin). At this temperature 2xG:~ = 0 and Kx = 1. The thermodynamic parameters for the transition can be obtained from the temperature dependence of KN: i R. Lumry, R. Biltoncn, and J. F. Brandts, Biopolvmers 4, 917 (1966). K. A. Dill, S. Bromberg, K. Yuc, K. M. Fiebig, D. P. Yee, P. D. Thomas, and H. S. Chan, Protein Science 4, 561 (1995).
[61
THERMAL DENATURATION METHODS
1.00 cO
A
149
\
0.80
c
,~ j--
0.60
co 0.40
o o
0.20
c~
0.00 ~ 0.00
1 20.00
40.00
60.00
80.00
100.00
120.00
Temp (°C) Fl(;. I. A series o f simulated thermal transitions centered at 62 °, as would be observed by
arbitrary noncalorimetric observables. All of these transitions exhibit a sigmoidal shape even though they are characterized by different cooperativity. Curve A is a fully cooperative twostate transition while curves B, C, and D exhibit progressively less cooperativity. In all cases the overall thermodynamic parameters ,",H, ,XS, and ~Cp are the same. ]'his family of curves demonstrates that it is impossible to obtain true thermodynamic parameters from noncalorimetric data unless the transition is of the two-state type (curve A). The reason for this is that the thermodynamic parameters are deduced from the shape of the curves (e.g., van't Hoff analysis) and the shapes are different even though the thermodynamic parameters are the same. Only direct calorimetric measurements provide the true thermodynamic parameters and allow evaluation of the cooperativity of the transition. Without knowledge of the true enthalpy, it is impossible to evaluate the cooperativity of a transition.
In K N
:
( - AH:v/RT) + AS;v/R
(13)
and therefore the slope of a plot of In K,,: versus 1/T, (-AH,,v/R), yields the enthalpy change for the reaction. Equation (13) implements the classic van't Hoff analysis and, as such, the enthalpy obtained in this way is called the van't Hoff enthalpy. The entropy change can be obtained at T,, because at this temperature AS,,: = AHN/Tm. It must be realized that the thermodynamic parameters derived from a van't Hoff analysis ultimately depend on whether or not the calculated populations of states correspond to the true populations. If the transition conforms to the two-state situation the calculated populations using Eqs. (11) and (12) correspond to the true populations and the van't Hoff or "apparent" thermodynamic parameters are the correct ones. If the equilibrium involves more than two states the populations will be incorrectly calculated and the van't Hoff or "apparent" thermodynamic parameters will also be incorrect. The problem is that in most experimental situations
150
ENERGETICS OF BIOLOGICAL MACROMOLECULES
[6]
the number of states involved in the equilibrium is not known. This situation was recognized earlier by protein researchers, 1 who concluded that one of the best ways of assessing the validity of the two-state approximation was to compare the enthalpy change obtained from a van't Hoff analysis with the one measured directly using calorimetry. The calorimetric enthalpy represents the actual enthalpy for a transition because it is equal to the amount of heat released or absorbed divided by the concentration. As such it is a true state function, depending only on the nature of the initial and final states. The van't Hoff enthalpy, on the other hand, reflects the enthalpy associated with the transformation of 1 mol of "cooperative unit." The physical or structural extent of this cooperative unit is an intrinsic property of the system, independent of the normalization used by the researcher. It is in fact determined by the magnitude and extent of the cooperative interactions within the system. If these cooperative interactions extend to the entire protein molecule then the van't Hoff and calorimetric enthalpies will be identical. If the van't Hoff enthalpy is smaller than the calorimetrically measured enthalpy, then the system-defined cooperative unit will be smaller than the unit defined by the researcher to calculate the calorimetric enthalpy (usually per mole of protein molecules). If this is the case, the transition will proceed through the presence of partly folded intermediates because the intrinsic cooperative unit is smaller than the protein itself, The converse occurs if the van't Hoff enthalpy is larger than the calorimetric enthalpy. In this case the system cooperative unit extends beyond a single molecule, indicating the existence of intermolecular interactions (usually oligomerization). For this reason, the ratio of the van't Hoff to calorimetric enthalpy is a reflection of the cooperative interactions existing within the system. In general, a van't Hoff analysis provides correct thermodynamic parameters only for a two-state equilibrium. Therefore, a calorimetric technique such as differential scanning calorimetry (DSC), which measures directly the thermodynamic parameters for the conformational equilibrium without any model assumptions, is the only one that provides direct access to the energetics of the system. In addition, DSC also provides a way to study the thermodynamic mechanism of the transition. In the next sections we provide an in-depth discussion of the theoretical foundations of DSC.
Differential Scanning Calorimetry Differential scanning calorimetry measures the heat capacity of the solution present in the calorimeter cell as a continuous function of temperature. To determine the heat capacity of a protein, the data from the protein
[6]
THERMAL DENATURATION METHODS
151
solution scan and the buffer are needed. For the buffer (solvent) scan the measured heat capacity can be written as Cp, b =
mbCp, b
(14)
where rnb is the mass of solvent in the cell and Cp,b is the specific heat capacity of the buffer solution. Similarly, the heat capacity of the protein solution can be written as Cp,p = mpCp,p + m~Cp,b
(15)
where C~,p is the heat capacity of the protein per unit mass, mp is the mass of protein in the calorimetric cell, and m~ is the mass of solvent. C~,p can be obtained as follows: Cp,p = [(Cp,p - Cp,b) + ( m b -- m~)Cp,b]/mp
(16)
where the quantity (m b -- m~) is equal to the mass of solvent displaced by the protein and can be written in terms of the partial specific w)lume of the protein as 3 C;,p = (Cp,p - Cp,b)/m v + Cp,b(V~/V~)
(17)
where Vp and V~, are the partial specific volumes of the protein and solvent, respectively. The partial molar heat capacity function (Cp) is simply equal to Cp,p multiplied by the molecular weight of the protein. Cp is the main quantity measured by DSC and constitutes the center of our discussion.
Partial Molar Heat Capacity Commercial DSC instruments [e.g., MicroCal (Northampton, MA), Hart (Provoh, UT), Seiko (Japan), Perkin-Elmer (Norwalk, CT)] do not have the sensitivity and baseline stability for accurate measurements of the partial molar heat capacity of a protein in dilute solution. For that reason, most calorimetric analyses of proteins performed with those instruments have been restricted to relative measurements of the anomalies associated with thermal denaturation or other thermally induced transitions. Prominent in that discussion has been the issue of baseline subtraction, which has involved a significant amount of arbitrariness. Throughout the years different baseline subtraction schemes have been devised; however, these schemes are mostly the product of instrumental shortcomings and not the result of breakthroughs in analytical methodologies. In theory, the only requirement to analyze the heat capacity function associated with a transition is the knowledge of the heat capacity of a reference state, usually P. L. Privalov and N. N. Khechinashvili,
J. Mol. Biol. 86, 665 (1974).
152
ENERGETICS OF BIOLOGICAL MACROMOLECULES
[6]
the native state. If a DSC instrument is able to measure accurately and reproducibly the partial molar heat capacity of a protein sample the arbitrariness in baseline subtraction can be eliminated. This level of sensitivity and stability has only recently been achieved and has opened the doors to new approaches in data analysis.
Excess Heat Capacity Function If a protein undergoes a transition, the heat capacity function will exhibit an anomaly (one or several peaks) at some characteristic temperature(s). Under these conditions, the heat capacity function can no longer be ascribed to a single structural state because it contains contributions from all the states that become populated during the transition as well as the excess contributions arising from the existence of enhanced enthalpy fluctuations within the temperature transition region. These excess contributions give rise to the characteristic peak or peaks associated with thermally induced transitions. 3,4'5The most important quantity for the thermodynamic analysis of the thermal unfolding of a protein is the excess heat capacity function ((ACp)), which is obtained by subtracting the heat capacity of the native state from the measured heat capacity function~: { a G ) = {Ct,) - Cp,,, (18) Figure 2 illustrates the procedure required to estimate {&Cp} from the experimental data. As indicated in Fig. 2, if the heat capacity of the native state is known, it can always be subtracted from the heat capacity function even if the experimental conditions are such that the native state never becomes fully populated.
Statistical Thermodynamic Definition of Excess Heat Capacity Function In the analysis of DSC data, the most important quantity that needs to be defined at the theoretical level, using the tools of statistical thermodynamics, is the average excess enthalpy function ({AH)), because (ACp} is equal to the temperature derivative of {&H) at constant pressure. The average excess enthalpy function is the sum of the enthalpy contributions of all the states that become populated during the transition ~ N
{AH) = ~] P, AH,
(19)
i 0
4 E. Freire, Comments MoL Cell. Biophys. 6, 123 (1989). 5 E. Freire, W. W. van Osdol, O. L. Mayorga, and J. M. Sanchez-Ruiz, Annu. Rev. Biophys. Biophys. Chem. 19, 159 (199(/). 6 E. Freire and R. L. Biltoncn, Biopolymers 17, 463 (1978). ~'~ Deleted in proof.
[6]
THERMAL 1 8 0 0 0
.
16000
-
.
.
.
DENATURATION
~
,
,
,
,
.
.
.
.
153
METHODS .
.
=
,
,
J 4
_~
14000
E
B
I
12000
o
-4
10000
A
£ 1
Cp,u . . . .
0
8000 6000
Z
Cp,0
4000 0.00
20.00
40.00
60.00
80.00
100.00
Temp (°C) FiG. 2. The partial molar heat capacity of a protein. The heat capacity of the native state (Cp,o) exhibits a linear temperature dependence. The heat capacity of the unfolded state (broken line) is a quadratic function of temperature. Shown are three transition curves for the same protein under conditions in which it exhibits three different transition temperatures. The excess heat capacity, the quantity required for statistical thermodynamic analysis, is obtained by subtracting the heat capacity of the native state from the heat capacity of the protein ((ACp) - Cp - Cp.0). If the calorimeter is precise enough to determine absolute heat capacities, no arbitrary baseline subtraction schemes are required. If this is the case, situations like the one shown in curve A, in which the protein is never in the native state, can be rigorously analyzed. A n arbitrary line subtraction under those conditions will lead to substantial errors/'"1°
where Pi represents the population or probability of state i, and ~tt~ represent the enthalpy of the ith state relative to that of the native state, which is taken as the reference state. The analysis of different transition models involves writing Pi in terms of the specific parameters of the m o d e l J "5-7 '~ The excess heat capacity function becomes N
(5Cp} = E
N
AH,(OP/OT) + E P, 2xCp,,
i I
= (ACp,t,.) +
(20a)
i I
(ACp,b,)
(20b)
The first term on the right-hand side ((ACp,t,-)) in the transition excess heat capacity function and defines the characteristic transition peak(s) in the 7 K. T h o m p s o n , C. Vinson, and E. Freire, Biochemistry 32, 5491 (1993). s y . Griko, E. Freire, and P. L. Privalov, Biochemistry 33, 1889 (1994). ~ T. Haltia, N. Semo, J. L. R. Arrondo, F. M. Gofii, and E. Freire, Biochemistrv 33, 9731 (1994).
154
ENERGETICS
14000
'
'
'
'
I
OF
BIOLOGICAL
. . . .
I
MACROMOLECULES
. . . .
I
. . . .
I
. . . .
I
. . . .
[6]
I
. . . .
i
....
12000 10000
E
8000 6000
w CD~
4000 2000 ....
-2000
, , , p
20.00
. . . .
30.00
I
.'"
. . . .
40.00
,~Cp,b[:>
I
. . . .
50.00
60.00
70.00
80.00
Temp (°C) FIG. 3. The excess heat capacity function is composed of two terms. The first one, called the transition excess heat capacity, gives rise to the peaks associated with thermal transitions. The second term, ( A C p , b l ) , is responsible for the sigmoidal shift observed in the heat capacity of transitions that exhibit a ACp. The transition excess heat capacity, (ACp.t,), is obtained by subtracting (ACp,bl) from (~Cp).
heat capacity function (Fig. 3). ~° The second term on the right-hand side ((ACp,bl)) defines the "S-shape" shift in baseline usually associated with protein unfolding or other transitions characterized by positive changes in ACp2 ° The transition excess heat capacity measures the enhancement in enthalpy fluctuations associated with the conformational transition. Explicit differentiation of Eq. (20a) yields
(ACp,tr) = {[i~o AH~ exp(-AGi/RT)/Q ] - [i~oAHiexp(-AGi/RT)/QI2}/RT 2 = {(AH 2} -
(AH)Z}/RT2
(21a) (21b)
The reader must recognize that Eq. (21 b) is equal to the second moment or dispersion of the enthalpy distribution. Therefore, the peaks observed in the transition region are a direct reflection of the enhanced enthalpy fluctuations occurring when the protein undergoes interconversions between different enthalpic states. 10 M. Straume and E. Freire, Anal. Biochem. 203, 259 (1992).
[6]
THERMAL DENATURATION METHODS
155
Information Content Differential scanning calorimetry provides three different types of information: (1) absolute heat capacities, (2) overall thermodynamic parameters, and (3) population and thermodynamic parameters for the states that become populated during the transition, that is, statistical thermodynamic information.
Absolute Heat Capacity As mentioned above, absolute heat capacity measurements have been limited to a few laboratories with access to appropriate instruments. The absolute heat capacities can be used to obtain structure-related information and the degree of hydration or degree of solvent exposure of the polypeptide chain to the solvent. This information is extremely important and can be used to assess the degree of unfolding achieved by thermal denaturation. It has been shown that the absolute heat capacity of different protein conformations can be accurately predicted from high-resolution structural parameters. ~ This finding permits the development of new approaches to the problem of protein-folding energetics and to the development of more accurate structure-based molecular design strategies. The relationship between heat capacity and structure is reviewed in the last part of this chapter.
Overall Thermodynamic Parameters The most important overall thermodynamic parameters associated with the thermal denaturation of proteins are the free energy (2xG), enthalpy (AH), entropy (AS), and heat capacity (ACp) changes between the unfolded and native states. All these parameters are state functions, that is, their values depend only on the nature of the denatured and the native states and not on the specific transition pathway or the presence of partly folded intermediates. From a practical point of view, these parameters are independent of the shape of the measured heat capacity function and can be determined in a model-independent way. The heat capacity change is the difference between the heat capacity of the thermally denatured state and that of the native state. The enthalpy change is the area under the transition excess heat capacity function
AH = f~tI (Aep,tr > dT 11j. G6mez, J. V. Hilser, D. Xie, and E. Freire,
Protein, (in press) (1995).
(22)
156
ENERGETICS OF BIOLOGICAL MACROMOLECULES
[6]
where the limits of integration are defined by the onset and completion temperatures of the transition (i.e., the temperatures at which essentially all molecules are in the initial and final states, respectively). The entropy change is simply evaluated by means of kS = c|r, (~XCp,)d In T d 7,0
(23)
Both AH and AS, as defined by Eqs. (22) and (23), are referred to the transition temperature, Tin, that is, AH = AH(Tm) and AS = AS(Tm). It must be noted that Eqs. (22) and (23) are defined in terms of the transition excess heat capacity curve, implying that (ACp.~,~)needs to be subtracted from the excess heat capacity function. Throughout the years, different subtraction methods have been utilized. In the past, (ACp,b~) used to be approximated by a step function defined by the intersection of a vertical line centered at Tm with the extrapolated initial and final values of the heat capacity function, lz~3 Because the baseline is proportional to the degree of unfolding [see Eq. (20)] a more accurate way of estimating it can be achieved by defining its shape in terms of the normalized integral of the heat capacity function.~4'~s This alternative is easy to implement with computerdigitized data and is mathematically exact for a two-state transition. In general, the overall thermodynamic parameters AH and AS are relatively insensitive to the exact method used to subtract {2xCp.N).~6 The situation is different, however, for the statistical thermodynamic analysis of the shape of the heat capacity function.
Statistical Thermodynamic Analysis of" Heat Capacity Function Overall thermodynamic parameters are state functions and as such they depend only on the area of the heat capacity function and are independent of its shape. The shape of the heat capacity function, on the other hand, is defined by the trajectory or path followed by the thermal transition. Therefore an analysis of the shape of the heat capacity function permits evaluation of the states that become populated during the transition. As indicated above, the excess enthalpy function plays a central role in the statistical thermodynamic analysis of DSC data because it provides a direct link between the experiment and the folding/unfolding partition 12 p. L. Privalov, Adv. Protein Chem. 33, 167 (1979). "~ Y. V. Griko, P. L. Privalov, J. M. Sturtevant, and S. Y. Venyaminov, Proc. Natl. Acad. Sci. U.S.A. 85, 3343 (1988). L4G. Ramsay and E. Freire, Biochemistry 29, 8677 (1990). 15j. W. Shriver and U. Kamath, Biochemistry 29, 2556 (1990). 1~,p. L. Privalov and S. A. Potekhin, this series, Vol. 131, p. 4.
[61
THERMALDENATURATIONMETHODS
157
function. (AH) is directly accessible from DSC data because it corresponds to the cumulative integral of the measured (ACp),
(AH)= T ~,>(acp) dT
f
where T0 is a temperature in which the protein is in the native stale. is also related to the partition function by Eq. (4):
(24)
(All)
(AH) = RT2(a In Q/aT) Freire and Biltonen 6 first realized that, by rewriting Eq. (4) in integral form, DSC could provide a direct numerical access to the folding/unfolding partition function: lnQ=
=
fr (AH) q,~dT
1~) ~
(25a)
To
Equations (25a) and (25b) provide a rigorous foundation for the deconvolution theory of the excess heat capacity function, because they establish a mathematical linkage between the experimental data and the most fundamental function in statistical thermodynamics. The uniqueness of the enthalpy function as a physical observable can be illustrated by comparing it with the observables measured by other techniques. The situation is different for an observable like the excess enthalpy relative to the native state ((AH)), which is the conjugate variable of the inverse temperature. In this case, the average enthalpy is also given by Eq. (9): N
(AH) = ~ AHi exp(-AGJRT)/Q
(26)
i 0
The main difference is that contrary to any arbitrary observable, AH~ also occurs inside the exponents in Eq. (26). This unique property of DSC makes a tremendous difference in the analysis and has made possible the development of rigorous deconvolution algorithms aimed at obtaining a complete thermodynamic characterization of a folding/unfolding transition. For other physical observables the characteristic ~ values are not mathematically related to the Pg values, that is, the amplitudes of the melting curves are not related to a thermodynamic function, and the experimental data cannot be used to obtain a complete thermodynamic description of a transition. The main goal of the deconvolution analysis of the heat capacity function is the determination of the number of states that become populated during
158
ENERGETICS OF BIOLOGICAL MACROMOLECULES
[6]
thermal unfolding and the thermodynamic parameters for each of those states. Throughout the years the deconvolution algorithms have been perfected in different ways. 6,16 18 Nowadays, the most effective algorithms involve a recursive deconvolution procedure that includes multiple cycling through each individual transition step combined with nonlinear leastsquares optimization, and conclude with a global nonlinear least-squares optimization. An example can be found in the analysis of the multistate transition of the molecular chaperone DnaK, is c~-lactalbumin, s and staphyloccocal nuclease. 19
Analysis of E x p e r i m e n t a l l y D e t e r m i n e d H e a t C a p a c i t y of Proteins Different Contributions to H e a t Capacity o f Proteins
The heat capacity of proteins originates from enthalpic fluctuations corresponding to internal vibrational and hindered rotational modes and from interactions with the solvent. Throughout the transition region an additional contribution is given by the enhanced enthalpic fluctuations associated with the transition as described by Eq. (21). Accordingly, the partial molar heat capacity, Cp, of proteins in aqueous solution can be considered as being composed of an intrinsic term and a term due to hydration. 2° The intrinsic term contains the contributions from covalent bonds as well as from noncovalent interactions. Thus, the heat capacity of a protein can be written as Cp = Cp, a + Cp, b ~- Cp. c q- Cp.othe r
(27)
The first term, Cp,~, is called the primary heat capacity and contains the atomic and covalent bond contributions. By definition, this term depends only on the amino acid composition of the protein and is independent of its conformational state. Experimental studies on a large n u m b e r of organic molecules indicate that the heat capacity of these compounds is largely additive at the bond or group level. 21 The second term, Cp,b, contains the contributions of all noncovalent interactions within the protein molecule. The third term, Cp,c, contains the contributions due to the interactions of the protein with the solvent, that is, hydration. Cp, b and Cp,c depend on 17C. Rigell, C. de Saussure, and E. Freire, Biochemistry 24, 5638 (1985). 18D. Montgomery, R. Jordan, R. McMacken, and E. Freire, J. Mol. Biol. 232, 680 (1993). 19D. Xie, R. Fox, and E. Freire, Protein Sci. 3, 2175-2184 (1994). 2oj. Suurkuusk, Acta Chem. Scand. Ser. B B28, 409 (1974). z~S. W. Benson, "Thermochemical Kinetics." Wiley, New York, 1968.
[61
THERMAL DENATURATION METHODS
159
the physical state of the protein, the secondary structure content, and interactions with the solvent, respectively. The change in heat capacity associated with folding/unfolding or conformational transitions involves only the t e r m s Cp, b and Cp,c because these protein transitions do not involve changes in mass or primary structure, A C p = ACp, b -[- A C p , c -}- ACp,othe r
(28)
Primary Heat Capacity of Proteins Within the temperature range of interest in biology (-0-100 ° ) the primary heat capacity of a protein predominantly contains contributions from vibrational frequencies arising from the stretching and bending modes of each valence bond and internal rotations. = Electronic contributions are negligible in this temperature range. 23 The primary heat capacity can be calculated rather accurately from the contribution of individual amino acids plus the additional contribution of the peptide bond, from atomic additivity parameters or from bond additivity parameters. The heat capacities of all 20 amino acids have been measured in the anhydrous state, as have those of some dipeptides. 24 In addition, individual atomic and bond contributions have been tabulated from the analysis of the heat capacities of small organic compounds.22"23,25Table I and Fig. 4 summarize the calculated primary heat capacity values at 25 ° for 10 globular proteins. The primary heat capacity values calculated from the contributions of individual amino acids are close to the measured heat capacities of anhydrous proteins, consistent with the idea that the bulk of the absolute heat capacity of a protein originates from the covalent structure. It should be noticed, however, that the experimental values for the anhydrous native protein are generally slightly larger than the calculated primary values, suggesting that noncovalent interactions do contribute, albeit slightly, to the heat capacity. For those proteins for which heat capacity values in the anhydrous state are available (albumin,2°,26chymotrypsinogen,2°'26insulin,26 and lysozyme2°) the average is 0.298 +_ 0.003 cal/K.g compared with an average primary heat capacity of 0.283 _+ 0.006 cal/K- g as calculated from the sum of individual amino acid contributions. In general, the contribution 22 G. J. Janz, "Estimation of Thermodynamic Properties of Organic Compounds." Academic Press, New York, 1958. 23 S. M. BLinder, J. A m . Chem. Soc. 97, 978 (1975). 24 j. 0 . Hutchens, in "Handbook of Biochemistry and Molecular Biology" (G. D. Fasman, ed.). Chem. Rubber Publ. Co., Cleveland, OH, 1976. 25 S. W. Benson and J. H. Buss, J. Chem. Phys. 29, 546 (1958). 26 j. O. Hutchens, A. G. Cole, and J. W. Stout, J. Biol. Chem. 224, 26 (1969).
160
[6]
ENERGETICS OF BIOLOGICAL MACROMOLECULES TABLE 1 HEAT CAPACITIES OF PROTISINS"
Cp (25 °)
Protein
Primary (cal/K. g)
Anhydrous (cal/K. g)
Native (cal/K. g)
Unfolded (cal/K. g)
Cytochrome (: Lysozyme Myoglobin RNase A BPTI Barnase Interlcukin 1H RNase TI Ubiquitin Staphylococcal nuclcase
0.286 0.282 0.291 0.284 0.283 0.278 0.289 0.271 0.288 0.288
0.2930 l).290 0.299 0.291 0.289 0.286 0.296 0.278 0.295 0.297
0.327 0.334 0.325 0.363 0.348 0.359 0.386 0.348 0.353 0.358
0.453 0.459 0.466 0.453 0.454 0.490 0.488 0.4635 0.513 0.497
" Data for the heat capacity of proteins in solution were obtained from the following papers: cytochrome c, lysozyme, myoglobin, RNasc A2*; bovine pancreatic trypsin inhibitor (BPTI)3°; Barnase29; interleukin l/8~; RNasc T13": ubiquitin~lb; staphylococcal nuclease. 1'~
Cp,primary Cp,anhydrous
0.60
~
Cp,native Cp,unfolded
0.50
-~
0.40
+
0.20 0.10 0.00 o
E 2
iilil[ii I .-= ~
<
g
~.
g ~
.~
~
~~
== -
_
8
FI(;. 4. The magnitude of the primary heat capacity, the anhydrous heat capacity, the heat capacity of the native state in solution, and the heat capacity of the unfolded state in solution at 25 ° for the nine proteins in the database.
[6]
THERMAL DENATURATION METHODS
161
of the primary structure to the heat capacity of anhydrous proteins is similar on a weight basis for all of them and accounts for about 97% of the total. Within the temperature range of interest, the heat capacity of anhydrous proteins increases linearly with temperature, with a slope equal to 9.77 + 0.2 × 10 4 cal/K 2 . g for all proteins. 24"2("
Contribution of Noncovalent Interactions to Heat Capacity of Proteins As shown in Table I and Fig. 4, the difference in heat capacity between the anhydrous native protein and the primary heat capacity calculated from the contributions of individual amino acids is small, suggesting that the contribution of noncovalent interactions is also small, in agreement with previous resultsF If the difference between the anhydrous heat capacity and the primary heat capacity is taken as an indication of the contribution of noncovalent structure, then noncovalent interactions contribute about 0.007 cal/K • g to the heat capacity. Noncovalent interactions are expected to be a function of the packing density within the protein and, as a first approximation, are expected to scale in terms of the total buried surface area of the protein.
Heat Capacity of Native Proteins in Solution The proteins in Table I and Fig. 4 were chosen for the calculations because their heat capacities in solution have been measured calorimetrically.S.l~.2s 31 The heat capacity of the native protein in solution is larger than that of the anhydrous protein, revealing the magnitude of the hydration contribution (Table 131a'b and Fig. 4). At 25 ° hydration contributes close to 0.06 cal/K • g or about 15% of the total heat capacity of the native state. The relative hydration contribution, however, is not the same for all proteins, suggesting that the composition of the protein surface mediates the magnitude of this increase. Also, it should be mentioned that at 25 ° the temperature dependence of the heat capacity of the native protein in solution is larger than that of the anhydrous protein. In solution, the additional contribution to the heat capacity is given by =7 M. R. Eftink, A. C. Anusiem, and R. L. Biltonen, Biochemistly 22, 3884 ( 1983),. ~s p. L. Privalov and G. I. Makhatadze, J. Mol. Biol. 213, 385 (1990). ~"~Y. Griko, G. [. Makhatadze, P. L. Privalov, and R. W. Hartley. Proteilt Sci. 3, 669 (1994). ~o G. |. Makhatadze, K.-S. Kim, C Woodward, and P. L. Privalov, Protein Sci. 2, 2028 (1993). ~l G. I. Makhatadze, G. M. Clorc, A. Groncnborn. and P. L. Privalov, Biochemistry 31, 9327 (1994). ~1:, TY. Yu, G. 1. Makhatadze, C. N. Pace, and P. L. Privalov, Biochemist O, 33, 3312 (1993). 3a~ TP. L. Wintrode. G. I. Makhatadze. and P. L. Privalov. Proteins: Struct., Funct., Genet. 18, 246 (1994).
162
ENERGETICS OF BIOLOGICAL MACROMOLECULES
[6]
the solvation of those atoms located at the protein surface. The solventexposed surface of native proteins is composed of polar and apolar regions in different proportions. On the average, about 55% of the total solvent accessible surface area in the native state is apolar, which qualitatively explains the positive hydration contribution. In general, the hydration heat capacity of a protein should be proportional to the dimensions of the apolar and polar solvent-accessible surface areas in much the same way as the heat capacity changes are associated with conformational transitions) 2,3~ The heat capacity of the native state in solution is a linear function of temperature within the temperature interval in which it can be measured (about 0-50°). The temperature dependence averages 1.62 × 10 ~ cal/ K 2- g; however, significant variations are observed among different proteins, reflecting the heterogeneous composition of the solvent-exposed surfaces.
Heat Capacity of Unfolded State At 25 ° the heat capacity of the unfolded state is larger than that of the native state, as shown in Table I and Fig. 4. On average the heat capacity of the unfolded state is about 0.12 cal/K - tool larger than that of the native state. The difference ranges from about 0.09 to 0.16 cal/K, tool, reflecting the different proportions of apolar and polar residues that are buried from the solvent in the native state and become exposed in the unfolded state. While the heat capacity of the native state has a linear temperature dependence, the heat capacity of the unfolded state does not. Within the temperature interval 0-100 ° the heat capacity of the unfolded state is well approximated by a second-order polynomial on temperatureS, V),34: Cp=Cp(25)+a(T
25)+b(Y-25)
2
(29)
In general the different magnitude and temperature dependence of the heat capacity of the native and unfolded states are primarily a reflection of the different proportions of polar and apolar surfaces that are exposed to the solvent by these protein conformations, and to a lesser extent to the lack of internal noncovalent interactions in the unfolded state. It has been shown that a single mathematical function accounts for the magnitude and temperature dependence of the heat capacity of different protein conformations, ll The conclusions of this study are summarized below.
:~2 K. P. Murphy, V. Bhakuni, D. Xie, and E. Freire, J. Mol. Biol. 227, 293 (1992). 3s R. S. Spolar, J. R. Livingstone, and M. T. Record, Jr., Biochemistry 31, 3947 (1992). 34 j. C. Martinez, M. E. Harrous, V. V. Filimonov, P. L. Mateo, and A. R. Fersht, Biochemistry 33, 3919 (1994).
[61
THERMAL DENATURATION METHODS
163
Heat Capacity Contribution of Protonizable Groups Besides specific effects, such as the presence of cofactors, metal ions, etc., which need to be taken into account explicitly, protonation effect appears to be the only other generic effect that might contribute measurably to the heat capacity of proteins. Side chains with ionizable groups (e.g., histidine, aspartate, glutamate, arginine, and lysine) contribute differently to the heat capacity, depending on whether they are protonated. The contribution of a protonizable group, Cp,p, is given by Eq. (30):
Cp,p = FpAC°p.p+ Fp(1 - Fp)AH2p/R(T 2)
(30)
where AC°p,pis the protonation heat capacity of the group, Fp is the degree of protonation, and AHp is the effective enthalpy of protonation. The first term is directly proportional to the degree of protonation while the second term arises from thermal fluctuations in the degree of protonation. In general, protonation contributions are small in relation to the overall magnitude of the heat capacity. For example, the heat capacity of a protonated imidazol group is 4 cal/K, mol larger than that of an unprotonated one; and that of a protonated carboxylic group is about 30 cal/K, mol larger than that of an unprotonated one. The contribution of the second term in Eq. (30) is maximal when Fp = 0.5, that is, when the pH is equal to the pK~, of the ionizable group. For example, for a histidine in a nonbuffered solution at a pH equal to the pK~,, the maximal contribution due to thermal fluctuations is expected to be around 70 cal/K, mol because under these conditions A H p = --7 kcal/mol. Under the usual conditions, however, the total contribution due to protonation is expected to be small, especially because both the heat capacity and enthalpy changes associated with the release or absorption of protons are generally opposed by the accompanying heat capacity and enthalpy changes in the buffer.
Single Mathematical Function Accounting for Heat Capacity of Different Protein Conformations G6mez et al. LL have demonstrated that a single mathematical function accounts for the heat capacity of different protein conformations. According to the discussion above, the heat capacity of any protein conformation can be written as Cp = Cp,~, + Cp.b + Cp,c + Cp,p
(31)
The primary heat capacity, Cp,,,, is a linear function of temperature and can be written explicitly in terms of the contribution of individual amino acids as discussed earlier, or using an empirical approximation that takes
164
ENERGETICS OF BIOLOGICAL MACROMOLECULES
[6]
advantage of the fact that it scales in terms of the molecular weight (MW) of a protein: CI,,~, = [v + w ( T -
25)]MW
(32)
The contribution of noncovalent interactions, Cp,b, is also assumed to be a linear function of temperature and can be written in terms of the total area that is buried from the solvent, BSAroI~a: Cp.b = Lo + q(T
25)]BSATot~,I.
(33)
The contribution of hydration, Cp.c, is a quadratic function of temperature in which each coefficient is a function of the solvent-accessible apolar (ASAap) and polar (ASApo0 surface areas: Cp.c = a(T)ASA~,p + b(T)ASApol
(34)
a(T) = aL + a 2 ( T - 25) + a s ( T - 25) 2 b(T) = b, + b 2 ( T - 25) + b 3 ( T - 25) 2
(35) (36)
where
Equations (31)-(36) were used by Gdmez et al. ll to fit the entire protein thermodynamic database irrespective of protein conformation in order to obtain the best values for each parameter. The best set of parameters is summarized in Table II. Figure 5 A - I shows the experimental and calculated heat capacities for the native and unfolded states of the proteins in the database. As seen in Fig. 5 the structural parametrization reproduces the experimental values within the experimental limits. In particular, it must
"FABLE 11 GLOBAL FITTING PARAMETERS FOR H I ' A I CAPACITY FUNCTION OF PROTEINS a
Parameter (units)
Cp
v (cal/K. g) w (cal/K 2.g) p (cal/K.mol ~2) q (cal/K2.mol ,~2) al (cal/K. tool ~2) a2 (cal/K2"mol ~2) a3 (cal/K 3" mol ~2) bl (cal/K. mol , 2 ) b2 (cal/K 2-tool ~2) b3 (cal/K 3"mol ~2)
0.28 9.75 × 10 4 8.7 x 10 3 6.43 × 10 4 0.45 2.63 x 10 4 -4.2 x 10 ~ 0.265 2.85 x 10 4 4.31 x 10 s
" Parameters obtained by Gdmez
et al. ~
[61
THERMAL DENATURATION METHODS
165
be noted that a single mathematical function and a unique set of parameters predict the almost linear temperature dependence of the heat capacity of the native state as well as the progressively decreasing temperature dependence of the unfolded state. The only case in which a significant deviation was observed was for the native state of interleukin 1/3; however, for still unknown reasons the reported specific heat capacity of this protein is also significantly higher than that found for other proteins) 1
Heat Capacity Change on Unfolding Because protein unfolding does not involve changes in the covalent structure of the protein, the heat capacity change for unfolding is given by
ACp = ACp.b + ACp,c + ACp,p.
(37)
Two different effects are the primary contributors to the heat capacity change on unfolding, the hydration of polar and apolar groups that are buried from the solvent, and the disruption of noncovalent interactions existing in the native state. According to this analysis, at 25 ° hydration contributes about 93% of the total heat capacity change of unfolding. However, this contribution decreases at high temperature. The heat capacity of apolar hydration is positive; however, it decreases from 0.45 cal/K • tool fk 2 at 25 ° to about 0.23 cal/K, tool ~2 at 100°. The heat capacity of polar hydration is negative. It amounts to -0.26 c a l / K . m o l ~2 at 2.5° but it becomes negligibly small around 100°. The contribution due to noncovalent interactions increases with temperature, reaching a value of about 0.06 cal/K, mol buried ~2 at 100°. Because internal noncovalent interactions are disrupted on unfolding, this term contributes negatively to the ACp of unfolding. The values obtained at 25 ° for the hydration AC v are similar to those determined before using a temperature-independent parametrization. 3~'35 In this sense, the general parametrization includes the previous one as a subset and extends the range of validity of the structural parametrization over the temperature interval 0-100 °. Also, the results of G6mez et aL 11 reconcile the structural parametrization with the results obtained by Privalov and Makhatadze es over the entire temperature range studied, as shown in Fig. 5. The elementary contributions summarized in Table II account quantitatively for the observed kCp for unfolding and its decrease at high temperatures. The parametrization also predicts a slight decrease in ACp at low temperatures. For the proteins in the database, the error is less than 10% over the entire temperature range. ~ D. Xie and E. Freire, Proteins: Struct., Funct., Genet. 19, 291 (1994).
166
Er~ERGET~CS OF BIOLOGICAL MACROMOLECULES
[61
10000 Baruas¢
8000 6000
o
4000
B~
8000 6000 4000 C
.E
v"
C~htomc c
8000
6000
o
~
~
,o o-., ..o......_,..--~.----.-0---''~
4000 o
8000 ~
x~
i
~
6000
-~
40OO E
L~oz~
6000 4OO0 0
" '
2'0'
'
'41~'
'
'6'0'
'
'8'0'
" 1~
'
• -1~0
-romp (°c) FI(;. 5. ( A - I ) Comparison between the experimental (circles) and calculated (solid lines) heat capacities for the native and unfolded statcs of nine proteins in the database (A, barnase; B, BPTI; (2, cytochrome c; D, interleukin l~; E, lysozymc; F, myoglobin; G, RNasc A; It, RNase TI: I, nbiquitin). For myoglobin, the squares and the circles represent the experimental and calculated values for the unfolded states obtained by Privalov and Makbatadzc, ~ respectively. Data points for the native state above 50 ° were extrapolated in the original references. The calculated values were obtained using the parameter values in Table [I (see text for details). (Adapted from Gomez et al.~l )
[6]
THERMALDENATURATIONMETHODS
167
10000 8000 6000
Myoglobln
4000
G
RN~
A
8000
-• ~ ©
6oo0 4000
H
8000 6000
8000 6000 4000
RN~TI
-
I ............. 0
20
40
60
1 80
I00
120
Temp (°C) FI~;. 5.
(eontintted)
Implications for Structure-Based Energetic Calculations A major conclusion of the work of G6mez et al. nl is that the absolute heat capacity of different protein conformations can be accurately calculated from structural parameters over a wide temperature range. It is known from previous work that the heat capacity of the unfolded state can be accounted for in terms of the individual contributions of the amino acid side chains and the peptide backbone, 2s that is, it exhibits group additivity. The heat capacities of the native state or that of partly folded states, on the other hand, do not exhibit that type of additivity: they cannot be predicted from the amino acid sequence. They can be predicted, however, if the three-dimensional structure of the protein is known. The results discussed here indicate that, within experimental error, the heat capacity
168
ENERGETICS OF BIOLOGICAL MACROMOLECULES
[61
is additive in terms of the primary, noncovalent, and hydration terms, which in turn can be expressed in terms of the molecular weight, the surface area buried from the solvent, and the polar and apolar surfaces accessible to the solvent. From a rigorous thermodynamic standpoint, if the heat capacity is additive on a set of system parameters, then the enthalpy and entropy are also additive on those same parameters plus the addition of a constant term. For example, the enthalpy change can be written as 1"
AH(T) = AH(TR,.) + ~ f,.a ACp,idV
(38)
and similarly, for the entropy, 7"
AS(T) : AS(TR,s)
+ --Z _l,
T
(39)
Because the heat capacity is additive on the system parameters discussed above, within the entire temperature range of interest (0-100°), it is clear that the enthalpy and entropy can be expressed accurately if appropriate reference temperatures are found at which these two quantities can be accurately parametrized. This is true even if the enthalpy or entropy, and hence the Gibbs energy, are not additive on those system parameters at the specified reference temperatures. 36 Because the heat capacity can be accurately estimated from structural parameters at any temperature, a reasonable strategy for structure-based energetic predictions is to find the most appropriate conditions for structural estimation of the constant terms AH(TR,H) and AS(TR,s).35'37It cannot be overstated that the development of an accurate algorithm for structure-based prediction of the free energy constitutes the foundation for any successful strategy for the molecular design of proteins and ligands. Acknowledgments Supported by grants from the National Institutes of Health (RR-04328, GM-37911, and NS-24520) and the National Science Foundation (MCB-9118687).
3t, A. E. Mark and W. F. van Gunstercn, .I. Mol. Biol. 240, 167 (1994). 37 D. Xie and E. Freire, J. Mol. Biol. 242, 62 (1994)•