J. Mol. Biol. (1996) 260, 644–648
COMMUNICATION
Helmholtz Free Energy of Peptide Hydrogen Bonds in Proteins Manfred J. Sippl Center for Applied Molecular Engineering, Institute for Chemistry and Biochemistry University of Salzburg Jakob Haringer Straße 1 A-5020 Salzburg, Austria
We estimate the Helmholtz free energy of peptide hydrogen bonds in native protein structures as a function of spatial separation between donor and acceptor atoms. The resulting potential function has a deep narrow well at H-bond contact but bond formation is hindered by a barrier and the net change in free energy is close to zero. The barrier provides a molecular lock mechanism acting as a kinetic trap. Once formed, H-bonds keep protein chains in a precise orientation. However, bond formation requires energy input and opposes protein folding. In contrast, the free energy functions of most side-chain interactions have no energy barriers. They lack spatial precision but free energy differences of contact formation are substantial. These interactions drive folding and stabilize structures but precision is mediated and maintained by H-bonds. 7 1996 Academic Press Limited
Keywords: protein folding; protein structure; potential of mean force; radial distribution function; liquid structure
Native three-dimensional structures of proteins are encoded by a linear sequence of amino acid building blocks. The process of structure formation is spontaneous, the free energy being lower in the native than in the unfolded state (Anfinsen, 1973). Folding is the result of a large number of opposing intra- and intermolecular interactions and only a small amount of excess energy remains to drive folding and to stabilize native structures. The physical basis and relative strength of these energies and forces as well as the general characteristics of protein folding are controversial issues (e.g. see Honig & Cohen (1996) or Israelachvili & Wennerstro¨m (1996). A classical example is the peptide H-bond. All native protein structures are interwoven by these bonds and their mere number points to a special role in protein folding. Each peptide unit can act as a H-bond acceptor through its carbonyl oxygen atom or, except proline, as a H-bond donor via the amide N-H group. When H-bonds are formed between protein atoms H-bonds to solvent molecules are lost and the question of whether peptide H-bonds in proteins are energetically important has remained controversial (Baldwin, 1993). The situation is quite subtle, with strong entropic contributions due to reorganisation of water molecules and loss of mobility in the protein chain, and quantitative estimates of these effects are difficult to obtain (Schellman, 1955; Israelachvili, 1987; Israelachvili & 0022–2836/96/300344–05 $18.00/0
Wennerstro¨m, 1996). In addition, backbone Hbonds in proteins depend on the separation of donors and acceptors along the chain. This parameter is particularly important for small separations, exemplified by the unique H-bond pattern of a-helices. In principle the free energy of H-bonds and other interactions in proteins can be derived from diffraction experiments. The result of such an experiment is the structure factor whose Fourier transform yields the radial density distribution g(r) of atoms in the sample, relative to a central atom taken as the origin of a reference frame (Gingrich, 1943; McQuarrie, 1976; March & Tosi, 1976). g(r) is related to the Helmholtz free energy w(r), also called potential of mean force, by: w(r) = −kT ln[g(r)], where kT is Boltzmann’s constant times absolute temperature. In terms of statistical mechanics w(r) is an average over the associated canonical ensemble, equivalent to the free energy of the interaction at constant density and temperature (e.g. see McQuarrie, 1976). Consequently, a diffraction experiment performed on a solution of protein molecules yields structural and energetic information on the interactions between atoms in the sample. However, g(r) is a superimposition of all interatomic distances unless atoms or atom pairs are selectively 7 1996 Academic Press Limited
645
Hydrogen bonds
Figure 1. Helmholtz free energy of the protein backbone N · · · O interaction compiled from a data base of 289 X-ray structures (Bernstein et al., 1977). Distance measurements of backbone N and O atoms are represented by Dirac delta functions, d(r − rpij ), where rpij is the distance between atoms i and j (peptide N or O) in protein p. The joint two-particle distribution function rNO (r) is the sum over all peptide N · · · O distances: rNO (r) = s d(r − rpij ) pij
(a)
The radial distribution function gNO (r) and Helmholtz free energy wNO (r) are obtained from rNO (r) by standard procedures of statistical mechanics (McQuarrie, 1976): gNO (r) = rNO (r)/r2 wNO (r) = −kT ln(gNO (r))
with r, the bulk density of N and ˚ , and O atoms. Distances are in A wNO (r) in units of kT. rNO (r) is compiled in the distance range from ˚ to 12 A ˚ , using intervals of 0.3 A ˚ 0A (Sippl, 1990). (a) Distances rij whose N and O atoms are separated by less than nine peptide units are skipped in the summation. In this case, wNO (r) is the Helmholtz free energy (b) of H-bonds, which are long-range in ˚ is 1.03 × 106. (b) Helmholtz free energy terms of sequence separation. The total count, n, of N · · · O distances <12.0 A ˚ for H-bonds separated by three peptide units (relative sequence positions s and s + 4). The total count of distances <12 A is 5.3 × 104.
labelled as scattering centres. With the exception of simple solutes the determination of such specific distribution functions gab (r) for particular atom types a and b is generally difficult or impossible (e.g. see March & Tosi, 1976). gab (r) can be estimated directly when the structures of proteins dissolved in the sample are known to atomic resolution. Summation over all distances between atoms a and b in a corresponding library of structures determined by X-ray analysis yields a radial distribution function g'ab (r). Except for minor variations g'ab (r) and gab (r) are equivalent, since, on average, the distribution of intramolecular distances calculated from X-ray structures is most similar to the corresponding distribution of distances in solution. This equivalence provides a convenient link between the statistical mechanics of liquid systems (McQuarrie, 1976) and data base derived energy functions (e.g. see Sippl, 1990, 1993, 1995). Figure 1 shows the Helmholz free energy w(r) of peptide N · · · O interactions, compiled from N and O atom pairs separated by more than eight peptide units along the chain. The data base used consists of 289 unrelated native protein folds determined
by X-ray analysis. The potential function has a ˚ , corresponding pronounced minimum at rc = 2.9 A to the equilibrium distance between an amide N and carbonyl O atom at H-bond contact. The potential has the familiar core repulsion for r < rc and superimposed oscillations reminiscent of near neighbour effects well known from diffraction studies and simulation experiments (McQuarrie, 1976). The intriguing feature of w(r) is the energy barrier between the minimum at H-bond contact and large distances. The Helmholtz free energy w(r) is the reversible work required to move two tagged particles from large separations to spatial distance r. Hence, two potential H-bond partners, approaching each other from large distances, experience repulsive forces before they reach the ˚ . Formation of H-bonds contact valley at 03.5 A requires work to be done against this energy gradient. When the energy supply is large enough to push the two particles over the energy barrier, free energy is released and the interaction is caught in the deep narrow potential minimum corresponding to H-bond contact. The effects in the reverse
646
Hydrogen bonds
(a)
(b)
process, the breaking of H-bonds, are quite similar to H-bond formation. Energy input is required to drive the transition. Once pushed over the barrier the H-bond partners actively separate, releasing energy to the system. Free energy values at H-bond contact and at large separations are nearly identical. Hence, Dw0w(a) − w(rc ), the total change in free energy upon H-bond formation, is close to zero. Since there is no change in free energy, H-bonds do not contribute to the thermodynamic stability of protein folds. However, the narrow potential well at rc acts as a kinetic trap. Once formed, the H-bond resists unfolding and a network of such bonds holds the chain in a precise configuration. H-bond formation requires a transition of the polypeptide chain from a disordered to an ordered state and, therefore, the Helmholtz free energy w(r) contains a strong entropic component. The approach of two H-bond partners requires navigation through a dense and complex medium composed of a tangled protein chain, protruding side-chains and solvent molecules. This results in a repulsive entropic contribution to w(r), which increases when the H-bond partners come closer. These entropic effects are enforced by the requirement that, at H-bond contact, the participating N, H and O atoms
Figure 2. Helmholtz free energy functions for side-chain atom interactions. Calculation of w(r) and the data base of proteins used are the same as for Figure 1. (a) Hydrophobic interaction between Cg atoms of valine, n = 2.5 × 105. (b) Ion interaction between the guanido group Cz of arginine and the Od atoms of carboxylic acid groups in aspartic acid, n = 5 × 104.
should form a linear arrangement (e.g. see Schuster, 1992). In addition, the approach of peptide H-bond donors and acceptors is always accompanied by electrostatic repulsion of N · · · N and O · · · O atoms. Only at short distances is the electrostatic attraction strong enough to compensate for the repulsive entropic and electrostatic components, but the terms merely cancel. Proteins are covalently linked chains of peptide units and hence the separation along the chain is an important parameter for potential functions (Sippl, 1990). When this separation is small, H-bond partners are confined to stay close in space and the interaction is strongly affected by local steric effects. In general, effects due to separations along the chain are strong for small separations, but diminish with increasing distance of peptide units. In this respect H-bonds in a-helices are unique. Formed between the carbonyl oxygen atom at position s and the nitrogen atom at position s + 4, the H-bond partners are separated by only a few peptide links. Local steric constraints dominate favouring a-helical and extended conformations (Pauling et al., 1951). The Helmholtz free energy potential of Os − Ns+4 interactions, Figure 1(b), separates the two minima corresponding to these states by an energy barrier and, perhaps surpris-
647
Hydrogen bonds
ingly, the gross features of this potential are similar to the function shown in Figure 1(a). The minimum at H-bond contact is narrow and formation as well as disruption of H-bonds requires energy input. The total energy change in the transition from large distances to H-bond contact seems to favour H-bond formation, but the energy difference is small. Formation of H-bond networks characteristic of all native folds requires energy that is immediately regained when H-bond contacts are formed, the total energy change of the whole process being close to zero. The activation energy for H-bond formation has to be supplied by other interactions. Since the intervening energy barrier is relatively high and H-bonds are abundant, these interactions have to be cooperative. Cooperativity is perhaps even more important for unfolding, since disruption of a single bond in a densely packed protein core is usually impossible without affecting other bonds in spatial proximity. Unfolding studies on ribonuclease A, a b-sheetrich protein, indicate that the rate-limiting step in protein unfolding is due to the breaking of H-bonds rather than to the loss of side-chain interactions (Kiefhaber & Baldwin, 1995; Kiefhaber et al., 1995). Side-chains start to rotate, losing their precise relative orientation before the H-bond network is disrupted. The observed unfolding behaviour requires that H-bonds are locked in a precisely defined position, in agreement with the Helmholtz free energy functions shown in Figure 1. But it requires also that interactions between side-chain atoms are less precise. The Helmholtz free energy functions for most attractive side-chain interactions have broad energy minima, as exemplified by the hydrophobic methyl-methyl interaction of valine Cg atoms and the attractive ionic interaction between the guanido Cz of arginine and the carboxylic Od atoms of aspartic acid shown in Figure 2. Differences in Helmholtz free energy between large distances and close contact are substantial, so that these interactions drive protein chains to compact states. However, precision in chain topology is added and maintained by H-bonds. The potentials shown in Figures 1 and 2 are obtained from native folds determined by X-ray analysis, i.e. from well-established experimental facts. The calculation of Helmholtz free energies from these data is a straightforward procedure in the statistical mechanics of dense interacting systems. Numerous attempts have been made to estimate free energies via computer simulation using molecular mechanics force fields in conjunction with molecular dynamics and Monte Carlo techniques. In view of the complexity of these approaches and the approximations involved outstanding results have been achieved by these techniques in singular cases. For example, Hunt et al. (1994) find that hydrogen bonding determines the specific local geometry of protein chains whereas hydrophobic interactions are responsible for dense packing. Yang & Honig (1995a,b) con-
clude from their simulations that the free energy balance of H-bond formation is close to zero or slightly unfavourable. Both conclusions are in perfect agreement with the results obtained here. These simulations and the present work are based on an all-atom representation of protein chains. This is a significant fact in view of the results obtained from simplified models and lattice simulations where peptide H-bond interactions are explicitly neglected (Honig & Cohen, 1996).
Acknowledgements This work was supported by grant P11601 of the FWF/Austria. A set of Helmholtz free energy functions, the data base of protein structures and other supplementary material is available electronically from URL http://lore.came.sbg.ac.at
References Anfinsen, C. B. (1973). Principles that govern the folding of protein chains. Science, 181, 223–230. Baldwin, R. L. (1993). Outstanding Papers in Biology, Current Biology Ltd, London. Bernstein, F. C., Koetzle, T. F., Williams, G. J. B., Meyer, E. F., Jr, Brice, M. D., Rodgers, J. R., Kennard, O., Shimanouchi, T. & Tasumi, M. (1977). The protein data bank: a computer-assisted archival file for macromolecular structures. J. Mol. Biol. 112, 535–542. Gingrich, N. S. (1943). The diffraction of X-rays by liquid elements. Rev. Mod. Phys. 15, 90–110. Honig, B. & Cohen, F. E. (1996). Adding backbone to protein folding: why proteins are polypeptides. Folding Design, 1, R17–R20. Hunt, N. G., Gregoret, L. M. & Cohen, F. E. (1994). The origins of protein secondary structure. Effects of packing density and hydrogen bonding studied by a fast conformational search. J. Mol. Biol. 241, 214–225. Israelachvili, J. (1987). Intermolecular and Surface Forces, Academic Press, London. Israelachvili, J. & Wennerstro¨m, H. (1996). Role of hydration and water structure in biological and colloidal interactions. Nature, 379, 219–225. Kiefhaber, T. & Baldwin, R. L. (1995). Kinetics of hydrogen bond breakage in the process of unfolding of ribonuclease A measured by pulsed hydrogen exchange. Proc. Natl Acad. Sci. USA, 92, 2657–2661. Kiefhaber, T., Labhardt, A. M. & Baldwin, R. L. (1995). Direct NMR evidence for an intermediate preceding the rate-limiting step in the unfolding of ribonuclease A. Nature, 375, 513–515. March, N. H. & Tosi, M. P. (1976). Atomic Dynamics in Liquids, Macmillan Press, London. McQuarrie, D. A. (1976). Statistical Mechanics, Harper Collins Publishers, New York. Pauling, L., Corey, R. B. & Branson, H. R. (1951). The structure of proteins: two hydrogen bonded helical configurations of the polypeptide chain. Proc. Natl Acad. Sci. USA, 37, 205–211. Schellman, J. A. (1955). The stability of hydrogen-bonded peptide structures in aqueous solution. Compt. Trav. Lab. Carlsberg ser. Chim. 29, 230–259. Schuster, P. (1992). Hydrogen bonds. In Encyclopedia of
648
Hydrogen bonds
Physical Science and Technology, vol. 7, pp. 727–761, Academic Press, New York. Sippl, M. J. (1990). Calculation of conformational ensembles from potentials of mean force. J. Mol. Biol. 213, 167–180. Sippl, M. J. (1993). Boltzmann’s principle, knowledgebased mean fields and protein folding. J. Comput. Aided Mol. Design, 7, 473–501.
Sippl, M. J. (1995). Knowledge-based potentials for proteins. Curr. Opin. Struct. Biol. 5, 229–235. Yang, A.-S. & Honig, B. (1995a). Free energy determinants of secondary structure formation. 1. Alpha-helices. J. Mol. Biol. 252, 351–365. Yang, A. S. & Honig, B. (1995b). Free energy determinants of secondary structure formation. 2. Antiparallel beta-sheets. J. Mol. Biol. 252, 366–376.
Edited by F. E. Cohen (Received 26 February 1996; received in revised form 20 May 1996; accepted 21 May 1996)