A model for electrostatic effects in proteins

A model for electrostatic effects in proteins

122 et al. M. Delepierre Linderstrom-Lang, K. (1924). C. R. Trav. Lab. Carlsberg, 15, 7-14. Linderstrom-Lang, K. (1955). Chem. Sot. Spec. Publ. 2, ...

982KB Sizes 2 Downloads 36 Views

122

et al.

M. Delepierre

Linderstrom-Lang, K. (1924). C. R. Trav. Lab. Carlsberg, 15, 7-14. Linderstrom-Lang, K. (1955). Chem. Sot. Spec. Publ. 2, l-24. Matthew, J. B. & Richards, F. M. (1983). ,I Biol. Chem. 258, 3039-304-4. Matthew, J. B., Gurd, F. R. N., Flanagan, M. A., March, K. L. & Shire, S. J. (1985). CRC Grit. Rev. Biochem. 18, 91-197. Molday, R., Englander, S. W. & Kallen, R. G. (1972). Biochemistry, 12, 150-158. Nakanishi, M., Nakamura, H., Hirakawa, A., Tsubor, M., Nagamura, T. & Saijo, Y. (1978). J. Amer. Chem. Sot. 100, 272-276. Orttung, W. H. (1969). J. Amer. Chem. Sot. 91, 162-167. Orttung, W. H. (1978). J. Amer. Chem. Sot. 100, 43694375. Peru& M. (1978). Science, 201, 1187-l 191. Pfeil, W. $ Privalov, P. I,. (1976). Biophys. Chem. 4, 23-32. Poulsen, F. M., Hoch, J. C. & Dobson, C. M. (1980). Biochemistry, 19, 2597-2607. Rogers, N. K., Moore, G. R. & Sternberg, M. J. E. (1985), J. Mol. Biol. 182, 613-616. Edited

Scatchard, G. (1949). Ann. N. Y. Acad. Sci. 51. 660-672. Shire, S.. Hanannia, G. I. H. & Gurd, F. R. N. (1974). Biochemistry, 13, 2967-2974. States, D. J. (1983). Ph. D. thesis, Harvard University. Tanford, C. (1957). J. Amer. Chem. Sot. 79, 5340-5347. Tanford, C. & Roxby, R. (1972). Biochemistry, 11, 21922198. Res. Tuchsen. E. & Ottesen, M. (1979). Carlsberg. CommurL. 44, l-10. Turhsen. E. & Woodward, C. K. (1985a). J. Mol. Biol. 185, 405-419. Tuchsen. E. & Woodward, C. K. (19856). J. Mol. Biol. 185, 421-430. 16, Waelder, S. F. & Redfield, A. C. (1977). Biopolymers, 623-629. Warwicker. ,J. (1986). J. Theor. Biol. 121, 199-210. Warwicker, ,J. & Watson, H. C. (1982). J. Mol. Biol. 157, 671-679. Wedin. R. E., Delepierre, M., Dobson, C. M. & Poulson, F. M. (1982). Biochemistry, 21, 1098-1103. Woodward, C., Simon, I. & Tuchsen, E. (1982). Mol. Cell Biochem. 48, 135-160.

by A. Fersht

APPENDIX

A Model for Electrostatic Effects in Proteins D. J. States and M. Karplus are important Electrostatic interactions determinants of macromolecular chemistry (Perutz, 1978). Electric fields couple directly to electronic structure and consequently can have immediate relevance to the chemistry of catalysis. Moreover, the long range of the electrostatic interactions suggests that they are critical to the overall structure and to the solvation of macromolecules (Kauzmann, 1959; Warshel & Russell, 1984; Matthew et al., 1985; Honig & Hubbell, 1986). The complex effects resulting from electrostatic forces have long been recognized and there are a number of early efforts to model such phenomena (Linderstrom-Lang, 1924; Hill, 1956; Tanford & Kirkwood, 1957; Kauzmann, 1959). LinderstromLang (1924) considered proteins to be impenetrable spheres with charges uniformly distributed over their surface. The limitations of this model led Hill (1956) and Tanford & Kirkwood (1957) to apply the Kirkwood-Westheimer formalism (Kirkwood, 1934; Kirkwood & Westheimer, 1938) to proteins. They assumed the protein to be an impenetrable sphere of low dielectric constant in a high dielectric constant solvent and located the charges at discrete sites at a fixed distance beneath the surface of that

sphere. Since at the time of this work, structural data for proteins were not available. they considered a number of models with simple distributions of charge. Interactions between the charged groups modified their behaviour and qualitatively accounted for the titration curves of proteins. In the course of this work it was noted that the depth at which charges were located was critical in determining the resultant electrostatic properties of the system. Information concerning the positions of charges obtained from X-ray structures of proteins was incorporated into the Tanford-Kirkwood model by Orttung (1970) and Tanford & Roxby (1972). They projected the charges onto a spherical shell. The projection scheme resulted in artifactual foreshortening of the distance between some charge centres, the need to readjust or exclude sites from the calculation and the introduction of the depth of the charges as a fitting parameter. Gurd and coworkers (Shire et al., 1974; Friend & Gurd, 1979) improved the projection procedure by using the actual distances between the charges in the calculation. In contrast to Tanford & Roxby (1972), they assumed that all of the charges are on the

Appendix surface of the protein (i.e. the depth parameter was set equal to zero) and attempted to correct for this by incorporating surface accessibility (Lee & Richards, 1971) into the calculations as a scale factor on the charge-charge interactions. This form of the Tanford-Kirkwood model has been applied to a variety of problems (Matthew et al., 1985). A number of criticisms of the modified model have been published (Rees, 1980; Warshel & Russell, 1984). More recently, extensions of the electrostatic models to arbitrarily shaped proteins have been made (Orttung, 1978; Warwicker & Watson, 1982; States, 1983; Rogers et al., 1985; Warwicker, 1986; Klapper et al., 1986), but these have not been used to calculate titration curves. In this Appendix we present a model (States, 1983) for calculating the electrostatic properties of proteins that makes direct use of the threestructure. The charged atoms, dimensional including those with partial charges (Brooks et al., 1983), have the positions given by the X-ray structure or other data. Thus, there is no projection and neither a depth parameter nor a surface accessibility correction is required. The model represents the protein as a spherical core of moderate dielectric immersed in an aqueous highdielectric solvent. Thus, the protein charges may be located in the core region or, as is the case for many charged residues, they may be in the region of high dielectric constant due to the solvent (Fig. Al). The protein core is assumed to be a region of uniform dielectric constant (EJ from which mobile ions are excluded. The solvent is assigned the bulk dielectric constant (a,,J and a mobile ion concentration (p). The transition between the core and solvent regions is chosen as the radius at which the density of protein atoms has fallen to about half the core value; for the exact choice of radius, see below. Titratable groups, which are the primary concern of any electrostatic model, are found from crystallographic data to occur at positions throughout the system; some are inside the core region, others extend into the solvent region and may be in the presence of mobile ions, and a few are near the transition between the two. By such a realistic positioning of the charges, which contrasts with earlier models, we avoid the need to exclude charges from the calculation or for the introduction of charge projections, arbitrary depth parameters or surface accessibilities to obtain meaningful results. By using a spherical core, rather than the true protein surface, we achieve a great simplification in the calculation. For some proteins (e.g. many enzymes), the spherical surface approximation may not be valid for all the residues because of the presence of an active-site cleft. This is the case for certain of the active-site residues of lysozyme. The separation of the system into two regions, a core with a fixed charge distribution from which mobile ions are excluded and a surrounding region containing mobile ions and some fixed charges leads to a boundary value problem that is a generalization of the one solved for the restricted primitive

123

Figure Al. A schematic diagram of the mobile ion model, showing the protein core immersed in a uniform solvent with charges located in both regions.

model of strong electrolytes (McQuarrie, 1976). Inside the core region, the electrostatic potential is given by Laplace’s equation: v2fp = 0.

(Al) Outside of the core, it is assumed that the system can be described by the Debye-Huckel theory. This leads to the linearized Poisson-Boltzmann equation: v2c$ = lc2cp, where ICis the Debye-Huckel

WY

parameter,

477B lc2 =-I. with /? = l/k,T and I equal to the ionic strength. To solve equations (Al) and (A2) simultaneously with a spherical boundary separating the two regions, we follow Kirkwood (1934) and introduce the general solutions of Laplace’s equations and of the linearized Poisson-Boltzmann equation in spherical polar co-ordinates. For Laplace’s equation, we have:

(r < a), t-43) where a is the core radius, the P,,(cos 8) are associated Legendre polynomials and the A,,, Blm are coefficients to be determined. Correspondingly, for the linearized Poisson-Boltzmann equation, we have:

+ Dl,r’eK’K+ (w)

1

Plm(cos 0) eim4, (A41

where K, and K- are polynomials obtained by series solution of the differential equation in r and C,,, D,, again are coefficients to be determined.

124

D. J. States and M. Karplus

In each region, the potential is a sum of a direct interaction term and a reaction field term. The former is of the form (l/rij) in the core region and (exp( - KTij)/rij) in the solvent region for each pair of charges qi and qj. Expansion of these functions in spherical harmonics permits one to find the direct interaction contribution to the potentials in equations (A3) and (A4). Addition of the reaction field term makes it possible to satisfy the boundary conditions of the problem. These are that (1) the potential goes to zero at infinity and has no singularities at the origin; (2) the gradient of the potential tangent to the boundary is continuous across the boundary; and (3) the gradient of the potential normal to the boundary times the dielectric constant is continuous (Jackson, 1975). Condition (1) is satisfied by restricting the reaction field contribution to the potential to positive powers of r in the core region and to negative powers of r in the mobile ion region. Conditions (2) and (3) result in a two-by-two system of linear equations for each order 1 that must’ be solved to obtain the expansion coefficients for the reaction field contribution. The spherical harmonics expansions given in equations (A3) and (A4) converge slowly. Their convergence properties can be improved by separating out the direct interaction term and expressing it as a function of rij. In most regions of space, the remaining terms of the expansion converge rapidly as a geometric series in (r&r,,,), where rmin(rmax) is th e smaller (larger) value of ri and 3. Tf both charges are near the boundary (within 10% of the core radius), this is not suficient and a limiting approximation where both are on the boundary is employed; for such pairs of charges, the present model is equivalent to the simplified projection technique used by Shire et al. (I 974). The expansions were truncated at order 1 = 20, which is sufficient to obtain satisfactory convergence. Given the solutions to equations (A3) and (A4) for a set of fixed charges in the protein and mobile ions in the solvent. the application of the present formulation to electrostatic effects in proteins proceeds in the same way as the model of Tanford & Kirkwood (1957). The essential point is that the potential can be expressed as a superposition of contributions from the individual fixed charges qi and that the total electrostatic free energy is a sum of terms of the form qiqjjij, where Wij is the interaction free energy for unit charges. The latter can be calculated once and for all and stored as a symmetric matrix given the protein structure, the charged groups and the ionic strength of the medium. A variety of applications of the formulation can be made. The primary objective of the TanfordKirkwood approach (Tanford & Roxby, 1972) and its more recent modification (Matthew et al., 1985) has been to calculate titration curves for proteins; that is, to determine self-consistently the change in the pK values of the various ionizable groups due to

the electrostatic free energy of interaction. Considerable success appears to have been achieved in this regard, though it is somewhat diflicult to evaluate the results because of the use of adjustable parameters in the calculation and the lack of complete assignments of the experimental pK values. For such calculations so-called intrinsic pK values (pKi”*) are introduced for each type of titrable amino acid. They refer to the pK value expected for an idealized “discharged” protein but implicitly include any self-energy terms (Tanford & Kirkwood, 1957; Honig & Hubbell, 1984). Given the results of a calculation of this type, the effective charge (qi) of each of the ionizable groups is known as a function of pH so that the electrostatic contribution to the free energy of the protein is given by:

;$,q&wW The present generalization makes it possible to include not only ionizable groups, which are concentrated at or near the surface of the protein, but any fixed partial charges, which may be distributed throughout the protein. The resulting charge distribution can be used to examine the electrostatic potential in the neighbourhood of the protein to analyse substrateprotein, protein-protein and protein-nucleic acid interactions (Matthew et aE., 1985). For treatment of the effect of electrostatic interactions on a variety of protein reactions (e.g. oxidationreduction, hydrogen exchange, thiol-disulphide interchange), it is necessary to calculate the energy difference at a given site between two states; i.e. reactant and product for the free energy change in a reaction, and reactant and transition state for the rate of a reaction. The ability to calculate the potential at interior sites is crucial to such studies of reactivity and catalysis as the phenomena of interest frequently occur in the core region. For such applications, it may be useful to employ experimental pK values (if they are available) and to use them with the model to determine the desired free energy changes; alternatively, the model can be used to calculate the required pK values, relative to suitably chosen values of pKi,,. To illustrate the model and make clear the difference from the projection-type treatment, we briefly outline some simple applications in addition to the results required for the hydrogen exchange analysis described in the main text. The co-ordinate atom comparisons, minimizations, hydrogen constructions, hydrogen bond energy calculations, and protein density evaluations were performed with the program CHARMM (Brooks et al., 1983); the protein parameter set used is PARAM 3. The core radius was chosen as the radius where the protein density falls to half the value in the core region; such a radius is somewhat smaller than that corresponding to the radius of gyration. The dielectric constant of the core (Q~,) was set equal to 10. This value is a compromise between the value of

Appendix about 5 observed for crystalline amides (Baker & Yager, 1972) and the very high dielectric constant (greater than 150) of liquid amides (Meighan & Cole, 1964); it attempts to reflect the extensive internal motion of proteins observed in molecular dynamics calculations (McCammon et al., 1977). The ionic strength was set to O-1 M, a value typical of many experimental protocols, and a,,, to 78. (a) Application to the redox potential of cytochrome c A number of cytochrome c derivatives have been prepared in which the lysines are selectively alkylated. This changes the lysine charge from 1 to 0 and modifies the electrostatic properties of the molecule (Smith et aZ., 1977). Specifically, alkylating a lysine alters the redox potential of the haem (George et al., 1966). Rees (1980) has examined this problem and found that difficulties arise in applying the Shire et aZ., (1974) model to it. Here we use the mobile ion model to calculate the electrostatic interaction between modified lysine sites and the haem centre. This may be done either by calculating the potential change at the haem centre from an alteration in the charge of a lysine, or by calculating the potential at the lysine from the introduction of a charge at the haem. The reduced cytochrome c structure (Mander et al., 1977) was used to represent the protein. Since cytochrome c is approximately spherical with no active-site cleft, it is an ideal case for applying the model. From this structure, a core radius of 14 A was determined; this is equal to 0.9 times the radius of gyration. It was assumed that in the oxidation-reduction reaction the iron charge changed by one unit. Only the direct effect of modifying a given lysine is included; that is, the lysine charge is altered from 1 to 0 but the effect of this on titratable groups in the molecule is neglected. An ionic strength of 0.1 M is used in the calculation; the exact value employed in the experiment is not known. Table Al compares the redox potential changes that result from the lysine modifications (Smith et al., 1977) with the calculated interaction energies. While there are significant uncertainties in the experimental data, there is reasonable agreement between the

Experimental

Table Al and calculated redox shifts in cytochrome c

Modified residue

-Wexp)

AWxp)

AE,(cslc)

None Lys8 Lys13 Lys27 Lys72 Lys79 LyslOO

26Ok5 2600+5 245+5 245f5 245f5 245f5 26Ok5

0 0 - 15 - 15 -15 -15 0

(0) -2 -6 -11 -8 -13 -3

Energies are given in 10m3eV.

125

calculated and observed shifts. The calculated shifts are, on average, somewhat smaller than the observed values. This may reflect an overemphasis of salt effects due to the free penetration of mobile ions among the external charges. Since all of the lysines are external this effect is particularly important here; a somewhat reduced value of the dielectric constant (E g 50) in the region where the fixed external charges occur would yield improved results. (b) Application

to the titration curve of lysozyme

For the analysis of the pH dependence of the hydrogen exchange behaviour of the tryptophans and peptide groups described in the main text, the mobile ion model was applied to the titrating groups of hen egg-white lysozyme, the protein treated in the pioneering study of Tanford & Roxby (1972). Lysozyme is rather polar; there are 32 titrating groups, many of which have been identified by chemical modification (Imoto et al., 1972) and by nuclear magnetic resonance (n.m.r.) (Wedin et al., 1982). The molecule is roughly spherical in shape with a prominent active-site cleft. Making use of the tetragonal crystal structure (Imoto et al., 1972; D. C. Phillips, personal communication), we plot the protein density (polar and non-polar atoms) as a function of distance from the geometric centre of the molecule in Figure A2. In the interior of the protein the density is approximately uniform with an equal distribution of polar and non-polar atoms. Near 15 A, the density falls off rapidly, with the density of non-polar atoms decreasing more quickly than that of the polar atoms. The transition radius was placed at the point (14.7 A) where the protein density is approximately half its core value; this is equal to 0.8 times the radius of gyration. Some density, due primarily to positively charged polar side-chains, extends out to 25 A. Ten of the 12 charged residues at greater than 18 A are positively charged (overall 19 of 32 titrating groups are positively charged); eight of the 32 titrating groups are located inside the transition radius and four are near the core boundary. The model was first applied to lysozyme with stereotypic pKint (Table A2); the values for pKint are the same as those used by Tanford & Roxby (1972) and Shire et al. (1972). Typically, for a given pH, three to seven charge iterations were required for convergence to a set of self-consistent charges; the convergence was tested by starting with different initial values. Titration curves were obtained by varying the pH over the range 0 to 14; the resulting values for each group were fit by a sigmoidal curve from which the effective pK values (pK,,,) were determined. Figure A3 shows the overall titration curve for lysozyme obtained from the model. It is in good general agreement with the experimental results throughout the measured range between pH 1 and 11 (Tanford & Roxby, 1972).

126

D. J. States and M. Karplus

Radius

Figure AZ. Density

(a)

from the geometric centre. The broken line is the number density for non-polar atoms, while the continuous line is the number density of polar atoms (atoms with partial charges greater than 0.1 in magnitude). of lysozyme

as a function

of radius

Table A2 Intrinsic Stereotypic Group

and effective pK values for lysozyme values

Adjusted values

~Kint

PK,,,

pKi.t

PK.,,

Asp18 Asp48 Asp52 Asp66 Asp87 Asp101 Asp1 19

4.0 4.0 4.0 4.0 4.0 4.0 46

3.45 3.46 3.19 3.35 2.56 3.27 3.16

4.0 4.0 4.0 3.0 5.0 5.0 4.0

Glu7 Glu35

4.5 4.5

3.08 4.16

4.0 6.0

3.33 3.52 3.39 2.26 3.70 4.32 3.15 2.52 6.64

3.6

2.66

3.6

2.66

10.0 10.0 10.0

8.58 9.98 10.54 6.29 7.62 11.54 11.35 10.48 11.73 10.82 10.70

11.5 10.0 11.0

10.60 9.92 11.81

Il..5 10.0 11.5

6.0 8.5 10.6 10.6 10.6 10.6 10.6 10.6

5.32 8.42 11.53 11.35 10.48 11.43 10.79 10.70

5.4 8.5

C terminus Tyr20 Tyr23 Tyr53 His15 N terminus Lysl Lys13 Lys33 Lys96 Lys97 Lysl I6

6.8 7.7 10.6 10.6 10.6 10.6 10.6 IO.6

Values were calculated with tetragonal lysozyme co-ordinates (see the text). included in the calculation had pK,,, = 12.0 and pK,,, between 12 and 13. t Data from Delepierre et nl. (1984) and references therein.

pL,t 4.4 3.7 2.0 4.0 4.5 2.0

6.2

The 11 arginines

Appendix

127

-201 00

I 2.5

I 5.0

I 75

I IO.0

I

12.5

PH

Figure A3. Titration curve for tetragonal lysozyme structure. (--) Calculated with stereotypic pK,,,; (- - -) calculated with adjusted pent values (see the text).

Figure A4. Electrostatic free energy for lysozyme as a function of pH (see the text). (--) Tetragonal structure; (- - - ) triclinic structure; both used adjusted pKi,, values.

The total electrostatic interaction energy (no self terms) calculated from the mobile ion model is shown in Figure A4. The electrostatic energy is negative from pH 3.5 to 12.0, and reaches its minimum (approximately - 6.6 kcal) at pH 10.4. These results are qualitatively consistent with the observed pH dependence of denaturation in The non-random lysozyme (Privalov, 1979). distribution of charges in lysozyme results in selective screening effects, which would not be correctly modelled in the Tanford-Kirkwood formalism. Since the charges far from the protein core are predominantly positive, dielectric and mobile ion screening has an important effect on these charges and shifts the potential of the core negatively. By screening these positive charges from each other, the range of electrostatic stability of the protein is significantly extended on the acid side. The isoelectric point is calculated to be 11.2, in good agreement with the experimental values of Il.0 to 11.35 (Imoto et al., 1972). Tf instead of the overall titration results, the individual calculated effective pK values are compared with the available measurements. the agreement is not quantitative (see Table AZ). The magnitude of the calculated shifts (relative to the stereotypic values) are of the order of those found experimentally; i.e. the electrostatic interactions shift the calculated pK,,, by amounts ranging from 0 to over 1.2 pK units. In most cases relatively local interactions were responsible for the larger shifts observed. Certain of the experimental trends are seen to be reproduced in the Table. For example, the negatively charged amino acids (Asp, Glu) are calculated to have shifts to lower pK,,, values, in accord with the available data; Glu35 is an exception due to its position deep in the core (see below). Unfortunately, there are no data for the positively charged lysines, which are calculated to

As have their pK,,, shifted in the opposite direction. to the three tyrosines, the calculations of the pK,,, show a decrease for two of the tyrosines and an increase for one of them, while the experimental values correspond to an increase for two and apparently no change for one. When one looks at individual residues for which data are available, it is evident that the correlation between theory and experiment is not completely satisfactory; e.g. for the four Asp residues with measured pKobsr Asp87 is calculated to have the lowest pK,,, while experimentally it is Asp66. The disagreement between the calculated experimental values for pK,,, may originate in the simplified dielectric protein core model used for the electrostatics or in the neglect of other factors that can alter the value of pKi”t, relative to the stereotypic constants. It is likely that both of these contribute, with one or the other being more important for specific residues. Some of the factors involved in modifying the pKin* of a specific group such include local interactions, as hydrogen bonding and an unusually hydrophobic environment. The local environment of the titratable groups in lysozyme varies from residues like Arg125 and -128, which are completely exposed and solvated, to groups like Asp66, which is buried and extensively hydrogen bonded, and to groups like Glu35, which is deep in the active-site cleft and not extensively hydrogen bonded. Table A3 lists a number of the residues for which experimental data are available, the adjustment in pK,, (ApK,,,) required to obtain approximate agreement with experiment and the hydrogen bonds in which the residues are involved with energies calculated by the CHARMM program (Brooks et al., 1983). The accessible surface areas of the polar group in the titratable residues, as calculated by the Lee & Richards algorithm (1971), are also included. It can

128

D. J. States and M. Karplus

Table A3 Local perturbations

residues

for titratable

Residue

APL

H-bond acceptor? (kcal/mol)

H-bond donor? (kcal/mol)

Surface areat (AZ)

Asp66 Asp87 Asp101

-1 fl fl

-7.8 -3.0 0

0 0 0

0.3 48.3 64.5

Glu7 Glu35

-0-5 +1.5

-4.9 -1.0

0 0

36.9 16.3

TyrfO Tyr53

+1 f1.5

0 0

-0.7 -0.6

20.6 8.2

N-terminus

+ 1.2

0

-3.3

10.0

t Polar hydrogen atoms built with the CHARMM program and allowed to relax by 25 cycles of AUNR minimization with heavy atoms fixed (Brooks et al., 1983). The acceptor and donor column refer to the energies calculated with the titrating group acting as acceptors or donors, respectively. $ Accessible surface area of titratable residue.

be seen that the shifts in pKi,, required to obtain agreement with experiment range from -0.5 to 1.5; this corresponds to a free energy change of no more than 2 kcal/mol. Asp66 and Glu7 are acceptors of energetically favourable hydrogen bonds that would stabilize the negatively charged species. The negative shifts in their PK~,,~(see Table A3) are in accord with this. Similarly, the amino-terminal hydrogen bonds strongly as a donor; this stabilizes the charged species and is in accord with a positive pKin, shift. Tyr20 and -53 both act as hydrogen bond donors, an interaction that would be lost on ionization, so the positive shift in the PK~,,~ is expected; however, the calculated hydrogen bonds are rather weak, so it is not clear that they can explain the magnitude of the shift. The hydrogen bond patterns do not provide a simple explanation for Asp87 and AsplOl; Glu35 is discussed further below. Because of the disagreement between the calculated and experimental results, a set of “adjusted” pKint values were introduced that shifted the calculated pK,,, closer to the pK,,, for the residues that have been studied experimentally. The values used are listed in Table A2. The titration curve calculated with this set of pKin, is also shown in Figure A3. It is seen to be very similar to that obtained with the stereotypic values. The adjusted values are used with the electrostatic model to treat the hydrogen exchange and thiol/ disulphide interchange kinetics. In this way, inaccuracies in the treatment of the pK values are at least partially compensated for in the interpretation of the experiments. The effects of variations in structure, transition radius, dielectric constant and ionic strength were investigated to gain insight into the properties of the model. Since pK measurements are mainly made in solution and crystal co-ordinates are used to position the charges, calculations based on the co-ordinates of tetragonal and triclinic hen eggwhite lysozyme were made; the structures have an r.m.s. (root-mean-square) difference of 1.3 A for all heavy atoms and 2.51 A for the side-chains

involving titratable residues. Figure A5 shows the lysozyme titration curves calculated with the adjusted pKin,; the two curves are similar but not identical. As to the electrostatic energy (Fig. A4), the minimum values are -7.2 kcal/mol for the triclinic co-ordinates versus -6.6 for the tetragonal co-ordinates; also, the triclinic curve is very flat between pH 5.6 and IO.4 while the tetragonal curve has a double minimum (at pH 6.2 and 10.4). For individual residues, with the identical sets of pKin,, the calculated pK,,, values tend to shift in the same way, but there are significant differences in some cases (see Table A4). The transition radius was varied over the range of 0.75 to 0.85 times the radius of gyration of lysozyme. As the core size was reduced, the interaction energies were somewhat reduced, both as a result of shifting groups out of the core region and as a result of decreasing the distance between groups in the core region and the solvent. For groups near the transition radius, changing the core size has an effect similar to altering the depth of the

-20 0.0

I 2.5

I 5.0

I 7.5

1 IO.0

1 12.5

PH

Figure A5. Titration curves for different structures. (------) Tetragonal structure: triclinic

structure;

both used adjusted

pK,,,

lysozyme (----I values.

Appendix

Table A4

Table A5

Calculated titration behaviour for two lysozyme crystal co-ordinates

Cystine sulphur electrostatic potentials and accessibilities

P&, for Group

p&r

Tetragonal co-ordinates

Residue

Triclinic co-ordinates

Asp18 Asp48 Asp52 Asp66 Asp87 Asp101 Asp1 19

4.0 4.0 4.0 3.0 5.0 5.0 4.0

3.47 3.62 3.51 2.36 3.93 4.28 3.35

3.56 3.35 3.39 2.33 4.02 4.05 3.66

Glu7 Glu35

4.0 6.0

2.75 6.30

3.20 6.08

3.6

2.85

2.78

TyRO Tyr23 Tyr53

11.5 10.0 11.0

10.89 9.83 11.49

11.42 9.77 10.98

His15

6.0

5.56

5.75

N terminus

8.5

8.37

8.18

10.6 10.6 10.6 10.6 10.6 10.6

11.36 11.05 10.32 11.04 10.90 10.58

10.71 11.09 10.25 10.95 11.09 10.58

C terminus

Lysl Lysl3 Lys33 Lys96 Lys97 Lysll6

129

charges within the core in the Tanford-Kirkwood (1957) model. Variation of the internal dielectric constant (Q,,,) scales the electrostatic interactions in the core region. For calculations with the internal dielectric varied from 5 to 20, the largest changes (up to 0.8 pK unit) in p&, involved core residues. The effects have a less than linear dependence on the dielectric constant reflecting the number of residues in the exterior region and the spatial averaging involved in dielectric phenomena. Significant changes in titration behaviour resulted from elimination of the ionic strength terms, even at 0.1 M. In the absence of mobile ion shielding, the core of the protein was uniformly shifted to positive electrostatic potentials. This decreases the pH range over which electrostatic interactions lead to stabilization, primarily due to reduced screening of the exterior positively charged (mainly Lys) residues. The PK,.~ values of the carboxyl groups are lowered by about one unit on average. Increasing the ionic strength from 0.1 M to O-2 M has a relatively small effect, though going to 1 M ionic strength would produce a significant change. (c) Thiolldisulphide

exchange

Disulphide bonds covalently stabilize the native conformations of many proteins including lysozyme. These bonds form by base-catalysed thiol/disulphide exchange (Creighton, 1978). In lysozyme the 6-127 disulphide can be selectively reduced under mild conditions (Imoto et al., 1972).

Cys6 cys30 cys64 Cys76 Cys80 cys94 cys115 Cys127

Electrostatic potentialt 0.85 0.13 -0.86 0.27 -0.72 -0.19 - 0.22 0.87

Sulphur accessible surface 17 0 0 1 0 0 0 0

The tetragonal lysozyme co-ordinates were used and the pH was set equal to 8.0. t The electrostatic potentials are expressed in pK units.

To test the possibility that electrostatics were responsible for this selectivity, we calculated the electrostatic potential at all of the cystine sulphur atoms in the enzyme (Table A5). It is evident that Cys6 and -127 have the most positive potential, equal to 1.15 kcal/mol per electronic charge. Such a potential is expected to enhance the attack by negative ions involved in thiol/disulphide exchange. The cystine 6 sulphur is also the only sulphur atom with significant accessibility to solvent as determined by the method of Lee & Richards (1971). Thus, the present analysis suggests that a combination of electrostatic accessibility factors is responsible for the selective reactivity of the 6-127 disulphide bond. The importance of accessibility to reagents is underlined by recent results on disulphide bond formation in the bovine pancreatic trypsin inhibitor (States et al., 1984). Here it has been shown that the form with two disulphide bonds (14-35, 5-55) only slowly forms the third bond (30-51), which is protected from reagents by the near native structure of the intermediate; by contrast the species (30-51, 5-55), which is also very native like, forms the third (14-38) disulphide very easily due to its exposed position. (d) Generalizations of the spherical model The model presented above can be generalized in several ways. The separation of space into two regions is somewhat arbitrary. Clearly, a region of intermediate mobile ion concentration exists at the boundary between the protein and solvent where protruding side-chains exclude ions and solvent from a significant volume. The polarizability of this region might also be expected to be intermediate between that of the protein core and that of the free solvent. The formalism introduced in the current model could be extended by introducing a spherical shell of intermediate dielectric and mobile ion concentration between the core and the solvent regions. This means that there are an additional pair of boundary conditions and an additional pair of free parameters so the problem remains well

130

D. J. States and M. Karplus

determined. Such an extension was not pursued because it was felt that the introduction of additional complexity in the model could not be justified, given the current uncertainty in determining essential parameters, such as the core dielectric constant and transition radius. The use of a spherical geometry is also obviously a crude simplification. Numerical methods can be used to treat a more realistic geometry (States, 1983). Finally, any continuum model for an inhomogeneous system such as a solvated macromolecule must be recognized as an approximation. Molecular dynamics simulations including solvent could be used to obtain more detailed results (Karplus & 1983). Also, the introduction of McCammon, dynamically polarizable atoms or molecules (Pollock & Alder, 1978) is possible. The present although oversimplified, offers great method, savings in computational cost over the more detailed treatments. We are grateful to Axe1 Brunger for helpful discussions and assistance with some of the calculations.

References Baker, W. 0. & Yager, W. A. (1972). J. Amer. Chem. Sot. 64, 2171-2177. Brooks, B. R., Bruccoleri, R. E., Olafson, B. D., States, D. J., Swaminathan, S. & Karplus, M. (1983). J. Comp. Chem. 4, 187-217. Creighton, T. E. (1978). Progr. Biophys. Mol. Biol. 33, 231-297. Delepierre, M., Dobson, C. M. Howarth, M. A. & Paulsen. F. M. (1984). Eur. J. B&hem. 145, 389-395. Friend, S. H. & Gurd, F. R. M. (1979). Biochemistry, 18, 4612-4619. George, P.. Hanania, G. II. H. & Eaton, W. A. (1966). Hemes and Hemoproteins, Academic Press, New York. Hill. T. L. (1956). J. Phys. Chem. 60, 253-255. Honig, B. & Hubbell, W. (1984). Proc. Nat. Acad. Sci., U.S.A. 81, 5412-5416. Honig, B. & Hubbell, W. (1986). Annu. Rev. Biophys. Biophys. Chem. 15, 163-168. Imoto, T., Johnson, L. N., North, A. T. C., Phillips, D. C. & Rupley, J. A. (1972). The Enzymes, 7, 665-868. Jackson, J. D. (1975). Classical Electrostatics, Wiley, New York. Karplus, M. & McCammon, J. A. (1983). Annu. Rev. Biochem. 53, 263-300. Edited

Kauzmann, W. E. (1959). Advan. Protein Chem. 14, l-63. Kirkwood, J. G. (1934). J. Chem. Phys. 2, 351-361. Kirkwood, J. G. & Westheimer, F. H. (1938). J. Chem. Phys. 6, 506-517. Klapper, I., Magstrom, R., Fine, R., Sharp, K. & Honig, B. (1986). Proteins, 1, 47-51. Kurachi, K., Sicker, L. C. & Jensen, L. H. (1976). J. Mol. Biol. 101, 11-24. Lee, B. & Richards, F. M. (1971). J. MOE. Biol. 55, 37% 400. Linderstrom-Lang, C. (1924). C. R. Trav. Lab. Carlsberg, 15, 7-14. Mander, M., Mandel, G., Trus, B. L., Rosenberg, J., Carlson, G. & Dickerson, R. E. (1977). J. Biol. Chem. 252, 4619-4696. Matthew, J. B., Gurd, F. R. N., Flanagan, M. A.. March, K. L. & Shire, S. J. (1985). CRC Grit. Rev. Biochem. 18, 91-197. McCammon, J. A.. Wolynes, P. G. & Karplus, M. (1977). Biochemistry, 18, 585-590. McQuarrie, D. A. (1976). Statistical lMec?uznics, Harper $ Row, New York. Meighan, R. M. & Cole, R. H. (1964). J. Them. Phys. 68, 503-508. Orttung, W. H., (1970). Biochemistry, 9, 2394-2402. Orttung, W. H. (1978). J. Amer. Chem. Sot. 100, 43694375. Perutz, M. (1978). Science, 201. 1187-1191. Pollock, E. L. & Alder, B. J. (1978). Phys. Rev. Letters, 41. 903-906. Privalov. P. L. (1979). Advan. Protein Chem. 33, 167-241. Rees. D. C. (1980). J. Mol. Biol. 141, 323-326. Rogers, N. K., Moore, G. R. & Sternberg, M. J. E. (1985). J. Mol. Biol. 182, 613-616. Shire, S. J., Hanania, G. I. H. & Gurd, F. R. N. (1974). Biochemistry, 13, 2967-2974. Shrake, A. & Rupley, J. A. (1973). J. MOE. Biol. 79, 351371. Smith, H. T. Staudenmayer, N. & Millet, F. (1977). Biochemistry, 16, 49714974. States, D. J. (1983). Ph.D. thesis, Harvard University. States, D. J., Dobson, C. M., Karplus, M. & Creighton, T. E. (1984). J. Mol. BioZ. 174, 411-418. Tanford, C. & Kirkwood, J. G. (1957). J. Amer. Chem. sot. 79, 5333-5339. Tanford, C. & Roxby, R. (1972). Biochemistry, 11, 21292198. Warshel, A. & Russell, S. T. (1984). Quart. Rev. Biophys. 17, 283-352. Warwicker, J. (1986). J. Theor. BioE. 121, 199-210. Warwicker, J. & Watson, H. C. (1982). J. Mol. Biol. 157, 671-679. Wedin. R. E., Delepierre, M., Dobson, C. M. & Poulsen, F. M. (1982). Biochemistry, 21, 1098-1103.

by A. Fersht