Prog. Blophys. molec. Biol, VoL 47, pp. 1-29, 1986 Printed in Great Britain, All rights tesvrved,
0079-6107/8650.00+..50 Copyright © 1986 Pergamon Press Ltd,
QUANTUM CALCULATIONS ON PROTEINS: THE INCORPORATION OF ENVIRONMENTAL EFFECTS IN QUANTUM CHEMISTRY MALCOLM L. J. DRUMMOND Department of CrystaUography, Birkbeck College, Malet Street, London WCI E 7HX, England
CONTENTS I. INTRODUCTION II. QUANTUM CHEMICAL METHODS III. WHY INCLUDE THE ENVIRONMENT.9
4 4 4
IV. ENVIRONMENTAL INFLUENCES
1. The Continuum Electrostatic Model (a) Classical theory (b) Quantum mechanical applications (c) Summary 2. Discrete Point Models (a) Point charge models (b) The solvaton model (c) Electrostatic models including polarizability (d) Conclusion 3. Hybrid Models
7 11 11 12 15 16 26 26
V. CONCLUSION ACKNOWLEDGEMENTS REFERENCES
27 27 27
I. I N T R O D U C T I O N For many years, biochemical and kinetic studies were the only means available for the investigation of enzyme activity. Thermodynamic considerations enabled one to calculate the reduction in the activation barrier of the enzyme-catalyzed reaction, but in the absence of any kind of structural information, such quantities remained purely empirical. The specificity and efficiency of enzyme catalysis indicated that the three dimensional molecular structure was the primary determinant of the mechanism. This structure should recognize the preferred substrate, and energetic considerations showed that the enzyme's active region should be complementary to the transition state of the substrate (Haldane, 1930; Pauling, 1946). Further progress became possible with the advent of protein crystallography. For the first time, atomic resolution three-dimensional structures of significant enzymes became available. Consequently, the detailed relationship of the structure to the mechanism could be investigated. Qualitatively, it was possible to propose plausible chemical theories for the reaction, involving (usually) strategically placed amino acid side chains. Specificity could also be explained as substrate recognition and binding sites were characterized. The pKa's of certain active site residues (reviewed more fully below) could also be qualitatively rationalized by considering the local environment of these groups in the molecule. The logical next step is the quantitative calculation of active site pKa's, and energy changes on the transformation from bound substrate to transition complex. The number ofatoms in a protein molecule makes the realistic calculation of its interaction properties such a mammoth undertaking that it is only by a judicious combination of modern computer JPB 47.*1-A
l
2
M.L.J. DRUMMOND
technology and intelligent approximation procedures that we may hope to make any progress in this field. It is this line of research that is reviewed in the current article. To describe the electronic state of the active site, quantum mechanical methods are essential. Firstly, then, 1 shall describe the techniques used and the limitations of such calculations. Secondly, protein and solvent environmental effects will be shown to be essential for a meaningful description of the active region, before I review the work undertaken to obtain a usable approximate model of such a complex system. II. Q U A N T U M C H E M I C A L M E T H O D S The two main computational approaches to quantum chemistry are the "ab initio" and "semi-empirical" techniques. The overall framework consists of the following approximations: (i) Non-relativistic motion. This breaks down for the innermost core electrons of the heavier elements, but is reasonable otherwise. (ii) Born-Oppenheimer approximation. Because the lightest nucleus is about 2000 times heavier than an electron, the former move so much more slowly than the latter that they are effectively stationary as far as the electrons are concerned. The nuclei then move in the average field of the electrons. (iii) Hartree-Fock approximation. The electrons are considered singly, each moving in the average field of the others. This ignores correlation between the electrons, and although it makes the calculations much easier, the cost is a significant fall in absolute accuracy. The circumstances under which this is likely to be a problem are discussed by Carsky and Urban (1980). The process leading to self-consistency between all the electrons gives rise to the label SCF (Self Consistent Field). (iv) Linear Combination of Atomic Orbitals ( LCAO ). The molecular orbitals (MOs) are expanded in terms of linear combinations of the free atoms' orbitals. This minimizes the effort needed for inner shell electrons, which are only slightly perturbed on the formation of a molecule. The forms of the AOs (a central field problem) are well known. The formulation of the Schrodinger equation in terms of these orbitals at this level is fully given by Carsky and Urban (1980). The discussion of the various integrals required, basis sets, and the dependence of the results upon them, are all contained in this same reference. In ab initio methods, all the integrals are calculated numerically, On the other hand, if some are assigned zero values at the start, and others approximated empirically, we have the semiempirical formalism, such as: C N D O (Complete Neglect of Differential Overlap) INDO (Intermediate NDO) M I N D O (Modified INDO) and so on. Pople and Beveridge (1970) review this method. The parametrization can become quite extensive, and is subject to occasional review (e.g. CNDO/2 or MINDO/2 or /3). The usefulness of the method, and examples of applications, are to be found in Murrell and Harget (1972). Because of the simplifying assumptions made, larger systems can be tackled. Also, some molecular properties (such as dipole moments) frequently come out more accurately than by ab initio techniques, although considering the empirical element of the method, this is not really terribly surprising. The group function approximation (McWeeny and Sutcliffe, 1969) in molecular quantum mechanics is analogous to the Hartree-Fock theory for electrons, but whole molcular groups are considered. The wavefunctions for the separate groups are combined in a totally antisymmetric way to produce the complete molecular wavefunction, assuming only weak interactions between the groups. This concept is used in solvent effect theory, with the solute as one group, and the solvent (represented in an approximate way) as the other. This effectively amounts to a linear separation of terms in the Hamiltonian; the analysis was carried out by Kleiner and Elder (I 975). This approach suffers from the fact that it uses an orthogonal basis, which is computationally disadvantageous and renders the method
Quantum calculations on proteins
3
unsuitable for large systems. Mehler (1977, 1981) derived a non-orthogonal group-function approach which is free of the above criticism. One way of extending full quantum chemical calculations to large molecules was investigated by Davis et al. (1975), who called it the "molecular fragment technique". This is another application of the group function method. The wavefunction for each fragment is expanded over a basis set and optimized; then these functions are used as the basis set for the whole molecule. Even so, however, the size of the molecule that can be treated in this way is fairly small. One approximate way of performing calculations on whole molecules is available. NaraySzabo has reported calculations on inhibitor-enzyme interactions for trypsin (I 984) and for serine proteases (Angyan et al., 1983) using this method based on transferable bond fragments. The wavefunction of the molecule is considered to be made up of strictly localized bond orbitals. These are constructed from Slater-type atomic orbitals separately for tr and rc bonding orbitals. Two-centre and lone-pair orbitals are built up from these, using coefficients either assumed to be transferable, or obtained from C N D O / 2 calculations. The localization of the orbitals enables the calculation to be simplified enormously, and an entire macromolecule may be treated in this approximation (Naray-Szabo, 1979). These methods require comparison with more rigorous (although computationally more expensive) techniques in order for one to be able to evaluate their effectiveness in any absolute sense. In conclusion, it is as well to note that the scale of the operation is really rather small in chemical terms. The best that the ab initio programme Gaussian 76 (Pople et al., 1978) can do is a maximum of 35 atomic centres, for example, and 90 basis functions, so that the size of the quantum mechanically treated region is severely limited. These difficulties arise, in part, from the problems involved in the storage and retrieval of all the necessary integrals, even with modern technology, and are not due to the theory itself in any way. Independently-developed (and highly machine-dependent) programmes have been written, which are able to handle several hundred basis functions, but this in no way invalidates the above general comments. III. WHY I N C L U D E T H E E N V I R O N M E N T ? If a system can be modelled by not more than about 20 atoms, the nuclear coordinates of which are known, it is a simple matter to run a quantum chemistry package and come up with an energy.Varying one or two parameters in the geometry enables one to find the stablest conformation. The number of calculations that have been attempted this way is large, and growing monthly. The question is: to what extent is it permissible to ignore the whole of the rest of the environment of the quantum system, if we want to predict the correct conformation? As an initial step in the investigation of the effects of various protein groups in the active site region as (potentially) part of the enzymic mechanism, one may estimate their contribution to ascertain whether they are important or not. Whether there are other effects equally or more important will need a separate study. For example, Kollman and Hayes (1981), investigating the active site of papain, tried including an extra aspartate (carboxyl group) in their calculations. Previously, Broer et al. (1976) had used the system shown in Fig. 1. Kollman and Hayes found an additional stabilizing effect for the (observed) diionic
H CH3--SH...N//CXNHI
.0 ~C'~ •
HC--'CH
(ys-25
His-159
ASh-175
FIG. 1. Papain active site. Small molecules are placed in the geometry of the catalytically significant residues for ab initio calculations on such systems. This arrangement was used by Broer et al. (1976).
4
M . L . J . DRUMMOND
form due to the presence of the aspartate, but the uncharged form was still predicted to be stabler. These studies (and others to be reviewed below) were initiated because: (i) The experimental pKas of the histidine and the cysteine were anomolous. For example, although the pK, of histidine is usually about 7, in papain it behaves as a strong base. (ii) The charged form is observed in vivo, as a consequence of (i). That the structure of the whole enzyme is essential for a more complete understanding of these phenomena will be seen below. Enzymes usually invoke charge separation processes in their activity (Krishtalik, 1980). Quantum mechanics in vacuo will normally predict a neutral form rather than a diion. Which state occurs is determined by the balance between the electrostatic interaction between the charges moieties, and the difference between the ionization potential of the electron donor and the electron affinity of the receptor. This balance is usually against the diionic form. The inclusion of environmental effects in the situation increases the relative contribution due to the charge-charge interaction, and renders the charge-separated form the stabler of the two. IV. ENVIRONMENTAL INFLUENCES In general, because the number of atoms that can be dealt with quantum mechanically with available programmes is so limited, the "environment" will consist of two regions: (i) the rest of the molecule under study; (ii) the aqueous environment. As will be seen in the following, the structure of an enzyme [e.g. the proximity of the N-terminus of an ~t-helix to the active site of the thiol proteases (van Duijnen et al., 1980)-1, and the nature of the solvent [e.g. water's high static dielectric constant leading to a reduction of the protein's electrostatic field near the solvent (Rees, 1980)-1, can have significant, and often dramatic, effects upon the explicitly quantum mechanical region. The two distinct approaches, which will be dealt with separately below, are the "continuum" and "discrete" models. The former attempts to describe the environmental influences by ignoring all or most of the structure of the surroundings in order to achieve mathematical simplicity, and reduce the computation time (and cost). The latter sacrifices this simplicity and takes greater account of structural details. This, in turn, requires coordinate data from one source or other (for example, crystallography). The relative advantages and disadvantages of the two approaches will be discussed below, for each of the two regions, (i) and (ii). 1. The Continuum Electrostatic Model
(a) Classical theory Dielectric theory (Frohlich, 1958, for example) is successful for bulk matter because, although the individual molecules have "quantum properties" (complex interactions), and there are statistical fluctuations on a molecular scale, the behaviour of large numbers of molecules can be described by a small number of parameters. These (the relative permittivity and permeability, conductivity and so on), obey relatively simple equations. These quantities may be obtained empirically, or from molecular methods of varying degrees of complexity (Hill et al., 1969). Above all, the theory is simple, and a variety of effects can be incorporated: (i) dielectric relaxation (Daniel, 1967); (ii) saturation effects in high field strengths (Buckingham, 1956); (iii) non-zero ionic strength (Hasted, 1973). These effects all add to the potential generality of the model. As an initial approximation, one may account for the difference between the highly mobile and polar water and the relatively immobile (though still polar) core of a protein by assigning a dielectric constant of about three to the protein interior, and about 80 to the surrounding water. This model can be exploited to account for the greater ease of charge-separation processes in enzymically catalyzed reactions (i.e. a lowering of the activation energy). The protein's immobilized polar regions can create an environment particularly suited to the
Quantum calculations on proteins
5
formation of the charge-separated configuration (Warshel, 1978a), without dipolar reorganization (as occurs in free solvent reactions). Krishtalik (1980) has shown that, for a spherical molecule, the optimum radius of the enzyme globule, if it is to reduce the activation energy by this mechanism, is about 15-20 A---comparable to the true values. The assumption of uniform internal dielectric constant ignores all the internal structure of the protein. It is also rather difficult to assign a value to this quantity (Rees, 1980). The assumption that e ~ 80 immediately outside the protein is also highly suspect. Water around proteins will be partly immobilized, by hydrogen bonds or surface ions, reducing the dielectric constant below the bulk value in these regions. The situation is further complicated by the presence of counterions in solution. As Warwicker (1983a) has pointed out, crystallographic data indicate that surface (charged) groups can move about 1 A thermally. Such movements must therefore involve an energy change less than kT, the thermal energy. This is impossible given the kind of dielectric discontinuity mentioned above. To model an enzyme molecule, which is a highly involved and specific system designed for the production of the optimum environment around the active site, by a homogeneous dielectric medium of ill-defined dielectric constant, is clearly rough. It is intended only to reveal the general features of reactions transferred from solvent to protein. It would therefore seem wiser to use a better model for the enzyme itself. Continuum modelling of the solvent is, however, a more appealing possibility, because it does not have a fixed structure. Thus coordinates are not available for any but the waters most tightly bound to appropriate sites on or in the molecule. Pethig (1979) has reviewed the dielectric properties of biological molecules in solution, from both experimental and theoretical viewpoints. Kharkats et al. (1976) have generalized the equations of dielectric theory for materials which are dispersive, inhomogeneous and anisotropic. The model most often adopted for continuum solvent calculations is the "cavity model", in which part of the system is treated in detail, in a cavity in a continuous isotropic dielectric representing the solvent. The idea is due to Debye (1929) originally, in his classic treatment of polar liquids. The deficiencies in his treatment were at least partially rectified by Onsager (1936). He considered a molecule in a cavity of molecular dimensions (which has been the norm since), surrounded by a medium characterized by bulk dielectric parameters. This limits the application to molecules with no short-range orienting forces. Onsager showed that the local field consists of two parts: F=G+R
G: the cavity field, that would exist in the cavity were the molecule absent. R: the reaction field in the cavity due to the polarization induced in the medium by the (dipole) field of the solute. The solute is modelled as a polarizable dipole. By matching the electrostatic boundary conditions on the potential inside and outside the sphere, Onsager found: G-
3~° E=gE 2~o + 1
R -
2(~° - 1) m/a 3 = rm (a = cavity radius) 2~0 + 1 (4/3)ha 3N= 1 defines a
m = total dipole moment of the molecule = p+~F
=
It-agE 1 -ra
Onsager conceded that the weaknesses of his method were: (i) The assumption that the medium has bulk properties right up to the "edge" of the molecule is only a first approximation.
6
M . L . J . DRUMMOND
(ii) Saturation effects are neglected, whereas one would expect the local field to be strong on a molecular scale. (iii) Most molecules are (roughly) ellipsoidal, and the spherical cavity will model them poorly. (iv) The results are very sensitive to the value taken for the radius of the sphere. This turns out to be a problem for all cavity models. A slight underestimate of the cavity radius results in a large overestimate of the solvent effect. There were several extensions made to Onsager's theory. Bonnor (1951) repeated the calculation for a general polarizable charge distribution in a sphere. Kirkwood (1939) extended Onsager's theory to the case of polarizable point dipoles including local interactions, with a more fully statistical treatment. The results are far better in the case of water (locally tetrahedrally coordinated). Frohlich's (1948) similar treatment was shown to be equivalent by Brown (1956) if a correction is made to Kirkwood's original derivation. Norton-Wilson (1939) considered molecules with anisotropic polarizability. Subsequently, Abbott and Bolton (1952) considered a point dipole in a prolate ellipsoidal cavity, as an attempt to describe a greater variety of molecules more realistically. The treatment was made more rigorous by Buckingham (1953a, b). Kirkwood (1934) dealt with the case of an asymmetric charge distribution in a sphere (as a model for a molecule), and tested the results on a number of zwitterionic species, such as amino acids. Many continuum solvent and protein effects have been incorporated into the work of Warwicker (1983a), and which he applied to phosphoglycerate mutase (1983b) in order to calculate its electrostatic field. His method, designed solely for computer calculations, consists of the evaluation of the potential at points on a grid, and the assignment of local charges or volume elements of charge density (due to ions in solution) to the nearest grid point. The computational scheme is incapable of modelling accurately charges which are closer together in the protein than the grid spacing he used without requiring an excessive density of such points. The theory takes into account the electric field of s-helices, modelled as charges of + ½e placed at the end of the helix (Warwicker and Watson, 1982). Otherwise it is a simple internal/external dielectric continuum model modified so as to include: (i) Dielectric saturation, via a simple analytic fit to the theoretical results of Booth (1951) and Laidler (1959). Dielectric discontinuities and sudden, excessive energy changes are avoided by placing a dielectric shell ore = 4 around each charged site, before the smooth change to bulk values begins. (ii) Counterions are included, either as a charge density p = po(e- e ~ b / k T
__
e edp/kT)
where P0 is the charge density of one (monovalent) species in zero electrostatic potential ~b. Alternatively, in the Debye-Huckel limit: p=-
2epo
For the same reason as in (i), discontinuities in this charge distribution at the protein surface need to be smoothed out. This method is attractive for its simplicity, even though it is purely computational and involves a great many fit-functions. The assumption that the or-helix dipoles are the only specifically structural contributors to the electrostatic field will be seen in a subsequent section to be too sweeping for accurate determinations of the relative stabilities of different solute conformations. However, Warwicker, in his application to phosphoglycerate mutase, was primarily interested in the long-range field for the purposes ofsubstrate recognition and binding. For a fast, if rough, estimate, incorporating salient features of the enzyme's gross geometry, the method is of value.
Q u a n t u m calculations on proteins
7
A similar fitting of experimental results to a reasonable functional form for ions in proteins has been carried out by Mehler and Eichele (1984), in order to perform calculations on wateraccessible regions of proteins. Their attempts to calculate the observed pK a shifts of ionizable amino acid side chains meet with qualitative success. The difficulty in defining dielectric constant over microscopic distances, and the expected over-simplification in treating a protein as a continuous environment with no effects depending on its internal conformation, prevent any attempt at further improvements in accuracy from being reasonable, until such effects have been estimated more rigorously. (b) Quantum mechanical applications Rinaldi and Rivail (1973) performed calculations on water and its dimers. The reaction field is most likely to have a sizeable effect on an easily polarizable system, as are hydrogenbonded molecules. They used anisotropic polarizability, and Barriol and Weissbeckers' (1964) method of treating mixtures. The dipole moment in the cavity is: M(t) = m ° + m(t) where m ° is the permanent dipole moment, and m(t) is a statistical or quantum mechanical fluctuation. In the quantum case, m ° = <~,lM(t)l~,> where ~ is the solute wavefunction in the presence of the solvent. Similarly, the reaction field is R ( t ) = R ° + r(t).
So the internal energy is: Ei=
--
½ = - ~R1o. m o _ ½.
The first term in this equation represents the interactions between permanent and induced dipoles. R ° = f °m °
defines f~ the tensorial reaction field factor. The second term in the equation represents the dispersion energy, r(t) depends on the history of the system: r(t) =
foo_~r)m(t- z) dT.
Defining a generalized susceptibility = I ~ ~ ) e - ~ " dT J0
it follows that t _ ~ = f_.~mto
which requires knowledge of e(co) in the actual expression for f. Rinaldi and Rivail used a suitably modified version of O n s a ~ r ' s formulae to evaluate the various expressions. The value of e(co)to use is troublesome, because the fluctuations are very rapid indeed. Neither the static dipolar reorientation nor the atomic polarizability should be included. They used e for simplicity, although this is not quite the same as the square of the refractive index extrapolated to infinite frequency; it does still include atomic contributions. The theory was implemented within the CNDO/2 semi-empirical framework, and revealed only minor changes in the geometry and dipole moments of the water molecules. On the other hand, the energies and relative stabilities of different dimer conformations were found
8
M.L.J.
DRUMMOND
to be strongly dependent upon the presence of the solvent. This will consistently be seen to be characteristic of solvent effects in such systems. Subsequently, Rivail and Rinaldi (1976) extended their theory to include multipoles of all orders. In this case a spherical cavity of radius a is assumed, with an arbitrary distribution of point charges within it, and an isotropic polarizable continuum outside. Kirkwood (1934) had already shown that, for a single charge, the electrostatic potential within the sphere is given by:
+-a ,=o
Ir-rll
~(l+ 1)+i
~
PI(cos 01).
Therefore, for a set of charges at {rj} of magnitude {q~} interacting electrostatically, the energy will be:
u=½Ei,jqiqj ~, (/-I-1)(1--~) a~a2/ 1 /r/-r/'~tpt(cos '," ,=o s ( l + l ) + l
Oij)
when the self-energy terms have been removed. For a molecular system, {q~}is represented by { Z J at {R~} and electrons at {r~}. Then the sum becomes:
U-~-- ~ Ul /=0
u I --
a ~,,#Z~Z#
~(l+l)+l
R~r~ t
a2 1
P~(cos O.p). r:. ~
Using the spherical polynomials, (4n) 1/2 St~(ri) = 21+ 1 r/l Y#(0 i, ~bi) where y m are the usual spherical harmonics, one may expand the Legendre polynomials: 47t
Z 2i+ l ~'(O,,~,)Y,'(oj, ~j)
P~(cos o,j)=
m=
--I
to give I I/1~
--
1 2
E m = -I
P~*mM~ = - ½~.M~_
where
and m
R~-.~Ma w i t h ~ -
(1+1)(1-~)
1
e ( I - 1 ) + l a 2t+l"
The quantities M F can be shown to be related to the usual multipole moments of order 1. Rivail and Rinaldi chose the centre of the sphere as the centre of charge of the nuclei, which then fixes it, in the Born-Oppenheimer approximation. For a reasonable overall dipole moment the centre of charge of the electrons will not deviate significantly from the same
Quantum calculationson proteins
9
point. The radius of this cavity was calculated from the polarizability of the solute molecule, using the formula of Barriol and Weissbecker (1964): a3
e~o+2
0t was calculated by a method the authors had devised (Rinaldi and Rivail, 1974), but experimental information was also available for the fluorinated hydrocarbons they studied. Rivail and Rinaldi now assume that the environment cannot follow the electronic motion, but only the expectation value of the charge distribution. The evaluation of this quantity requires the integration of the electronic contribution to u over all space. The discontinuity in the dielectric constant is ignored on the grounds that the wavefunction is small at the boundary. The integration domain of the potentials obtained above is extended over all space, rather than considering the external space separately, and using the different potential there. Rather than attempt the calculation of the dispersion energy, as in their work on water, they comment that, because of the complication involved in the unknown frequency dependence of the dielectric constant, this term is difficult to evaluate with any certainty. They therefore take for the effective perturbation energy:
)
(1)
rather than
(
,2,
This corresponds to assuming that the active site system is separable from the rest of the molecule, in the sense that there are no correlated interactions (i.e. dispersion) between them. They then implement this within the LCAO approximation for molecular orbitals, reducing all the additional integrals to one or two centre recursive forms. In practice, the series had converged by l = 7, and the actual calculations were done in the CNDO/2 scheme because this has the advantage of predicting molecular dipole and quadrupole moments more or less correctly. Because of the uncertainty over the cavity radius, it is deliberately overestimated in order not to overestimate the solvent's influence. The solvent effect was found to drastically alter the relative stabilities of different conformations of, for example, 1,1,2-trifluoroethane. The continuum approximation is justified because there are no large electric fields (no ionic effects), and the solvent does not interact with the solute in a specific way. Therefore, if applied to water, the model would be expected to be less accurate. That the contribution of the wavefunction to the multipole moments for r > a is negligible has been shown by Rinaldi in unpublished work, and hence there are no convergence problems. The possible origin of such problems will be discussed below, when the alternative treatment using (2) instead of (I) will be considered. A small group of different approaches appeared in the mean time, however. In their study of the conformational stability of acetylcholine, Beveridge et al. (1974) calculated the total solvation energy using Sinanoglu's (1967) solvent effect theory incorporated into the semiempirical INDO scheme of quantum calculations. In this theory, the total energy of the solvated solute is taken as:
Etotal =
E s o l u t e in f . . . . pace "{-
Esolvation
and Esolvatio . = Eel ~ + Edis + Ecav
Een: interaction of induced and permanent dipole moments, using the Onsager reaction field.
tO
M.L.J. DRUMMOND
Edis: dispersion energy calculated from an effective pair potential and the radial distribution function of the solvent around the solute:
1;
Edis = ~
pv~f(r)9(2)(r)4rr r 2 dr.
(3)
Ecav: the energy required to form the cavity occupied by the solute, and estimated from the surface tension of the solvent. This term is important only for the difficult problem of absolute solvation energies. As long as the cavity stays the same size, there is no change in this term in studies of relative conformational stability. This theory requires no arbitrary fitting parameters. The electrostatic part does not include multipoles beyond dipole moments, nor saturation, and there is a degree of uncertainty concerning the dispersion term (Sinanoglu, 1967), which are weaknesses. The usual assumption that there are no specific solute-solvent interactions has also I~een made. Beens and Weller (1969) considered the first order perturbation theory for two interacting species, on including the expectation value of a reaction field. The resulting equations are, of course, non-linear, and the authors show that at a certain minimum solvent polarity, a new solution becomes possible, corresponding to a different, polarized form of the system: a charge-separated state. Much the same result was obtained by Yomosa (1973), when he considered the non-linear Schrodinger equation that corresponds to the case considered by Beens and Weller. He used this to explain the existence of ionic defects H20 + H20~.-~-H3O+ + O H in water and ice (Yomosa and Hagesawa, 1970), as the polar form is stabilized preferentially by the solvent, if its polarity is high enough. Water amply meets this condition. Although these relatively simple calculations cannot be extended to larger systems usefully, they do show that the existence of stabilized polar forms is strongly dependent upon the polarity of the solvent exceeding a certain critical value. Now to return to the question of instantaneous response. McCreery et al. (1976a; Hylton et al., 1974) follow the same path as Rivail and Rinaldi (see above), but instead of averaging over the positions of the electrons for the overall electric moments, the instantaneous values of the electronic positions are retained. The perturbation energy is then (2), not (1). This procedure includes a measure of the dispersion energy, because it includes correlations in fluctuations between solute and solvent, although Rivail and Rinaldi (I 976), following Linder (1962, 1967; Linder and Hoernschmeyer, 1964) again, show that the explicit frequency-dependence of the dielectric constant needs to be included. In the simple case of an infinitely narrow dispersion band, this leads to McCreery et al.'s result being incorrect by a factor of about four. Nevertheless, if changes in the dispersion energy in the course of a reaction are important, this method should at least detect the fact. McCreery et al.'s theory is intended for non-polar solvents, where the polarization relaxes e x t r e m e l y rapidly, and so it should be a good approximation to let the dielectric effects follow the motion of the electrons. Even highly polar liquids have a component of polarizability that is capable of responding in this way, although the orientational contribution relaxes much more slowly, and certainly only responds to the average charge distribution. Interestingly, Burch et al. (1976) did apply this method to calculations of the formamide-water supermolecule in aqueous solution. Such a procedure will be seriously in error for either the dispersion or induction energy, depending on whether the low or high frequency dielectric constant is taken, respectively. However, McCreery et al. came up against difficulties. The most serious of these is that the expansion (2) fails to converge. Tapia (1980) considers this to be due to: (i) Kirkwood's expansion of the potential assumes r < a , and diverges for r > a . In the "averaged" treatment there is no problem, because ( r ) < a . However, in McCreery's theory, the electrons are free to penetrate the boundary, whereupon the potential expansion ceases to be valid. (ii) In the medium, the interaction between the electrons and the nuclei is so reduced that it is
Quantum calculationson proteins
11
more favourable for them to imbed themselves in the continuum once they find themselves in the vicinity than to occupy the usual orbitals. Therefore it was necessary to constrain the electrons within the sphere, and McCreery includes a "penalty function", which is a potential of the form: all electrons
c~ =
~
(ri/a) n
~=1
(4)
n must be greater than the order of the highest multipole retained in the sum (2). They used n=12. McCreery et al. attempt to justify this artifact by considering it to arise from the overlap repulsive forces between the solute and solvent molecules, so that n = 12 is taken to be a model of the r - 12 repulsion term in the Lennard-Jones interaction formula. This is invalid, because all powers of r occur in the expansion, which will eventually overwhelm any finite value ofn in (4). The method is inherently divergent as it stands. Despite this, it produces good results for alkanes dissolved in heavier alkanes. Apart from the penalty function, the model should be good for the electronic and atomic polarizabilities' contributions to the interaction energy of solvation. A further technique is due to Miertus et al. (1981). In this method, the cavity is the van der Waals envelope of the molecule. The reaction field is modelled by the surface charge density induced on the boundary (approximated by a distribution of point charges). This can be simply calculated from the normal component of the electric field at the boundary, which is in turn a simple property to calculate from the quantum mechanical wavefunction. Iterative solution of the resulting system of equations results in the inclusion of continuum effects without the problem of filling the "gaps" between an asymmetric solute and the usual spherical or ellipsoidal cavity wall. This method has been applied to calculations on the spectrum of formaldehyde in aqueous solution (Bonaccorsi et al., 1984). Because the precise cavity used is still somewhat arbitrary (the van der Waals envelope is only one choice), this technique may be useful only for calculating relative stabilities. With this limitation, however, it would seem to be useful for any system for which the boundary surface can be tractably modelled by a set of point charges. In practice, the reaction field once again responds only to the expectation value of the electric field, so there are no convergence problems beyond the number of points needed to model the surface: 100-200 in the case of the water molecule. (c) S u m m a r y Continuum models ignore structure and specific interactions, and gain simplicity and the option to include (for simple spherical cavities) multipoles of any order in the electrostatic energy. This expansion was made about only one centre in the sphere. This in turn enables convergence properties and the importance (or otherwise) of higher order electrostatic moments to be investigated. Both Rivail and Rinaldi, and McCreery et al., found that terms beyond the dipole were not negligible. Indeed, Rein (1973) has calculated that, for a singlecentre multipole expansion, convergence is achieved for quadrupoles at 10 A from the molecular boundary. Bonaccorsi et al. (1974) found that, for water, the convergence is to less than about 7% of the true energy at the van der Waals envelope, falling to about 1% at oneand-a-half times this distance, when terms up to hexadecupoles are included. This is seen to bear out the quantum mechanical convergence studies referred to above. For larger molecules, the convergence properties are correspondingly worse. Since the dipole approximation is often made, these lessons are well worth learning.If many centres are taken, the convergence is naturally more rapid, but the ensuing mathematical complexity of the treatment sacrifices the continuum method's major advantage: its simplicity. 2. Discrete Point M o d e l s
If a system is to be modelled in a manageable and reasonably accurate way, the electrostatic field and interaction properties of the environment with the quantum mechanically treated subsystem are the most amenable to treatment. This is due to the
12
M.L.J. DRUMMOND
relative simplicity of Coulomb's law and its generalizations to multipoles of arbitrary order, and the approximately linear relationship between the induced dipole moment and the applied field: p=~E.
The simplest model is to replace the molecular and atomic orbitals of the environment (the true charge distribution) with an array of point charges at appropriate sites (often atomic nuclei). If the polarizability of the electron distribution is ignored, we have the point charge models. (a) Point charoe models The question is now: how should one decide upon the locations and magnitudes of the point charges so that he obtains the best possible representation of the electrostatic field? The classic work in this field is due to Mulliken (1955). He located the charges upon atomic nuclei. Considering a diatomic molecule, then on the linear combination of atomic orbitals:
I]I=ClZ1 "]- C2~2 where Z1,2 are the wavefunctions of the separate atoms. Then, on integrating, the total number of electrons being N: N = Nc21 + Nc 2 + 2Nclc2S 12
S~2 = (zllz2).
(5)
The first two terms in (5) are gross atomic populations, and the third is the "overlap population". Mulliken assigns half of the latter to each atom, so that ql =N(c3+clc2S12) etc.
This procedure (Mulliken population analysis) is still the most commonly used one for obtaining molecular electrostatic fields, because of its sheer simplicity. The results are usually accepted uncritically. For example, Hayes and Kollman (1976a) report calculations on carboxypeptidase A by this method. They investigated electrostatic enzyme-substrate interactions. (Their finding that the potential depends largely upon the charges of the surface groups is open to question on the grounds that they include neither solvent nor counterions.) Their similar calculations on the active site (Hayes and Kollman, 1976b) indicated that only the nearest groups (and the surface charges) had a significant effect on the field there, the rest of the molecule exercising a "marginal modulating effect". The possibility that only the nearest groups are important in this way gives one cause for optimism that accurate field calculations may be possible. Mulliken's division of the overlap populations can be shown not to be merely the most convenient. Mayer (1983) has derived an effective charge on an atom operator which has the Mulliken charges as its eigenvalues purely as a result of the structure of SCF theory. This mathematical construction shows that there is, at least, a good reason why one should choose this method of population analysis in a SCF calculation. However, numerical accuracy depends upon the degree to which Mulliken population analysis produces an accurate potential. The major objection to the process is that it does not preserve bond dipole moments, introducing an error into the leading term in the electrostatic interaction between moieties bearing no net charge. The results show a dependence upon the basis set used, which makes the comparison of results difficult, and confidence in them wellnigh impossible. Politzer and Harris (1970) point out these and other difficulties, and briefly review other techniques based on Mulliken's method, all of which suffer from one or other of its inadequacies. A more general definition of "charge on an atom" is (Dean and Richards, 1975): q = ~$*~b dr where ~, is the molecular wavefunction, and the integration is over a volume "associated
Q u a n t u m c a l c u l a t i o n s on p r o t e i n s
13
with" the charge centre, chosen so that the total volume is appropriately subdivided. Various attempts have been made at calculating this integral and defining the volumes. Politzer and Harris (1970) define the associated volume in such a way that, in the limit of zero interaction, each atom is found in its natural free state. Applying this to linear molecules, they integrate the free-atom electron distribution along its axis, and mark off the planes where the number of electrons 'to the left of' the integration point reaches an integer equal to the total nuclear charge found there. These planes then define the associated volumes, and the true molecular wavefunction is integrated over them to find the partial charges. Bader et al. (1979) consider the topology of electron distributions, and extend the volume associated with a particular nucleus out to the locus of the density minimum between that atom and the next. This is physically reasonable, and Barrett (1983) has implemented this scheme, using interactive computer graphics to reduce the overall computation time that is needed. Neither of these last two methods is really practical for any but the smallest molecules, but they are not dependent upon the analytical form of the wavefunction assumed. Other work has been done for molecular orbitals constructed from floating spherical Gaussian orbitals (FSGOs). These orbitals differ from the usual Gaussian orbitals in that all are spherical, but they are not all necessarily located upon nuclei. For example, a p-orbital can be modelled by two FSGOs, one "above" and one "below" the nucleus, and their locations can be optimized as a further variational parameter in the quantum calculations. This basis set restriction is the main drawback of these methods. The limitation is made because the product of two FSGOs on different centres is always a third, with its centre on the line joining their centres. Hall (1973) developed a model in this system preserving both total charge and dipole moment. Other multipoles are in error by a predictable amount (Tait and Hall, 1973), and analytical expressions can be obtained for the potentials and fields. The method is only expected to be accurate at long range, as indeed are all point charge representations. The charge density is:
e~,cb/b,,
p(r) = Y, Sd
Ps, is the usual bond-order matrix, constructed from the LCAO expansion coefficients as follows: P,,
=
2 Y CskCtk, * k
~bs and ~b, are FSGOs, centre r s and r t and exponents a s and a, respectively. Then
Centre rst =
Grs + atrt
; exponent as, = as + a,
a s 4- a t
Hall chooses to model the charge density, therefore, as: p*(r) = ~ PstSst 6(r-G,) S,I
That dipole moment is, in fact, conserved is shown as follows: P = S rp(r) d 3 r : ~ estlr@s4)t d 3 r s,l
: ~ est(f(r -- rs,)flpsdptd3r + rs, S~Os~btdar) s,t
Using the spherical symmetry of ~s, about r~,, then
p = ~ Ps,Ss,rs, = ~rp*(r) d3r =p* S,t
14
M . L . J . DRUMMOND
The representation consists of charges Ps, at rs and P~S~t at r~t, i.e. on nuclei and between every pair of nuclei. McCreery et al. (1976b) used this method to consider microscopic orientational effects upon water near a protein. Although the electrostatic point charge model potential is only asymptotically equal to the true potential, calculations are possible at more than the very few orientations that one can afford by supermolecule methods. Since correlation and charge-transfer effects are neglected, the results are more accurate at greater distances from the charge centres. Shipman (1975) suggested that the ~M(M+ 1) centres of Hall, where M is the number of nuclei, is too computationally expensive for large M. He advanced a point charge model for FSGOs using only the nuclei as centres. It = E P,,S~,r,, = ~ P~,S~, ~,r, + ~,r, s,t
s,t
O~s + O~t
2P~,S~, s.t
-~- rs . O~s "+ O~t
Therefore point charges
q s = Z 2 P s , Ss, t
as
O~s+
O~t
provide a reduced set, also preserving charge and dipole moment. A general population analysis, preserving the dipole moment but not dependent upon any particular choice for the wavefunction, was presented by Thole and van Duijnen (1983). It is adjusted to reduce its sensitivity to small changes in geometry, and to optimize its agreement with the full quantum chemical calculation, in a way which is not dependent on the particular system under study. They require {qk} on atomic centres such that Z qk = k
Q and Z
rkqk =
P
k
where Q and P are elements of the set of one and two-centre charges and dipole moments calculated from the quantum mechanics. A local dipole moment (P) is then modelled by charges on all the atoms. Of the many possible ways of doing this, they take the one which minimizes all the charges, especially the ones furthest from the point considered, by minimizing with the constraint: q2 k~ ~Wk is a minimum where w k is a weighting function falling off rapidly with distance. Differentiating with respect to q (~ and p are Lagrange multipliers): qk/W k - ~ - r*kl3= 0 i.e. qk = wk(~+ rktl3)
LetW=~WkSOthat(x
)
=
1~
WkX k
k
Then Q = ~ qk = Wa+ ( r ) t WII k
P = ~ rkqk = (r> W~+ ( r r t ) Wfl k
• W ~ = Q - (r)* Wfl; ~ Wk
= [(rr*) - (r) (r)*] - 1(p _ Q (r)) ---7 - 1 ( p _ Q ( r > )
.. qk = ~ { Q + (rk -- (r))t7- l ( p _ Q(r))}.
Quantum calculations on proteins
15
Taking ( r ) to be the origin, and choosing coordinates to diagonalize ~,- 1, then
Wk qk= W
Q-r~
0
fl-1
0
0
0
?-t
P'
•
If along direction "a" there are groups nearby (with correspondingly high weights) the dipole moment can be constructed by placing small charges upon these, and a is large. In the converse case, when large charges are needed in a particular direction, a can be small, and the matrix ~ can be almost singular. This normally arises along directions perpendicular to a locally planar molecule, when one would expect P' to be small, so it is better to neglect the extra charges than to put large charges on other atoms in an attempt to model this effect. More reasonable higher order multipoles should also be produced by this artifice. However, putting s - I = 0 when a ~ max(a, fl, 7) produces a situation unstable with respect to small geometrical changes out of planarity, and it is better to say: a - 1~ ( a + A(? + A))- 1 where ~ = max (~, fl, y) and A is small and positive. Thole and van Duijnen found this to stabilize the procedure numerically without harming the results. For the weighting function, they chose a form with the same shape and width as the charge density associated with the atom. This should then model it as well as possible, and for a basis which does not include functions ranging over more than two atomic centres they use: wk = exp{ - I r k -r,.,I 2/d2. } where rmn= ~(rm+ rn) and dmn= fir m- r~l. c5is chosen to model the exact charge distribution as accurately as possible. In fact, 6 = 1 seemed to be the best choice. Thole and van Duijnen applied their method to various small molecules, and the results are generally better than the corresponding potential from Mulliken population analysis by a factor of 2-3 in the rms deviation from the full quantum value. On application to the fragments of a larger molecule, this method should be the best available for tractable calculations of the electric field. An example of a simple solvation model using point charges is given by Noell and Morokuma (1975). They treated the hydration of LiF and the H 3 N . . . H F systems by choosing the point charges so as to optimally fit the ab initio calculations they had done on, for example, Li(H20),, complexes. They showed that this approach produced the expected drastic stabilization of the system H3NH • • • F over the uncharged form. Charges can also be fitted to some property, such as the electrostatic potential calculated over a grid. There have been many calculations of this type (references are to be found in the paper of Singh and Kollman, 1984). Although the results reproduce the chosen property well (not surprisingly), they are less concerned with a representation of the quantum mechanical wavefunction than the techniques discussed above, and are generally also far more expensive in computing resources. (b) The soh,aton model Kiopman's (! 967) solvaton model is reminiscent of the "electrostatic images" method of solving problems. A "'soivaton" is a charge associated with a nucleus, and is designed to mimic the eff~,ct of the solvent without physically modelling it. Klopman chose the solvaton charges {qs} as the negative of the Mulliken charge of the atom with which it is associated. The interaction between the solvatons is set to zero, because they are supposed to already represent the full effect of the solvent. The solvaton (S)-atom (A) interaction is: EAS-- 2 r
16
M . L . J . DRUMMOND
which is designed to reproduce the Born solvation energy of a charge, and r = )'van der, Waals radius of A if S is associated with A ~the AA distance if S is associated with A. Germer (1974a) applied this method to quantum chemistry, by including the interaction energy in the Hamiltonian, using the explicit formula for the Mulliken charges. This produces non-standard integrals in the course of the calculation, which have to be specially evaluated (Germer, 1974b). Constanciel and Tapia (1978) develop this idea into the "virtual charge model". They do retain solvaton-solvaton interactions, as modelling the energy needed to polarize the medium in the first place. They also show that the theory as it stands is internally inconsistent, in that it cannot correctly predict both the energy of the solute in the field of the polarized solvent, and the energy of the whole system. They choose the latter, and so the Born solvation energy comes out incorrectly. Lamborelle and Tapia (1979) use the virtual charge density operator:
~(a) =ft~)[~ Z~6tR-R~)- y~ 6(R-r,)] i
{R~} are nuclear coordinates {ri} are electronic coordinates. Then the reaction field potential is: f'(R) = f d R ' (Ol~v(R)ltP)
3
IR-R'l
The interaction energy is: z,o, = E' fl
j=l
where the first sum contains the self-energy terms replaced by
(a~)van der Waals The perturbed Hamiltonian is then H + ~nt, leading to a non-linear Schrodinger equation. The "dielectric constant" e is treated as a variable parameter in their work in order to optimize the results and account for the basis set dependence that they found for small amino acid molecules. Constanciel and Contreras (1984) have shown that, by including "desolvation effects" (i.e. generalizing the Born solvation formula appropriately) it is possible to remove the inconsistency alluded to above. Even so, the model fails to give one any physical intuition into the processes occurring in solvation; it is merely a computational scheme for estimating their values. (c) Electrostatic models including polarizability A pure point charge representation ignores the mutual effects that the solute (in a general sense, taken to mean the "quantum" part) and the solvent (generalized environment) have on one another due to their mutually induced dipolar (and other) fields. Mantione and Daudley (1970) used Claverie's theory (Huron and Claverie, 1969; reviewed in Claverie, 1978) of intermolecular interactions to calculate the energy of rigid spheres packed as closely as possible, in minimum energy conformations, around the solute. This meant calculating the interaction from bond polarizabilities, and semi-empirical dispersion and short range repulsion terms. Then the energy obtained is used in evaluating the solvent's effect upon excited electronic states of the solute; in their case, benzene. The method is expected to be best for non-polar solvents.
Quantum calculationson proteins
17
The microscopic electrostatic model of Warshel and Levitt (1976) treated a given solute as an array of polarizable points and charges such that, at a point i:
Iti =°tiEi; rlj = ri -- rj _ Qrj~
., . [ - Y
Itir"'l
These equations are solved self-consistently for the dipole moments and charges. Warshel and Levitt considered that, to make the treatment more compact, they could write Ei~ ~
ajrjl
J e(r)lrji I 3
with e(r) the same V{i,j}
and the value g of e(r) was 1.36 for lysozyme. When e(r) was replaced by g, only minor changes were found (qualitatively), and they took this value for simplicity. To include the solvent, they put "waters" on the points ofa 3 A cubic lattice and fitted them around the solute using an empirical Lennard-Jones potential. Once the optimum geometry had been found in this way, solutions were once again found self-consistently via: 1
Ek
I t ' = # ° { c o t h x k - - ~ } IEkl ; x k -
C#olEkl
kT
#o is "related to" the permanent dipole moment of water, and C is an empirical parameter designed to estimate the waters' resistance to reorientation, due to hydrogen-bonding. Another dielectric constant is defined for the solvent, and found to vary roughly as: e(r)'-, 1 + (r/,~). Inserting this value gives a rough indication of the fields, without the need to go through the self-consistency iterations every time. The calculation can be repeated with a substrate fitted in place by energy minimization with respect to all its coordinates (Lifson, 1982). The binding energy can be calculated, and the field in the active site can be obtained for use in quantum chemistry. The methods weaknesses are that it incorporates so many empirical fit factors, and makes rather sweeping approximations with the dielectric constant. The fields are, however, fully self-consistent, when treated without the rough e substitutions. The inaccuracies involved in the rather simple way that the water molecules were located need to be clarified. Locating the solvent molecules is always a problem in discrete solvation models. Warshel subsequently upgraded the model, as follows. His surface-constrained soft sphere dipoles model (SCSSD) (Warshel 1978b, 1979) considers the solvent to consist of deformable spheres with point dipoles at their centres. These interact via semi-empirical van der Waals potentials and dipole~lipole coupling, both with each other and with the solute. The system is contained in a large cavity in a dielectric continuum, and the surface dipoles are constrained to have orientations characteristic of the bulk solvent, as they are at the end of a Monte-Carlo run in the absence of the solute. However, the number of empirical parameters is still high. A linear term is introduced into the van der Waals potential to try to remove "excessive dimerization". The entropic contribution to the Gibbs free energy (calculated from the partition function of the system) is introduced into the enthalpy as an empirical scaling factor to reduce the computation time and the complexity. This all makes the theoretical value of the procedure somewhat questionable. Tapia and Goscinski (1975), in their average reaction field theory, use a non-linear Schrodinger equation; the Hamiltonian is: H = H[~P] = n o - Ho: the isolated solute Hamiltonian It: the total dipole moment operator M: the total dipole moment JPB 4 7 , 1 - B
ItTgm
=
H o - ItT_g<~lit J~'>
(6)
18
M.L.J. DRUMMOND
W: the solute's molecular wavefunction ._g:the solute-solvent coupling tensor, which in the case of an Onsager cavity of radius a is:
go = 2(e- 1)a - 3(2e + 1)- l~ij. The rationale behind (6) is that the interaction is between the average reaction field, gM._M_ and the dipole moment producing it. To see the form of g: If H' is the interaction (perturbation) Hamiltonian, and IS> is the solvent's wavefunction, then: N
=Itr~ p=l
N: number of solvent molecules about the vth solute molecule.
(Tvp)O= ViViIR,p] - a: the dipole propagator tensor Let = - jt~Ta(q, Q)
q/Q represents the solute/solvent configuration. If M is the total dipole moment of the solvent molecules, and at their polarizability, then
Taking the average over all configurations of the solvent gives the mean field
-
(7)
The first term in (7) is the reaction field due to the interaction between the solute and the permanent dipoles of the solvent. It is zero in an isotropic environment, and reflects local ordering. It is also associated with the orientation polarization, and so is written as glM. The second term incorporates the dipoles induced in the solvent due,to the permanent dipole moment of the solute. It is associated with the electronic polarization and, assuming that g and M are uncorrelated, it may be expressed as _g2M. Defining g = g l + g 2 , they recover (6), and find that g depends upon both the bulk macroscopic properties of the solvent, and upon its local structure around the solute. Thus they have a general way of calculating g from the molecular structure of the solvent. However, in the actual calculations (implemented in the CNDO/2 formalism), an isotropic g is taken:
go = g6o g is taken as a parameter, to incorporate errors and approximations (hopefully), and to investigate the effect of different solute-solvent coupling strengths. Furthermore: (i) The novel form of the Schrodinger equation necessitates the choice of a different variational principle for the calculation. This property was investigated by Sanhueza et al. (1979). (ii) The field of the solute is approximated only by a dipole field. This, as we have already seen, is a poor approximation for intermolecular interactions. Further work to alleviate this problem will be discussed below. (iii) Only homogeneous fields are dealt with. In the extension to inhomogeneous fields, Tapia and Johannin (1981) used dipole and polarization density functions to build up a more general reaction field. This was considered to be induced by a point charge representation of the protein backbone and atomic/group
Quantum calculationson proteins
19
polarizabilities to take into account induction effects. The tensor g was then modelled by the first term in its expansion (7) as N
p=l
so that the reaction field at a point v is R(v) = g*M(v) where M(v) is the polarization source present at that point. Higher order effects were not estimated. We now have one g tensor for every atomic centre in the quantum region, incorporating a great deal of information about the detailed structure and properties of the region outside, as they influence the active site. The method was applied to liver alcohol dehydrogenase (LADH), in which an NAD + and Zn z + were modelled by appropriate point charges. Each g* turned out to be almost diagonal, and N
g - vE =l
3
Eg /N i=1
turned out to be 0.0022, almost exactly the value they had chosen for it as a purely parametric quantity. The final result, as usual, was that the charged state of the active site residues was preferentially stabilized with respect to the uncharged form (see Fig. 2), and the presence of the backbone charges, the Zn 2 +, and the NAD ÷ all contributed to this effect. (a)
(b)
Ser I • H ~'otH Water
O'~H
Ser I
" ~N._ H I
Hiz
I
.
" H•O
" H~,.~H
H~O~ Water
HIz
FIG. 2. Environmental effects on the active site of LADH. Inclusion ofinhomogcneous fields from the protein, as well as from bound N A D + and Zn 2 + (Tapia and Johannin, 1981) decreases the stability of the in v a c u o uncharged form (a) with respect to the charged form (b). The latter form has a highly nucleophilic hydroxide ion, and is thus a more potent catalyst for the reaction mediated.
Tapia et al. also investigated the effects of inclusion of the environment on second-order effects such as London-van der Waals interactions, by including it as a third order perturbation on the total energy (Tapia et al., 1978a). They found that the predicted dependence upon the angle between the reaction field direction and the dipole moment of the isolated system agreed with calculations that had been done on formaldehyde--water and water dimer and trimer systems, by supermolecule methods• The inhomogeneous self-consistent reaction field theory for protein electrostatic effects was applied to the reaction mechanism of serine proteinases in a series of more recent papers (Longo et al., 1985; Stamato et al., 1985; Tapia et al., 1985). The environment was modelled with the charges of Poland and Sheraga (1967) on the atoms of the protein. Only backbone hydrogens were located explicitly; for the rest, "combined atom charges" were used at the sites of the crystallographically observed side-chain atoms. This approximation reduces the number of atoms that need to be taken into account, and avoids the problem ofthe absence of hydrogen atom coordinates in X-ray crystallographic datasets. Acidic or basic side-chains could be left fully, partially, or completely uncharged, and so it was possible to mimic the effect of solvent screening of the fields due to these charges, pH-dependent effects, or the presence of counterions. This is obviously not quantitative (because one never knows by how much to reduce any given charge), but it was hoped that it would provide useful information on general trends, and on the importance (or otherwise) of the effects. The final result was (as one would qualitatively expect), that all the additional effects (electrostatic and inductive
20
M . L . J . DRUMMOND
effects in the quantum model) tended to favour the charge-separated (single proton-transfer) state in this enzyme. There was no attempt made to allow inductive changes in the bulk of the protein. Apart from this work, g was taken as a parameter in investigations of hydrogen-bonded systems. For example: A water dimer (Tapia et al., 1975) Models of enzymic proton transfer systems (Tapia and Poulain, 1977; Tapia et al., 1978b; Tapia and Sanhueza, 1978) Simple molecules (using MINDO/3) useful in modelling enzyme catalysis (Tapia and Silvi, 1980). In all cases the result is much the same. The presence of the solvent stabilizes preferentially the dipolar species with respect to the uncharged one. One criticism of the work is that it was done in semiempirical frameworks. Such calculations are unable accurately to model charged systems because they do not have the flexibility needed to allow charge to move across the system (Carsky and Urban, 1980). This will be seen further below. Tapia et al. (1980) accounted for this effect by using further empirical parameters--the experimental proton affinities. A comparison between semiempirical and ab initio results was carried out for the water dimer (Sanhueza and Tapia, 1982). These calculations bore out the above observations, although for computing trends the method seemed to be reliable. This area of work was reviewed by Simkin and Sheikhet (1983). Subsequently, the method was applied to a cavity model for the rhodopsin system using a complete multipole expansion of the potential reminiscent of Rivail and Rinaldi's (1976) method, with classical point charges representing important charged groups (Raudino et al.,1983; Zuccarello et al., 1984a). The absorption bands of the covalently bound retinal molecule are correctly predicted to be red shifted, but the spectral broadening is seriously underestimated by this method. They also examined effects on electronic transitions in diazines (Zuccarello et al., 1984b) including some water molecules as point charges and the rest as a continuum outside a spherical cavity. Although dependent upon the cavity, this approach lifts the second objection to the work of Tapia et al., with the important result that multipoles of up to order 20 were needed for convergence. Those after the dipole term were not negligible, and it is therefore not a satisfactory approximation to neglect them in the calculations. The most general approach to a variety of different solvent effects was presented by Tapia (1982), The various different self-consistent reaction field or instantaneous response models can all be derived as special cases of this approach, and some degree of approximation is always needed in order to produce a theory which is practically applicable. The fundamental (and essential) unifying assumption of all the different theories is that it is possible to divide the molecular system into a "quantum region" and a "classical region"; and then allow them to interact with one another in order to model the effects of the environment upon the electronic behaviour of the active site region. Just how this is realized in practice leads to the different model approaches outlined in this review. The need to incorporate the electrostatic field of the protein backbone in most clearly seen in the case of ~-helices, because the dipole moments of the individual peptide groups are almost parallel. This leads to an appreciable overall dipole moment (Wada, 1976), which can be modelled reasonably well as point charges of _ ~e located at the ends of the helix (Sheridan and Allen, 1980). The possible roles of these ~-helix dipoles were considered by Hol et al. (1978) for the (common) case that the N-terminus of an ~-helix is found at or near substrate binding sites. Assuming that solvent effects do not reduce this field too drastically, the possible effects were: (i) Facilitating the binding of negatively charged substrates. (ii) Stabilizing charged groups important in catalytic activity. (iii) [Especially for several parallel helices] attracting substrates from some distance and guiding them into the active site. This last effect is the one most likely to be damped out by the solvent.
Quantum calculations on proteins
21
Ash-175
I
o.~C\
H • S--H
•
•
•
N H
•
NH 2
"
i---- H Hisq59 FIG. 3. Papain active site, as used in the ab initio work of van Duijnen et al. (1979). For the computations, the three groups are broken from the amino acid backbone, with bonds being filled by hydrogen atoms to satisfy valence requirements.
Point (ii) was considered further by van Duijnen et al. (1979) in their study of the active site of papain, which has an appropriately placed ~-helix. The active site was modelled as shown in Fig. 3. The ab initio molecular orbital calculations were done both with and without the helix, and in the former case a 10 kcal/mol relative stabilization of the ion pair structure was obtained. The helix was modelled by "transferable atomic charges" obtained by Mulliken population analysis of small subunits, and tabulated (Poland and Scheraga, 1967). Use of these renders the results qualitative at best, but the effect is so pronounced that van Duijnen et al. conclude: "no serious model for the active site of papain should exclude the helix". A double zeta basis set was used, because charged states require flexible bases to describe the shift of charge. Indeed, in further work on the basis set dependence of the results, van Duijnen et al. (1980) found that a minimal (STO-3G) basis set calculation did not produce the second minimum in the potential for the displacement of the hydrogen from the cysteine sulphur to the histidine nitrogen. Thus it did not predict at all the correct charge state of the system. These calculations did not include any environmental effects other than the ~-helix point charges--i.e, no polarization of the environment• The extension of the calculations to incorporate these effects was carried out by Thole and others. The solvent effect theory due to Thole and van Duijnen (1980) differs fundamentally from that due to Tapia et al. Instead of considering the average reaction field's effect upon the electronic system, the environment is considered to respond instantaneously to the electric field of that system, and so the method is called the "direct reaction field formulation". Clearly, the Hamiltonian will be linear and, as in the case of McCreery et al.'s continuum electrostatic theory, can be expected to hold for electronic/atomic polarizability only. It will also include terms resembling dispersion interactions. The reaction field is introduced directly into the total Hamiltonian, and the total energy is simply the expectation value of this operator, as usual. The solvent-solvent interactions are treated first, in a self-consistent way. The solvent (environment) is modelled as a set of points {p} with isotropic point polarizabilities %:
mp=Otp{fp + ~pf(m,, R)} fp is the original electric field at p f(m, p) is the polarization contribution to the field at p due to a dipole induced at q. This set of equations can be solved self-consistently: m = [ 1 - A'Bq - tA'f where: 0
m=
m2
;A_'=
_=z
-..
T~
; f=
;_B=
21 0
T~
_T23
!113
= dipole propagator.
" ' "
22
DRUMMOND
M.L.J.
The solution depends only upon the choice of the points {p} and their polarizabilities, and the solvent will be treated self-consistently by the use of the effective polarizability tensor A = [I'-A'B__I- 1A'.
Because __Ais symmetric: A= LLt where _Lis a lower triangular matrix. The energy needed to polarize the solvent is then:
Epo,=½~ mtv_~ imp. P
The total solute-solvent energy is E = EQ + ERF + E p°l
where E Qis the energy of the quantum mechanically treated region; E RFis the reaction field (interaction) energy, and
Epol= _½ERF. Tapia's method consists of the construction of the operators p-r p-r~ {P - [p--r[ and f~ = ~c Zc ]P-rcl 3
which are the electronic and nuclear contributions to the field operators relative to the point p. The field produced is then:
f,= (t,) + f~,. Instead of this,Thole introduces the solventeffectinto the solute'sHamiltonian directly: id
i
The ~ now refer to the electronic coordinates directly, rather than to an arbitrary point in space. Using A = _I~_t, then: =
o_½
+ ij
i
Solving H°fW = Ecl~P,using orthonormal doubly-occupiedmolecular orbitalsli)gives,at the Hartree-Fock level: Ec, = (~,olHOlg,o) - ~ (iJg*(1)g(1)[i)+2g t" ~ (ilg(1)li) i
i
- ~ (ilg(1)li)t(jlg(1)lj) + ~ (ilg(1)lj)t(iJg(1)Jj). i.j
i.j
To implement this procedure in the LCAO approximation, they introduce the density matrix 2R_ such that the charge distribution is:
p(r)= -- 2xtR_x + ~ Z~ ~(r- r~) c
and Z are the basis set functions. Then with (][ilAIXj> = Aij
they find, for the total energy:
ET = E 2R,i{ho- ~(gtg)o + g"tgii} i.j
+ Z i.j.k.l
c,d
(2RoRk,- RikRj,){(/j[k/)- gtgk,}
Quantum calculationson proteins
23
where h~j are the matrix elements of the unperturbed single-particle Hamiltonian, and (ijlkl) are the two-electron repulsion/exchange integrals. In principle, this is all that is needed in order to be able to use the method on a real system. However, the sheer number of quantities that require storage can make it impractical by the cost of the computing, and the storage in memory or on disc/tape of all the quantities. For example, if there are P points {p} considered as the solvent: P
g~',, i j ~ k l --- fti j Af - - kl = E ~Apqfqk,
(8)
P,q
with
and since A_is 3P x 3P, g~igkz requires the manipulation of 3P(3P+ ½n(n + 1)) numbers in (8), which for any reasonable number of solvent points rapidly becomes unwieldy in the extreme. Thole and van Duijnen therefore approximate the various matrix elements is a way that is consistent with the interacting dipole model for the solvent, reduces the number of integrals, and avoids explicit reference to the classical part of the system during the quantum calculations. The approximation made is:
p - r --- ~-_]_p(r-a) Ip-rl 3 where the operator is expanded about origin a. a
_
vp
p-a Ip_al3.
Relative to a, r °Plj = t g +
(9)
The coefficients in this expression can be calculated separately; the remaining parts are overlap and dipole moment integrals that are routinely available during ordinary quantum calculations. In fact, if only a sinole origin is taken, the method is no more accurate than a dipole field. If the nuclear coordinates {rc} are taken as expansion centres for the integrals, interpolating between nuclei or choosing the nearest one for two-centre integrals, a much more realistic field representation is produced. This takes better account of the true charge distribution than the simple dipole field. Moreover, all the solvent terms in (9) can be calculated involving only overlap, dipole and quadrupole moment operators. Only the last of these are not routinely available in SCF packages, and are easily calculated from the wavefunction. These terms can then be added to the relevant parts of the unperturbed Hamiltonian and stored with them, thus requiring virtually no additional memory space in the quantum part of the calculation. Clearly this particular application does represent a step forward in the evaluation of solvent effects, even limited as it is to electronic polarization. Because the Schrodinger equation is linear, there is no need to alter the variational principle used in the SCF calculations. Thole et al. (1983) applied the method to cysteinyl proteases (e.g. papain, actinidin). They incorporated the "active site helix" as point charges on its backbone, and the surrounding protein as point polarizabilities. On experimenting with the radius of the zone
24
M.L.J. DRUMMOND
incorporated in this way, about 8 A seemed to be sufficient to determine the relative stabilities of different conformations. Increasing this to 12 A only altered the absolute values of the energies calculated. An 8 A radius corresponds to the inclusion of 311 atoms in the calculation; 12 A corresponds to 1047. The results gave the magnitude of the stabilizing effects as about a factor of ten times less than Tapia's calculation, and as of less importance then the field of the helix. Even this field may have been underestimated, as they had taken the dipole moment of an individual peptide residue as 3.5D. There is evidence, both experimental (Wada, 1976) and theoretical (via minimal basis set quantum calculations: van Duijnen and Thole, 1982; and via classical methods: Drummond, unpublished results) that cooperative effects in long helices (i.e. over 20 residues) increase this to about 5D. Thole and van Duijnen appeal to this to explain the fact that the ion-pair configuration is still slightly less stable than the neutral form in their calculations. This is difficult to estimate, however, because the effect of the solvent (neglected in these studies) will be to reduce the field of the helix once again, and the detailed balance between these two effects has not yet been investigated. The degree to which dispersion energy is included was investigated by Thole and van Duijnen (1982), and they found that it was overestimated by a factor of about four. Rather than including the interaction between excited states of solute and solvent, the latter is treated classically, and so only the excited states of the former appear. To prevent the induced dipoles from polarizing one another to infinity, Thole (1981) introduced a modified interaction which mimics, at short distances, two overlapping dipole distributions rather than point dipoles. The discontinuity that arises when the two distributions begin to overlap is carefully chosen so as to minimize its influence upon physical quantities calculated by the model. This theory is an adaptation of Applequist's interacting point dipole model (Applequist et al., 1972; Applequist, 1977). More recently, Birge et al. (1983) introduced a modification so that the discontinuity is removed and the dipole distribution is not a sharp sphere, but falls off exponentially at the same rate as the atomic orbitals would in the parallel quantum calculation. The exponents to use for any given set of atoms are available in tables. Possible improvements of the calculation scheme of Thole and van Duijnen are (Thole, 1982): (i) Including point charges and polarizabilities on all solvent groups. (ii) Including orientational polarization, and water in general. The electronic polarization of the water could be naturally incorporated if coordinates were known for the water molecules. The possibility of obtaining this information by Monte Carlo techniques is now under investigation (van Duijnen, personal communication). A quite different approach to the problem of the electrostatic field in the active site of papain is provided by Lavery et al. (1983). Although no explicitly quantum calculations were done on the active site per se, this application presents only computational problems. The idea basically follows the standard point charge models, except that dipole and quadrupole moments are also included, in a many-centre multipole expansion, where the centres are not only nuclei, but also points between all bound and unbound atom pairs (Port and Pullman, 1973). In previous calculations the method had been shown to reproduce the ab initio results for distances down to 2 A for nucleotide groups (Bonaccorsi et al., 1972; Pullman and Perahia, 1978), and tbree-membered ring molecules (Ghio and Tomasi, 1973). The multipole moments are obtained from small portions of molecules by SCF minimal basis set ab initio molecular orbital calculations. In practice, C~:~ ' and C~-C p bonds are broken in proteins, and the open bonds filled with hydrogen atoms, to avoid disrupting the electron distribution too badly. The overall procedure is subject to four drawbacks: (i) Basis set size. Minimal basis sets give good agreement with experiment for one electron properties, but this is largely fortuitous (Carsky and Urban, 1980). Charged groups (e.g. side chains) will not be accurately represented, for reasons explained previously. For the uncharged parts, the problem is not so severe. (ii) Following on from the last point; since the hydrogen atom in the active site is not moved to the minimum energy position, whereupon there will be different fields due to the
Quantum calculationson proteins
25
separation of the charge, the results of Lavery et al., only show that if the hydrogen atom was in the uncharged configuration, it would tend to begin to move away from it. (iii) The fragmentation method. This is bound to lead to inaccuracies, which can only be tested by splitting a molecule which can itself be fully computed ab initio. This was done for, among others, formamide-water systems (Alagona et al., 1973) with encouraging results. (iv) Transferability. This is reasonable for environments which are not highly polar or asymmetric, since the original calculations were done in vacuo. Polarizability effects are neglected between fragments, with no measure of how important such effects are likely to be in practice. The absence of both polarizability and cooperativity means that the results will not be accurate in, for example, the s-helix, where cooperativity is important. In regions of high field strength the electronic distortion is expected to be sizeable. The supermolecule technique is itself only applicable to small solutes with a few water molecules in a limited number of configurations, when applied to solvation studies. This is because of the scale of the computational effort required. (Pullman 1976; Pullman et al., 1974; Alagona et al., 1972). The opportunity has already been taken to investigate all the basic building blocks of biologically significant macromolecules, and this is the rationale behind most of the work discussed here. As far as papain is concerned, the significance of counterions was investigated by placing them at stoichiometrically and symmetrically probable sites. They were modelled by pure monopoles. All the side chains considered to be ionized at physiological pH were given unit charges. Although these had an effect on the overall potential, they did not appreciably influence the active site, because they were clustered on the far side of the protein away from it. The usual charge-transfer complex stabilizing effect was found, but in this treatment the s-helix produced less than half of the field. The rest came principally from neighbouring sidechains, especially a charged aspartate (number 158) which exists in close proximity to the active site (Fig. 4). This effect was large (although different quantitively) regardless of whether this group had a counterion associated with it or not. The precise values of the different effects are open to question on the grounds that if the fields are large enough to make the charged state more stable than the uncharged conformation, the transferability condition breaks down. However, Thole et al. did not include point charges, so the effect of the charged aspartate side chain is absent from their model. This serves to emphasize the importance of including at least point charges in their Asp-158 O.~..C/CH3
0
Cys-25 ?~'CH~ H
CH 3 ~C,. I~1~ His-159 II N~,CH HC - H
O•C ~CH~
NH~ Asa -175 FIG.4. Molecularmodelof the papain activesite(comparewith FIG.3) showingthe relativeposition of asp-148. This group exerts an important influenceupon the active site, accordingto Laveryet al. (1983), regardlessof whethera counterionis associatedwith it or not. The effectis obviouslydifferent quantitativelyin the two cases.
26
M . L . J . DRUMMOND
model and, for consistency, induced point dipoles should also be included. That Pullman et al. find it necessary to include quadrupoles on twice as many centres gives some idea of how slowly this expansion converges. (d) Conclusion The discrete models clearly hold out a hope of a reasonably tractable and coherent way of incorporating the protein molecular environment in (for example) quantum chemical calculations on the active sites of enzyme molecules. The incorporation of aqueous effects on a molecular representation would allow automatically for dielectric saturation on a local scale. The potential role ofcounterions would be difficult to evaluate using these procedures. Expensive computational schemes, like Monte Carlo or even molecular mechanics, may provide a way forward. For a description of the detailed interior field of the protein, the discrete approach is essential. It exposes the role of a-helices and other groups not chemically implicated in the reaction mechanism. Whether the approach's more accurate treatment of the liquid solvent outweighs the difficulties involved remains to be seen. 3. Hybrid Models
Of course, it would be simplistic to assert that all the methods of including solvent effects are either purely "continuum" entirely "discrete". We have already seen the example of Warshel's work (see previous section) in which the soft spheres are surrounded by a polarizable continuum. McCreery et al. (1976a) considered that ideally their method should include a large enough sphere of explicitly "quantum mechanical" material that the dependence of the results upon the cavity radius could be minimized as far as possible. This process is limited so stringently by the expense of supermolecule calculations that it is impossible in real terms. Yomosa (1973) considered a quantum system surrounded by protein treated as point charges, and solvent treated as a polarizable continuum. This was a first attempt at using the strengths of the different approaches on the different regions of the environment. Modifications of the approach to include further important effects (as outlined separately in the two preceding sections) would appear attractive. Huron and Claverie (1974a) used an idea similar to Yomosa's, except that the electrostatic field of the molecule outside the sphere was expanded in spherical harmonics about many centres (the nuclei) rather than about the centre of the cavity. This cavity, it was hoped, would not ultimately be spherical, nor even ellipsoidal, but the van der Waals envelope of the molecule. The many-centre expansion ensures faster convergence and better accuracy for the potential outside, whereas the internal potential was shown to be adequately described by a single-centre expansion. The application of the method then became an exercise in numerical analysis and computing. There is no application to quantum mechanics made, and such an application would be extremely difficult. To counteract the deficiencies (hydrogen bonding and saturation) of the continuum model, Beveridge and Schnuelle (1974) included the first hydration shell explicitly in their supermolecule-continuum model. Here the water is located by a Monte-Carlo process. Outside the first hydration shell, the solvent is approximated by a continuum with bulk properties, and the energies are calculated (in principle) quantum mechanically, using Onsager's model. In practice, approximate pair-additive potentials are used, in order to be able to explore a wider range of configurations. This approximation brings its own pitfalls (Finney et al., 1983). The use of pair additive potentials, checked for accuracy at appropriate points by full quantum calculations, can be extended to larger molecules if the potentials are trustworthy and transferable. A definition of classes of atoms and groups within which potentials are considered to be transferable was developed by Clementi (1980). In this classification, a group is assigned to a class not merely on the basis of its atomic content, but also by considering its immediately environment, and the hybridization state of the constituent atoms. For example the carbon atoms in CH 4 and HCOOH would be assigned to different classes, and potential functions would be fitted separately for each class. The result is a large group of potentials which it is hoped will be of use in locating waters around molecules by Monte-Carlo
Quantum calculations on proteins
27
techniques. Nevertheless, the extent to which they are applicable to water-protein interactions is uncertain, and their failure to account for non pair-additive effects renders any results obtained in this way of questionable validity. V. C O N C L U S I O N The dual nature of the environment around the active site (water/protein) calls for a twofold attack for any rigorous calculation of its effects. Because of the difficulty involved in locating sufficient individual water molecules to include the bulk of their (orientational/electronic) polarization contribution, a continuum approach is indicated. The orientational component of the polarizability (most important for water) necessitates use of reaction fields averaged over timescales long compared with electronic fluctuations, because of the long relaxation time, despite the consequent non-linearity of the Schrodinger equation. On the other hand, the protein environment must be modelled so as to include detailed structural effects. This calls for the representation of atoms or small groups individually, via a monopole approximation to their true electrostatic field. Because their orientation is not free to follow even slowly varying fields, their entire polarization contribution can respond allbut-instantaneously to the electronic motion, and so should be included as a direct (rather than averaged) reaction field in the quantum region. There is, as yet, no fully self-consistent way of marrying these various effects (solvent-protein; dynamic-static) together into a unified theory of solvent influence on enzymic catalysis. When such a unification has been achieved, we may expect to see quantitative predictions of those biochemical quantities which have hitherto been purely empirical; elucidation of detailed catalytic pathways; and the prediction of (unobservable) transition state structures, leading to the design of more effective, more specific inhibitor (drug) molecules. Such developments are already within the reach of modern computational methods. ACKNOWLEDGEMENTS I would like to thank my supervisors, Professor T. L. Blundell and Dr. J. L. Finney for their patience and helpful advice in the preparation of this review, and also the comments of Dr. P. Th. van Duijnen and Dr. O. Tapia on the manuscript. I also acknowledge the Science and Engineering Research Council's financial support.
REFERENCES AaBoTT, J. A. and BOLTON,H. C. (1952) Trans. Farad. Soc. 48, 422-430. ALAGONA,G., CIMIRAGLIA,R., SCROCCO,E. and TOMASl,J. (1972) Theoret. Chim. Acta (Berl) (25), 103-119. ALAGONA,G., PULLMAN,A. SCROCCO,E. and TOMASI,J. (1973) Int. J. Peptide Protein Res. 5, 251-259. ANGVA~q,J., SURJAN,P. R. and NARAY-SZABO,G. (1983) Stud. Biophys. 94, 221-224. APPLEQUlST,J. (1977) Accts Chem. Res. 10, 79-85. APPLEQUiST,J., CARL, J. R. and FUNG, K.-K. (1972) J, Am. Chem. Soc. 94, 2952-2960. BADER, R. F. W., NGUYEN-DANG,T. T. and TAL, Y. (1979) J. chem. Phys. 70, 4316~329. BARRETT,A. N. (1983) J molec. Graphics 1, 71-74. BARRIOL,J. and WEISSaECKER,A. (1964) Comp. Rend. 259, 2831-2833. BEENS, H. and WELLER,A. (1969) Chem. Phys. Lett. 3, 666-668. BEVERIDGE,D. L., KELLY, M. M. and RADNA,R. J. (1974) J. Am. chem. Soc. 96, 3769-3778. BEVERIDGE,D. L. and SCHNUELLE,G. W. (1974) J. phys. Chem. 78, 2064-2069. BIRGE, R. R., SCHICK,G. A. and BOOAN, D. F. (1983) J. chem. Phys. 79, 2256-2264. BONACCORSI,R., CIMIRAGLIA,R., SCROCCO,E. and TOMASI,J. (1974) Theoret. Chim..4cta. (Berl.) 33, 97-103. BONACCORSl,R., C1MIRAGLIA,R. and TOMASl,J. 0984) J. tool. Structure (Theochem. 17) 107, 197-219. BONACCORSI,R., PULLMAN,A., SCROCCO,E. and TOMASl,J. (1972) Theoret. Chim. Acta (Berl.) 24, 51-60. BONNOR,W. B. (1961) Trans. Farad. Soc. 47, 1143-1152. Boo'rn, F. (1951) J. chem. Phys. 19, 391-394. BROER, R., VAN DUUNEN,P. TH. and NIEUWPOORT,W. C. (1976) Chem. Phys. Lett. 42, 525-529. BUCKINGHAM,A. D. (1953a) Australian J. Chem. 6, 93-103. BUCKINGHAM,A. D. (1953b) Trans. Farad. Soc. 49, 881-886. BUCKINGHAM, A. D. (1978) In Perspectives in Quantum Chemistry, Vol. H: lntermolecular Interactions: from Diatomics to Biopolymers (ed. B. PULLMAN),pp. 1--76. Wiley, New York. BUSCh, J. L., RAGHUVEER,K. S. and CI-mlSTOFFERSEN,R. E. (1976) In Environmental Effects on Molecular Structure and Properties (ed. B. PULLMAN),pp. 17-29. Reidel, Dordreeht.
28
M . L . J . DRUMMOND
CARSKY,P. and URBAN,M. (1980) Lecture Notes in Chemistry 16: Ab initio Calculations--Methods and Applications in Chemistry. Springer-Verlag, New York. CLAVERIE, P. (1978) In Intermolecular Interactions:from Diatomics to Biopolymers (ed. B. PULLMAN),pp. 69-305. Wiley, New York. CLEMENTI, E. (1980) Lecture Notes in Chemistry, 19: Computational Aspects for Large Chemical Systems. SpringerVerlag, New York. CONSTANCIEL,R. and CONTRERAS, R. (1984) Theoret. Chim. Act (Berl.) 65, 1-11. CONSTANCIEL,R. and TAPIA,O. (1978) Theoret. Chim. Acta (Berl.) 48, 75-86. DANIEL, V. V. (1967) Dielectric Relaxation, Academic Press, New York. DAVlS, T. D., CHRISTOFFERSEN, R. E. and MAGGIORA,G. M. (1975) J. Am. chem. Soc. 97, 1347-1354. DEAN, S. M. and RICHAROS, W. G. (1975) Nature (London) 256, 473-475. DEBYE, P. (1929) Polar Molecules, The Chemical Catalog Company. FINNEY, J. L. and GOODFELLOW, J. M. (1983) In Structure and Dynamics: Nucleic Acids and Proteins (eds. E. CLEMENTI and R. H. SARMA),Adenine Press, New York. FROHLICH, H. (1958) Theory of Dielectrics, Oxford University Press, Oxford. GERMER, H. A., Jr. (1974a) Theoret. Chim. Acta (Berl.) 34, 145-155. GERMER, H. A., Jr. (1974b) Theoret. Chim. Acta (Berl.) 35, 273-274. GHIO, C. and TOMASl,J. (1973) Theoret. Chim. Acta (Berl.) 30, 151-158. HALDANE, J, B. S. (1930) Enzymes, p. 182. Longman, Green and Co. HALL, G. G. (1973) Chem. Phys. Lett. 20, 501-503. HASTED, J. B. (1973) Aqueous Dielectrics, Chapman and Hall, London. HAYES, D. M. and KOLLMAN, P. A. (1976a)J. Am. chem. Soc. 98, 3335-3345. HAYES, O. M. and KOLLMAN, P. A. (1976b) J. Am. chem. Soc. 98, 7811-7816. HILL, N. E., VAUGHAN,W. E., PRICE, A. H. and DAVIES, M. (1969) Dielectric Properties and Molecular Behaviour, van Nostrand Reinhold, London. HURON, M.-J. and CLAVERIE,P. (1969) Chem. Phys. Lett. 4, 429-434; Erratum: Chem. Phys. Lett. 11, 152 (1971). HURON, M.-J. and CLAVERIE,P. (1974) J. phys. Chem. 78, 1853-1861. HYLTON, J., CHRISTOEFERSEN, R. E. and HALL, G. G. (1974) Chem. Phys. Lett. 24, 501-504. KHARKATS, YU. I., KORNYSHEV, A. A. and VOROTYNTSEV,M. A. (1976) J. Chem. Soc. Farad 2 72, 361-371. KIRKWOOD, J. G. (1934) J. chem. Phys. 2, 351-361. KIRKWOOD, J. G. (1939) 3". chem. Phys. 7, 911-919. KLEINER, M. and ELDER, M. (1975) Molec. Phys. 30, 357-372. KLOPMAN, G. (1967) Chem. Phys. Lett. 1,200-202. KOLLMAN, P. A. and HAYES, n. M. (1981) J. Am. chem. Soc. 103, 2955-2961. KRISHTALIK, L. I. (1980). J. Theor. Biol. 86, 757-771. LAIDLER, K. J. (1959) Can. J. Chem. 37, 138 147. LAMBORELLE, C. and TAPIA, O. (1979) Chem. Phys. 42, 25-40. LAVERY, R., PULLMAN,A. and WEN. Y. K. (1983) Int. J. Quantum Chem. 24, 353-371. LIESON, S. (1982) In Structural Molecular Bioloqy (eds. D. B. DAVIES,W. SAENGERand S. S. DANYLUK),pp. 359--385. Plenum, New York. LINOER, B. (1962) J. chem. Phys. 37, 963-966. LINOER, B. 0967) Adv. chem. Phys. 12, 225-282. LINDER, B. and HOERNSCHMEYER, D. (1964)J. chem. Phys. 40, 622-627. LONGO, E., STAMATO, F. M. L, G., FERREIRA, R. and TAPIA, O. (1985) J. theor. Biol. 112, 785-798. MCCREERY, J. H., CHRISTOFFERSEN, R. E. and HALL, G. G. (1976a) J. Am. chem. Soc. 98, 7191-7197. MCCREERY, J. H., CHRISTOFFERSEN, R. E. and HALL, G. G. (1976b) d. Am. chem. Soc. 98, 7198-7202. MCWEENY, R. and SUTCLIFFE, B. T. (I 969) Methods of Molecular Quantum Mechanics, Academic Press, London. MANTIONE, M. J. and DAUDEY, J. P. (1970) Chem. Phys. Lett. 6, 93-96. MAYER, I. (1983) Chem. Phys. Lett. 97, 270-274. MEHLER, E. L. (1977) J. chem. Phys. 67, 2728 2739. MEHLER, E. L. (1981) J. chem. Phys. 74, 6298 6306. MEHLER, E. L. and EICHELE, G. (1984) Biochemistry 23, 3887-3891. MIERTUS, S., SCROCCO, E. and TOMASl,J. (1981) Chem. Phys. 55, 117-129. MULLIKEN, R. S. (1955)J. chem. Phys. 23, 1833 1840. MURRELL, J. N., HARGET, A. J. (1972) Semi-Empirical Sel.f Consistent-Field Molecular Orbital Theory of Molecules, Wiley, New York. NARAY-SZABO, G. (1979) Int. J. Quantum Chem. 16, 265-272. NARAY-SzABO, G. (1984) J. Am. chem. Soc. 106, 4584-4589. NOELL, J. O. and MOROKUMA, K. (1975) Chem. Phys. Lett. 36, 465-469. ONSAGER, L. (1936) d. Am. chem. Soc. 58, 1486-1493. PAULING, L. (1946) Chem. Engng News 24, 1375-1383. PETrlIG, R. (1979) Dielectric and Electronic Properties of Biolo#ical Materials, Wiley, New York. POLAND, D. and SCHERAGA, H. A. (1967) Biochemistry 6, 3791-3800. POLITZER, P. and HARRIS, R. R. (1970) J. Am. chem. Soc. 92, 6451-6454. POPLE, et al. (Gaussian 76)(1978) [B1NKLEY, J. S., WHITESIDE, R. A., HARmARAN, P. C., SEEGER, R., POPLE, J. A., HEHRE, W. J., NEWTON, M. D.] QCPE I!, 368. POPLE, J. A. and BEVERIDGE,D. L. (1970) Approximate Molecular Orbital Theory, McGraw-Hill, New York. PORT, G. N. J. and PULLMAN, A. (1973) FEBS Lett. 31, 70-74. PULLMAN, A. (1976) In Em,ironmental Effects on Molecular Structure and Properties (ed. B. PULLMAN), pp. 1-15. Reidel, Dordrecht. PULLMAN, A., ALAGONA,G. A. and TOMASl,J. (1974) Theoret. Chim. Acta (Berl.) 33, 87-90. PULLMAN, A. and PERAHIA, D. (1978) Theoret. Chim. Acta (Berl.) 48, 29-36. RAUDINO, A., ZUCCARELLO, F. and BUEMI,G. (1983)J. Chem. Soc.; Farad. Trans. H 79, 1759-1770.
Quantum calculations on proteins
29
REES, D, C. (1980) J. molec. Biol. 141, 323-326. REIN, R. (1973) Adv. Quantum Chem. 7, 335-396. RaNALD], D. and RIVAm,J.-L. (1973) Theoret. Chim. Acta (Berl.) 32, 57-70. RaNALDI, D. and RlVA]L,J.-L. (1974) Theoret. Chim. Acta (Berl.) 32, 243-251. RIVAm, J.-L. and RXNALDI,D. (1976) Chem. Phys. 18, 233-242. SANHUEZA,J. E., TAPIA,O., LAIDLAW,W. G. and TRSlC, M. (t979) J. chem. Phys. 70, 3096-3098. SANHUEZA,J. E. and TAPIA,O. (1982) J. molec, struct. (Theochem.) 89, 131-146. SHERIDAN,R. P. and ALLEN, L. C. (1980) Biophys. Chem. 11, 133-136. SmPMAN, L. L. (1975) Chem. Phys. Lett. 31, 361-363. SIMKIN, B. YA. and SHEIKHET,I. I. (1983) J. molec. Liquids 27, 79-123. SINANOGLU,O. (1967) Adv. Chem. Phys. 12, 283-326. SINGH, U. C. and KOLLMAN,P. A. (1984)J. Computational Chem. 5, 129-145. STAMATO,F. M. L. G., LONGO,E., FERREIRA,R. and TAPIA, O. (t985) (in press). TAIT, A. D. and HALL, G. G. (1973) Theoret. Chim. Acta (Berl.) 31, 311-324. TAPtA, O. (1980) In Quantum Theory of Chemical Reactions (eds. R. DAUOEL,A. PULLMAN,L SALEMand A. VEILLARD),Vol. 2, pp. 25-72. Reidel, Dordrecht. TAPIA, O. (1982) In Molecular Interactions (eds. H. RATAJCZAKand W. J. ORVILLE-THOMAS),Vol. 3, pp. 47-117. Wiley, New York. TAP]A, O. and GOSCINSKI,O. (1975) Molec. Phys. 29, 1653-1661. TAP1A, O. and JOHANNIN,O. (1981) J. chem. Phys. 75, 3624-3635. TAPIA, O., LAMBORELLE,C. and JOHANNIN,G. (1980) Chem. Phys. Lett. 72, 334-341. TAPtA, O. and POtJLAtN, E. (1977) Int. J. Quantum Chem. 11, 473-484. TAP1A, O., POULAIN,E. and SUSSMAN,F. (1975) Chem. Phys. Left. 33, 65-70. TAPIA, O., POULAtN,E. and SUSSMAN,F. (1978a) Theoret. Chim. Acta (Berl.) 47, 171-174. TAPIA, O. and SANHUEZA,J. E. (1978) Biochem. biophys. Res. Commun. 81,336-343. TAPIA, O and Smv], B. (1980) J. phys. Chem. 84, 2646-2652. TAPIA, O., STAMATO,F. M. U G. and SMEYERS,Y. G. (1985)J. molec, struct. (Theochem.) 123, 67-84. TAPIA, O., SUSSMAN,F. and POULAIN,E. (1978b) J. Theor. Biol. 71, 49-72. THOLE, B. TH. (1981) Chem. Phys. 59, 341-350. THOLE, B. TH. (1982) Ph.D. thesis, University of Groningen, Netherlands. THOLE, B. TH. and VAN DUHNEN, P. TH. (1980) Theoret. Chim. Acta (Berl.) 55, 307-318. THOLE, B. TH. and VAN DUtJNEN,P. TH. (1982) Chem. Phys. 71, 211-220. THOLE, B. TH. and VAN DUtJNEN,P. TH. (1983) Theoret. Chim. Acta (Berl.) 63, 209 221. VAN DUUNEN,P. TH. and THOLE, B. TH. (1982) Biopolymers 21, 1749-1761. VANDUIJNEN,P. TH., THOLE,B. TH., BROER,R. and NIEUWPOORT,W. C. (1980) Int. J. Quantum Chem. 17, 651-671. VAN DUIJNEN,P.TH., THOLE, B.TH. and HOL, W. G. T. (1979) Biophys. Chem. 9, 273-280. WADA, A. (1976) Adv. Biophys. 9, 1-63, WARSHEL,A. (1978a) Proc. natn. Acad. Sci. U.S.A. 75, 5250-5254. WARSHEL,A. (1978b) Chem. Phys. Lett. 55, 454-458. WARSHEL,A. (1979) J. phys. Chem. 83, 1640-1652. WARSHEL,A. and LEVITT,M. (1976) J. molec. Biol. 103, 227-249. WARWICKER,J. (1983a) Unpublished results. WARWICKER,J. (1983b) Unpublished results. WARWlCKER,J. and WATSON,J. (1982) J. Molec. Biol. 157, 671~79. WILSON,J. N. (1939) Chem. Revs. 25, 377-406. YOMOSA,S. (1973) J. Phys. Soc. Jap. 35, 1738-1746. YOMOSA,S. and HASEGAWA,M. (1970) J. Phys. Soc. Jap. 29, 1329 1334. ZUCCARELLO,F., RAUDINO,A. and BUEMX,G. (1984a) J. molec. Struct, (Theochem. 16) 107, 215 220. ZUCCARELLO,F., RAUDINO,A. and BUEMI,G. (1984b) Chem. Phys. 84, 209-215.