Bhr'himie ( ~997 ) 79, 687-o,(~89 () Socidld frap~aise de biochhnie ct biologic moldcukmc / !~nse\,icr. Paris
Computational enzymology MA Cunningham, PA Bash* ('¢~lcr,[~;r McHnmistic Biology aml Biotechmdogy. A li~onm' Natiomd Laboratory, Ali~,,o,,,,. IL 60439. /_'5;.~
(Received i 2 August ! 997: accepted 4 November 1997 ) Summary ~ Numerical sirnulations of enzyme reaction mechanisms are beginning to provide quanti|ative as weil as qualitative iasights. Melhods based on a hybrid quantum mechanical/molecular mechanical technique permit the natural inclusion of protein solvatiotl effects. Coupled with modern experimental techrliques, the numerical simulations are providing details at Ihe atomic level about hey, enzyme structure hffluences its furlction. numerical simulation / reaction mecimnism / QM/MM method
Fredrich Wilhem Ktihne first coined the word enzyme in 1878 |o emphasize that an agent within the yeast cell. and not the yeast itself, was responsible for fermentation. In the ! 20 years since, bigchemists have deduced that enzymes are proteins, deve!opva powcrfu! teclmique,~ m ascertain and modify the amino acid sequences that define particular enzymes and, through crystallographic methods, determined the three-dimensional arrangements of individual atoms within enzyme melee'ales. Great strides have been made toward understanding how enzyme systems achieve their extraordinary catalytic efficiencies and substrate specificities. There remain, however, some aspects of enzymecatalyzed reactions that remain diMcult to probe Ihrot, gh direct experimentation. Most notably, the chemical events of bond cleawlge and lbrmation that define the reaclion rnecl'utnisln are diffictllt to invesligate directly becausc they are extrelnely short-lived. The l'ocu~: of our efft~l"tS has been on the develop!sent of immel'ical tools thai can provide ill~. sights into enzyme reaction inechanisms, information that is complementary to that obtained e×periment,'tily. In this report, we shall review some of the recent progress in the nascent field ot" computational enzymology. Atomic and ,nolecular interactions are well characterized by the theory of qu.'mtum mechanics. In principle, with a well-defined protein structure, one need just solve the
Schr/6dinger equation to obtain the electron densities of ~he reacting species. Chemical pathways can be explored by investigating modifications to the structure, lot"example, by moving a proton from a possible donor atom to a likely
*Correspondence and reprints
accepter and comparing the energies of initial and fin:~l states. In practice, providing an accurate quantum descrip|ion of a systern ct)mposed of thousands of atoms is problematic. The current state-of-the-art ab i n i t i o theoretical methods based on Har|ree-Fock theory 16, 141 require hours of CPU tirne on large supercomputers to calculate the energy of a system involving a dozen atoms. Semi-empirical quantum metllods 17 i arc ,several orders of magnitude faster but must {;c ca, cFully calibraled for specific molecular interactions to provide quantitative results. Beyond the limitations imposed by finite computational resources and the extensive tmmerical calculalions required to produce solutions of the Schri~dirlger eqtmtion, a t'ur|ilcr complication arises when tree ailt?nlpts lO toni'leer Iheorelical c;dculation.,, wilh cxpcl'itnen|a~ lilea.nttlelllenls. Kiiictic rate CtlllSlilills and e v e n crystal slrtlcltll'eS ;i1'o |llacrt)scopic
prol)el"lics of ellzylne syslelllS ill ,,OiUlitm. To obtaill csli. 111ales t)l' ii*laCl'OSt2t~pic ~ff~servable:; fl'Olil inicroscopic qti;l|l, Itnn c,'flculations requires an average over all ensemble of slales. 'l~VO methods of eslimatin~ Ihese t.nscnlt)le averages are commonly used. The Monte (~arlc m,~lhod generales random conformations and compules the energy of each. The molecular dynamics method starls with an inilial cone figuration, compules the forces on each atom and then inteo grates the equations of motion to let the system evolve dynamically; an ensemble avera~ge is estimated by the time average of the system, in both methods, hundreds of thou° sands of configurations are typically sampled to provide an cstimale of the ensemble average. in this lighl, it is clear that even using semi-empirical quantum models to represen! an entire enzyme system is beyond the scope of current computation:d resources. We should note, however, that recent advances in semi-enapiricai methods may permit simulation of thousand-atom systems in the reasonably neat" future 18, 221. One approach capable of handling large numbers of atoms is the so-called
~8
molecular mechanics method 141, in which molecules are repre~nted as charged masses connected by springs. This classical ~p~sentatitm is computationally efficient and has been u~'d effectively to reprotJuce experimentally observed crystal structures 1241 and to simulate the dynamics of confonnational changes in proteins. Unfortunately, it in not possible to adequately represent the chemical events of bond cleavage and ~ n d formation with such methods; to do so requires a quantum-mechanical description of the electronic restructuring. There have been a variety of approaches developed to study the~e key chemical steps in enzyme-catalyzed reactions, As one might imagine, as computational resources have improved, so have the complexity and sophistication of numerical simulation methods. The methtxl on which we focu~ in this report is a hybrid, incorporating both a quantum-mechanical (QM) description of atoms in the active site of the enzyme and a molecular-mechanical (MM] description of the remaining atoms. Originally proposed by Was'shel and Levitt 1261. several variations on the QM/MM m e t h ~ have since been de,veined by other groups ! ! i. 21 ]. The principal assumption of the method is that changes in the electronic structure during the reaction are localized to the vicinity of the active site. In this scheme, the bulk of the protein/solvent system is represented by a molecular mechanics potential; only a small nutnber of atoms in the active site are treated quantum mechanically. While this hybrid QM/MM technique is not as complete a model as one might wish, it has proven quite capable of providing the sort of detailed intbrmation about reaction mechanisms previously unavailable from experimental measurement. There have ~ e n several comprehensive reviews of the methodology recently 19, 16, 18.20, 251, so we sh~dl not dwell on t!a~sc aSl~Ct~ here, Emil Fischer postulated in 1894 that an enzyme and its substrate must fit logether like a lock and key. TIle l~tct thai e n t y n ~ and substrates do indeed have complement~u+y geoo metri al configurations has ~ a demonstrated iep~atetlly with m ~ e r n high,~solution crystallographic techniques, In a ~finement of Fischer's hypothesis. Haldane and Paub ing suggested that the key to catalysis lay in the ability of the enzyme to stabilize the transition states along the reaction pathway 1101, Given the short lifetimes of the transition slates, however, it has proven difficult to ex~rimentally define the transition slate structures, This is a situation whe~ numerical modeling can provide insights into reaction mechanisms, Harrison et a! 1131 have, for example, investigated amide hydrolysis catalyzed by the enzyme papain, which is typical of the cysteine pro|eases, The key residues involved in this ~aetion are a cysteine and histidine pair, The thiolate anion of the cysteine t~sidue attacks a carbonyl carbon of the sub~ slrate and the ND2 nitrt~gen of the histidine ring attacks the a m i ~ nitr~gen of Ihe substrate, The authors computed the potential energy surface of these two principal reaction co-
ordinates and Ibund tile reaclion to take place m a concerted ln~lllnel;
In a study of tile enzyme malate dehydrogenase, which interconverts malate and oxaloacetate as part of the citric acid cycle, Bash and coworkers [2, 5, 15] computed the minimum energy surface associated with the prolon- and hydride-transfer reactions catalyzed by the enzyme. The authors identified two separate transition states and concluded that the reaction proceeds sequentially, with the proton transfer occurring first. They also computed a somewhat higher energy banter lbr the hydride transfer reaction, indicating that this event is the rate-limiting chemical step of the reaction. There is now some experimental evidence in a mutant enzyme to support this claim. Numerical methods are, in essence, snore malleable |han experimental techniques, in the QMtMM meth~d, it is possible, for example, to simply eliminate d~e influence of the enzyme by setting ti~e charges on protein atoms in the molecular mechanics region to zero. The atoms in the quantum region no longer interact with line protein and the elk fects of the protein environment can be calculated directly. In their study on malate dehydrogenase, Bash and coworkers determined that the reaction sequence would be reversed without the protein present [51; the protein environment was essential for stabilizing |he intermediate transition-state structures. This result echoes the earlier insights of Haldane and Pauling. A similar study by MuIholIand and Richards [ [9[ on the acetyI-CoA enolization in citrate synthase also identified the key contributions of active site residues and a conserved water molecule toward stabilizing the transition-s|ate s|ructare. Conserved wales" molecules were also identi fled as active participants in tile sva¢lio!| n)echanism in a study of chorismale inulase II 71. These sludies and a recent loves|igalit~n ~ff |he mechanism of the enzyme sialidase I!] conlIt!l| the is|lpol'ttulce of Ilh¢ solvaliosl enviloluneill provided
by the protein 1231. Another issue !jlllt arises when simulating a reaclion mechanism is !he determination o!' protonalion stales Of titralable groups in the active site. Experimental clues can be obtained through nuclear nlagnetic resonance measurements; calculations of the electrostatic potential in the active site can also lead to estimates of pK,. values for residues in the active site 13, 28!, An alternative, and arguably more accurate, method for estimating pK,, values uses the QMIMM method to compute the fi'ee-energy change associated with protonating the residues. Hansson et al 1121 have recently investigated tile protonation state of tile cysteine-12 residue in the protein tyrosine phosphatase. Using an empirical valence bond model as the quantum Hamiltonian, the QM/MM method was utilized to compute the freeenergy change 1271 that occurred when a proton was transferred from the catalytic cysteine residue to the phosphate substrate. Their findings indicate that the enzyme environment does indeed lower the pK,, value of the cysteine residue, leaving it unprotonated and poised to begin the next
step of lhe catalytic process, x\ hich is a nuc~cophilic atlack by lhe lhiolale anion. In sum. while a compfetc , / , i~giei¢~ calculaliol~ of a~[ as. pecis of en~yme-cala[y/.ed reaclions rcma{ns at distanl pro,,peer, present ctmlt)utational melhods buih ot~ Ihe hybrid QM/MM melhod can certainly provide qualitative insighis into enzyme readion medmnisms, insiehts limt arc unax.ailable from cm'mnt experimental measurements. Moreover, wi|h careful calibration of lhe qtlallltllll mechanical Hamiiionian employed in the QM region, lhe QM/MM method can provide quanthative resuhs as well, as has been demonstrated in Ihe research programs outlined above. In the not-too-distant future, we can anticipale that more accurate quantum methods will be incorporated inio the QM/MM
nleihod, rehixhlg Ihe curreni reslriction ikai Ihe quaniuni l-tanlillonia|! be calibraled for each i.'11/.),!110 syslein. ('erlahily, as conlpulalional capabilities iilexorably expand. 1111-' quanlilaiive eapabiliiies of numerical simulalions will ilnp r o v e . ~i'~ h a v o !1OI q u i l t reached i h e u l l i n l a l e goaJ o f a ¢onlplele undersiandhlg of enzylne mechani,sin.~ bul recClil progress i n d i c a t e s i h a l I!11." c a l a l y l i c f u n c i i o n s o f e l l / y i i l e > , can be elucidaled flonl ihe conibinalion of lilodcrn cxperhneilial
lechnh.lues
and nuinericai
simulaiion.
Ack.owledgmen|s We w o u l d like Io Ihank Alldl'zqi .h)achinliak and t:icd Slevcns fur Iheir in,~ighiftil coninielil,~ on Ihe r o u g h drafis l!l" ihis lillintiscripl. This w,ork was ,~UPllOrled by ihe IIS I)eparlnlenl o f l~.nc-igy, ( ) f f i c c ,.ff Itcalih aild t{iiviroiinienlal I~,eseal'ch. u n d e r ( ' o n l r a c l W-.t I- I(Iq-I{llg-3,R,
lteli'rences I I|arll¢~ .IA, liViJJialliS III I lqqhi (.}Uallliilli lllcchallical/ililllccilJ;ir iiit, CIlilllk'iil ilpllllllichl, k hi Iriill~,ililili M.il¢ 'qlllCllll+t ''. ilict'halli~lii o1 ~ialid:lsc ilt'lillll, llio,hcm ,%'o(" /7'an,~ 2,1, 2(I )~ .~(IR 2 ltll~li llA, ihl 1,1., Macl(crcll A I ) , h , I c~inc I), Ilallqi~un I ~ i l q q h l lli°og, r0,s,~ Io~liid d!011iiclil lit-c'ilriicy i. Iht, COliipul0r siiliiilalioii o1 tlOlllJellsl;d phlis0 r¢ilt.'l iOliS, Pro(' N . I I / h ' o d ,%'1'i !/,%',,I ().al,)100,R <,1?( ))1 3 Illishfllrd I), Kiirplus M (19911) pK~,~ ill' illiiizal~le t.i'oilps ill i}roieiiis: Alolilic del;iil frolil ii i;onlJnuliili eleCll'liMalic: lilOdel, Ilio~'lwmi,~lrv ]~), 11)219.+~,1t)225 ,1 Brooks ('1,, Kilrplus M, Peuil ItM 119,R,R) Pl•ot~'ili,w A 7 7 w o n ' m ' a l t)(,1",%'11(,('![1'(' (~f D.YiIfllll/r,'t, ,~'11"11('III1'~,iIIId Th('l'lllOdYil(llll/('& Adl'~llll'(',~ ill ('henli('a[ Phy.~'i(',~, I.XXl. Jt)hli Wiley ;llld ~OliS, Ncw York Cliilllilighani M A , Ho ir.l., (iillihin liE, Bash I)A (1997) Sinlulalion ill ih¢ eii/.yine ieal:lion inechallisni o1' liiahile dehydrogena,se,/J/o('/l('nli,~irv 36, 4800-41'I I (i 6 Curliss LA. Raghavadlari K. Ilcdfern PC, Pople .IA (1997) Assessllienl of (]aussian-2 and den,sily funclional Ihellries for Ihc colnl)tilaIii)li of elllllalpies of fOl'lnalil)n, ,/('hem I'hv,~ 1116. II)(i;t.-1()7q
-,.cm'r,d purp~)~c quamum mcchamcal ll~k~cular I~,~dcf../.~r~ ( h r m %~, it)7..~(t~:~2 ~ t 9 t) S [)i~.tlll SI .. MCI/ KNI .h" { IUq()l Nclniclnpirlcal llltill_'ctllatl ,~rbllal <,d ctlJaliOllk \~,i~h tlltc'~r .\ MC~li ..i/c xCallil{2. ,/ ( /I,,H# /~/l~ ~ I(i-l. £}(~4.~ tRI-I t) t) t~ulciliu~ Kit ('hatficld I)('. I~,-~)k~ BR. t{odo~cck M ~1 tJt)(}l 1"]11/\i11c ilic'challi.qll,, \\ ilh h~.blid tltKlllltllli Clild lll~}JL'ctlLu u~cct~ctitical p{~tcnlkils, l. Theorclical considcraliilns. #nl .1 Qltanm#l# ('/win 60. list) 121il) I() Fcrshi A t llJS5 i l ; n : ; m c .\Iru(lll#c and III('('/IIIIIiSIII. ~ ' H |:r¢cnl/.ln ~_k ('o. Nc\v York II l:icld M J. Bash PA. Karplus M 119ql)),,% coinbincd qualilUlli llicchallical ai~tl Inolecular in¢c'hanical plilcrllial lot Inolccular d~llainics >,ililUtalion. J ('. "~ - I I S-127 I.I Ilarlison MJ. Phirlon HA. tlilliel !tt, Gould IR (199~1) Mcchanislll and llallsilioll Male ~llUClUlC for par~ain c',llaly~,cd anlide h~dloly~is, tiSill 7 a hqllid O M / M M polcnliaJ. ('hl,/n ('otll/lltlll 24. 27('ltJ-2770 • 14 J|chrc \~'.1. Ratltllll I...%clio\or It l)oplc JA i IqblS)Ah illl'll'O m < , h ' , . h a ,,,l,i#.l lhl'~rv..hlhn \Vile) & SOll:~. Nc~.~, ¥ol'k 15 !1o !, M,icKcrcll i\.l. Bash PA (ItJqhl PlOhlll and h.vdridc Ii-alislcl~ ill solulion I I \ b r i d Q M / M M Ircc CliCi-g) r~crlurl'tl.ilii)n sluck. I t ' / l ~ { '/Irni Ilil). 4 4 6 6 4 4 7 5 I (i I{tllhilall ll,.\ I I ql)'~ ) Thclll\ oI on/\ nit lilcchiilli~llik. ( '~o t f )lJTii ~lrll, 1 B . , / . . ~ 765 771 17 i.yile PI). Mulholland A J. Richards W(i (1995! Insighis into ch~)iisliialc nlul;isc calalvsis flOlll ~.i conlbincd Q M / M M ~innilalion ol lhc eli/\ lilt" lt.'ill:lil!ll. ,/Am ('/i~'ln ,~o( I 17, 11345-1135() I~ Mulholland A.I, (h/.inl Gil. i{ichards W(I (I qtJ3) ('Olllrlulcr lllodclinQ of cilia, lilt c'alaI.v/cd rcaciion IllechanislilS. Pl'olCd#l l]ll k, {'). l JJ 147 Iq Mulhllll:ind A.I. I{ichaids \V(I (Iq97l Ac'clyl-('llA t, llllli/alit)ll ill t:i.. Iialc SVlllhase: a qUalllUni iiIcchanical/lllOlcctilar illcch;illic;ll /(.)M/MMI slutk,, lJlv,lcilr~ .%'lllt(l ] u n , ' l (;c#l,,I 27. q 25 ](i iici'{ikylii M. KoJhlltlil I)A ( I t)qTi A sinlulaliiul ill Ihc cal;.il~,'lic: ilicch;iliislli of axptirl)/gJul:iiSallihiidasc ilSillg ah inilio qUalllUnl incch:ulics alld illolccillar dylialnics, ,/Ani ( / l , m ,%'o, I Iq, II Sq I lq(~ 21 ,~illgh I!('. Kollnian liA (Iq,Rh)A Cltlllbincd