194
Comparison of database potentials and molecular mechanics force fields John Moult The advantages and disadvantages of database and molecular mechanics force fields for the study of macromolecules are compared, with emphasis on the ability to distinguish between correct and incorrect structures. Molecular mechanics force fields have the advantage of resting on a clear theoretical basis, permitting an in-depth analysis of different contributions. On the other hand, large simplifications are necessary for tractable computing, and there has so far been little effective testing at the macromolecular level. Database potentials allow greater freedom of functional form and have been shown to be effective at discriminating between correct and incorrect complete structures. The principal negative is a controversial relationship to free energy. More testing and comparison of both sorts of potential are needed.
Addresses Center for Advanced Research in Biotechnology, University of Maryland Biotechnology Institute, 9600 Gudelsky Drive, Rockville, MD 20850, USA e-mail:
[email protected]
Current Opinion in Structural Biology 1997, 7:194-199 Electronic identifier: 0959-440X-007-00194 © Current Biology Ltd ISSN 0959-440X
Introduction Potentials for studyin~ biomolecules are following two distinct lines of evolution. Traditionally, potentials have been based on physical models of the intcractions b e t w e e n atoms. I will refer to these potentials as "molecular mechanics' force fields. Recently, potentials based on the statistics of occurrence of particular geometries of interactions in proteins have become increasingly popular. T h e s e have become known as 'database', 'statistical' or 'knowledge-based' potentials. This review compares the apparent strengths and weaknesses of each approach for the study of biological macromolecules, as reflected by current literature. Methods for assessing the quality of macromolecular potentials are still largely undeveloped. T h e ability of a potential to distinguish b e t w e e n correct and incorrect conformations of protein molecules is used as the principal critcrion in this review. Severity of the problem How difficult is it in principle to use a potential to distinguish between the correct and incorrect structures of the whole or a part of a protein molecule.: Statistical mechanics provides a worst case answer: a population of protein molecules will occupy conformational states
according to a Boltzmann distribution, which is usually defined as P(AG i) = e x p ( - A G i / k T ) / E j exp(-AGj)
(1)
where P(AGi) is the probability of finding the system in state 'i' with a free energy AG i relative to some reference state, and the sum is over all states of the system, qb establish a link to molecular force field potentials, we also need the thermodynamic relationship b e t w e e n the frec energy G and the potential energy ' E ' AG = AE + TAS
(2)
w h e r e ' S ' is the entropy of the system, and 'T" is tile absolute temperature. T h e functional conformation of soh, ble protein molecules typically differs in free energy from the set of nonfunctional ones by only 5-15 kcal mol -I [11; however, there is essentially only one functional, and very many nonfunctional conformations. T h a t is, tile sum in Equation 1 is over many normative states, so that the small free energy difference hides a large configurational entropy cost of selecting one conformation from the background of many. Calorimetric data suggest that this cost is equivalent to a free energy of approximately 1-1.5 kcalmo1-1 residue -1 [2]. T h u s . for a typical protein of 200 residues, the functional conformation is stabilized by a free energy of around 200-300 kcal mo1-1 with respect to any single other conformation. For a completc protein, then, a potential need not be very accurate to distinguish the correct conformation from any completely different fold. Lattman and Rose [3] have drawn attention to this distinction b e t w e e n the specificity of a fold (identified here as the relatively large difference in free energy to any other single fold), and its stability (the relatively small frec-energy difference to the full set of all alternative conformations). T h e large specificity difference is tire basis of the succcss of fold idcntification methods, in which the compatibility of a sequence for a fold is tested on cach m e m b e r of a library of known folds using database potentials. At the other extreme, we may also consider how difficult it is to resolve small details of a structure in energy terms, such as two alternative conformations of a single sidechain, against the background of a fixed protein molecule. A difference of 2 k T in frec encrgy (approximately 1.2kcalmol -I at 300K) will be sufficient for one conformation to be populated more than three times as much as tile alternative conformation. Both of these arguments lead to the conclusion that we nccd to calculate the frec energy for each residue, excluding configurational entropy effects, with an accuracy of about
Comparison of database potentials and molecular mechanics force fields Moult
1 kcalmo1-1 in order to distinguish b e t w e e n correct and incorrect structures.
Molecular mechanics force fields Molecular mechanics force fields have the advantage of resting on a solid body of physical theory. T h e quantum mechanics of the covalent and noncovalent interactions b e t w e e n the atoms are well understood. In practice, a series of radical approximations is necessary to arrive at a functional form that is computationally tractable for large molecules. For this reason, the term "empirical force fields' is often used. A more or less standard functional form has emerged, which has three principal components: covalent: Van der Waals; and electrostatic [4].
All the molecular mechanics formulations incorporate a large n u m b e r of parameters. Two main approaches have evolved for obtaining these. T h e first to be used was the a d j u s t m e n t of parameters to obtain the best fit to experinaental data on small model compounds. T h e s e include fitting the parameters of expressions for the van der Waals and electrostatic interactions using the g e o m e t r y and sublimation energies of crystals of small molecules and fitting those for distortions of covalent geometry to spectroscopic data. More recentlx; as q u a n t u m mechanical methods have improved, partial charges for electrostatic interactions have been obtained from quantt, m mechanical calculations on model compounds. References to these methods may be found in a recent review in this series [5"]. This strategy of d e v e l o p i n g force fields by fitting as many properties of simple systems as possible has resulted in the e m e r g e n c e of many slightly different parametcrizations and ft, nctional forms [6",7-10,11"]. T h e current functional forms have recently been thoroughly reviewed [12"]. T h e strategy of extensive fitting of parameters to the properties of simple systems has the advantage that individual aspects of the force field may be investigated and assessed. T h e asst, mption is that when put together. all these c o m p o n e n t s will reproduce the properties of macromolecules. T h e latest versions of these force fields are often referred to as 'second generation', as they represent extensive revisions based on the experience of applications to many systems. T h e most d e v e l o p c d of these is a new version of the A M B E R force field [6"]. In addition to a general reparameterization, several design issues are addressed: the problem of obtaining self-consistency b e t w e e n internal and solvation energies is partly resolved bv placing the emphasis on obtaining a good reproduction of the e n t h a l p y and density of simple liquids, following a strategy first adopted for the d e v e l o p m e n t of force fields for liquids. O P I , S [10l. An improved method for obtaining atomic partial charges bv fitting to a set of q u a n t u m mechanically derived electrostatic potentials is also introduced. A revised version of the G R O M O S simulation package has also been released [11"].
19,5
T h e approximate functional forms of these potentials do not e m b o d y some critical aspects of the physics. For example, the use of atomic-point partial charges, together with Coulomb's lax~, in representing electrostatic interactions: the point charges are required to represent complex nonspherical electron clouds around each nucleus. Furthermore, polarization of these clouds by external electric fields in proteins changes their shape significantly [13]. T h e s e effects are ignored in nearly all contemporary force fields. A critical issue is the treatment of solvent effects. Inclusion of explicit solvent in a molecular dynamics calculation does, in principle, allow a representation as accurate as that for the protein itself: however, running a molecular dynamics simulation long enough to obtain the solvent effects for each conformation of macromolecule is not possible in practice, and an approximate model inust be used [14,15]. A further difficulty is that the practice of characterizing each part of the force field separately may introduce significant imbalances in the full potential. For example, in the folding of a protein molect, lc, favorable electrostatic interactions with water molect, lcs arm partially replaced by almost compensating favorable internal elcctrostatic interactions (e.g. see [16]). Thus, uncorrelated errors in p r o t e i n - s o l v e n t interactions and the intramolect, lar point-charge electrostatics may be critical in d e t e r m i n i n g a preference for the folded state. Some of thesc issues have been recently discussed [12"].
T h e r e has so far been little validation of these force fields at the macromolecular level. T h e tests used arm of two main types: molecular dynamics simulations starting from the experimental crystal structure: and the use of the potentials to distinguish b c t w e e n correct and incorrect structures. In the molecular dynamics simulations, the extent of drift away from the experimental structure is taken as the main measure of the relative accuracy of force fields [17-19]; however, other factors, such as cutoff and effective pressure, can limit drift. Ideally, such simulations should be started from structures removed from the experimental one, and the extent of convergence back to e x p e r i m e n t used as a measure of accuracy. So far, almost all attempts to do this have been unsuccessful. It is unclear w h e t h e r these failures reflect errors in the force ficlds or difficulties of convergence in the simulation. Reccntl.x; encouraging results have been obtained with a partial convergence from relatively high starting deviations [20"]. T h e characteristics of the motion may also be related to X-rav [21] and N M R data [22]. Limited experimental accuracy restricts the usefulness of tests of this type.
A n u m b e r of studies have shown that the intramolecular portion of a molecular mechanics force field alone is not sufficient to distinguish a correct from an incorrect protein fold, but that thc addition of an approximate solvation model is effective [23-25]. In one of the most thorough analyses of protein conformational preferences
196
Theory and simulation
using molecular mechanics force fields, together with an approximate solvation model, Yang and Honig [26°',27,28] have e x a m i n e d conformational preference as a function of residue type for helices, sheets and 13 turns in proteins. T h e C H A R M m force field was used [7], together with a finite difference Poisson-Boltzmann solvation model [29]. Reassuring a g r e e m e n t with e x p e r i m e n t was obtained for the helix preference of a residue, and for turn type as a fonction of sequence. Because these force fields are firmly based on the principles of physics and because many of the limitations imposed by computational power are becoming less severe, the long term prospects for i m p r o v e m e n t are good. In the short term, much more validation at the macromolecular level is required.
Database potentials T h e r e is an excellent review of the principles of database potentials in Volume 6 of this series [30°*]. Database potentials are usually justified in terms of the Boltzmann relationship; that is, utilizing Equation 1 to relate the observed frequency of a feature of protein structure to its free energy. In fact, a purely statistical theory can be used if the purpose is to distinguish b e t w e e n incorrect and correct structures: characteristics of correct structures exist that differ from those of incorrect structures, and these differences can be expressed statistically and utilized for discrimination. Although these differences must obviously arise from the underlying physics, there is no requirement to formulate the theory in physical terms. T h e purely statistical view has advantages: the difficulties about choosing a physically justifiable reference state that represents all possible states of the system are not encountered, and any appropriate prior distribution may be used, in the normal procedure of Bayesian statistics. Approximate molecular descriptions, which use only one or two points for each residue, necd not be justified on physical grounds. Hybrid potentials that incorporate some physics and some statistics may also be more easily formtdated. In practice, almost all empirical force fields have a statistical component, as most physics-based contributions are scaled to provide m a x i m u m a g r e e m e n t with experimental data. On the other hand, if the potentials are viewed as purely statistical properties of proteins, there is no theoretical basis for expecting a correct distribution of states as a function of relative free energy or of temperature. As a consequence, unless a basis for regarding the potentials as representative of free energy can be found, they may be unsuitable for examining physical properties, such as motion, thermal denaturation or the relative stability of mutations. Current database potentials O n e of the earliest r e s i d u e - r e s i d u e contact potentials has now been recompiled using a larger set of pro-
teins [311. Bahar and Jernigan [32"] have also explored s i d e c h a i n - s i d e c h a i n distance d e p e n d e n t potentials. In a variation from most of the earlier work, they use multiple a t o m - a t o m distances b e t w e e n sidechains as a measure of sidechain separation. T h e y conclude that less averaging of information occurs than in the more usual center-to-center compilations and note that highly specific, reasonably short range electrostatic interactions (<4.A.) are distinct from longer range nonpolar ones in such an analysis. T h e y also note that much of the information in the potential curves reflects the general close packing of residues and emphasize the importance of removing this to see r e s i d u e - r e s i d u e specific interactions. T h e most extensive use of database potentials has been in the identification of folds by evaluating the sequence fit to each m e m b e r of a library of folds. T h i s technique is often referred to as threading, and sometimes as 'inverse folding' or 'fold recognition'. Usually, the potentials use an approximate description of the protein that has one or two points for each residue [33"]. In addition to a potential, algorithms are required to deal with the problem of the o p t i m u m alignment of the sequence on a fold, allowing for insertions/deletions. T h r e a d i n g methods have been objectively tested in blind predictions [34**,35"]. Although far from perfect, a n u m b e r of the methods have been found to be effective at identifying the correct fold a significant fraction of the time. Two groups have demonstrated that threading potentials can be useful for d e t e c t i n g errors in experimental protein structures, a n u m b e r of examples of structures that were initially incorrectly reported were used as test data, and a clear distinction b e t w e e n each of these and the corresponding corrected structures was obtained [36,37]. Encouraged by the success of database potentials on the r e s i d u e - r e s i d u e level, a n u m b e r of groups are now working on potentials of mean force on the a t o m - a t o m level and a preliminary version of one of these has been published [38]. T h e s e potentials will provide an alternative to the molecular mechanics force fields for detailed descriptions of the conformational behavior of protein molecules. Can database potentials be used to derive individual interaction free energies? T h e r e is a serious d e b a t e concerning the validity of regarding the quantities used in database potentials as free energies in the same sense that free energies are used in classical physics. In the past year, two key contributions have b e e n made on this subject, one for an interpretation in terms of free energx, the other against. Sippl et a/. [39"] have elegantly argued that one of the key requirements for the satisfaction of Equation 1 - - a proper sampling of s t a t e s - - i s satisfied in the analysis of protein structural features. T h e y do this bv suggesting a G e d a n k e n e x p e r i m e n t in which a dilute solution of
Comparison of database potentials and molecular mechanics force fields Moult
197
a set of globular proteins is made, and the scattering properties that represent particular interatomic distances are measured in the same manner as in a traditional potential of mean force experiments for liquids [40]. T h e y point out that the same radial distribution function would be obtained as from compiling distance tables from a database of the same set of proteins. Formally, their argument is correct, although the sampling represented is quite unusual; for example, only those sequences that are able to form stable folded structures are sampled. A similar experiment conducted on a set of random sequences would yield a quite different distribution. Sippl eta/. [39"'] focus particularly on the form of the potential of mean force for the mainchain hydrogen bonds separated by more than eight residues in the sequence. T h e relative free energy is found to be approximately zero at short and large distances, but shows a peak in the intermediate range, between 4--5,~. T h e y interpret this peak as a transition-state barrier for the formation and breaking of hydrogen bonds during folding and unfolding respectively. However, the form of this function would be expected to be dependent on the conformation of the rest of the protein: in the (sampled) folded state of a chain, breaking such a hydrogen bond typically involves the breaking of many other interactions, and these will contribute to the apparent transition-state barrier. By contrast, in an unfolding or folding structure, hydrogen bonds would be expected to break or make in an order dependent on the release of other constraining interactions, and this particular form of the transition-state barrier would not exist.
at identifying the lowest energy conformations using the lattice model. T h e y find these to be only partly effective as the chains get longer; however, much compelling evidence suggests that database potentials are highly effective at distinguishing between correct and incorrect structures for real protein chains [34"',35"]. Furthermore, the limitations on the success of threading methods appear to arise primarily from imperfect fold models, rather than from weaknesses of the potentials [43"].
Thomas and Dill [41 °'] have attempted to directly test the assumptions underlying the relationship of structural features to free energy using a simple lattice model. T h e y impose a simple contact or distance-dependent potential on a small two dimensional lattice, and then find the lowest energy conformation for all possible sequences. T h e observed frequency of lattice-point contacts or separations in these conformations is used to reconstruct the apparent free energy of the interactions, using the Boltzmann relationship. T h e y conclude that the nonindependence of the interactions leads to significant distortions of the energy. For example, using the HP energy model [42], they show that the extracted free energies for hydrophobic to polar and polar to polar contacts are significantly different from the input value of zero and are a function of chain length. There is an apparent difficulty here: the term 'energy' is used to describe the potentials deduced from the analysis, and these potentials are compared with the true energies used to drive /he model. Yet, the extracted quantities are free energies and include the cntropic restrictions imposed by chain continuity and the compacmess associated with low energy states. Thus, the validity of the comparison is not clear. T h o m a s and [)ill [41"1 also address the issue of whether the derived free energies are effective
Conclusions
Both of the work of Sippl et aL [39 "°] and of Thomas and Dill [41"] represent a significant step forward in the debate on the interpretation of statistical potentials. A final resolution awaits further work. Larger scale potentials
testing
of macromolecule
How can the performance of force fields on macromolecules be more extensively tested? One approach is to compare the performance of as many different potentials as possible on as many test data as possible. A World Wide Web based system for the exchange and comparison of potentials is being developed. It contains a set of benchmarks, contributed by different research groups, and groups are invited to run their potentials against these tests and to deposit the results. T h e current benchmarks are all based on testing the ability of a potential to distinguish between more or less correct structures of proteins and parts of proteins. When sufficient results have been accumulated, a workshop will be held to discuss the results (for further details, see [44]).
Much more work is needed before any of the potentials for the study of macromolecules can be considered fully developed. On the molecular mechanics front, more testing against large systems is needed, as well as a more direct assessment of the effect of the different approximations. For statistical potentials, the most pressing issue is the validity of the relationship of individual interactions to free energy.
References
and
recommended
reading
Papers of particular interest, published within the annual period of review, have been highlighted as: • of special interest • = of outstanding interest Privalov PL: Stability of proteins. Adv Prot Chem 1979, 33:167-241.
2.
Privalov PL, Makhatadze Gh Contribution of hydration to protein folding thermodynamics. II, The entropy and Gibbs free energy of hydration. J Mol Bio11993, 232:660-679.
3.
Lattrnan EE, Rose GD: Protein folding- what is the question? Proc Natl Acad Sci USA 1993, 90:439-441.
198
T h e o r y and simulation
McCammon JA, Harvey SC: Dynamics of Proteins and Nucleic Acids. Cambridge, UK: Cambridge University Press; 1987. 5. Halgren TA: Potential energy functions. Curr Opin Struct Biol • 1995, 5:205-210. A useful description of the methods used to parameterize molecular mechanics force fields. 6. •
Cornell WD, Ciepak P, Bayly CI, Gould IR, Merz KM Jr, Frguson DM, Spelleyer DC, Fox T, Caldwell JW, Kollman PA: A second generation force field for the simulation of proteins, nucleic acids and organic molecules. Biochemistry 1995, 117:5179-5197. An example of a 'second generation' molecular mechanics force field.
ubiquitin: how well is the X-ray structure maintained? Proteins 1996, 25:315-334. 20. •
Storch EM, Daggett V: Molecular dynamics of cytochrome bS: implications for protein-protein recognition. Biochemistry 1995, 34:9682-9693. One of the first examples of limited convergence towards an experimental protein structure using molecular dynamics. 21.
HQnenberger PH, Mark AE, Van Gunsteren WF: Fluctuation and cross-correlation analysis of protein motions observed in nanosecond molecular dynamics simulations. J Mol Biol 1995, 252:492-503.
22.
Smith LJ, Mark AE, Dobson CM, Van Gunsteren WF: Comparison of MD simulations and NMR experiments for hen lysozyme. Analysis of local fluctuations, cooperative motions and global changes. Biochemistry 1995, 34:10918-10931.
23.
Novotny J, Bruccoleri RE, Karplus M: An analysis of incorrectly folded protein models. Implications for structure predictions. J Mol Biol 1984, 177:788-818.
24.
Roterman I, Lambert M, Gibson K, Scheraga HA: A comparison of CHARMm, AMBER and ECEPP potentials for peptides. I1. f - c maps of N-acetylalanine N'-methylamide: comparisons, contrasts and simple experimental tests. J Biomol Struct Dyn 1989, 7:421-453.
Novotny J, Rashin AA, Bruccoleri RE: Criteria that discriminate between native proteins and incorrectly folded proteins. Proteins 1988, 4:19-30.
25.
Wang Y, Zhang H, Scott RA: Discriminating compact non-native structures from the native structure of globular proteins. Proc Natl Acad Sci USA 1995, 92:709-713.
Tirado-Rives J, Jorgensen WL: Molecular dynamics simulations of the unfolding of apomyoglobin in water. Biochemistry 1993, 16:4175-4184.
26. o*
Brooks BR, Bruccoleri R, Olafson B, States D, Swaminathan S, Karplus M: CHARMm: a program for macromolecular energy, minimization, and dynamics calculations. J Comp Chem 1983, 4:187-193. Dauber-Osguthorpe P, Roberts VA, Osguthorpe DJ, Wolf J, Genest M, Hagler AT: Structure and energetics of ligand binding to proteins: Eschericha coil dihydroflolate reductasetrimethoprim, a drug-receptor system. Proteins 1988, 4:31-47.
10.
Van Gunsteren WF, Billeter SR, Eising AA, HQnenberger PH, KrLiger P, Mark AE, Scott WRP, Tironi IG: Biomolecular Simulation: the GROMOS96 Manual and User Guide. ZLirich: Hochschulverlag AG an der ETH ZiJrich; 1996. A new release of the GROMOS simulation package. 11.
12. •
HLJnenberger PH, Van Gunsteren WF: Empirical classical interactions functions for molecular simulation. In Computer Simulation of Biomolecular Systems, Theoretical and Experimental Applications, vol III. Edited by Van Gunsteren WE Weiner PK, Wilkinson AJ. Leiden: ESCOM; 1997. A review of molecular mechanics force fields, including a description of the current functional forms, and a discussion of some of the approximations involved.
Yang A-S, Honig B: Free energy determinants of secondary structure formation I. Alpha-helices. J Mol Bio11995, 252:351-365. This paper, together with [27,28], describes one of the most thorough studies that tests the use of a molecular mechanics force field with an approximate solvation model against experimental data on larger molecules. 27.
Yang A-S, Honig B: Free energy determinants of secondary structure formation II. Anti-parallel beta sheets. J Mol Biol 1995, 252:366-376.
28.
Yang A-S, Honig B: Free energy determinants of secondary structure formation III. Beta-turns and their role in protein folding. J Mol Biol 1996, 259:873-882.
29.
Sitkoff D, Sharp KA, Honig B: Accurate calculation of hydration free energies using macroscopic solvent models. J Phys Chem 1994, 96:1978-1988.
13.
Dudek M J, Ponder JW: Accurate modeling of intramolecular electrostatic energy of proteins. J Comput Chem 1995, 16:791-816.
30. Jernigan RL, Bahar I: Structure-derived potentials and protein •, simulations. Curt Opin Struct Bio11996, 6:195-209. An excellent review of the principles of database potentials.
14.
Gilson MK, McCammon JA, Madura JD: Molecular dynamics simulation with a continuum electrostatic model of the solvenL J Comp Chem 1995, 16:1081-1095.
31.
15.
Fraternali F, Van Gunsteren WF: An efficient mean selvation force model for use in molecular dynamics simulations of proteins in aqueous solution. J Mol Bio/1996, 256:939-948.
16.
17.
18.
19
Braxenthaler M, Avbelj F, Moult J: Structure, dynamics and energetics of initiation sites in protein folding: I. Analysis of a 1 ns molecular dynamics trajectory of an early folding unit in water: the helix I/Ioop-I fragment of barnase. J Mol Biol 1995, 250:239-257. Kitson DH, Avbell F, Moult J, Nguyen DT, Mertz JE, Hadzi D, Hagler AT: On achieving better than 1 angstrom accuracy in a simulation of a large p r o t e i n - S t r e p f o m y c e s griseus protease A. Proc Natl Acad Sci USA 1993, 90:8920-8924 Brunne RM, Berndt KD, Guntert P, Wuthrich K, Van Gunsteren WF: Structure and internal dynamics of the bovine pancreatic trypsin inhibitor in aqueous solution from long-time molecular dynamics simulations. Proteins 1995, 23:49-62. Fox T, Kollman KA: The application of different solvation and electrostatic models in molecular dynamics simulations of
Miyazawa S, Jernigan RL: Residue-residue potentials with a favorable contact pair term and an unfavorable high packing density term, for simulation and threading. J Mol Biol 1996, 256:623-644.
32. •
Bahar I, Jernigan RL: Inter-residue potentials in globular proteins: dominance of highly specific hydrophilic interactions at close separation. J Mo/Bio/1997, 266:195-214. A new treatment of residue-residue distance dependent potentials of mean force. Highly specific short range electrostatic interactions are apparent. 33. Jones DJ, Thornton JM: Potential energy functions for threading. • Curr Opin Struct Biol 1996, 6:210-216. A review of the database potentials currently in use for fold recognition applications. 34. ••
Lemer CMR, Rooman M J, Wodak S J: Protein structure prediction by threading methods: evaluation of current techniques. Proteins 1995, 23:337-355. A critical evaluation, based on objective testing, of the effectiveness of fold recognition methods. 35. •
Second meeting on the critical assessment of techniques fro protein structure prediction on World Wide Web URL: http://ins4.carb.nist.gov/casp2/ Access to new results of obiective tests of threading methods.
C o m p a r i s o n of d a t a b a s e potentials and molecular mechanics force fields Moult
36.
Sippl M J: Recognition of errors in three-dimensional structures of proteins. Proteins 1993, 17:355-362.
37.
Luthy R, Bowie JU, Eisenberg D: Assessment of protein models with three-dimensional profiles. Nature 1992,356:83-85.
38.
Subramaniam S, Tcheng DK, Fenton JM: A knowledge-based method for protein structure refinement and prediction. In Proceedings of the Fourth International Conference on Intelligent Systems for Molecular Biology. Edited by States D et aL Menlo Park: AAA Press; 1996:218-229.
39. °°
Sippl M.!, Ortner M, Jaritz M, Lackner P, FI6ckner H: Helmholtz free energies of atom pair interactions in proteins. Fold Des 1996, 1:289-298. An elegant analysis of the use of database potentials to look at the free energy of interactions in proteins. 40.
McQuarrie DA: Statistical Mechanics. New York: Harper Collins; 1976.
199
41. •.
Thomas PD, Dill KA: Statistical potentials extracted from protein structures: how accurate are they? J Mol Biol 1996, 257:457-469. A test of the principles of database potentials using fully characterized simple lattice systems.
42.
Lau KF, Dill KA: Theory of protein mutability and biogenesis. Proc Nat/Acad Sci USA 1990, 87:638-642.
43. Bryant SH: Evaluation of threading specificity and accuracy. • Proteins 1996, 26:172-185. This paper demonstrates that the limits of threading methods are mainly due to imperfect fold models, rather than to the inability of database potentials to identify the correct structure. 44.
Prostar - - the protein potential site on World Wide Web URL: http://prostar.carb.nist.gov