J. Leszczynski (Editor)
Computational Molecular Biology Theoretical Computational Chemistry, Vol. 8 ©1999 Elsevier Science B.V. All rights reserved
85
Chapter 3
C O M P U T A T I O N A L A P P R O A C H E S TO THE S T U D I E S OF THE I N T E R A C T I O N S OF N U C L E I C ACID B A S E S J. Sponer, *a P. H o b z a , a and J. Leszczynski b
aj. Heyrovsk3) Institute of Physical Chemistry, Academy of Sciences of the Czech Republic, Dolej~kova 3, 182 23 Prague, Czech Republic. bDepartment of Chemistry, and Computational Center for Molecular Structure and Interactions, Jackson State University, Jackson 39217, MS. 1. I N T R O D U C T I O N The three dimensional structures and dynamics of biomacromolecules substantially influence all molecular recognition processes. These phenomena have attracted computational chemists for years despite the fact that the application of computational chemistry in studies of biopolymers is not easy. Biopolymers are large and a proper description of them requires inclusion of a solvent and counterions. Sufficiently long molecular dynamics (MD) simulations on large biomolecules are still intractable even with the simplest empirical potentials. Further, an exceptionally accurate energy function is necessary for a proper description of biopolymers since the energetic scale of all biomolecular processes is very tiny. Therefore, theoretical studies of biopolymers must always seek for a balance between the reliability (size) of the studied system and the quality of its description. No single theoretical approach could study all aspects of biopolymers. Numerous techniques and models are used simultaneously, and sometimes even a simple approach can be very useful by providing at least fragmentary explanations and rationalizations. The recent years are characterized by a significant improvement in theoretical methods used in studies of the structures and dynamics of nucleic acids. The most impressive development we have evidenced was the advance of nanosecond MD simulations. Currently, all-atom molecular dynamics techniques for the first time provide stable trajectories of hydrated oligonucleotides on a nanosecond scale without using any constraints. The old MD studies were limited to very short time intervals. When longer simulations
86 become available the DNA double helix was disrupted unless constraints were applied. This was caused by truncation of long-range electrostatic interactions. This deficiency has now been overcome and stable trajectories are obtained when the particle mesh-Ewald (PME) technique for the electrostatic interactions is used [ 1]. After first studies appeared in 1995 [2], dozens of PME papers can be found in the current literature [3], and this method dominates the theoretical studies on nucleic acids these days [3]. Furthermore, the procedure can also be used as an accurate tool in refinements of NMR structures [4]. The method is still limited by the quality of force fields and insufficient sampling (length of simulations)[5]. Simple atom-atom pairwise additive empirical potentials cannot, despite their continuous refinement, properly include all contributions which are important. In addition some results such as A - D N A - B-DNA equilibrium have been shown to be force-field dependent. The introduction of nonadditive polarizable potentials is considered to be the next step forward [6], and such potentials would allow for a consistent description of the solvent effects in polar solvents. They are under development, and the induction effects are mostly accounted for by using point atomic polarizabilities [6]. The current pair-additive empirical potentials are not accurate enough to satisfactorily describe metal-cation containing clusters because of the huge induction and charge-transfer effects in these systems. Of the other recent achievements, the application of high-quality (ab initio) quantum chemical calculations to biological problems should be mentioned. Ab initio calculations possess a unique feature that their quality and accuracy can be compared with accurate gas phase experiments [7]. No other theoretical method provides such reliability. Recent progress in computational quantum chemistry has been highlighted by the 1998 chemistry Nobel prize awarded to Pople and Kohn. The main aim of quantum-chemical calculations for biological systems is to complement experiments and to provide information and predictions which are not accessible by experimental techniques. If we consider studies of nucleic acids and their components, there is currently only one reliable gas phase experiment (mass-field spectroscopy) on the energetics of nucleic acid base pairs [Ta]. This experiment provides interaction enthalpies of several base pairs at rather high temperatures of 350-400K. The experimental technique does not allow for the determination of the geometry of complexes, and it is even not possible to distinguish between H-bonded and stacked structures. Recent state-of-the art theoretical analysis of gas phase thermodynamics of uracil dimer indicates that this pair exists as a mixture of several H-bonded and stacked structures [8]. Furthermore, their relative populations could vary depending on experimental conditions. It is likely that
87 similar situation will occur also for other base pairs which would make any experimental assignment exceptionally difficult. In contrast, current ab initio calculations can characterize any configuration of base pair with similar accuracy even those which can never be observed experimentally for isolated base pair. Literally, the whole conformational space of the base pairs can be investigated. Further information can be obtained by performing vibrational analysis though one must keep in mind that in many cases the harmonic approximation is not valid, and multi-dimensional nonharmonic treatment is inevitable [9]. The ab initio calculations can be used to parameterize a force field, and subsequent simulations can evaluate the thermodynamics of gas phase formation of base pairs. There are now numerous examples of joint experimental and theoretical studies on molecular clusters [10]. Theory and experiments are of similar power for smaller clusters (water clusters, benzene argon clusters, benzene dimer, etc.). However, when going to more complicated systems such as base pairs, the theory seems to be superior these days though progress in experimental studies is very desirable [7]. Because empirical parameters are avoided in the ab initio calculations, such calculations have had a major role in parameterization and testing of many recently introduced force fields for biopolymers [11,12]. In contrast to empirical potentials, ab initio calculations can be used in the study of metal cation-containing complexes which are very important in understanding the many aspects of structure and function of biomolecules [ 13]. The importance of ab initio theory is expected to grow since the calculations are more and more reliable for larger systems. Nevertheless, currently almost all ab initio calculations assume a gas phase systems so that some re-scaling or reparameterization is inevitable if the force field is to be used for condensed phase simulations. The range of quantum-chemical applications to the problems of biological interest can be extended by using the very economical Density Functional Theory (DFT) techniques [15] though presently their accuracy is very limited for weak intermolecular interactions [ 16]. In this chapter we are going to summarize recent quantum-chemical studies on the interactions of complexes of nucleic acid bases. Interactions of bases substantially influence structure and function of nucleic acids, and their comprehensive quantum chemical characterization represents one of the most successful applications of quantum chemistry to biomolecules so far. The review starts with a brief summary of published papers, then it continues by describing the most important methodological aspects of the calculations, and finally summarizes selected recent results obtained in the laboratories of the authors.
88 2. HISTORICAL OVERVIEW OF AB INITIO STUDIES ON NUCLEIC ACID BASE PAIRS Let us present a brief overview of the ab initio calculations of the interactions of nucleic acid bases. Since the literature is rather wide, we cannot give credit to all studies. We are referencing to those papers which have brought new insight to understanding of DNA structure. Before the advance of ab initio method, techniques of a semi-empirical nature were also applied to studies on base pairs. An extensive semi-empirical study of stacking and H-bonding interactions was made by Langlet et al. [ 16]. DNA base interactions were then further studied by Ornstein and Fresco [ 17], Forner et al. [18], and Otto [19]. Del Bene investigated the association of AT and GC base pairs with Li + and H + at the ab initio HF/STO-3G level [20] Hobza and Sandorfy [21] studied 29 H-bonded DNA base pairs at the HF/MINI-1 level with the step-by-step method using the counterpoise correction and including the empirical London dispersion energy. This study set a benchmark for nearly a decade. The empirical potential counterpart of this study has been reported by Poltev and Shulyupina together with an extensive characterization of the base stacking [22]. Both stacking and H-bonding interactions were studied by Aida and Nagata at the HF/4-31G and HF/6-31G levels [23]; the dispersion energy was evaluated by the second-order sum-of-states perturbation method. This was especially important to evaluate the base stacking many years before the second-order M611er-Plesset method could be applied. Anwander et al. [24] studied, using a minimal basis set, the interaction of monovalent and divalent metal ions with AT and GC pairs. They reported significant changes in the stability of the base pairs due to metal ion binding. Similar effects were noticed for model complexes with imidazole [25]. Kozelka and co-workers developed an ab initio-based force field for platinated bases [26]. Colson et al. [27] have thoroughly investigated electron affinities and ionization potentials for bases, base pairs, and backbone mostly at the HF/6-3 I+G*//HF/3-21G level including water molecules from the first hydration shell in some cases. The main focus of their research were nucleic acid radical-cation base pairs and other species involved in the damage of nucleic acids due to irradiation. Ab initio calculations with inclusion of electron correlation were first done on isolated nucleobases. Sponer and Hobza introduced the idea of very flexible and intrinsically nonplanar DNA base amino groups and amino-acceptor interactions into DNA structure analysis [28] and have confirmed the conclusions reached by earlier HF level comprehensive investigations of the nonplanarity of bases by Leszczynski [28f]. Then several base pairs were
89 characterized at the MP2 level using the Hartree-Fock gradiently optimized geometries [29]. Florian et al. investigated double proton transfer in the AT and GC base pairs at various levels of theory up to the MP2/6-31G**//HF/6-31G** level [30]. Hobza et al. analyzed stacked and H-bonded cytosine dimer at the MP2/6-31G* level using diffuse polarization functions [31]. This study is the first generally reliable analysis of nucleic acid base stacking and has ruled out the induction theory of stacking. These calculations were extended by Sponer and co-workers on many other stacked and H-bonded base pairs [12a,b-e,32] including a systematic comparison of the correlated ab initio data with different empirical potential models [12a,b,d] and Density Functional Theory calculations [12b]. The postulated empirical pi-pi ("sandwich") interaction model of aromatic stacking was not confirmed [12a]. A systematic analysis of the nonplanarity and flexibility of the base pairs demonstrated that many base pairs including all GA mismatches are intrinsically nonplanar [33]. The next study investigated nonadditivities in stacked DNA base pair steps and the sugar...base stacking occurring in some crystals of nucleic acids [34]. Brammeld et al. applied local MP2 procedure with an extended basis set of atomic orbitals on several H-bonded base pairs and parameterized a very accurate force field for H-bonded base pairs [12g]. Sponer and Hobza carried out the first CCSD(T) calculations on base pairs and model complexes [35]. They have confirmed that the MP2 procedure works well for H-bonding and nonaromatic stacking while it overestimates the stabilization in all aromatic stacked clusters [35b,36]. Due to a fortunate compensation of errors, the MP2 procedure with medium sized diffuse-polarized basis set should provide a very good estimate of base stacking energies. Hobza and ~;poner also reported the first MP2 gradient optimization on stacked DNA base pairs [37]. ~;pirko et al. attempted the first inharmonic vibrational analysis of base pairs [9]. Bertran et al. studied proton-transfer in radical cation base pairs and provided a set of useful comparative studies on model complexes [38]. Gu and Leszczynski investigated H-bonding structures of guanine tetramers [32d]. Luisi and coworkers carried out a combined crystallographic and ab initio study exploring the ability of amino groups to act as H-acceptors in biomolecules [39]. Also the splitting pathway of irradiation- induced species such as cyclobutane-type uracil dimer radical cation was studied [40]. The thermodynamics of formation of a gas phase uracil dimer was characterized by Kratochvil et al using a combination of empirical potential and ab initio approaches [8]. Due to increased computer power the main interest in ab initio studies of base pairs shifted to metal-cation containing clusters important for biomolecules. Carloni and Andreoni characterized the solid-state structure of a platinated base
90 pair using DFT [14a] while the ab initio method with Effective Core Pseudopotentials (ECP) has been used for some other metalated bases [41]. Burda and co-workers carried out extensive high-level ab initio calculations on the interactions of bases and base pairs with fifteen metal cations using relativistic pseudopotentials [42]. The calculations revealed large nonadditivity of interactions, polarization and charge-transfer effects and demonstrated the qualitative failures of an empirical potential treatment. A large difference between guanine and adenine containing base pairs has been noticed. The calculations were then extended by including the first hydration shell of the cations [13c]. A very different balance of cation-nucleobase and cation-water (hydration) interactions was found for Z n 2+ and M g 2+ divalent cations [13c]. Preliminary investigation of the gas phase energetics of a platinated pair has been reported by Zilberberg and co-workers. Recently the first ab initio studies have been published considering the influence of solvent effects on base stacking and H-bonding [43]. Zhanpeisov and Leszczynski studied selected H-bonded base pairs in an environment of small water clusters mimicking the first hydration shell [43a-d]. Gorb and Leszczynski characterized water-assisted tautomerization of nucleobases [43e]. Orozco, Luque, and coworkers applied sophisticated classical and quantumchemical techniques to predict the tautomeric equilibria of bases in a polar solvent 43f, g]. Subramanian et al. studied base stacking in the cytosine dimer using the Onsager formalism of SCRF (self consistent reaction field) technique [44]. Florian et al. carried out an extensive set of calculations on both H-bonded and stacked pairs by combining the Langevin Dipole (LD) approach with correlated ab initio calculations [45]. Clearly, proper inclusion of solvent effects into the calculations is perhaps the most important though exceptionally difficult task of contemporary quantum-chemistry [46]. Reliable inclusion of solvent effects is necessary to directly compare the calculations with experimental data on the nearest-neighbor stacking properties in nucleic acids [47] and many other molecular recognition experiments carried out in a condensed phase [48]. Much effort is currently devoted to developing more economical quantumchemical methods for biopolymers. Semiempirical techniques are very inaccurate (even compared to force fields) for description of molecular interactions [12d,e]. More promising are the Density Functional Theory methods [14]. A high quality DFT method can provide, for some applications, results comparable to those obtained by good quality, traditional ab initio techniques. However, high quality DFT techniques are still costly. Furthermore, no DFT method currently works for dispersion-controlled interactions [ 12b, 15]
91 including base stacking [12b] which is evidently inadequate for investigations of biomolecules. The development of DFT methods able to treat van der Waals interactions is under way [ 15c, d]. 3. M E T H O D S 3.1. Levels of ab initio treatment of base pairs When considering the complexes of nucleic acid bases, three distinct levels of traditional ab initio treatment can be used. a) Hartree-Fock (HF) approximation. Its applicability is limited by a complete neglect of electron correlation effects which lead to two substantial inaccuracies. i) Due to the absence of intramolecular electron correlation effects, the dipole moments of monomers and electrostatic interactions are overestimated. ii) The dispersion attraction which originates in intermolecular electron correlation effects is neglected. The dispersion energy is especially important in biopolymers, and its role increases with the size of the molecule. The Hartree-Fock method is useful in studies of H-bonded complexes and complexes containing metal cations, although for quantitatively correct results, the electron correlation method should be applied. The Hartree-Fock method underestimates the flexibility of the amino groups of nucleic acid bases [28b] and fails for stacked complexes. b) The second-order M611er-Plesset (MP2) method includes a significant portion of electron correlation effects and can be used for a consistent description (with a reasonable basis set of atomic orbitals) of all interactions of nucleic acid bases. Let us point out that correlated calculations are mostly done within the frozen core approximation, i.e., only valence electrons are considered for the electron correlation. The MP2 method overestimates the stabilization in aromatic stacking clusters, thus care should be taken when selecting a basis set for such systems. (The basis set should be augmented by diffuse d-polarization functions but should not be saturated [3 5b].) The MP2 method is computationally much more demanding compared to the Hartree-Fock theory. There are attempts to develop less costly variants of the MP2 procedure which are called local MP2 or LMP2. For large systems the speed-up of the LMP2 can be enormous [49]. The LMP2 method has recently been used by several groups to study fragments, of biomolecules [12c] including DNA base pairs [ 12g]. c) Coupled Cluster method with noniterative triple excitations (CCSD(T)) is the next step in accuracy [50]. This approach provides essentially the same
92 results as the MP2 level for hydrogen bonding, nonaromatic stacking [51 ], and amino group pyramidalization [51]. It brings a substantial change compared to MP2 results for aromatic stacking clusters [35b,36]. Due to the enormous computer requirements of the CCSD(T) method, there is only a limited data available, and some further differences between the MP2 and CCSD(T) levels might yet be established. The alternative to CCSD(T) is the full MP4(SDTQ) procedure although the CCSD(T) and MP4 results can differ in some aspects [36]. We do not recommend the use of cheaper alternatives of full MP4 such as MP3 or MP4(SDQ) (MP4 without triple electron excitations) [35b]. (d) Let us briefly comment on Density functional theory (DFT). DFT calculations have been widely used in the past few years. Here the exact exchange term employed in the HF method is replaced by a more general expression, the exchange-correlation functional. This includes the exchange term and also contributions to the electron correlation energy. DFT is much less demanding than the standard correlated ab initio methods. Because the exchange-correlation functional in the DFT calculations could be defined in different ways, the results strongly depend on the choice of the functional. This is a significant disadvantage compared to the standard ab initio techniques since no systematic improvement of the results is guaranteed. The DFT calculations (with a proper functional) provide reasonable values of amino group pyramidalization for nucleic acid bases [28e, 52] and related compounds and very good dipole moments, charge distributions [12a,b, 52], and vibrational frequencies of monomers [52]. The currently available DFT techniques completely fail for dispersion-controlled (van der Waals) systems [ 12b, 15]. In view o f the exponentially growing number o f attempts to use DFT for biomoleeules, we have to emphasize that the method is not reliable for weak intermolecular interactions which are very important in biology. 3.2. Choice of basis set
A minimal requirement for all applications is to have a basis set of doublezeta (DZ) quality with at least one set of d-polarization functions on second row elements [53,54]. Such a basis set provides a very good description of the electrostatic field around the bases and electrostatic interactions. However, the amino group nonplanarity is (at the correlated levels) exaggerated [ 12a]. Here a triple-zeta (TZ) quality basis set with two sets of d functions is likely to be sufficient. In case of base stacking calculations, one must use diffuse d- polarization functions on the second row elements [55]. Failure to include diffuse dfunctions results in a qualitative underestimation of stabilization (by ca 50 %).
93 One set of diffuse d-functions with a momentum-optimized exponent of 0.25 provides an already reasonable value for the dispersion stabilization [55]. Two sets of d-polarization functions with standard exponents of 1.6 and 0.4 do not provide better description. The addition of a diffuse sp shell to the standard dpolarization functions (6-31+G* basis set) is not sufficient. Computational studies conclude that p-polarization functions on hydrogen atoms are not very important, and the f-polarization functions on second-row elements bring only a small improvement in stabilization [35,36]. Because the MP2 procedure overestimates the aromatic stacking stabilization energies, attempts to reach the basis set limit at the MP2 level should be avoided. All reference calculations carried out so far indicate that one can use either the MP2 level of theory with medium-sized diffuse polarized basis sets of atomic orbitals or the CCSD(T) method with large very diffuse basis sets. The diffuse d-polarization functions (compared to standard d-functions) have a rather marginal influence on the correlation interaction energy for H-bonded base pairs [35b]. In contrast to stacking the higher momentum angular functions were found to be very important [35b, 56]. Comparison of aug-cc-pVDZ, ccpVTZ, and aug-cc-pVTZ basis sets for H-bonded formamide and formamidine dimers shows still a nonnegligible difference in the correlation interaction energies calculated with the two largest basis sets [35b]. The correlation part of the interaction energy for H-bonded systems converges slowly with the size of the basis set, compared to stacked dimers [35b]. Fortunately, H-bonded systems are dominated by the Hartree-Fock contribution to the interaction energy, and the correlation part of the interaction energy amounts to less than 20-30% of the interaction energy.
3.3. Evaluation of interaction energies Interaction energy between molecules A and B (AE AB) is determined as the difference between the energy of the dimer (E A'B) and the sum of the m o n o m e r energies (EA+ EB). AEAB = (E AB) _ (E a + EB).
(1)
The interaction energy, AE ABC , of a trimer ABC can be expressed in two ways [42b]: as a difference of the electronic energy of the complex and of the monomers AEABC= EABC _ [EA+ E B + EC],
(2)
94
or as a sum of three pair additive contributions and the three-body term AE 3 AEABC=AEAB+AEAC+AEBC+AE3=EAB_[EA+EB]+EAC.[EA+EC]+EBC_[EB+EC]+AE3
(3)
The three-body term can be expressed as: AE 3 = EABC _ E As_ E AC_ EBc+ EA+ EB+ E c
(4)
The methodology is extended accordingly for tetramers and larger complexes. Analysis of extended systems can be simplified by treating a selected group of molecules as one subsystem. For example a hydrated cation (cation plus water molecules) could be treated as one subsystem when evaluating the nonadditivity of interactions in complexes between hydrated cations and base pairs and trimers [ 13c]. The stability of a dimer is further influenced by the deformation of the monomers upon formation of the complex. This can be evaluated by subtracting the energy of the optimized isolated monomers (E Ai, E Ai) from the energies of the monomers in the dimer geometry (E A, EB). The respective deformation (relaxation) energy AEDEFis positive (repulsive). AE oEF = (E A_ E Ai) + (E a - E Bi)
(5)
The total complexation energy AET is thus defined as AET(AB) = AE AB + AEDEF
(6)
and equations (5) and (6) can be extended in a straightforward manner also to trimers and larger clusters. If the calculations are made with inclusion of electron correlation effects, then interaction energies AE and their components consist of HF and electroncorrelation components (AE = AE HF + AE c°R)
(7)
The former term basically includes the electrostatic, induction, charge-transfer, and electron-exchange contributions. The dispersion energy originates in the electron correlation which also influences the other contributions.
95
3.4. Basis set superposition error Evaluation of the ab initio interaction energy is substantially complicated by the so-called basis set superposition error (BSSE) [57]. This artifact arises from the final size of the basis set and is to be eliminated by the full counterpoise procedure (CP) [58]. The monomer energies are evaluated with the full basis set used for the dimer in the geometry of the complexes (designated E A', EB'). Then the equation (1) is modified as follows: AEAB= (EAB) _ (EA'+ E B')
(8)
Elimination of BSSE in a trimer requires to carry out all calculations in the trimer-centered basis set. The full counterpoise procedure has been proposed independently by Jansen and Ross [58a] and Boys and Bernardi [58b]. This technique has been several times alleged to overcorrect the BSSE. However, recent comprehensive investigation conclude that the full counterpoise procedure is rigorously correct for closed-shell interactions and that the overcorrection argument is fundamentally wrong [57]. One should not follow recommendations to skip the CP procedure at correlated levels "because it worsens the agreement with some target (experimental) data or expected trends." [57] These disagreements are always due to other substantial inaccuracies in the calculations opposing the BSSE, namely, the insufficient size of basis set for electron correlation. When the CP correction is not applied some low quality calculations can really seem to be closer to the experiment, but this compensation of errors does not work in the same way when going from one system to another. Magnitude of BSSE differs for small and large complexes, for H-bonded and stacked dimers, and is very dependent on the basis set. Let us note that in some cases the BSSEcorrected correlation interaction energy can be close to zero or even repulsive. It does not indicate any overcorrection by CP; it is a correct result caused by a reduction of electrostatic attraction at a correlated level [57a, 32a]. Since the CP correction provides an exact value of BSSE (it is not an estimate [57]) it is fully justified to make calculations even when this correction is large compared to the interaction energies. Such an effect has been noticed in correlated calculations of stacked clusters with medium sized diffuse basis sets.
3.5. Geometry optimization Before the interaction energies can be evaluated, the optimal geometry of the complexes should be determined. Nowdays geometry optimizations are based almost exclusively on gradient techniques. Gradient optimizations allow the
96 intramolecular and intermolecular degrees of freedom to relax simultaneously so that a fully optimized structure can be obtained. Subsequent evaluations of the harmonic vibrational frequencies verify the nature of the optimized structure and that the thermodynamic properties of base pair formation can be calculated [12g, 53]. The harmonic approach, however, cannot be used in case of base pairs characterized by strongly inharmonic lowest intermolecular vibrational modes [9]. Gradient optimization at the HF level is suitable for studies of the H-bonded base pairs, complexes with metal cations, and hydrated complexes. Stabilization of these systems is dominated by Hartree-Fock interaction energy. A correlated level of theory must be used for stacked base pairs. Base stacking in nucleic acids shows very variable conformations, and many of them are clearly outside the low energy region of a stacked nucleobase dimer [12a]. Thus, a proper analysis of base stacking requires an exhaustive search over the conformational space rather than an optimization. It can be done using step-by-step MP2 calculations with rigid monomers. Some nucleobase dimers even have no minimum corresponding to the stacked arrangement [37]. In many cases it is very useful to perform optimizations with partly frozen geometries, for example, in studies of interactions occurring in bigger systems such as crystals [28a,d, 39]. Unfortunately, the currently used gradient techniques are not corrected for the basis set superposition error which introduces a nonnegligible artifact into the calculations. In the case of stacked base pairs, the sum of basis set superposition errors from HF and correlation contributions compares to the actual stabilization energies [37]. Recently several papers reported BSSE-free gradient optimizations[54], and such calculations will be essential to improve the computations especially with inclusion of electron correlation effects. 4. RESULTS 4.1. Structures of H-bonded DNA base pairs Extensive studies in the past years have provided the molecular geometries of more than 100 nucleic acid base pairs (Figure 1) and trimers which were obtained at the HF level with medium sized polarized basis sets of atomic orbitals [12g,13c,29,30,32,33,43,59] including pairs with protonated bases [32b], rare tautomers [13c,43], modified, and nonnatural bases [59]. Many noncanonical base pairs are nonplanar [33]: all GA mismatch (Figure 2) pairs, TC pairs, 2-aminoadenine.thymine, and several GG pairs. They are significantly propeller-twisted and buckled, frequently with the hydrogens of
97
Figure 1a. Canonical Watson-Crick Adenine-Thymine base pair.
Figure lb. Canonical Watson-Crick Guanine-Cytosine base pair. the amino groups bent out of the molecular plane. These hydrogens can form attractive proton-proton acceptor interactions oriented out of the molecular plane of the base pair while the nitrogen can serve as a weak H-acceptor [28a,39]. The energy difference between nonplanar and planar base pairs is 0.0 4 kcal/mol. The nonplanarity of pairs is due to secondary interactions, amino group flexibility, and steric reasons. The secondary interactions (electrostatic interactions involving groups which do not participate in the same primary hydrogen bond) are both attractive ((C)H..-O, (N)H..-O, (C)H...N, (N)H---N) and repulsive (H.--H, O-..O). The amino groups are planarized by the primary hydrogen bonds but can be nonplanar if there is a hydrogen - hydrogen repulsion with the other monomer. This effect is especially well-pronounced if the amino group does not participate in base pairing. For example, a G(anti)A(anti) base pair contains a free amino group of guanine which interacts with the H2 hydrogen atom of the opposite adenine. After propeller twisting and amino-group pyramidalization, the repulsive electrostatic inter-hydrogen interaction is weakened, and instead, the H(C2) hydrogen atom points towards the negatively charged lone electron pair above the amino nitrogen atom. The large flexibility and nonplanarity of GA mismatch base pairs is known from
98 NMR studies [60] though the authors seem not to be aware that it is an intrinsic property of the pairs. The high-resolved crystal structure of d(CCAAGATTGG)2 B-DNA decamer [61] shows a significantly nonplanar GA anti-anti mismatch base pair with close contact between the unpaired amino group of the guanine and thymine carbonyl group belonging to the adjacent AT base pair. Ab initio calculations unambiguously predict that the guanine amino group is highly nonplanar in this configuration and that formation of an out-ofplane hydrogen bond brings about ca. 3 kcal/mol of additional stabilization [33].
Figure 2. Stereoscopic view of two GA mismatches optimized at HF/6-31G** level [33]. A very interesting result is revealed by an inharmonic vibrational analysis of the AT WC base pair [9]. The effective geometry of this pair is nonplanar despite that the planar structure is a minimum on the potential energy surface. The difference between equilibrium and effective geometries is because the molecular vibrations shift the nuclear positions from their equilibrium positions. This might be the case of many other base pairs which are currently assumed as planar complexes. Thus spectroscopic experiments in the gas phase should reveal a nonplanar, double-minimum structure for base pairs.
99 Also, a rather large conformational flexibility of the pyrimidine rings of the bases could to some extent contribute to the flexibility of pairs [62].
4.2. Energies of H-bonded DNA base pairs H-bonded base pairs are stabilized mainly by electrostatic attraction. The highlevel ab initio calculations provide reliable data on the energetics of H-bonding which can be used for verification and parameterization of force fields and as a substitution for the missing experimental data. Table 1 shows the energetics of selected base pairs evaluated for the planar optimized structures. The most stable neutral pair among the studied species is the GC Watson-Crick pair. Its calculated gas phase interaction energy is -25.8 kcal/mol, and the complexation energy (interaction energy + reorganization energy of bases) amounts to -23.7 kcal/mol. The second most stable pair is the GG1 pair (interaction energy -24.6, complexation energy -22.1 kcal/mol). Interaction (complexation) energies of all four AT pairs (Watson-Crick, Hoogsteen, Reverse Watson-Crick, and Reverse Hoogsteen) are within-12.3 to -13.3 (-11.7 to -12.6) kcal/mol while the complexation energies of the weakest pairs are within -9 to -10 kcal/mol only. The reorganization energy of the bases ranges, for neutral pairs, from +2.5 kcal/mol (GG1 pair) to +0.5 to +0.6 kcal/mol for the weak pairs. Our calculations agree within 2 kcal/mol with the mass field spectroscopy data [7] although this experiment does not reveal the molecular structures of studied complexes, and a mixture of several base pairs could be in fact present under experimental conditions [8]. It is encouraging that our data agree within 1 kcal/mol with the interaction energies obtained with a larger basis set and the LMP2 technique by Brameld et al. [12g] for several base pairs. (Interaction energies reported by Gould and Kollman [29a] are more negative since the BSSE was not properly corrected.) Two other important interactions must be considered in the case of protonated pairs" the induction interaction and the molecular ion- dipole interaction, i.e., interaction of the charged monomer with the electric field (dipole moment) of the neutral monomer. This interaction is very strong in protonated pairs with a highly polar neutral monomer. For example, the calculated gas-phase complexation energy of the triply-bonded CCH + pair is -45 kcal/mol. The molecular ion- dipole interaction leads to protonation of adenine in the (AC)H + pair. The strong electric field around cytosine shifts the proton to adenine [32b] despite that for isolated bases protonation of cytosine is preferred. It is important to notice that the CCSD(T) procedure provides essentially the same interaction energies as the MP2 procedure for H-bonded pairs [35b]. Basis
100
Table 1 The energetics of planar D N A base pairs [32a,b], trimers[ 13c], and pairs containing thiobases [59a] evaluated at the MP2 level with medium-sized polarized basis sets and using HFoptimized geometries. AE nr - HF/6-31G*(0.25)//HF/6-31G* data, A E MP2 MP2/631G*(0.25)//HF/6-31G*, A E DFT - Becke3LYP/6-31G*(0.25)//HF/6-31 G* data, AE v - AE ~2 + deformation energy of monomers. The designation 6-31G*(0.25) means that diffuse dpolarization functions with an exponent of 0.25 were used (See refs. [13e,32a, b,59a] for the nomenclature used for the structures). pair/trimer AE "r AE MP2 AE DFT AEV GCWC CC GA1 GT2 GC1 GA3 TARH TARWC GA4 TC1 TT2 TT3 GG4 6sG6sG A4sUWC 2sU/sU1 CCH+ G.GC(H) a TAT(H). CH+GO
-24.6 -16.1 -12.2 -13.8 -11.6 -10.8 -10.3 -9.6 -7.9 -8.7 -9.3 -9.3 -6.5 - 19.3 -8.4 -6.9 -43.2 -47.1 -19.1 -66.3
-25.8 -18.8 -15.2 -14.7 -14.3 -13.8 -13.2 -12.4 -11.4 -11.4 - 10.6 - 10.6 - 10.0 -22.3 - 11.8 -9.3 -44.8 -50.1 -25.0 -70.9
c(i)6o
-35.4
-39.8
GG1 GG3 GT1 AC1 AC2 TAH TAWC AA1 TC2 AA2 TT1 GA2 AA3 6sGCWC A2sUWC
-25.1 -16.0 -14.2 -10.8 -10.4 -10.4 -9.7 -7.8 -8.9 -7.2 -9.3 -6.8 -6.2 -23.1 -9.6
-24.7 -17.8 -15.1 -14.3 -14.1 -13.3 -12.4 -11.5 -11.6 - 11.0 -10.6 - 10.3 -9.8 -25.0 -12.8
-26.5 -18.4 -14.5 -14.7 -14.2 -12.7 -12.6 -11.4 -10.8 -10.1 -9.8 -9.6 -9.2 -
-25.1 -17.4 -14.8 -13.7 -13.4 -12.5 -11.9 -10.7 -10.7 - 10.3 -9.9 -9.2 -8.8 -
-23.8 -17.5 -14.1 -13.5 -13.4 -13.1 -12.6 -11.7 -10.7 -10.7 - 10.0 -9.9 -9.3 - 19.9 - 11.1 -8.8 -41.7 -44.6 -23.3 -65.2 -36.4 -22.2 -17.0 -13.9 -13.5 -13.2 -12.7 -11.8 -11.0 -10.7 - 10.3 -10.0 -9.6 -9.2 -22.5 -12.1
101 Table 1 (continued) The energetics of planar DNA base pairs [32a,b], trimers[13c], and pairs containing thiobases [59a] evaluated at the MP2 level with medium-sized polarized basis sets and using HFoptimized geometries. AEHF - HF/6-31G*(0.25)//HF/6-31G* data, AE~2 - MP2/631G*(0.25)//HF/6-31G*, A E DFT - Becke3LYP/6-31G*(0.25)//HF/6-31 G* data, AET- AE~2 + deformation energy of monomers. The designation 6-31G*(0.25) means that diffuse dpolarization functions with an exponent of 0.25 were used (See refs. [13e,32a,b,59a] for the nomenclature used for the structures). pair/trimer AEnv AE~2 AEDFT AET 2sU2sU2 -8.7 - 10.2 -9.6 GGC(rH) -40.7 -44.2 -40.0 AAT(rH)* - 16.4 -22.9 -21.7 GAT* - 18.8 -24.8 -23.2 AHGC* -63.8 -68.7 -63.6 *HF/6-31G* geometries ,
sets of a 6-31G* quality underestimate the correlation contribution to the interaction energy and are rather far from being saturated in this respect. However, this type of a basis set exaggerates the HF component of the binding and somewhat reduces the error at least for some pairs [35b]. Nevertheless we think that the presently available data for the H-bonding interaction energies of base pairs are to some extent undervalued. Nonnegligible inaccuracy can be introduced by performing the gradient optimization at the HF level only. It should be stressed that the next step in accuracy will require making several improvements simultaneously. It includes MP2-1evel gradient optimizations, large basis sets with higher-angular momentum polarization functions, and the use of the BSSE-free optimization procedure! Such calculations seem not to be feasible in the near future. Improving just only some aspects of the optimization procedure (e.g., MP2 optimization not corrected for BSSE) is not sufficient. Note the very good performance of the DFT Becke3LYP parameterization used for energy evaluation at the HF-6-31G**-optimized geometries. On the other hand it seems that the gradient optimization at the Becke3LYP/6-31G** level leads to an overestimation of the reorganization energies of bases and an underestimation of the H-bond lengths [32]. It is suspicious that the H-bond length obtained by the gradient procedure at the Becke3LYP/6-31G** level is by 0.02 A shorter in the CC base pair compared to the MP2/6-31G** gradient optimization despite that the MP2/6-31G** optimization certainly underestimates the H-bond length by ca. 0.06 ,A due to an uncorrected BSSE [32]. However, since almost no reference MP2 optimized geometries are available, no final judgment could be made concerning this DFT technique. The
102 main drawback of the DFT methods is, however, the complete failure in predicting for dispersion-controlled base stacking what devaluates even the good agreement for H-bonding energies. One recent paper reported the gas phase interactions of so-called hydrophobic (nonpolar) base pairs [59b]. Interesting information concerning base pairing can be also deduced from the electrostatic studies by Gadre and coworkers [63]. 4.3. Base stacking interactions There were several contradictory theories on the origin of base stacking interactions, and their evaluation was not possible since there are no gas phase data available for the stacked pairs. None of the previously used quantumchemical methods were reliable enough to properly characterize base stacking. The only exception is perhaps the ab initio study done by Aida [23a,b] though even here the differences with respect to the current values [12a, 34] are substantial. On the other hand the description of basis stacking provided by the empirical potential study of Poltev and Shulyupina was quite successful [22]. This unexpected agreement between modem quantum-chemical and empirical potential results can be easily understood. The most demanding part of a quantum-chemical treatment of base stacking is a proper description of intermolecular electron correlation which is responsible for the dispersion attraction. Semi-empirical methods have never succeeded in including this contribution, and it is still not within the reach of DFT techniques. The use of the MP2 method is the minimal requirement. On the other hand, the dispersion attraction is a rather isotropic contribution. It can be well described by the simple empirical London dispersion energy proportional to polarizabilities and the sixth power of the reciprocal interatomic distance. The stability of stacked pairs originates in the electron correlation (dispersion energy); the orientational dependence (twist, displacement) of the stacking energy is controlled by the Hartree-Fock energy (electrostatic interactions). The predicted stacking energies are sensitive both to the size of the basis set and to the inclusion of higher-order electron correlation contributions. Diffuse polarization functions on the second row elements must always be used. The present MP2/medium-sized basis set level of theory is nevertheless sufficient to reveal, for the first time, the nature of base stacking interactions and to validate and/or rule out various models of base stacking. The calculations are probably very close to the actual values due to a compensation of errors (MP2 vs. CCSD(T) level, medium vs. large basis set, see method section and [3 5hi).
103 One of the most surprising and important results in our studies can be highlighted as follows" The standard coulombic term with point charges localized on the atomic centers is sufficient to describe the electrostatic part o f the stacking interactions. The charges must be obtained quantum-mechanically
from the electrostatic potential with inclusion of electron correlation and with a basis set of at least 6-31G* quality. Such charges are already very similar to those obtained with large basis sets [12a,b]. We were able to reproduce (within 1.5 kcal/mol) ab initio stacking energies for almost 250 geometries of 10 different neutral stacked dimers using a standard empirical potential consisting of the coulombic and Lennard-Jones terms. The only exception was a region of vertical separation of bases below 3.3 A [12a,b, 59]. The inaccuracy found for the dimers with reduced intermonomer separations might be, for example, due to the anisotropy of the short repulsion neglected by the potential. The calculations ruled out the "induction" stacking model proposed by Bugg et al. [64] in which a significant stabilization is expected to arise from the interactions of polar exocyclic groups of DNA bases with the delocalized nelectrons of aromatic rings of adjacent bases. No such interactions were found; the induction theory of stacking is due to an inappropriate interpretation of the crystal structures as suggested many years ago by Poltev and Shulyupina [22]. The induction model of stacking was, for example, proposed to explain the unusual high twist- high slide geometry of the CpA steps in B-DNA crystals. The unusual stacking properties of the CpA steps can be rationalized considering the interaction between the guanine amino group and adenine six membered ring [65]. Also, no support was found for the n-n interaction model [66]. This is an important result because both induction and the n-n model of stacking would imply failure in the currently used empirical potential form. We have also tested the performance of a distributed multipole analysis at the correlated level of theory but the results were not satisfactory, at least in our case [12b]. We have also tested the use of additional out-of-plane charges, but these provide no substantial improvement. The present ab initio calculations made for all possible stacked dimers of adenine, cytosine, guanine, and uracil estimate the optimal gas-phase stacking energies to range from -12 kcal/mol (GG) to -7 kcal/mol (UU). Table 2 compares the gas phase interaction energies of the H-bonded and stacked (cf. Figure 3) nucleobase dimers. For all dimers there is at least one H-bonded base pair which is more stable than the best stacked arrangement. Despite this, some populating of the stacked structures in the gas phase dimers can be expected since structures that are energetically less favored are frequently stabilized by the entropic term [8].
104 Table 2 Comparison of gas phase interaction energies of stacked and H-bonded base pairs. The data for the H-bonding base pairs were obtained at the MP2/6-31G*(0.25)//HF-6-31G* level under Cs symmetry. Deformation energy is not included. The data for stacked pairs were obtained at the MP2/6-31G*(0.25) level with rigid monomers; the optimal geometries were estimated using a combined ab initio/empirical potential search with rigid coplanar bases. AEHvHartree-Fock interaction energy, AE c ° R - correlation interaction energy, AE~2 - MP2 interaction energy. Data taken from ref. [12a]. stack AEnr AE c°R AE MP2 H-bond AEHF A E c°R A E MP2 GG -0.84 - 1 0 . 4 7 -11.31 GG1 -25.08 +0.39 -24.69 GA +1.30 - 1 2 . 4 7 -11.16 GA1 -12.22 -3.01 -15.23 GU -1.17 -9.45 -10.62 GT1 -14.23 -0.92 -15.15 CA +0.85 -10.35 -9.50 CA1 -10.83 -3.51 -14.34 GC -1.44 -7.87 -9.32 G C W C -24.58 -1.23 -25.81 AU +1.25 -10.33 -9.08 TAH -10.38 -2.94 -13.32 AA +4.01 -12.84 -8.83 AA1 -7.83 -3.72 -11.55 CC -2.09 -6.17 -8.26 CC -16.15 -2.66 -18.81 CU -1.51 -7.01 -8.52 CT1 -8.68 -2.67 -11.44 UU +0.46 -6.98 -6.52 TT2 -9.29 -1.35 -10.64 We also evaluated the stacking energies for base pair steps in high-resolved BDNA and Z-DNA crystal structures and in an ideal B-DNA double helix [34]. The calculations (Table 3) showed a surprisingly small sequence-dependent variability of the total base pair step stacking energy; its values ranged from 10 to - 14 kcal/mol. However, the intrastrand and interstrand contributions to the stacking energy varied very significantly. We think that base stacking influences the sequence-dependent variability of DNA through the delicate balance between intrastrand and interstrand contributions rather than through the total base pair step stacking energy. The calculations also demonstrate the nonadditivity of interactions for base pair steps which can amount to ca. 20% of the total stacking energy [34]. This nonadditivity has been estimated only at the SCF level and can thus be underestimated. Similar calculations were reported for A- and B-DNA geometries by Alhambra et al. [67]. However, by not using diffuse polarization functions, the result is an underestimation of the stabilization energies. It explains why Alhambra et al. reported weaker base pair step stacking even though they have made calculations for larger system having methylated N1/N9 positions of bases. The protonated stacked dimers are gaining further stabilization from induction and molecular ion - dipole interactions. Neglection of the induction interaction is responsible for the fact that the current empirical potentials do not reproduce the changes in interaction energies due to DNA base protonation [32b]. -
105
G...G
G...A
A...A
I
t N~
o -11.3
-11.2
G...C
-8.8
A...U
~'N/ o
.
G...U
I~ N
H
0 -9.3
0 -10.6
-9.1
U...U
C...U
o H
A...C
C...C
H
o
Nf %
H n ~ N
|
i
0 H -6.5
H
-8.5
-9.5
-8.3
Figure 3. Optimal geometries of ten stacked nucleic acid base pairs. Their structure has been obtained using an ab initio-fitted potential. The interaction energy (kcal/mol) has been evaluated at the MP2/6-31G*(0.25) level. For more details see [12a].
A further step in quantum-chemical studies of base pairing and stacking should be inclusion of solvent effects [44,45]. In a polar solvent the stacked pairs are formed, i.e., the stability order is reversed. The base - base interactions within a nucleic acid helix further differ from the pure solvent. Base pairs embedded between two very polar GC base pairs experience a different electric field than in the case when they are surrounded by two rather nonpolar AT base pairs.
106 Table 3 Stacking energies of consecutive base pairs in standard B-DNA geometry (vertical separation of base pairs of 3.38 A, helical twist of 36°; all other parameters including the propeller twist were set to 0). The interaction energies were evaluated at the MP2/6-31 G*(0.25) level. Values in parentheses were obtained using the MP2-fitted empirical potential and slightly overestimate the stabilization energy which could be improved by a subtle rescaling of the van der Waals term. A E intra -intrastrand stacking, m E inter - interstrand stacking, AEM~- many-body correction obtained at the HF/6-31G*(0.25) level, AET - stacking energy, i.e., sum of the previous three terms. Data extracted from [34]. step A E intra A E inter AEMB AET AA -9.8 (-10.3) -2.2 (-3.3) -0.1 -12.0 GG -4.6 (-5.0) -6.9 (-7.8) +2.0 -9.5 TC -12.5 (-13.1) +0.4 (-0.3) +0.7 -11.4 CT -11.9 (-12.1) -0.3 (-1.3) +0.7 -11.5 AT - 10.3 (- 10.4) -0.4 (- 1.4) +0.1 - 10.6 GC -18.1 (-19.0) +4.0 (+3.7) +0.9 -13.2 GT -9.0 (-9.3) -3.3 (-4.1) +0.6 -11.8 TA -10.1 (-10.4) -1.1 (-2.1) 0.0 -11.2 CG -11.1 (-12.0) -2.7 (-3.2) +0.7 -13.1 CA -8.5 (-9.3) -4.4 (-5.4) +0.6 -11.9 Proper inclusion of the solvent into the calculations is unfortunately quite difficult [46]. One can use classical molecular dynamics or Monte Carlo simulations, classical continuum models based on the Poisson-Boltzman equation, and quantum-chemical studies using various variants of the Self Consistent Reaction Field (SCRF) approach at the semiempirical or ab initio level. There are serious approximations associated with these methods. Continuous models neglect the specific solute-solvent interactions which are very important for polar solvent. Classical methods neglect the changes in the electronic structure of the solute due to the solvent effects. These uncertainties can be illustrated using the predicted solvation energy of adenine treated by various m o d e m approaches. The calculated values vary from -8 to -20 kcal/mol
[68]. At least, a qualitative inclusion of solvent effects into calculations is vital since a picture based only on the gas phase data can sometimes be incomplete. For example, four-stranded intercalated i-DNA contains consecutive hemiprotonated cytosine base pairs [69], and the predicted gas phase stacking is very repulsive due to the charge-charge repulsion [32b]. A similar problem exists in protonated triplexes [70]. We and others have proposed that the charge-charge repulsion between (among) consecutive protonated bases may be efficiently eliminated by a temporary deprotonation of the outer (non H-
107 bonded) position of the amino group of protonated cytosine [32b, 70]. This would lead to a stable cytosine - imino cytosine tautomer base pair with the same base pairing pattern while stacking between consecutive neutral and protonated base pairs is attractive. Surprisingly, nanosecond-scale MD simulations of the i-DNA structure with eight closely spaced protonated base pairs show exceptionally stable trajectories very close to the starting crystallographic structure with no destabilization of the (repulsive) stacking arrangement [71]. We of course cannot rule out that the molecular dynamics structure will be destabilized after a longer simulation, but it seems to be a very stable structure. Interestingly, a related study on triplexes clearly shows fast destabilization of several consecutive protonated triplets of bases [72]. The difference between triplexes and i-DNA noticed by MD simulations agree with the experimental data since consecutive protonated triplets are known to be destabilizing while the stability of i-DNA originates in the protonated cytosine core region. This result is in fact not so surprising because a polar solvent can stabilize complexes of ionic species having the same charge [73]. Let us finally comment on one problem which should be considered by anyone attempting base stacking calculations in DNA. The stacking energy between base pairs is very sensitive to the vertical distance between the extended stacked systems [74]. In all crystal structures solved at high resolution, the vertical distance of the base pairs is optimized irrespective of the sequence and local conformational variations [74b]. On the other hand, many crystal structures reported at lower resolution (starting at ca. 2.0 A) frequently contained base pair steps with evidently unoptimized vertical distances between the consecutive base pairs accompanied with sometimes absurd values of calculated stacking energies [74b]. Geometries of these steps are influenced by data and refinement errors to such an extent that they are not suitable for stacking energy calculations. (It is likely that it is not due to the incorrectly determined average vertical distance of base pairs but rather due to some inaccuracy in angular parameters such as base pair roll). Analogously many computational methods do not provide a correct vertical separation of aromatic stacked clusters [74]. For example the old AMBER 3 force field tends to artificially "compress" correctly stacked structures of base pairs by ca. 0.2 A while the new AMBER 4.1 force field is already properly parameterized in this respect. It can lead to substantial deformations of the correct structures since the energy gradient due to the unoptimized vertical distance is exceptionally very large. Serious problems caused by the unoptimized vertical distance in extended stacked systems are still largely unrecognized, and the outcomes of many
108 studies were substantially influenced by them. When making stacking calculations using the experimental geometries, one should always consider possibility of some geometrical inaccuracies (a simple test is given in [74b]). Also, the force field should be verified, and the vertical distance of stacked base pairs should be allowed to relax when making an investigation of the potential energy surface. It is because the vertical distance is adjusted depending on the other parameters (twist, roll, propeller) rather than being an independent parameter [74,75]. 4.4. Interactions of amino groups of bases
One of the most interesting results predicted by the correlated ab initio calculations is the partial sp 3 pyramidalization of the amino groups of DNA bases. Since this issue has been extensively reviewed before [28e, 32c], let us just briefly summarize the main results (Table 4). Nonplanarity of the guanine amino group is substantially more pronounced than the nonplanarity of adenine and cytosine. The guanine amino group hydrogens are further bent asymmetrically due to a strong interaction with proximal polar H1 ring hydrogen atom. The current reference values of the amino group nonplanarity were obtained at the MP2/6-311G(2df, p) level [32c] and are summarized in Table IV. Studies on model compounds indicate that the MP2 and CCSD(T) methods provide essentially the same results [51 ]. The performance of the ab initio technique has been further verified by congruency between the calculated and measured spectra for aniline inversion motion. For this molecule, a nonharmonic ab initio vibrational analysis has been by explicitly considering five degrees of freedom [51 ]. Nonplanar amino groups are probably involved in various interactions in nucleic acids. Among them are interstrand contacts involving amino groups not restricted by the primary hydrogen bonds [33] (e.g., GA mismatch base pairs), such as observed in the d(CCAAGATTGG)2 crystal structure (see above). Activation of the amino groups of DNA bases is also believed to contribute to mutual interstrand contacts of the amino groups in B-DNA [28a,39,76]. These are surprisingly the most frequent contacts between the exocyclic groups of bases in the B-DNA crystals, and the average amino - amino distance is below the sum of their van der Waals radii [76a]. Their occurrence contradicts the usual view that amino - amino contacts are purely repulsive steric clashes. Formation of amino - amino contacts requires a perturbation of symmetry in base pair steps which can otherwise adopt a twofold symmetry arrangement (CpG, APT). This indeed seems to be supported by crystallography because the
109 Table 4 Nonplanar DNA bases: geometries and inversion barriers for pyramidalization. XCNH1 amino group hydrogen dihedral angle. X - C5 of cytosine and adenine and the N3 of guanine and isocytosine; XCNH2 amino group hydrogen dihedral angle. X = N3 of cytosine, and the N1 of adenine, guanine, and isocytosine. Sum of the HNC and HNH amino group hydrogen valence bond angles. This quantity equals 360 ° for a planar molecule and is less than 360 ° if the amino group is pyramidal. AE -the inversion barrier for pyramidalization, i.e., the energy difference between nonplanar and planar optimized molecules. (Data taken from ref. [32c]). nucleobase method XCNH(°) XCNH(°) ZHNX(°) AE(kcal/mol) HF/6-31G** -5.5 3.3 359.5 -0.00 cytosine MP2/6-31 G* -26.2 14.1 348.8 -0.38 MP2/6-311G(2df, p) -21.4 12.6 351.9 -0.15 HF/6-31G** -5.0 4.6 359.6 -0.00 adenine MP2/6-31G* -21.1 18.7 349.3 -0.34 MP2/6-311G(2df, p) -15.3 16.5 352.9 -0.13 HF/6-31G** -11.1 28.6 348.6 -0.34 guanine MP2/6-31G* -11.8 43.2 338.1 -1.63 MP2/6-311G(2df, p) -13.3 39.2 339.6 -1.12 HF/6-31G** -10.8 27.1 349.6 -0.27 isocytosine MP2/6-31G* -12.3 40.2 340.2 -1.22 MP2/6-311G(2df, p) -13.6 35.7 342.3 -0.78 amino - amino distances are larger in the CpG and ApT steps where the true twofold symmetry has been imposed by the crystal packing [28a]. An extensive search of the small-molecule crystal structure database reveals several clear examples of amino groups of aniline as acceptors while only few such cases were reported for nucleobases [39]. The relatively infrequent occurrence of amino - amino contacts in crystals of D N A constituents is not surprising. Amino groups are mostly involved in primary hydrogen bonds where the amino group serves as an H-donor. The amino group is expected to act as the acceptor when it is not fully involved in the primary H-bonds. This is more likely to occur in crystals of aniline and less likely in crystals of nucleobases which have a much higher capability to form primary H-bonds. There is no direct experimental verification of nonplanarity of isolated bases yet; however, indirect evidence supports the nonplanarity [28a,39,77].
4.5. Interactions of bases and base pairs with metal cations Recent studies on interactions between bases, base pairs, and metal cations were mainly aimed at clarifying the following items. i) H o w the metal cations influence the strength of the base pairing? ii) What is the influence of the first hydration shell of the cation on the
110 interactions? iii) What are the differences between various cations? iv) How reliable is the empirical potential treatment? The reviewed calculations were done assuming a coordination of the cation to the N7 position of purines. 4.5.1. Enhancement of base pairing The enhancement of base pairing by metal cation coordination is the difference between the energies which are necessary to separate two H-bonded bases in the presence and absence of the cation. Such effects have been noticed in older theoretical studies [24,25], and polarization enhancement of base pairing has been proposed as a means of stabilizing the Purine.Purine.Pyrimidine triple helices [78]. The enhancement of a base pair arise from two effects: a) the classical electrostatic interaction between the cation and remote base (the base which does not interact directly with the cation) and b) the "nonclassical" three-body term originating mainly in polarization effects.
Figure 4: GGC triad interacting with a hydrated metal cation. Table 5 summarizes the data for the unsolvated cations interacting with various base pairs. The three-body term is very significant in stabilizing the GG, GC and IC base pairs. Qualitatively different is the picture provided by AA, AT, and 2aminoAT base pairs. Here, the three-body term is basically negligible although there is some base-pair enhancement of the AT pairing due to the cation - thymine attraction. The cations attack the N6 amino group nitrogen and destroy the base pairing. Therefore, in order to preserve H-bonding, all calculations were done imposing Cs symmetry on the AA, AT, and 2-aminoAT base pairs. We tried to optimize the adenine-cytosine and guanine-thymine base
111 pairs with a metal cation. H o w e v e r there is no m i n i m u m for these structures, and the optimizations resulted in cross-link structures with a cation b r i d g i n g the two bases. Table 5 Interaction between cations and base pairs. M - cation, A - proximal purine, B - distant base, AE3 three body term, AET - total interaction energy. MP2/6-31G*//HF/6-31 G* level; values in parentheses represent HF/6-31 G* data. All energies in kcal/mol. For more details see original refs. [42b,c]. AEm AE~ AE~ AE3 AEv GC Mg 2÷ -198.7 (-209.7) - (-) -26.0 (-25.9) (-) -243.8 (-252.9) GC Ca > -133.9 (-143.5) -3.0 (-3.0) -25.8 (-25.5) -10.1 (-8.8) -172.7 (-180.8) GCBa 2+ -118.8(-120.6) -2.0 (-1.9) -25.6(-25.3) -9.6 (-8.3) -156.1 (-156.0) GC Zn 2+ -237.2 (-234.9) (-4.6) -26.1 (-26.0) - (-15.0) -285.4 (-280.6) GC Cd 2+ -192.6 (-190.5) (-4.3) -26.0 (-25.7) - (-12.1) -237.2 (-232.6) GC Hg 2+ -208.0 (-196.7) (-4.3) -25.9 (-25.7) - (-12.9) -253.9 (-239.6) A T M g 2+ -107.9(-111.1) - (-11.0) -10.8 (-8.4) (-2.1) -131.5 (-133.1) AT Ca 2+ -61.6 (-63.8) -10.0 (-11.0) -11.1 (-8.7) -1.5 (-1.5) -84.2 (-85.0) AT Sr 2÷ -48.9 (-51.0) -9.7(-10.7) -11.2 (-8.8) -1.2 (-1.3) -71.0 (-71.7) AT Ba > -51.4 (-49.1) -9.5(-10.4) -11.2 (-8.8) -1.4 (-1.4) -73.5 (-69.7) ATZn > -152.9 (-144.3) - (-11.3) -10.8 (-8.4) - (-2.4) -176.7 (-166.4) AT Cd 2+ -116.6 (-109.9) - (-11.2) -10.9 (-8.5) (-2.2) -129.3 (-123.2) AT Hg 2÷ -141.1 (-126.4) - (-11.0) -10.9 (-8.5) (-2.5) -164.7 (-148.2) IC Ca > -121.1 (-131.4) -8.8 (-9.5) -15.0 (-15.3) -11.9 (-11.0) -167.2 (-156.7) 2aATCa 2+ -72.5 (-73.1) -9.7 (-10.6) -14.2 (-11.4) -2.1 (-2.0) -98.5 (-97.1) GG Ca 2÷ -136.9 (-146.3) -12.8 (-12.4) -18.5 (-17.1) -10.5 (-8.9) -178.7 (-184.7) AACa > -67.8 (-64.8) -3.2 (-2.8) -10.3 (-6.2) -1.4 (-1.4) -82.7 (-75.2) The t h r e e - b o d y term is reduced by the hydration o f the cation (Table 6). H o w e v e r , it still remains significant for the G G and GC structures. The e n e r g y contributions in this study were evaluated for the following trimer" the h y d r a t e d c a t i o n - p r o x i m a l base - remote base. That is, the cation and its h y d r a t i o n shell have been considered as one s u b s y s t e m in the interaction e n e r g y calculations. The e n h a n c e m e n t o f base pairing is especially large for the G G base pair (both terms are large) w h i c h gives strong support for the p r o p o s e d m e t a l - c a t i o n assisted stabilization o f the G G C triplexes. The t h r e e - b o d y term has v a n i s h e d for the A A base pair, and the same holds for the A T base pair. The cation can still attack the adenine amino group t h r o u g h a polarized water in its h y d r a t i o n shell a l t h o u g h the effect is w e a k e r than for a bare cation. A g a i n Cs s y m m e t r y had to be i m p o s e d for adenine-containing pairs. E v e n u n d e r this constraint one of the polarized water molecules actedas a w e a k H - d o n o r to the N 6 adenine a m i n o group nitrogen atom with the H...N distances around 2.4 A. The cation is
112 Table 6 Interaction energies in the hydrated cation (M) - proximal base (A) - remote base (B) complexes evaluated with inclusion of the electron correlation (MP2/6-31G*//HF/6-31G* level). E AB - pairwise interaction energy between the proximal base and the solvated cation, E~ - pairwise interaction energy between remote base and the solvated cation, AEAB - the pairwise base pair interaction energy, AE(3) - the three-body term, AET - the total interaction energy, i.e., the sum of the previous contributions. All energies are in kcal/mol; deformation energies of monomers were not included (for original papers see [ 13c,d]). AEMA AEMB AEga AE3 AET GC Mg 2+ -89.3 -1.5 -26.4 -8.1 -125.4 GC Ca 2+ -82.6 -1.7 -26.3 -5.2 -115.8 GC Sr2+ -76.0 -2.1 -25.8 -4.4 -108.5 GC Ba 2+ -71.2 -7.7 -23.2 -2.1 -104.1 GC Zn 2+ -93.8 -1.5 -26.4 -8.7 -130.4 GC Cd 2+ -87.9 -1.1 -26.3 -8.0 -123.3 GC Hg 2+ -94.3 -1.3 -26.2 -8.7 -130.5 GC MgOH + -58.0 +0.4 -26.1 -4.2 -88.0 GG Mg 2+ -89.1 -9.5 - 19.9 - 10.4 - 129.8 GG Zn 2+ -94.5 -9.5 - 19.8 - 10.9 - 134.7 AA Mg 2+ -45.9 -2.7 -10.5 0.0 -59.2 AA Zn z+ -53.7 -2.6 - 10.6 0.0 -66.9 shifted from the adenine plane. The other water molecule close to the amino group is oriented away from the amino group. Especially for the large alkaline earth cations, there is literally not enough space near the N7 position o f adenine unless the amino group is pyramidal (or perhaps deprotonated).
4.5.2. Differences among cations The cation-purine interaction is primarily determined by the charge-molecular dipole interaction so that the attraction decreases for a given cation in the series guanine, inosine, adenine, and 2-amino adenine. Besides that, Table 5 shows the m u c h larger cation-proximal base attraction for the zinc group (IIb) elements c o m p a r e d to the m a g n e s i u m group (IIa) metals. This difference is due to the filled d-orbitals o f the IIb group and is essential in understanding the difference between these two groups. Table 6 shows that the hydrated cation - proximal base interaction energies are quite uniform compared to the data for bare cations. However, the difference between the IIa and IIb groups reappears upon decomposition o f the interaction between the base and hydrated cation to the individual contributions [ 13c]. The (base-cation-water shell) complexes can be viewed from two different directions. As a hydration o f a metalated base or as an interaction between the
113 base and a hydrated cation. The complex with Zn 2+ is clearly shifted more towards the first description compared to Mg 2+. The Zn 2+ cation is bound more tightly to the nucleobase, and the water molecules around the Z n 2+ a r e more flexible. On the other hand, Mg 2+ can be more easily separated from the N7 position of guanine. This is in agreement with the Cambridge database search showing a much more frequent interaction of Z n 2+ with nitrogen (versus oxygen) compared to Mg 2+ [79] and complements the theoretical calculations considering some other ligands [13a,b].
5. CONCLUDING REMARKS Ab initio evaluations of the structures and energetics of H-bonded and stacked nucleic acids base pairs carried out since 1994 have provided for the first time a reliable picture of these interactions. This could not be achieved by any other experimental or theoretical technique. These calculations are important in understanding the role of molecular interactions of DNA bases in nucleic acids and allow for a parameterization and verification of the empirical force fields. Intense research of the various aspects of interactions between metal cations and bases and base pairs is under way. Ab inito studies of interactions of nucleic acid bases represent one of the most successful applications of quantum chemistry to biological problems.
Acknowledgements: This study was supported by a grant 203/97/0029 from the GA CR, by the National Science Foundation (EHR9108767) and the National Institute of Health (Grant. No. GM08047), and by by ONR grant # N00014-95-1-0049.
REFERENCES 1. (a) T. Darden, D. York, and L.G. Pedersen, J. Chem. Phys., 98 (1993) 10089. (b) D.M. York, W. Yang, H. Lee, T. Darden, and L.G. Pedersen, J. Am. Chem. Sot., 117 (1995) 5001. 2. (a) H. Lee, T. Darden, and L. Pedersen, Chem. Phys. Let., 243 (1995) 229. (b) S. Weerasinghe, P.E. Smith, and B.M. Pettitt, Biochemistry, 34 (1995) 16269. (c) T.A. Cheatham, J.L. Miller, T. Fox, and T.A. Darden, P.A. Kollman, J. Am. Chem. Sot., 117 (1995) 4193.
114 3. P. Auffinger and E. Westhof, Curr. Opinion. Struct. Biol., 8 (1998) 227 and references therein. 4. G.D. Strahan, M.A Keniry, and R.H. Shafer, Biophys. J., 75 (1998) 968. 5. M. Feig and B.M. Pettitt, Biophys. J., 75 (1998) 134. 6 (a) Y. Ding, D.N. Bemardo, K. Krogh-Jespersen, and R.M. Levy, J. Phys. Chem., 99 (1995) 11575. (b) P. Ahlstrom, A. Wallquist, S. Engstrom, and B. Jonsson, Mol. Phys., 68 (1989) 563. (c) D.N. Bemardo, Y.B. Ding, K. Krogh-Jespersen, and R.M. Levy, J. Phys. Chem., 98 (1994) 4180. (d) P.-O. Aastrand, A. Wallqist, and G. Karlstrom, J. Chem. Phys., 100 (1994) 1262, (e) T.A. Halgren, Curr. Opinion Struct. Biol., 5 (1995) 205 and references therein. 7. (a) I.K. Yanson, A.B. Teplitsky, and L.F. Sukhodub, Biopolymers, 18 (1979) 1149. (b) C. Desfrancois, H. Abdoul-Carime, C.P. Schulz, and J.P. Schermann, Science, 269 (1995) 1707. (c) P. D. Schnier, J.S. Klassen, E.F. Stritmatter, and E.R. Williams, J. Am. Chem. Soc., 120 (1998) 9605. 8. M. Kratochvil, O. Engkvist, J. Sponer, P. Jungwirth, and Pavel Hobza, J. Phys. Chem. A, 102 (1998) 6921. 9. V. Spirko, J. Sponer, P. Hobza J. Chem. Phys., 106 (1997) 1472. 10. (a) P. Hobza, H.L. Selzle, and E.W. Schlag, Chem. Rev., 94 (1994) 1767. (b) S. Sun and B.R. Bemstein, J. Phys. Chem., 100 (1996) 13348. 11. (a) W.D. Comell, P. Cieplak, C.I. Bayly, I.R. Gould, K.M. Merz, Jr., D.M. Ferguson, D.C. Spellmeyer, T. Fox, J.W. Caldwell, and P.A. Kollman, J. Am. Chem. Soc., 5179 (1995) 117. (b) A.D. MacKerell, Jr., J. Wiorkiewicz-Kuczera, and M. Karplus, J. Am. Chem. Soc., 117 (1995) 11946. (c) J.R. Maple, M.-J. Hwang, T.P. Stockfisch, U. Dinur, M. Waldman, C.S. Ewig, and A.T. Hagler, J. Am. Chem. Soc., 115 (1994) 162. 12. (a) J. Sponer, J. Leszczynski, and P. Hobza, J. Phys. Chem., 100 (1996) 5590. (b) J. Sponer, J. Leszczynski, P. Hobza, J. Comput. Chem., 12 (1996) 841. (c) M. Beachy, D. Chasman, R. Murphy, T. Halgren, and R. Friesner, J. Am. Chem. Soc., 119 (1997) 5908. (d) P. Hobza, F. Hubalek, M. Kabekic, P. Mejzlik, J. Sponer, and J. Vondr/t~ek, Chem. Phys. Lett., 257 (1996) 31. (e) P. Hobza, M. Kabelgc, P. Mejzlik, J. Sponer, and J. Vondrfigek, J. Comput. Chem., 18 (1997) 1136. (f) S.R. Gadre and S.S. Pundlik, J. Phys. Chem. B, 101 (1997) 3298. (g) K. Brameld, S. Dasgupta, and W.A. Goddard III, J. Phys. Chem. B, 101 (1997) 4851. 13. (a) D.R. Garmer and N. Gresh, J. Am. Chem. Soc., 116 (1994) 3556 (b) D.R. Garmer and N. Gresh, and B.-P. Rogues, Proteins" Struct. Funct. Genet., 31 (1998) 42. (c) J. Sponer, J.V. Burda, M. Sabat, J. Leszczynski, and P. Hobza, J. Phys. Chem. A, 102 (1998) 5951. (d) J. Sponer, J.V. Burda, M. Sabat, J. Leszczynski, and P. Hobza, J. Biomol. Struct. Dyn., 16 (1998) 139. (e) J. Sponer, J.V. Burda, P. Mejzlik, J. Leszczynski and P. Hobza, J. Biomol. Struct. Dyn. 14 (1997) 613. 14. (a) P. Carloni and M. Andreoni, J. Phys. Chem. 100 (1996) 17797. (b) J. Hutter, P. Carloni, and M. Parrinello, J. Am. Chem. Soc., 118 (1996) 8710 (c) D. Sanchez-Portal, P. Ordejon, E. Artacho, and J.M. Soler, Int. J. Quantum Chem., 65 (1997) 453 and references therein. 15. (a) S. Kristian and P. Pulay, Chem. Phys. Lett., 229 (1994) 175. (b) P. Hobza, J. Sponer, and T. Reschel, J. Comput. Chem., 17 (1995) 1315. (c) T.A. Weselowski, O. Parisel, Y.
115 Ellinger, and J. Weber, J. Phys. Chem. A, 101 (1997) 7818. (d) T.A. Weselowski, Y. Ellinger, and J. Weber, J. Chem. Phys., 108 (1998) 6078. 16. J. Langlet, P. Claverie, F. Caron, and J.C. Boevue, Int. J. Quantum Chem., 19 (1981) 299. 17. R. Ornstein and J.R. Fresco, Biopolymers, 22 (1983) 1979. 18. W. Forner, P. Otto, and J. Ladik, Chem. Phys., 86 (1984) 49. 19. P. Otto, Int. J. Quantum Chem., 30 (1986) 275. 20. J.E. Del Bene, J. Mol. Struct (THEOCHEM), 124 (1985) 201. 21. P. Hobza, C. Sandorfy, J. Am. Chem. Sot., 109 (1987) 1302. 22. V.I. Poltev and N.V. Shulyupina, J. Biomol. Struct. Dyn., 3 (1984) 739. 23. (a) M. Aida and C. Nagata, Int. J. Quatum. Chem., 29 (1986) 1253. (b) M. Aida, J. Theor. Biol. 130 (1988) 327. (c) M. Aida, J. Comput. Chem., 9 (1988) 362. 24. (a) K.P. Sagarik and B.M. Rode, Inorg. Chim. Acta, 78 (1983) 177. (b) E.H.S. Anwander, M.M. Probst, and B.M. Rode, Biopolymers 29 (1990) 757. 25. H. Basch, M. Krauss, and W.J. Stewens, J. Am. Chem. Sot., 107 (1985) 7267. 26. (a) J. Kozelka, Met. Ions. Biol. Systems, 33 (1996) 1. (b) J. Kozelka, R. Sanelli, G. Berthier, J.P. Flament, and R. Lavery, J. Comput. Chem., 13 (1992) 45. 27. A.-O. Colson, B. Besler, and M.D. Sevilla, J. Phys. Chem., 97 (1993) 13852 and references therein. (b) A.-O. Colson and M.D. Sevilla, J. Phys. Chem., 100 (1996) 4420. 28. (a) J. Sponer and P. Hobza, J. Am. Chem. Sot., 116 (1994) 709. (b) J. Sponer and P. Hobza, J. Phys. Chem., 98 (1994) 3161. (c) J. Sponer and P. Hobza, J. Mol. Struct. (THEOCHEM), 304 (1994) 35. (d) J. Sponer and P. Hobza, J. Biomol. Struct. Dyn., 12 (1994) 671. (e) J. ~;poner and P. Hobza, Int. J. Quantum Chem., 57 (1995) 959. (f) J. Leszczynski, Int. J. Quantum Chem.; Quantum Biol. Symp., 19 (1992) 43. 29. (a) I.R. Gould and P.A. Kollman, J. Am. Chem. Soc., 116 (1994) 2493. (b) J. Florian and J. Leszczynski, J. Biomol. Struct. Dyn., 12 (1995) 1055. 30. J. Florian and J. Leszczynski, J. Am. Chem. Sot., 118 (1996) 3010 and references therein. 31. P. Hobza, J. Sponer, and M. Pol~t~ek, J. Am. Chem. Sot., 117 (1995) 792. 32. (a) J. Sponer, J. Leszczynski, and P. Hobza, J. Phys. Chem., 100 (1996) 1965. (b) J. Sponer, J. Leszczynski, V. Vetterl, and P. Hobza, J. Biomol. Struct. Dyn. 13 (1996) 695. (c) J. Sponer, J. Leszczynski, and P. Hobza, J. Biomol. Struct. Dyn., 14 (1996) 117. (d) J. Gu and J. Leszczynski, J. Phys. Chem. (in press). 33. J. Sponer, J. Flori~in, J. Leszczynski, and P. Hobza, J. Biomol. Struct. Dyn., 13 (1996) 827. 34. J. Sponer, H.A. Gabb, J. Leszczynski, and P. Hobza, Biophys. J., 73 (1997) 76. 35. (a) P. Hobza and J. ~;poner, J. Mol. Struct. (THEOCHEM), 388 (1996) 115. (b) J. Sponer and P. Hobza, Chem. Phys. Lett., 267 (1997) 263. 36. P. Hobza, H.L. Selzle, and E.W. Schlag, J. Phys. Chem., 100 (1996) 18970. 37. P. Hobza and J. Sponer, Chem. Phys. Lett. 288 (1998) 7. 38. J. Bertran, A. Oliva, L. Rodriquez-Santiago, and M. Sodupe, J. Am. Chem. Soc., 120 (1998) 8159. 39. B. Luisi, M. Orozco, J. Sponer, F.J. Luque, and Z. Shakked, J. Mol. Biol., 279 (1998) 1123. 40. J. Rak, A.A. Voityuk, and N. Rosch, J. Phys. Chem. A, 102 (1998) 7168.
116 41. (a) G.M. Stewart, E.R.T. Tiekink, and M.A. Buntine, J. Phys. Chem. A, 101 (1997) 5368. (b) I. Zilberger, V.I. Avdeev, and G.M. Zhidomirov, J. Mol. Struct. (THEOCHEM), 418 (1997) 73. 42. (a) J.V. Burda, J. Sponer, and P. Hobza, J. Phys. Chem., 100 (1996) 7250. (b) J.V. Burda, J. Sponer, J. Leszczynski, and P. Hobza, J. Phys. Chem. B, 101 (1997) 9670. 43. N. Zhanpeisov and J. Leszczynski J. Phys. Chem. A, 102 (1998) 6167. (b) N. Zhanpeisov and J. Leszczynski, Int. J. Quantum Chem., 69 (1998) 37. (c) N. Zhanpeisov and J. Leszczynski, J.Phys.Chem. B, 102 (1998) 9109. (d) N. Zhanpeisov, J. Sponer, and J. Leszezynski J. Phys. Chem. A, 102 (1998) 10374. (e) L. Gorb and J. Leszczynski, J. Am. Chem. Sot., 120 (1998) 5024. (f) C. Alhambra, F.J. Luque, J. Esterlich, and M. Orozco, J. Org. Chem., 60 (1995) 966.(g) M. Orozco, B. Hernandez, and F.J. Luque, J. Phys. Chem. B, 102 (1998) 5228. 44. V. Subramanian, D. Sinavesan, and T. Ramasami, Chem. Phys. Lett., 289 (1998) 189. 45. J. Florian, J. Sponer, and A. Warshel, J. Phys. Chem. B 103 (1999) 0000. 46. F.J. Luque, J.-M. Lopez-Bes, J. Cemeli, M. Aroztequi, and M. Orozco, Theor. Chem. Ace., 96 (1997) 105 and referenced therein. 47. (a) D.M. Gray, Biopolymers, 42 (1997) 783 and references therein. (b) J. SantaLucia Jr., Proc. Natl. Acad. Sci. USA, 95 (1998) 1460. 48. J. Sartorius and H.-J. Schneider, J. Chem. Soc. Park. Tr. 2, (1997) 2319 and references therein. 49. (a) P. Pulay, S. Saebo, and W. Meyer, J. Chem. Phys., 81 (1984) 1901. (b) M.D. Beachy, D. Chasman, R. A. Friesner, and R.B. Murphy, J. Comput. Chem., 19 (1998) 1030. 50. J. Cizek, Adv. Chem. Phys., 14 (1969) 35. 51. O. Bludsky, J. Sponer, J. Leszczynski, V. Spirko, and P. Hobza, J. Chem. Phys., 105 (1996) 11042. 52. (a) D.A. Estrin, L. Paglieri, and G. Corongiu, J. Phys. Chem. 98, 1994, 5683. (b) G. Bakalarski, P. Groehowski, J.S. Kwiatkowski, B. Lesyng, and J. Leszczynski, Chem. Phys., 204 (1996) 301. 53. (a) P. Hobza and J. Sponer, Chem. Phys. Lett., 261 (1996) 379. (b) S. Scheiner, Hydrogen Bonding. A Theoretical Perspective. Oxford University Press, New York (1997). 54. (a) S. Simon, M. Durand, and J.J. Dannnenberg, J. Chem. Phys., 105 (1996) 11024. (b) P. Hobza and Z. Havlas, Collect. Czech. Chem. Commun., 63 (1998) 1343. (c) P. Hobza and Z. Havlas, Theoret. Chim. Acta., in press. (d) A. Famulari, M. Raimondi, M. Sironi, and E. Gianetti, Chem. Phys., 232 (1998) 275. (e) A. Famulari, R. Specchio, M. Sironi, and M. Raimondi, J. Chem. Phys., 108 (1998) 3296. 55. L.M.J. Kroon-Batenburg and F.B. van Duijneveldt, J. Mol. Struct., 121 (1985) 185. 56. (a) J.J. Novoa and C. Sosa, J. Phys. Chem., 99 (1995) 15837. (b) J. E. del Bene and I. Shavitt, In: Molecular Interactions, S. Scheiner (ed.), John Wiley and Sons (1997), p. 157. 57. F.B. van Duijneveldt, In Molecular Interactions, S. Scheiner (ed.), John Wiley & Sons, 1997, p. 81 (b) G. Chalasinski and M.M. Szczesniak, Chem. Rev., 94 (1994) 1723. (c) M. Urban and P. Hobza, Theoret. Chim. Acta, 36 (1975) 215. (d) M. Gutowski and G. Chalasinski, J. Chem. Phys., 98 (1993) 5540. (e) S.M. Cybulski and G. Chalasinski, Chem. Phys. Lett., 197 (1992) 591. (f) F.B. van Duineveldt, J.G.C.M. van Duineveldt-van de Rijdt, and J. H. van Lenthe, Chem. Rev., 94 (1994) 1873.
117 58. (a) H.B. Jansen and P. Ross, Chem. Phys. Lett, 3 (1969) 140. (b) S.F. Boys and F. Bernardi, Mol. Phys., 19 (1970) 553. 59. (a) J. Sponer, J. Leszczynski, and P. Hobza J. Phys. Chem. A, 101 (1997) 9489. (b) M. Meyer and J. Suhnel, J. Biomol. Struct. Dyn. 15 (1997) 619. (c) P. Cysewski, J. Chem. Soc. Farad. Trans., 94 (1998) 3117. 60. S.-H. Chou, L. Zhu, and B.R. Reid, J. Mol. Biol., 267 (1997) 1055 and references therein. 61. G.G. Prive, U. Heinemann, S. Chandrasegaran, L.-S. Kan, M.L. Kopka, and R. E. Dickerson, Science, 38 (1987) 498. 62. O. V. Shishkin, J. Chem. Soc. Chem. Commun., (1995) 1539. 63. S.S. Pundlik and S.R. Gadre, J. Phys. Chem. B, 101 (1997) 9657. 64. C.E. Bugg, J.M. Thomas, M. Sundaralingam, and S.T. Rao, Biopolymers, 10 (1971) 175. 65. J. Sponer, J. Jursa, and J. Kypr, Nucleosid. Nucleotid., 13 (1994) 671. 66. C.A. Hunter, J. Mol. Biol., 230 (1993) 1025. 67. C. Alhambra, F.J Luque, F. Gago, and M. Orozco, J. Phys. Chem. B, 101 (1997) 3846. 68. J.L. Miller and P.A. Kollman, J. Phys. Chem., 100 (1996) 8587. 69. K. Gehring, J.-L. Leroy, and M. Gueron, Nature, 363 (1993) 561. (b) L. Chen, L. Cai, X. Zhang, and A. Rich, Biochemistry, 33 (1994) 13540. (c) I. Berger, M. Egli, and A. Rich, Proc. Natl. Acad. Sci. USA, 93 (1996) 12116. 70. C. Colominas, F.J. Luque, and M. Orozco, J. Am. Chem. Sot., 118 (1996) 6811. 71. N. Spackovfi, I. Berger, M. Egli, and J. Sponer, J. Am. Chem. Sot., 120 (1998) 6147. 72. R. Soliva, C. Laughton, F.J. Luque, and M. Orozco, J. Am. Chem. Soc., 120 (1998) 11226. 73. K. T. No, K.-Y. Nam, and H. A. Scheraga, J. Am. Chem. Soc., 119 (1997) 12917. 74. (a) J. Sponer and J. Kypr, In: Theoretical Biochemistry and Molecular Biophysics, D.L. Beveridge and R. Lavery (eds.), Adenine Press, NY, (1991) p. 271. (b) J. Sponer and J. Kypr, J. Biomol. Struct. Dyn., 11 (1993) 277. 75. C.A. Hunter and X.-J. Lu, J. Mol. Biol., 265 (1997) 603. 76. (a) J. Sponer and J. Kypr, Int. J. Biol. Macromol., 16 (1994) 3. (b) M. Shatzky-Schwartz, N. Arbuckle, N.D. Eisenstein, D. Rabinovich, A. Bareket-Samish, T.E. Haran, B.F. Luisi, and Z. Shakked, J. Mol. Biol., 267 (1997) 595. 77. H. C. Kung, K.Y. Wang, S. A. Parker, I. Goljer, and P.H. Bolton, Magnet. Res. Chem., 34 (1996) $47. 78. V.N. Potaman and V.N. Soyfer, J. Biomol. Struct. Dyn., 11 (1994) 1035. 79. C.V. Bock, K.A. Katz, and J.P. Glusker, J. Am. Chem. Sot., 117 (1995) 3754.