Chemical Physics Letters 442 (2007) 42–46 www.elsevier.com/locate/cplett
Insufficient description of dispersion in B3LYP and large basis set superposition errors in MP2 calculations can hide peptide conformers Leo F. Holroyd a, Tanja van Mourik b
a,b,*
a Chemistry Department, University College London, 20 Gordon Street, London WC1H 0AJ, UK School of Chemistry, University of St. Andrews, North Haugh, St. Andrews, Fife KY16 9ST, Scotland, UK
Received 9 March 2007; in final form 16 May 2007 Available online 24 May 2007
Abstract B3LYP/6-31+G(d) and MP2/6-31+G(d) calculations predict markedly different structures for one Tyr–Gly conformer. Calculation of the energy profile for rotation around the glycine Ca–N bond reveals one minimum in the B3LYP profile (/gly = 180°) and two in the MP2 profile (75° and 280°). Large intramolecular BSSE values are responsible for masking the 180°-minimum in the MP2 profile: approximate elimination of BSSE in the MP2 calculations – by (1) correction using BSSE values from complexes of phenol and N-formylglycine, (2) the application of local MP2, or (3) employing large basis sets (aug-cc-pVTZ/QZ) and density fitting – yields an unambiguous triple-well potential. Ó 2007 Elsevier B.V. All rights reserved.
1. Introduction The use of electronic structure methods to model simple organic or bioorganic molecules is routine in computational chemistry. Levels of theory ranging from basic Hartree– Fock (HF) theory, through nth order Møller–Plesset (MPn) and density functional theory (DFT), to high-level methods including coupled cluster theory, are widely used and are expected to give accurate results in accordance with the level of sophistication. One of the major advantages of DFT is its computational efficiency as compared to Møller– Plesset methods. Nonetheless, it is often assumed that DFT and MP2 will give similarly accurate results. Comparative studies sometimes bear out this conclusion: for example Ya´n˜ez et al. [1] compare MP2 with a range of density functionals, including B3LYP, for the description of hydrogenbonded clusters of water and hydrogen peroxide, and find * Corresponding author. Current address: School of Chemistry, University of St. Andrews, North Haugh, St. Andrews, Fife KY16 9ST, Scotland, UK. Fax: +44 1334 463808. E-mail address:
[email protected] (T. van Mourik).
0009-2614/$ - see front matter Ó 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.cplett.2007.05.072
good agreement. Similarly, Tuma et al. [2] compare density functionals with both MP2 and coupled cluster theory in the description of hydrogen-bonded complexes, and find the accuracy of hybrid functionals (e.g. B3LYP) to rival that of MP2 while their own non-hybrid functional HCTH38 [3] surpasses it. However, the complexes considered in these studies did not feature strong dispersion (London) forces. This is important because DFT is known to be poor at handling dispersion [2,4–6]. This is because the nonexact functionals used in most DFT methods treat only the local electron density at any given point, hence they cannot model the interaction between transient multipoles arising from fluctuations in the charge density, the source of dispersion energy [4]. Traditional DFT methods, including B3LYP, are therefore not necessarily reliable for systems dominated by dispersion, a fact already recognised in the early 1990s [7,8]. The development of new variants of DFT to overcome this problem is ongoing, based either on adjustments to the functional or on the inclusion of additive dispersive terms [9–16]. The research described here was prompted by the observation that MP2 and B3LYP, when used with the same
L.F. Holroyd, T. van Mourik / Chemical Physics Letters 442 (2007) 42–46
moderately-sized basis set, gave markedly different energy landscapes for the dipeptide tyrosine–glycine (Tyr–Gly) [17]. B3LYP predicted extended conformations to be more stable, whereas MP2 favoured folded structures. The same discrepant conformational preference of these two methods (where B3LYP favours conformers in which the aromatic ring is moved away from the rest of the molecule, whereas MP2 prefers folded conformations) was observed for a number of other molecules containing an aromatic ring in addition to a flexible side chain (Trp–Gly [18–20], Trp–Gly–Gly [19,20] and pindolol [21]). In the case of Tyr–Gly [17], B3LYP geometry optimisation gave an extended structure as the global minimum. However, single-point MP2 calculations significantly changed the order of stability of the conformers, yielding a folded (‘book’) global minimum. This was even more pronounced after full MP2 geometry optimisation, which significantly altered the structure of the book-type conformers, increasing their degree of foldedness. It was suspected that this disagreement was partly due to dispersive interactions involving the phenyl ring, which MP2 treats far better than B3LYP. Further investigation of the sixth most-stable conformer according to the MP2 single-point calculations, which exhibited the largest structural change from B3LYP to MP2 optimisation, suggested that the very different structures obtained by B3LYP and MP2 are likely caused by dispersion (a true physical effect, underestimated by B3LYP) as well as large basis set superposition errors (BSSE; an artificial attraction) in the MP2 calculations [22]. Due to the deficiencies in both methods, the structure of this Tyr–Gly conformer could not be obtained unambiguously. In the current Letter, we employ approaches designed to overcome the problems inherent to the application of the MP2 method: density fitting MP2 (df-MP2) [23] allows the use of large basis sets (thereby reducing BSSE); local MP2 (LMP2) [24–26,23] produces much reduced BSSE even when employing more limited basis sets; and spin-component-scaled MP2 (SCS-MP2) [27] corrects for the overestimation of dispersion in the MP2 calculations. We focus on the fourth most-stable conformer on the MP2 single-point energy surface, labelled ‘book4’ hereafter, as the difference between the B3LYP- and MP2-optimised geometries of this conformer is simply a rotation around one torsion angle. We show that the df-MP2 and df-LMP2 methodologies are promising tools to obtain reliable structures of molecules where the dispersion contribution is important. 2. Methodology 2.1. Rotational energy profiles The B3LYP/6-31+G(d) and MP2/6-31+G(d) optimised geometries of book4 (Fig. 1) were taken from Ref. [17]. The two structures mainly differ in the orientation of the C-terminus, as characterised by the Ramachandran angle /gly (the Ccarb(Tyr)–N(Gly)–Ca(Gly)–Ccarb(Gly) dihedral angle
43
Fig. 1. The B3LYP/6-31+G(d) and MP2/6-31+G(d) optimised geometries of the Tyr–Gly conformer book4.
– see Fig. 1), which is 180° in the B3LYP minimum and 74° in the MP2 minimum. We computed relaxed potential energy profiles for rotation around the glycine Ca–N bond by geometry optimisation at fixed values of /gly, using step sizes of 10–20°. The profiles were computed at the HF/631G(d), B3LYP/6-31+G(d) and MP2/6-31+G(d) levels of theory. In addition, single-point energy calculations were performed at the MP2-optimised structures with the df-MP2 (density fitting MP2 [23]), df-SCS-MP2 (density fitting spin-component-scaled MP2 [27]) and df-LMP2 (density fitting local MP2 [23–26]) methods. These calculations employed the aug-cc-pVnZ (n = D, T, Q) basis sets [28,29] and the corresponding aug-cc-pVnZ/MP2fit fitting basis sets [30] for both the df-HF and df-MP2 parts of the calculation. SCS-MP2 is a modification to MP2 that scales the energy contributions from parallel and antiparallel electron pairs by separate factors [27]. This modification corrects for the MP2 overestimation of the dispersion energy at no extra cost. The default scaling factors (5/6 for antiparallel spins and 1/3 for parallel spins) were employed. In the df-LMP2 calculations, the contributions of the two most diffuse functions of each angular momentum type were ignored in the localisation to yield better-localised orbitals. A completion criterion of 0.985 (aug-cc-pVDZ) and 0.99 (aug-cc-pVTZ) was employed for the orbital domain selection. These options were found to be necessary to obtain smooth rotational energy profiles. The B3LYP and MP2 calculations were performed using the program packages GAUSSIAN [31] and NWCHEM [32] whereas the df-(SCS)-(L)MP2 calculations were done with MOLPRO [33].
44
L.F. Holroyd, T. van Mourik / Chemical Physics Letters 442 (2007) 42–46
2.2. Counterpoise corrections The intramolecular BSSE in book4 was estimated using the same procedure as employed in a previous Letter [22]. Here, the Tyr–Gly structures were modelled by intermolecular complexes of phenol and N-formylglycine, with the conformations and spatial arrangements identical to the partially-optimised structures with fixed /gly angles generated as described above. The Cb(Tyr)H2 and NH2Ca(Tyr)H groups were replaced by hydrogen atoms. The positions of these two hydrogens were optimised for the isolated fragments (phenol and N-formylglycine), at the MP2/631+G(d) level using NWCHEM [32]. The counterpoise [34] (CP)-corrected phenol-[N-formylglycine] interaction energies were computed using Gaussian’s ‘counterpoise’ option. 3. Results
Fig. 3. B3LYP/6-31+G(d), MP2/6-31+G(d), and CP-corrected MP2/631+G(d) potential energy profiles for rotation around the glycine Ca–N bond.
3.1. HF, B3LYP and MP2 rotational energy profiles Fig. 2 shows the HF/6-31G(d) rotational energy profile, whereas the B3LYP/6-31+G(d) and MP2/6-31+G(d) profiles are displayed in Fig. 3. It is instructive to examine first the HF profile. It is a triple-well potential, with energy minima at /gly values of 81°, 180° and 279°, the deepest of which is the minimum at 180° (corresponding to the extended book4 structure also predicted by B3LYP). The minimum at 81° matches the conformation predicted by MP2. The maxima in the energy profile correspond to transition states. The highest of these (DE = 33 kJ/mol) occurs at /gly = 0°/360°. In this structure, the Ccarb(Tyr) and Ccarb(Gly) atoms are in an unfavourable eclipsed arrangement. Likewise, there are two additional transition states at /gly = 111° and 241°, where one of the Ca hydrogens is eclipsed with the Ccarb(Tyr) atom. Thus, the shape of the HF profile originates from three unfavourable eclipses of the Ccarb(Tyr) atom (with Ccarb(Gly) or either of the Ca(Gly) hydrogens), determining the location of the transition states; the minima are located between these.
Fig. 2. HF/6-31G(d) potential energy profile for rotation around the glycine Ca–N bond.
The transition state at 0/360° (DE = 30 kJ/mol) is also very prominent in the B3LYP potential energy curve (Fig. 3). However, in contrast to the HF profile, the B3LYP profile exhibits only one well, at 180°, although inflection points can be seen around 120° and 250°. These inflection points are reminiscent of the minimum/maximum combinations at /gly = 80°/110° and 240°/280° in the HF profile. The MP2 profile features minima at /gly = 74° and 290° (roughly where the HF profile exhibits its two shallower minima), and shows inflection points at 120° (one of the HF transition states) and at 180° (where HF and B3LYP have their deepest minimum). A maximum can be observed around 240° (the second HF transition state). Thus, neither B3LYP nor MP2 predicts the energy minimum located by the other, though there is a hint of the other method’s minima via points of inflection in the potential energy profiles. Fig. 3 also shows the CP-corrected MP2 rotational energy profile, estimated using the intermolecular BSSE values computed for complexes of phenol and N-formylglycine. (The CP-corrected B3LYP profile is not shown, as it is indistinguishable from the uncorrected B3LYP curve on the scale of the figure.) Interestingly, BSSE correction of the MP2 relative energies reveals a shallow minimum in the rotational energy profile at 180°. In addition, the minimum at 290° has gained stability relative to the 74°-minimum, with the result that these two minima are nearly iso-energetic on the CP-corrected MP2 potential energy surface. As expected [2], the BSSE is much larger for MP2 than for B3LYP: whereas the B3LYP-computed BSSE values amount to 2 kJ/mol and are fairly constant throughout the /gly range, the MP2-computed BSSE values range from 11 to 17 kJ/mol, and exhibit a sharp maximum at 120° (Fig. 4a). In the 120°-structure the glycine residue is located above the tyrosine ring, positioning the N(Gly)–Ca(Gly)–Ccarb(Gly)–O(Gly) atom chain directly above the C4–C5–C6–C1 atoms of the aromatic ring. The
L.F. Holroyd, T. van Mourik / Chemical Physics Letters 442 (2007) 42–46
45
Fig. 5. Df-(L)MP2 and df-SCS-(L)MP2 potential energy profiles for rotation around the glycine Ca–N bond, using the aug-cc-pVTZ and augcc-pVQZ basis sets. The inset shows the variation of the BSSE values as a function of /gly (avdz = aug-cc-pVDZ; avtz = aug-cc-pVTZ; avqz = augcc-pVQZ).
Fig. 4. (a) Variation of the BSSE values (in kJ/mol) in the MP2 ˚ ) between the calculations as a function of /gly. (b) The distances (in A glycine and tyrosine ring atoms.
distances between the glycine and tyrosine ring atoms appear to be shortest around /gly = 120° (Fig. 4b). The close approach of the glycine residue to the aromatic ring in this conformation is presumably responsible for the large BSSE around 120°, which artificially stabilises the molecule in this area. Comparison of the uncorrected and CP-corrected rotational profiles clearly shows that the BSSE significantly distorts the MP2 rotational energy profile, thereby masking the 180°-minimum. 3.2. df-MP2 and df-LMP2 rotational energy profiles The BSSE correction method employed in the previous section is only approximate, as some differences between the actual intramolecular BSSE values in Tyr–Gly and the intermolecular BSSE in geometrically conforming complexes of phenol and N-formylglycine can be expected. In the current section, we follow two alternative routes to reduce the BSSE in the MP2 calculations: (i) the application of the local approach of the MP2 method (LMP2). Local methods have much reduced BSSE [22], but the local correlation approximation introduces errors arising from not correlating distant electron pairs. (ii) By employing large basis sets (aug-cc-pVTZ/QZ). This is expected to sufficiently reduce the BSSE to eliminate any artificial extrema in the MP2 rotational energy profile. MP2 calculations with basis sets of this size were feasible by employing the ‘density fitting’ (df) algorithm [23] implemented in MOLPRO to reduce computational cost. Fig. 5 displays the df-MP2 and df-LMP2 rotational energy profiles computed with the aug-cc-pVDZ (avdz),
aug-cc-pVTZ (avtz) and aug-cc-pVQZ (avqz) basis sets. The two df-MP2 curves are very similar, demonstrating that the calculations are nearly converged with respect to basis set size. The BSSE in the df-MP2/avqz calculations is small (<3 kJ/mol) and does not change much in magnitude over the entire /gly range (inset, Fig. 5). Thus, BSSE effects do not distort the computed rotational energy profile. When employing the aug-cc-pVDZ basis set, on the other hand, a sharp increase in the magnitude of the BSSE around /gly = 130° is observed. Note that the shape of the BSSE curve is very similar to that computed with MP2/ 6-31+G(d) (Fig. 4a). The large variation of the BSSE in the df-MP2/avdz calculations results in a small artificial minimum at /gly = 130° in the df-MP2/avdz profile (not shown). Also shown in Fig. 5 is the profile computed with dfSCS-MP2/avqz. The SCS-MP2 method has been shown to yield much improved potential energy curves for the benzene dimer (both in its stacked and T-shaped conformation) [35], significantly reducing MP2’s overestimation of the dispersion energy. In the benzene dimer, the SCSMP2 results are in excellent agreement with high-level CCSD(T) calculations. Fig. 5 shows that the global minimum is slightly destabilised by SCS-MP2 as compared to MP2; the MP2 and SCS-MP2 curves are very similar otherwise, indicating that overestimation of dispersion by MP2 is not a major issue affecting the rotational energy profile of this particular Tyr–Gly conformer. The df-LMP2/avdz and df-LMP2/avtz profiles were found to be almost identical (absolute relative energy differences <0.6 kJ/mol over the 50–300° /Gly range). Therefore, only the df-LMP2/avtz profile is displayed in Fig. 5. The BSSE is very small in the LMP2 calculations (inset, Fig. 5). The close similarity of the aug-cc-pVDZ and augcc-pVTZ LMP2 profiles, in addition to the good agreement
46
L.F. Holroyd, T. van Mourik / Chemical Physics Letters 442 (2007) 42–46
between the LMP2/avtz and MP2/avqz results (Fig. 5), indicates that the LMP2 calculations are converged with respect to basis set size already at the double-zeta level. Presumably, the main effect of basis set improvement in the non-local MP2 calculations is to reduce the BSSE. Again, application of the SCS correction to the df-LMP2/avtz values slightly stabilises the 180°- and 280°-minima (results not shown). 4. Conclusions We investigated the strikingly different conformation of the Tyr–Gly book4 conformer predicted by B3LYP and MP2 by computing potential energy profiles for rotation around the glycine Ca–N bond. The B3LYP/6-31+G(d) profile displays one minimum at 180°. The MP2/631+G(d) profile displays two minima, at 74° (the global minimum) and at 290°. A third minimum (at 180°) only shows up after BSSE correction. The triple-well profile is confirmed by (virtually BSSE-free) df-MP2/aug-cc-pVQZ and df-LMP2/aug-cc-pVTZ calculations. These results show that the large conformational dependence of the BSSE in MP2 calculations with medium-sized basis sets may significantly distort the potential energy surface and hide local minima. The missing minima in the B3LYP profile are likely due to the underestimation of dispersive interactions in the B3LYP calculations. (It is puzzling, however, that the profile computed with HF – which does not describe dispersion interactions either – does show two shallow minima around 80° and 280°.) The SCS correction does not significantly change the MP2 rotational energy profile, indicating that overestimation of dispersion by MP2 is not a major issue for the structure of this particular Tyr–Gly conformer. The main effect of basis set improvement in the non-local MP2 calculations appears to be a reduction of the BSSE; the virtually BSSE-free LMP2 calculations converge much faster with basis set size. Acknowledgements T.v.M. gratefully acknowledges the Royal Society for their support under the University Research Fellowship Scheme, the Engineering and Physical Sciences Research Council for funding for computer systems (Grant codes GR/R63196/01 and GR/S06233/01), and EaStCHEM for computational support via the EaStCHEM Research Computing Facility.
References [1] L. Gonza´lez, O. Mo´, M. Ya´n˜ez, J. Comput. Chem. 18 (1997) 1124. [2] C. Tuma, A.D. Boese, N.C. Handy, Phys. Chem. Chem. Phys. 1 (1999) 3939. [3] F.A. Hamprecht, A.J. Cohen, D.J. Tozer, N.C. Handy, J. Chem. Phys. 109 (1998) 6264. [4] T. van Mourik, R.J. Gdanitz, J. Chem. Phys. 116 (2002) 9620. [5] T. van Mourik, Chem. Phys. 304 (2004) 317. [6] I. Dabkowska, P. Jurecka, P. Hobza, J. Chem. Phys. 122 (2005) 204322. [7] S. Kristya´n, P. Pulay, Chem. Phys. Lett. 229 (1994) 175. [8] P. Hobza, J. Sˇponer, T. Reschel, J. Comput. Chem. 16 (1995) 1315. [9] M. Elstner, P. Hobza, T. Frauenheim, S. Suhai, E. Kaxiras, J. Chem. Phys. 114 (2001) 5149. [10] U. Zimmerli, M. Parinello, P. Koumoutsakos, J. Chem. Phys. 114 (2001) 2693. [11] S. Grimme, J. Comput. Chem. 25 (2004) 1463. [12] O.A. von Lilienfeld, I. Tavernelli, U. Rothlisberger, D. Sebastiani, Phys. Rev. Lett. 93 (2004) 153004. [13] A.D. Becke, E.R. Johnson, J. Chem. Phys. 122 (2005) 154104. ´ ngya´n, J. Chem. Phys. 126 (2007) 044103. [14] I.C. Gerber, J.G. A [15] S. Grimme, J. Antony, T. Schwabe, C. Mu¨ck-Lichtenfeld, Org. Biomol. Chem. 5 (2007) 741. ˇ erny´, P. Hobza, D.R. Salahub, J. Comput. Chem. 28 [16] P. Jurecˇka, J. C (2007) 555. [17] D. Toroz, T. van Mourik, Mol. Phys. 104 (2006) 559. ˇ eha et al., Chem. Eur. J. 11 (2005) 6803. [18] D. R ˇ eha, P. Hobza, J. Phys. Chem. B 110 (2006) 6385. [19] H. Valde´s, D. R ˇ erny´, P. Jurecˇka, P. Hobza, H. Valde´s, J. Phys. Chem. A 111 [20] J. C (2007) 1146. [21] S.C.C. Nunes, A.J.L. Jesus, M.T.S. Rosado, M.E.S. Euse´bio, J. Mol. Struct. (Theochem.) 806 (2007) 231. [22] T. van Mourik, P.G. Karamertzanis, S.L. Price, J. Phys. Chem. A 110 (2006) 8. [23] H.-J. Werner, F.R. Manby, P.J. Knowles, J. Chem. Phys. 118 (2003) 8149. [24] G. Hetzer, P. Pulay, H.-J. Werner, Chem. Phys. Lett. 290 (1998) 143. [25] M. Schu¨tz, G. Hetzer, H.-J. Werner, J. Chem. Phys. 111 (1999) 5691. [26] G. Hetzer, M. Schu¨tz, H. Stoll, H.-J. Werner, J. Chem. Phys. 113 (2000) 9443. [27] F. Grimme, J. Chem. Phys. 118 (2003) 9095. [28] T.H. Dunning Jr., J. Chem. Phys. 90 (1989) 1007. [29] R.A. Kendall, T.H. Dunning Jr., R.J. Harrison, J. Chem. Phys. 96 (1992) 6796. [30] F. Weigend, A. Ko¨hn, C. Ha¨ttig, J. Chem. Phys. 116 (2002) 3175. [31] M.J. Frisch et al., GAUSSIAN 03, Revision D.01, Gaussian Inc., Pittsburgh, PA, 2003. [32] NWCHEM version, Version 4.5, High Performance Computational Chemistry Group, Pacific Northwest National Laboratory, Richland, Washington 99352, USA, 1999. [33] H.-J. Werner, P.J. Knowles, R. Lindh, F.R. Manby, M. Schu¨tz, others, MOLPRO, version 2006.1, A package of ab initio programs.
. [34] S.F. Boys, F. Bernardi, Mol. Phys. 19 (1970) 553. [35] J.G. Hill, J.A. Platts, H.-J. Werner, Phys. Chem. Chem. Phys. 8 (2006) 4072.