An efficient procedure for calculating the molecular gradient, using SCF-CI semiempirical wavefunctions with a limited number of configurations

An efficient procedure for calculating the molecular gradient, using SCF-CI semiempirical wavefunctions with a limited number of configurations

Journal of Molecular Structure (Theochem), 206 (1990) 123-133 Elsevier Science Publishers B.V., Amsterdam - Printed in The Netherlands 123 AN EFFICI...

756KB Sizes 23 Downloads 55 Views

Journal of Molecular Structure (Theochem), 206 (1990) 123-133 Elsevier Science Publishers B.V., Amsterdam - Printed in The Netherlands

123

AN EFFICIENT PROCEDURE FOR CALCULATING THE MOLECULAR GRADIENT, USING SCF-CI SEMIEMPIRICAL WAVEFUNCTIONS WITH A LIMITED NUMBER OF CONFIGURATIONS

MICHAEL J.S. DEWAR Department of Chemistry, University of Texas at Austin, Austin, TX 78712 (U.S.A.) DANIEL A. LIOTARD Laboratoire de Chimie Structurale, Faculte des Sciences, 64000 Pau (France) (Received 24 June 1988; in final form 26 June 1989)

ABSTRACT An effective procedure is presented for calculating analytical derivatives of the energy in Hartree-Fock-type procedures including configuration interaction in cases where the number of configurations is small. This situation commonly arises in calculations involving semiempirical models such as MNDO or AMl. Existing numerical procedures are unsatisfactory, having been designed for ab initio applications where large numbers of configurations must be included.

INTRODUCTION

Theoretical chemists are becoming increasingly concerned with studies of chemical reactions that involve exploration of potential energy surfaces, the object being to locate the relevant stationary points (minima and transition states) involved in a reaction and the reaction paths joining them. For large molecular systems this can be achieved only by using a wide range of optimization methods, all of which require fast and accurate computation of derivatives of the energy. Calculations of this kind also commonly require the use of correlated wavefunctions. In ab initio SCF MO procedures, allowance for correlation is usually made by configuration interaction. To be effective, a large number of configurations must be included. In semiempirical schemes such as MNDO [1 ] and AM1 [ 21, allowance for electron correlation is made implicitly via the parameters (static correlation). In special cases, where correlation effects are unusually large, further allowance is made for this effect by the inclusion of configuration interaction (CI) terms involving a small number of relevant configurations (dynamic correlation). The molecular orbitals are usually determined by optimizing a single

0166-1280/90/$03.50

0 1990 Elsevier Science Publishers B.V.

124

determinant SCF wavefunction. The latter is normally of closed-shell type unless a HOMO/LUMO crossing occurs during the reaction in question, in which case the “half-electron” model [ 31 is commonly used. Analytical expressions for derivatives of the energy were described long ago for SCF closed shells [ 4-61. These results have also been expressed either in perturbational [ 71 or matrix {S] form. The first major applications, by Pulay [9] were to the computation of force fields by analytical-numerical methods. These have since been extended to UHF and MP2 wavefunctions [ 10 1, MCSCF plus CI, including second-order derivatives [ 11-141, to the Roothaan openshell treatment [15,16] and to very large CI wave~ctions using the GUGA [17] or MCSCF [ 181 formalisms. For a review see ref. 19. None of these approaches takes account of the special features of the semiempirical treatments indicated above, i.e. the fact that they are commonly applied to large molecules involving a large number of MOs, the speed with which the Fock matrix can be calculated (due to the small number of electron repulsion integrals), and the small number of con~~rations treated (because the effects of electron correlation are taken into account primarily via the parametrization). We report here an efficient analytical approach to the computation of derivatives of the energy in such models, taking into account the new possibilities offered by multiprocess~g or vectorization in the case of supercompute~. OVERVIEW

The energy of a single-determinant RHF plus CI wavefunction can be written as the sum of two distinct terms, the “SCF energy” and the “CI correction”. This is true for both closed-shell and “half-electron” functions. Each term depends explicitly on the coordinates of the nuclei (via the one- and two-electron integrals in the atomic-orbital basis) and on the electron cloud spanned by the molecular orbitals (MO). The MOs are implicit functions of the nuclear coordinates, determined by the RHF-SCF equations. These define the stationary condition to be satisfied by the MO at each nuclear geometry. The problem involved in an analytical derivation of the energy (E) has a well-known formal solution depending on the nature of the so-called response function [ 201. The corresponding relations can be stated explicitly in matrix form. A useful theorem of linear algebra gives the derivatives of the eigenvalues 2 and eigenvectors T of a hermitian matrix A &l=T’~AT+BIZ-AB

with B=dT’T

(1)

where AT=T,%andT’T=TT’=I

(21

The diagonal elements of an are the derivatives of the eigenvalues. Setting the

125

off-diagonal elements equal to zero leads to the ~tisymmetric matrix B, giving the derivatives of the eigenvectors 8T in a form equivalent to perturbation theory in the basis of the unperturbed eigenvectors of the matrix A [ 7,8,21]. BASIC RELATIONS AND DERIVATIVES

Let us introduce the following notation for both closed-shell and “‘half-electron” cases: C, molecular orbital coef~cients (by columns for different MOs); e, diagonal matrix of Foek eigenvalues; H, one-electron integrals (core hamiltonian matrix); 0, diagonal matrix of MO occupancies; D, the one-electron density matriu V, the fourth-order tensor of two-electron integrals (p(l)q(l) lr(2)e(2)) definedby VPQrs=2 (pq 1rs) - (ps 1rq) ; and B, rotation matrix between the MOs and their derivatives; dC = - CB. The one-electron density matrix is given by D=COC’

(3)

and its derivative by dD=KOC’+COl’K’=C(OB-BO)C’

(4)

The Fock matrix, F, takes the form F=I1+&5tr(OC’VC)

=H+O.56

(5)

where G = tr (DV) stands for G, = =Q,

(6)

V,,,

P9

Therefore, t?F=F1 +F,

(7)

with the non-relaxed term (F,) given by F1 =~H+O.5tr(D~V)

(3)

and the relaxation (F,) by Fz =0.5tr(aDV)

=0.5tr[ (OB-BO)C’VC]

(9)

The “SCF electronic energy” (E) is given by: E=O.&r[ (H+F)D]

00)

with the formal derivatives dE=0.5tr[

(dH+aF)D]+0.5tr[(H+F)dD]

(11)

This expression can be split into non-relaxed and relaxation components

aE=E1 +E,

(12)

126

where El=0.5tr[

(aH+F,)D]

(13)

and E,=O.&r(F,D)

+0.5tr[ (H+F)dD]

(14)

Taking into account the symmetry properties of V gives tr[~Dtr(DV~]=tr~Dtr(~DV~]

(15)

hence E,=tr(aDF) DIFFERENTIATION

(16) OF THE FOCK EQUATION

The response unction linking the electronic cloud (and its derivatives ) to the nuclear coordinates is the Fock equation (without overlap in NDDO formalism) FC=Ce

(17)

The first consequence of this relation appears in the relaxation term &. Fromeqns. (4), (16), and (17) E,=tr[

(OB-BO)e]

=tr[B(eO-Oe)

]

(18)

Since the matrices e and 0 are both diagonal they commute and Ez therefore vanishes. Next let us consider the matrix B, i.e. the MO derivatives and related expressions. Applying the main theorem to eqn. (17) using eqns. (8) and (9) gives de=C’F,C+Be-eB+O.tiC’tr[

(OB-BO)C’VC]C

(19)

The diagonal terms give explicitly the derivatives of the eigenvalues of the Fock matrix as functions of B. Writing down the nullity of the off-diagonal terms gives a linear system of equations for the off-diagonal B matrix elements. As matrix B is antisymmetric, its diagonal elements are zero and the number of unknowns is n (n- 1) /2 with n the number of MOs. Once this system of linear equations has been solved, all the ingredients needed to build the CT matrix and its derivative are readily available, as shown below. DERIVATION OF THE “CI CORRECTION” TERM

Explicit computation and storage of the two-electron density matrix does not offer any advantage if the number of configurations is small. Let Ho, be

127

the CI matrix with eigenvectors Ccl. The CI energy (Ec,) of the selected eigenstate is an eigenvalue of HcI; i.e.

&I = CCIHCICCI and, therefore, (20)

~&I=C&WICCI

The matrix aHcI is built directly from the derivatives of the eigenvalues of the Fock matrix and the derivatives of the (IJ 1KL) integrals over the CI-active MO: d(IJIKL)=(IJld(l/r)IKL)+(aIJIKL) + (Ih’JIKL) + (IJlaKL)

+ (IJIKdL)

(21)

The first term (non-relaxed component) requires a full four-indices transformation which can be done in two steps after computation and storage of the one- and two-center charge-distribution supervectors. These steps are not time consuming in the NDDO formalism. If the geometrical coordinate is Cartesian, the number of non-vanishing two-center integral derivatives ( (w I r3(l/r) I rs) ) over the AOs is small. The four remaining terms (relaxation component) come from a fast oneindex transformation involving the MO derivatives in the unperturbed MO basis (matrix B) and the ( nJ1 KL) integrals where n runs over all the MOs and J, K and L run over the CI-active MOs (dIJIKL)=CBi,(dIKL),etc. n

(22)

MO DERIVATIVES

Let /I be the supervector of unknowns with elements flij = B, (j < i) , i and j referring first to the subset 1 of pairs of MOs with different occupancies and then to the subset 2 of pairs of MOs with identical occupancies. The off-diagonal terms in eqn. (19) lead to (d-a)fi=/3°

(23)

where d is a diagonal super-matrix with elements Ai,,= (ej - ei) and ocis a nonsymmetric supermatrix given by aij,~~=0.5(0i-Oj)(4(~lkZ)-(iZIjk)-(ik(jZ))

(24)

and 8: = (C’F,C)ij Partitioning

eqn. (23) between the subsets 1 and 2 gives

(25)

and

The linear system [ eqn. (26) ] is usually solved by projection onto a small basis set of o~honorm~i~ vectors (a) generated by relaxation [ 10,111. The matrix alI does not need to be explicitly computed and stored. This point is essential for handling a large MO basis. Actually, for any vector pi (corresponding to a matrix Bi) we have lalllfli =C’&C

I@211

(26)

Using eqn. (9) gives Fz =0.5tr~C(OB~-BiO)C’V~

(29)

Thus & can be obtained from eqn. (27) by trivial inversion of the diagonal matrix & Use of eqn. (28) instead of eqns. (24) and (26) presents advantages when CI is limited, when the number of basis-set functions is large, and when the two-electron terms in the Fock matrix can be calculated rapidly. These conditions hold for semiempirical methods based on ZDO approximations. If degeneracy occurs, group theory predicts the associated element of fi2 to be zero (no rotation within the subset of degenerate MO) [22]. If the degeneracy is accidental, and thus very sensitive to changes in the nuclear geometry, the corresponding MO must belong to the same subset of occupancy at the “half-electron”SCF level and must be included in the CI-active MO for the numerical accuracy of both eqns. (26) and (27) not to be affected. CO~P~ATIONAL

DETAILS

In the NDDO formalism, resolution of the linear response equation [eqn. (26) ] is the only time-consuming numerical work to be performed. Care is to be taken in order to obtain the required accuracy (typically 0.001 on each element of 8) by some efficient procedure. Relaxation-projection methods are very sensitive to the dispersion of the eigenvalues of the matrix (A, - a,,) which must be implicitly inverted [ 211. The eigenvalue spectrum is conveniently narrowed by introducing a positivedefinite diagonal metric tensor S with elements nearly equal to the inverse of the square root of the diagonal elements of (d, -all) in place of& as suggested in refs. 10 and 11. Therefore, the linear system [eqn. (26) ] is best written in the form (S&S)

tS-‘A)=@%

(36)

129

i.e. AX=XO

(31)

with the vector of unknowns (32)

Xij=Si;i’(Oi-Oj)fiij and the right-hand side is best written as

x; =sijp;j

(33)

Since A is derived from the electronic Hessian matrix of the SCF equation, it is a symmetric positive-definite matrix with the off-diagonal elements given by Aij,kl= -sij~ij,klSkll(Oi

(34)

-Oj>

and the diagonal ones by

Aij,ij=Sij[ (ej -ei) - ‘Xij,ij]Sij/ (Oi

- Oj)

(35)

Optimum results for both speed and accuracy [ 231 are obtained by collecting a large number of right-hand-side vectors in eqn. (31) and solving them simultaneously. This is accomplished by the following series of steps. (1) extract an initial orthonormalized basis set V from X0; (2 ) update AV using the relationship outlined in eqn. (28); (3 ) update V’ AV and invert directly; (4) update V’XO and find an approximate solution X=BW with W= (V’AV)-‘V’XO; (5 ) compute the residues R = X O- AVW and if the largest residue is lower than the specified threshold then stop. Otherwise; (6) extract, an orthonormalized basis set from the residues, append it to the previous basis and go back to step 2. The extraction of a basis set from a given set of residual vectors R is best achieved by performing a principal-component analysis on the residuals [ 241. More precisely the eigenvectors T of the semipositive definite matrix R’R R’RT=Tp

(36)

provide, in the least-squares sense, an optimally hierarchical orthonormalized basis V: V=RTP-~‘~

(37)

where the eigenvalues ,uare ordered in decreasing order. More than 85% of the variance usually lies in a subspace of dimension about one-third the number of vectors R, thus allowing a slower increase in the size of the basis V without a significant loss of information. Thus to obtain one X vector, six V vectors

130

are usually needed, whereas if 12 X vectors are computed simultaneously, only 24 V vectors are needed. COMPUTER PROGRAM

The procedure indicated above is included in a new version (2.1) of the AMPAC program [ 251 which is now available from the Quantum Chemistry Program Exchange (QCPE).This permits use of analytical derivatives in all connections involving the use of CI wavefunctions. PERFORMANCE

Naturally, an important question concerning the procedure developed here is its performance relative to that used in the current AMPAC program [ 251 where CI derivatives are found numerically, by finite difference using full SCFCI calculations. Accordingly, some timing comparisons were carried out using a VAX 8350 computer the performance of which (1.1 MIPS) is similar to that of the VAX 11-780. For the reasons indicated above, the situations of main interest involve the use of the HE-C1 approximation (half-electron approximation with a 3 x 3 CI). The molecules chosen were a number of cyclic polyethers (CH,OCH,), with n ranging from 1 (oxirane) to 6 (crown ether 18C6). These molecules cover a wide range of molecular sizes, with the number of atoms ranging from 7 to 42, the number of basis-set AOs from 16 to 96, and the number of Cartesian coordinates from 21 to 126. While the calculations were carried out using MNDO, similar results would be expected using AMl, because the timings for MNDO and AM1 calculations are almost the same. In each case the geometry was first optimized at the SCF closed-shell HF level. Single-point calculations were then carried out for the energy, and derivatives of the energy, for the lowest HE-C1 singlet state, the derivatives being calculated both analytically and numerically (DERINU option in AMPAC ) . The PULAY and PRECISE options in AMPAC were used to ensure accurate convergence at the SCF level. In the calculations of the numerical derivatives it was also necessary to narrow the convergence limits (SCFCRT = 0.00000001) in order to obtain results as accurate as the analytical values. The standard deviations on the gradient components were ca. 0.01 kcal A-’ while those between the analytical and numerical gradients ranged between 0.004 and 0.009 kcal A-‘. The time (seconds of CPU time) required for the various geometry optimizations, and the ratios of the times required for calculating energies and derivatives are shown in Table 1. The ratio of the time required for calculating the analytical derivatives to that for calculating the energy increases with molecular size. The value (2.9) for oxirane is similar to the corresponding ratios

131 TABLE 1 The time required” for single-point HECI calculations of energies, and the analytical and numerical gradients for oxirane and its polymers ( C2H40)n n

1 2 4 5 6

Analytical gradients

Numerical gradients

C(num.)/G(anal.)

Eb

G’

G/E

Eb

G’

GIE

8 50 386 720 975

24 207 2566 5161 9930

2.9 4.1 6.7 7.2 10.2

12 53 371 798 1042

863 9924 26839 49593 82823

70.5 187.5 72.4 62.2 79.5

36.5 48.0 10.5 9.6 8.3

“Seconds of CPU time. “Time for calculating energy. ‘Time for calculating derivatives.

reported for ab initio procedures [ 261. Here, however, the ratio increases with molecular size, unlike the corresponding ratios for numerical derivatives and is in contrast to the values reported for ab initio calculations. Thus the advantage of using analytical derivatives in MNDO decreases with molecular size. However, the ratio remains large enough (8.3) even in the case of the crown ether to represent a major practical improvement. These differences can be easily explained. In both semiempirical (ZDO ) and ab initio CI calculations, the computing time required for gradients depends on the number of terms in the relevant expressions which varies in each case as m4, m being the number of basis-set functions. Since the same is true for ab initio calculations of RHF molecular energies, the ratio of times for calculating ab initio energies and derivatives is independent of m. In semiempirical procedures, however, the rate-determining step in the energy calculation is matrix diagonalization. Since the time required for this varies only as m3, the advantage of using analytical derivatives in semiempirical SCFCI calculations is expected to decrease with molecular size, as the results reported here indicate. However, the results also suggest that the use of analytical derivatives should lead to a major saving of computing time in the case of most systems of chemical interest. SUMMARY AND CONCLUSIONS

In the semiempirical procedures that have been developed by our group, allowance for correlation is made via the parametrization. Very extensive tests have shown the effectiveness of this approach. Special allowance for correlation becomes necessary only in the case of biradical-like species where the correlation between the “unpaired” electrons is unusually large. Such species unfortunately play a major role as intermediates in reactions.

132

Work here [ 271 and elsewhere [28] has shown that species of this kind are treated much more effectively by using limited CI, either with normal RHF MOs or with MOs found using the “half-electron” approximation, than by the UHF approximation. However, the latter has until recently been used almost exclusively because of the amount of computing time needed in the HE-C1 approach. For example, in a recent study [29] of rearrangements of C,,H, isomers, using the HE-C1 version of MNDO, the location of each transition state took upwards of 30 h on an IBM 3081 computer, on which our programs run ca. 15 times as fast as on a VAX 11-780. Calculations of this kind are now being carried out here routinely on a VAX 11-780, the use of the procedure described here having reduced the amount of computing time by more than one order of magnitude [ 30 3. Algorithms for calculating analytical derivatives of CI functions have been developed elsewhere and have been used extensively in ab initio calculations. In these cases however, enormous CI matrices must be handled. The procedure described here is primarily intended for systems involving limited CI terms of the kind involved in semiempirical treatments. In such cases it offers special advantages in that the most time-consuming parts of the computations involve standard manipulations of large matrices, which can be vectorized easily. The method should therefore, be particularly appropriate for implementation on supercompu~rs. Developments along these lines are in progress at Pau. ACKNOWLEDGEMENTS

This work was supported by the Air Force Office of Scientific Research (Grant No. ~89-0179), National Science Foundation (~H~7-12022) and the Robert A. Welch Foun~tion (Grant No. F-126).

REFERENCES 1 2

9 10

M.J.S. Dewar and W. Thiel, J. Am. Chem. Sot., 99 (1977) 4899,4907. M.J.S. Dewar, E.G. Zoebisch, E.F. Healy and J.J.P. Stewart, J. Am. Chem. Sot., 107 (1985) 3902. M.J.S. Dewar, J.A. Hashmall and C.G. Venier, J. Am. Chem. Sot., 90 (1968) 1953. S. Bratoz, Coll. Int. CNRS (Paris), 82 ( 1957 ) 287. D.M. Bishop and M. Randic, J. Chem. Phys., 44 (1966) 2480. R. Moccia, Chem. Phys. Lett., 5 (1970) 260. J. Gerratt and I.M. Mihs, J. Chem. Phys., 49 (1968) 1719,173O. L. Bouscasse, R. Phan-Tan-Luu, Ed. Vincent and J. Metzger, Int. J. Quantum Chem., 7 (1973) 357. P. P&y, Mol. Phys., 17 (1969) 197; in H.F. Schaefer III (Ed.), Modern Theoretical Chemistry, Plenum Press, New York, 1977. J.A. PopIe, R. Krishnan, H.B. Schlegel and J.S. Binkley, Int, J. Quantum Chem., 13 (1979) 225.

133 11 M. Dupuis, J. Chem. Phys., 74 (1981) 5758. 12 B.R. Brooks, W.D. Laidig, P. Saxe, J.D. Goddard, Y. Yamaguchi and H.F. Schaefer III, J. Chem. Phys., 72 (1980) 4652. 13 R. Krishnan, H.B. Schlegel and J.A. Pople, J. Chem. Phys., 72 (1980) 4654. 14 T. Takada, M. Dupuis and H.F. King, J. Chem. Phys., 75 (1981) 332. 15 Y. Osamura, Y. Yamaguchi and H.F. Schaefer III, J. Chem. Phys., 75 (1981) 2919. 16 Y. Osamura, Y. Yamaguchi, P. Saxe, M.A. Vincent, J.F. Gaw and H.F. Schaefer III, Chem. Phys., 72 (1982) 131. 17 Y. Osamura, Y. Yamaguchi and H.F. Schaefer III, J. Chem. Phys., 77 (1982) 383. 18 T.U. Helgaker, J. Almlof, H.J. Aa. Jensen and P. Jorgensen, J. Chem. Phys., 84 (1986) 6266. 19 J. Simons and P. Jorgensen (Eds.) Geometrical Derivatives of Energy Surfaces and Molecular Properties, Reidel, Dordrecht, 1986. 20 H.F. King and A. Komornicki, J. Chem. Phys., 84 (1986) 5645. 21 E. Durand, Solutions Numeriques des Equations Algebriques, Vol. 2, Masson, Paris, 1961. 22 M.D. Kumanova, Mol. Phys., 23 (1972) 407. 23 J.E. Rice, R.D. Amos, N.C. Handy, T.J. Lee andH.F. Schaefer III, J. Chem. Phys., 85 (1986) 963. 24 M. Kendall, Multivariate Analysis, Chap. 2, Macmillan, New York, 1980. 25 Quantum Chemistry Program Exchange (QCPE), Indiana University, Program No. 506. 26 See, e.g., A. Komornicki, QCPE Bull., 8 (1988) 1,9. 27 J.J. Dannenberg and K. Tanaka, J. Am. Chem. Sot., 107 (1985) 671. 28 M.J.S. Dewar, S. Olivella and J.J.P. Stewart, J. Am. Chem. Sot., 108 (1986) 5771. 29 M.J.S. Dewar and K.M. Merz Jr., J. Am. Chem. Sot., 108 (1986) 5142,5146. 30 A referee has suggested that the present procedure might be further accelerated by using Handy’s Z vector method [31], which leads to substantial improvement in the case of ab initio procedures. To implement this in AMPAC would, however, be a major undertaking, involving rewriting all the CI routines with explicit computation and storage of the secondorder density matrix after CI. While we hope to do this at some time in the future, the present version represents such a major improvement that we feel it should be presented as it stands, given its immediate practical value in applications to chemical problems. 31 N.C. Handy, J. Chem. Phys., 81 (1984) 5031.