Computer Physics Communications 141 (2001) 39–54 www.elsevier.com/locate/cpc
Time evolution in the unimolecular master equation at low temperatures: full spectral solution with scalable iterative methods and high precision Terry J. Frankcombe ∗ , Sean C. Smith Department of Chemistry, University of Queensland, Brisbane, Qld. 4072, Australia Received 22 March 2001; accepted 4 June 2001
Abstract A new method is presented to determine an accurate eigendecomposition of difficult low temperature unimolecular master equation problems. Based on a generalisation of the Nesbet method, the new method is capable of achieving complete spectral resolution of the master equation matrix with relative accuracy in the eigenvectors. The method is applied to a test case of the decomposition of ethane at 300 K from a microcanonical initial population with energy transfer modelled by both Ergodic Collision Theory and the exponential-down model. The fact that quadruple precision (16-byte) arithmetic is required irrespective of the eigensolution method used is demonstrated. 2001 Elsevier Science B.V. All rights reserved. PACS: 02.70.Hm; 34.50.Lf Keywords: Master equation; Spectral solution; Arnoldi method; Nesbet method; HONE method; Time-dependent concentration
1. Introduction The unimolecular master equation [1–3] (ME) has been a particularly useful approach to statistical unimolecular rate theory [1,2,4], helping to elucidate aspects of important gas phase reactions. However utilising the ME approach is often plagued by gross numerical instability, particularly at lower temperatures. (‘Low’ in this context is usually of the order of several hundred kelvin, but depends largely on the relative sizes of the reaction barrier and kB T , where kB is the Boltzmann constant and T is the temperature in kelvin.) To extend the application of ME formulations, * Corresponding author.
E-mail address:
[email protected] (T.J. Frankcombe).
new numerically stable and scalable methods of solving the ME are needed. For a thermally controlled unimolecular reaction, the ME statistically describing the evolution in time of the molecular population as a function of energy is ∂p(t; E) = ω ∂t
∞
P (E|E )p(t; E ) dE
0
− ω + k(E) p(t; E),
(1)
where p(t; E) is the population distribution, ω is the collision frequency, P (E|E ) is the energy transfer kernel describing the probability that a collision involving a molecule at energy E will leave that molecule at energy E, and k(E) is the microscopic rate constant for unimolecular reaction. Note that the ME pre-
0010-4655/01/$ – see front matter 2001 Elsevier Science B.V. All rights reserved. PII: S 0 0 1 0 - 4 6 5 5 ( 0 1 ) 0 0 2 9 8 - 3
40
T.J. Frankcombe, S.C. Smith / Computer Physics Communications 141 (2001) 39–54
sented in Eq. (1) is one-dimensional in that all quantities are expressed as a function simply of energy. Greater accuracy requires a two dimensional formulation, incorporating explicit dependence on both energy and angular momentum. In this form the unimolecular ME is a differentialintegral equation on a continuous domain. To make the ME more easily tackled with numerical methods it is common to bin the energy domain into a set of n uniformly distributed discrete energies {Ei }, transforming equation (1) into a system of coupled differential equations. This yields what is known as the energy grained master equation (EGME) [1,5], dp(t) = Ap(t), (2) dt where the vector p(t) is a representation of the continuous function p(t; E) sampled in the energy domain (so that pi (t) = p(t; Ei )). The n × n matrix A, the rate matrix, is given by Aij = ω δEP (Ei |Ej )
(3)
for off-diagonal elements and Aii = −ω δE P (Ej |Ei ) − k(Ei )
(4)
j =i
for elements on the main diagonal, where δE is the energy grain size. The solution of Eq. (2) as a function of time is easily shown to be p(t) = exp(At)p0 ,
(5)
where p0 ≡ p(t = 0) is a vector describing the initial population. Appearing in this solution is the matrix exponential exp(At), which is a difficult entity to calculate. A more easily calculated solution can be obtained as [1] p(t) =
n
αj exp(λj t)xj ,
(6)
j =1
2. Existing eigensolution methods
where λj and xj are eigenvalues and eigenvectors of the matrix A (so that Axj = λj xj ) and the projection coefficients αj are determined by projecting the initial population p0 onto the set of eigenvectors so that p0 =
n j =1
αj xj .
Whereas solution via Eq. (5) requires the determination of exp(At) for each time of interest, solution via Eqs. (6) and (7) requires a single eigendecomposition of the matrix A and a single projection for each initial population of interest. For the long-time solution only the smallest eigenvalue and corresponding eigenvector need be calculated. A set of the smallest eigenvalues and corresponding eigenvectors yield the solution for medium to long times and a complete eigendecomposition gives the solution for all times. The inverse relationship between the magnitude of the eigenvalue and the time scale to which it is most relevant means it is common to number the eigenvalues from the smallest to the largest. This convention will be used here, with λ1 being the eigenvalue of smallest magnitude with (for example) x1 as the corresponding eigenvector, together being referred to as the first eigenpair. It is worth noting that the eigendecomposition of Eq. (6) is not the only way to solve the ME. Solution of the ME through numerical Monte Carlo integration techniques has been used by several authors (see, for example, Barker [6], Barker and King [7] and Vereecken et al. [8]). The main advantage of the eigendecomposition approach over the alternative numerical integration approach lies in the computational time required for well resolved populations, particularly at medium to long times when the population is approaching equilibrium. In the next section an overview of some existing methods is presented, including the important Nesbet method and the motivation behind the new methodology being presented in this paper. The new method is presented in Section 3, followed by a discussion of the level of numerical precision required to perform population propagations in conjunction with an example calculation. Section 5 details another example calculation. The final section presents some conclusions.
(7)
The solution of the standard eigenproblem Axj = λj xj for rate matrices arising in this context is where problems start occurring. Typically a symmetrised version of the ME is used. As a consequence of detailed balance [3] the matrix B = SAS −1 is symmetric [1,9], where the diagonal matrix S is defined by its diagonal
T.J. Frankcombe, S.C. Smith / Computer Physics Communications 141 (2001) 39–54 −1/2
elements Sii = fi derived from the Boltzmann distribution f. The eigenvalue spectrum is invariant under this transformation and the eigenvectors xj of A can easily be obtained from the eigenvectors yj of B as xj = S −1 yj . The smallest eigenvalue of the rate matrix is typically well separated from the remainder of the spectrum. For rate matrices corresponding to low temperatures and pressures the smallest eigenvalue can be very small, and special methods must be employed to calculate this eigenvalue and the corresponding eigenvector accurately. The most successful of these methods for the smallest eigenvalue is the Nesbet method [10] as adapted by Gilbert and coworkers [11,12]. The Nesbet method can be derived in several ways, such as by considering the exact solution zeroing one element of the residual r = By − λy
(8)
or as an approximation to the matrix inversion required for the exact zero residual vector. Working with the symmetrised version of the rate matrix B, the resultant method iteratively takes an eigenvector estimate y(i) and corresponding smallest eigenvalue estimate λ(i) and calculates a new eigenvector estimate y(i+1) via an (i+1) (i) (i) element by element update yk = yk + σk , with the update given by rk (i) σk = − . (9) Bkk − λ(i) For the normal application of the Nesbet method to the smallest eigenvalue and corresponding eigenvector of the ME the residual is often calculated by non-standard approaches to minimise numerical error [11,12]. The update elements can be applied either sequentially for each element or simultaneously to all elements. The eigenvalue estimate λ(i) can them be updated from the eigenvector update. The Nesbet method has proved to be extremely successful in calculating the smallest eigenvalue and the corresponding eigenvector, which correspond to the long-time solution of the ME and hence the long-time population distribution. Other attempts at improving the efficiency and applicability of the uni-molecular master equation, such as the Equilibrium Finite Basis Set method [13] and the compound methods of Venkatesh and coworkers [14,15] also fail for difficult low temperature cases. While the Weighted Inner
41
Product Subspace Projection method [16] does operate at low temperatures, in its current form it works most effectively for the smallest eigenvalue and corresponding eigenvector at the expense of the higher eigenpairs needed for shorter times. The truncated steady state solutions of Green et al. [17] do not use a full eigendecomposition of the rate matrix, but are also limited in the time range to which they apply. A standard approach for determining the eigendecomposition of general large matrices is to use subspace projection methods [18], such as the Lanczos or Davidson methods. These methods project the matrix onto a subspace yielding a projected problem that is smaller, better conditioned, requires less core storage or has other beneficial properties. Particularly popular and robust methods are the Lanczos and closely related Arnoldi methods [19]. These project the matrix B onto a Krylov subspace K(m) (B, v) = span(v, Bv, B 2 v, . . . , B m−1 v), where v is a seed vector, and are effective in the presence of numerical error. These methods are particularly attractive for solving the unimolecular ME as eigenpairs with eigenvalues at the extremes of the spectrum converge the fastest in the Krylov subspace, meaning a set of eigenpairs to model the population evolution for medium to long times can potentially be calculated rapidly. An additional significant advantage is that the Lanczos method can be implemented using just a few vectors, an extremely important property when dealing with the large matrices that can arise in two-dimensional ME problems. To make more concrete the discussion of how the Krylov based methods perform for difficult low temperature ME matrices consider a ME for the decomposition of ethane at 300 K and 104 torr (1.3 MPa) with an energy transfer kernel derived from Ergodic Collision Theory [20] (ECT). The energy grain size was approximately 1.2 kJ/mol (100 cm−1 ) and an energy range of 718 kJ/mol was covered. We have used this model in past work to test low temperature ME eigensolution methods [16]. In order to accurately assess the performance of the method the 600 × 600 discretisation of the ME is small enough to allow the calculation to be repeated using a direct method implemented using very high levels of numerical precision. An adaptation of the EISPACK [21] QL procedure “rs” was used along with the MPFUN multi-precision system of Bailey [22] to directly diagonalise B. This approach
42
T.J. Frankcombe, S.C. Smith / Computer Physics Communications 141 (2001) 39–54
yields a numerical environment with a unit roundoff of around 10−200 (compared to standard double precision arithmetic with a unit roundoff of around 10−16 ) and requires around 400 megabytes of free core storage. Although the calculation time is very long, the resultant eigendecomposition yields demonstrably very small residuals and serves as an excellent basis with which to compare. Increasing the numerical precision is a technique that has been used in the past [23] to solve MEs at lower temperature than possible in double precision, though not usually to the 200 digit level used here. The Lanczos and Arnoldi methods are formally identical when applied to symmetric matrices, though numerical errors affect the two methods differently. While the Lanczos method in its basic form is only applicable to real symmetric matrices the Arnoldi method can be applied to any matrix, albeit at a higher computational cost. In principle the Arnoldi method can be applied just as easily to the original matrix A as to the symmetrised matrix B, or indeed to the matrix SBS −1 = AT which similarly has the same spectrum and easily transformed eigenvectors. In practice it is found that for the asymmetric cases of A and AT the Arnoldi method returns many complex eigenvalues, despite the fact that the spectrum is necessarily real. In this work the Arnoldi method has been used over the computationally less expensive Lanczos method for an observed improvement in numerical stability and faster convergence. When the projected eigenproblem is solved the eigenvalue estimates for all but the smallest eigenvalue converge giving a complete spectrum in little more than 600 iterations, the theoretical minimum number of iterations required to determine the 600 eigenpairs. Fig. 1 shows for selected eigenpairs the relative difference between the eigenvalue estimates produced by the Arnoldi method and the converged eigenvalues as a function of the number of Arnoldi iterations. The slowest to converge were the eigenvalues in the range λ225 to λ300 . Once the spectrum was converged the spurious eigenvalue estimates resulting from additional iterations were very small in magnitude. This is the well known ghosting effect [19] producing spurious copies of the smallest eigenvalue and corresponding eigenvector. Once the eigenvalues had converged there was at most 10−9 relative error between the eigenvalues determined by the dou-
Fig. 1. |λ(n) − λ(i) |/λ(n) , the relative error in successive eigenvalue estimates with respect to the converged value. 10th (—), 150th (− · · −), 275th (− · −), 450th (· · ·) and 600th (− − −) smallest eigenvalue. Sampled every 10 iterations.
ble precision Arnoldi method and those determined by the high precision direct method. Fig. 2 shows the eigenvectors corresponding to the 20th, 200th and 450th eigenvalues. Both the final eigenvectors from the Arnoldi method in double precision and those calculated directly in high precision are shown. This figure reveals the shape of the eigenvectors, which in general are not well known. As the eigenvalues get larger the eigenvectors maintain the same general shape with additional nodes (sign changes) appearing at low energies. The node structure never passes the 305th element, which is the energy grain corresponding to the reaction threshold. Towards the 305th eigenvalue the spacing between the eigenvalues decreases steadily, as does the magnitude of the low energy elements of the corresponding eigenvectors. For eigenvalues past the 305th the eigenvectors resemble delta function spikes, with exponentially decaying shoulders and some nodes at low energies. Beyond the 305th the eigenvalues increase in magnitude rapidly. A striking feature of Fig. 2 is that the double precision Arnoldi eigenvectors and those calculated in high precision agree where the magnitude of the vector is relatively large. However where the elements of the vector are small the double precision vector bears
T.J. Frankcombe, S.C. Smith / Computer Physics Communications 141 (2001) 39–54
43
Fig. 2. Absolute value of eigenvectors calculated with the Arnoldi Method in double precision (− − −) and with the high precision direct method (—). 20th (a), 200th (b), and 450th (c) eigenvector.
no resemblance to the accurate high precision vector, being comprised completely of numerical noise. From these results one can conclude that a double precision Arnoldi method can easily and accurately calculate the eigenvalue spectrum for all but the smallest eigenvalue — which can be quickly and easily calculated using the Nesbet method. The Arnoldi method can also converge certain regions of the eigenvectors, these being the regions in which the eigenvector elements are relatively large. However, significant portions of the eigenvectors remain completely inaccurate.
Consider then the propagation of an initial population through Eqs. (6) and (7). An accurate propagation requires the accurate determination of both the eigendecomposition and the projection coefficients αj . The standard method for obtaining a decomposition such as in Eq. (7) is through what is known as the projection theorem, which allows calculation of the coefficients as inner products: αj = yj , Sp0 .
(10)
Accurate projection coefficients cannot be calculated from the eigenvectors calculated in double precision
44
T.J. Frankcombe, S.C. Smith / Computer Physics Communications 141 (2001) 39–54
(such as those shown in Fig. 2), as all regions of the eigenvectors contribute to the coefficients calculated through Eq. (10). This is particularly the case for applications such as chemically or laser activated reactions or problems where the initial population is far from equilibrium. In these cases details of the initial population above the threshold energy (which are projected by the corresponding regions of the eigenvectors; regions which are generally composed entirely of numerical noise) become very important, and the corresponding erroneous contributions to the inner product in Eq. (10) can become large.
3. High-order Nesbet eigenvectors: The HONE method The discussion of the previous section suggests that the approximation to the eigendecomposition generated by an Arnoldi iteration would be a reasonable starting point for a method attempting to refine the higher approximate eigenvectors. Given that the spectrum can be obtained with reasonably high precision it is tempting to apply the Nesbet algorithm to refine the eigenvectors, with the shift λ(i) in Eq. (9) fixed at λ, the eigenvalue estimate from the Arnoldi method calculation. 3.1. Unmodified Nesbet method The result of applying the Nesbet method á la Eq. (9) to the 20th eigenpair with the eigenvalue fixed at that converged from the Arnoldi calculation is shown in part in Fig. 3. This figure shows only the low energy elements of the vector for the sake of clarity. The seed vector for the method was the vector from the Arnoldi calculation shown in Fig. 2(a) for the low energy elements with an exponentially decaying part resembling the true eigenvector pasted on from the 250th energy grain. The results are similar if the complete vector produced by the Arnoldi method are used and no such pre-emptive pasting was performed for the later calculations described in this paper. In the first few iterations of applying the Nesbet method the high energy tail of the eigenvector improves, though it does not converge to the accurate values. After the first few iterations the low energy elements of the vector — which had been accurately calculated by the Arnoldi
Fig. 3. Result of applying the Nesbet method to a partially converged vector with a fixed shift. Absolute value of eigenvector estimates (i) y20 corresponding to the 20th eigenvalue. Initial vector (—) along with the vectors from the 10th (· · ·) and 20th (− · −) iterations.
calculation — start to diverge. As can be seen from Fig. 3, after 20 iterations the resultant vector bears no resemblance to any eigenvector of B. After several hundred iterations the vector converges to a vector that, while being at first glance similar to that shown in Fig. 2(a) for the 20th eigenvector, does not represent any eigenvector of the matrix B. To quantify the quality of an eigenpair the residual norm is often used, being the norm of the residual vector r defined in Eq. (8). When the eigenvalues are large or cover a large range or when the elements of the eigenvector vary by many orders of magnitude, all of which apply to this case, a better measure is a relative residual norm. This can be calculated by taking the norm of a relative residual vector R calculated from the eigenpair λ and y by Rk = [(By)k − λyk ]/(By)k for element k, where (By)k represents the kth element of the vector By. A good eigenpair should return a small relative residual norm. For the application of the Nesbet method to the 20th eigenpair of B as previously described the relative residual norm was at its smallest with a value of approximately 0.001 after the 7th iteration, not particularly small when compared to the unit roundoff of around 10−16 . By the 13th iteration the relative residual norm was of order 100. A similar pattern was observed for all eigenvectors except the eigenvector corresponding to the smallest
T.J. Frankcombe, S.C. Smith / Computer Physics Communications 141 (2001) 39–54
45
eigenvector, for which the Nesbet method was developed. In the case of the eigenvectors beyond the 305th (those resembling delta functions) the elements corresponding to energies above the reaction threshold generally converged to the correct values except for the spike element, which was typically around twenty orders of magnitude too small. Elements corresponding to energies below the reaction threshold were typically exponentially increasing with decreasing energy rather than the correct exponentially decreasing behaviour. The remaining set of eigenvectors generally did not converge to the correct vector in any region. 3.2. Modified Nesbet: the HONE method A modification to this procedure yields a method that does converge. The method is comprised of two different applications of methods derived from the Nesbet method following an Arnoldi or Lanczos calculation. This is denoted the High-Order Nesbet Eigenvectors (HONE) method. The ‘High-Order’ component of the name represents the fact that the Nesbet method is being applied to eigenpairs with eigenvalues of higher order (higher in the spectrum) than simply the smallest for which it was developed. The unsuccessful calculation described above takes an eigenvector estimate y(i) , forms an update vector σ (i) using Eq. (9) then simply forms the next eigenvector estimate as y(i+1) = y(i) + σ (i) . In the successful variant of this procedure the converged elements of the vector are preserved by setting the corresponding elements of the update vector to zero. Figs. 4 and 5 show the results of repeating the calculation described above for the 20th vector with the first 150 elements of σ (i) set to zero. While the relative residual norm increases initially it soon starts decreasing exponentially before stagnating at less than 10−11 after 25 iterations. This relative residual norm indicates the resultant eigenpair is of high quality. The highest relative residual norm in Fig. 4 of around 1000 occurs at iteration 7. The eigenvector estimate corresponding to this iteration is shown in (7) Fig. 5, along with the product By20 and the raw (7) (7) residual vector By20 − λ20 y20 . Despite the fact that the eigenvector is quite close to the true eigenvector (7) (7) the product By20 varies significantly from λ20 y20 for the high energy elements, which leads to the relatively large raw residual vector elements. Also shown in
Fig. 4. Relative residual norm as a function of iteration from applying the HONE method to the 20th eigenpair resulting from an Arnoldi calculation.
(7)
Fig. 5. Absolute values of eigenvector estimate y20 (—), ma(7) (7) (7) trix–vector product By20 (− · −) and residual By20 − λ20 y20
(− − −) after the 7th iteration of the HONE method. Also shown is the residual after the 30th iteration (· · ·).
Fig. 5 is the raw residual vector arising from the eigenvector estimate from the 30th iteration, which can readily be seen to lead to the far lower relative residual norm of 10−11 . The residual vector after the 30th iteration shown in Fig. 5 exhibits a stark drop after the 150th element. This corresponds to the update cutoff — the first
46
T.J. Frankcombe, S.C. Smith / Computer Physics Communications 141 (2001) 39–54
150 elements of the eigenvector were not modified. No automatic measure for an appropriate modification cutoff was developed; each cutoff was set manually. If the cutoff was set too high then inaccurate elements from the Arnoldi calculation were preserved and the relative residual would not converge to a small norm vector. Setting the cutoff moderately too low would slow convergence. If the cutoff was set to a point before the eigenvector begins to drop into the exponentially decreasing tail the large magnitude elements of the eigenvector that had previously been converged by the Arnoldi method got modified and became inaccurate. This methodology can successfully be applied to the complete set of eigenpairs. For the delta function-like eigenvectors (those corresponding to the top half of the spectrum) generally only the element corresponding to the spike needed to be fixed, which was the j th element for the j th eigenvector. For the eigenvectors at the lower end of this region of the spectrum the low energy elements were problematic. These could either be converged using a damping of the Nesbet update or left unconverged to be dealt with in the second part of the HONE method, detailed next. It is apparent in Fig. 2(b) that the calculated eigenvectors corresponding to eigenvalues in the middle of the spectrum are also inaccurate in the small magnitude low energy elements. These inaccuracies show up in the relative residual vectors and get steadily worse as one moves up the spectrum, as shown in Fig. 6. The obvious remedy is to allow the Nesbet update to be applied to these elements. However when this is done, whether or not the high energy tail has been converged and/or fixed, the magnitude of the unconverged low energy elements grows rather than reducing towards the correct answer. The vector eventually converges at these increased magnitudes with no resemblance to the desired eigenvector in this low energy region. The low energy elements can be converged using a second method derived from the simple Nesbet method. The single vector update Nesbet method used so far is based on calculating an update vector σ (i) whose elements are found by solving each line of the equation
B (y + σ ) = r + B σ = 0
(11)
independently (where B = B − λI ), and assuming that the coupling between the elements of the raw
Fig. 6. Absolute value of relative residual vectors upon convergence of HONE method for high energy elements. 50th (···), 100th (· · ·), 150th (− − −), 200th (− · −), 250th (− · · −), and 300th (—) eigenpairs. Data smoothed for clarity.
residual vector (Eq. (8)) is small. This assumption breaks down for the low energy elements of the eigenvector, so that when one element is zeroed other elements become large. A more robust method is derived by partitioning the eigenvector y into two parts, y1 , (12) y= y2 where y1 contains the first q elements of y and y2 contains the remaining n–q elements. This splitting partitions equation (11) into blocks. It can then be expressed as B11 B12 σ1 r1 = − (13) σ2 r2 B21 B22 where σ and r have been partitioned in the same way as y and B partitioned into appropriate blocks (with being q × q). The eigenvector update becomes B11 (i+1)
y1
(i)
= y1 + σ 1
(14)
for the low energy part of the eigenvector and (i+1)
y2
(i)
= y2 + σ 2
(15)
for the high energy part of the eigenvector. In the present case we can restrict the method to update only elements of the eigenvector in the first block where
T.J. Frankcombe, S.C. Smith / Computer Physics Communications 141 (2001) 39–54
47
the relative residual vector is not sufficiently small by setting σ 2 to zero, effectively removing Eq. (15). This block Nesbet algorithm replaces Eq. (9) with the solution to the system of equations corresponding to the upper q × q block of B − λI , σ 1 = −r1 . B11
(16)
The assumption about the coupling within the solution is thus reduced from assuming that the entire vector is only loosely coupled to assuming that there is minimal coupling between the low energy and high energy parts of the vector (corresponding to y1 and y2 , respectively). The size of the corrected block taken is unrestricted. Taking a block size of one gives the traditional single vector update Nesbet method as a special case of this generalised block Nesbet method. Applying the second block component of the HONE method to the unconverged, low energy elements of the vectors produced by the first part of the HONE method (those producing the relative residual vectors shown in Fig. 6) successfully reduces the relative residual norms to less than 10−10 for all eigenpairs of the test matrix. Typically only one application of a correction from the block Nesbet component was needed to achieve this accuracy. No more than two iterations were required for any vector. The relative residual vectors resulting from applying all stages of the HONE method (including the initial application of the Arnoldi method) to the 200th eigenpair are shown in Fig. 7. Note that the Arnoldi and HONE first stage residuals are nearly coincident in the top-left of the plot and the HONE first and second stage residuals are quite similar in the bottomright of the plot. The cutoffs used for the two updates (the 330th element for the single vector update correction of the high energy tail and the 130th element for the block Nesbet correction of the low energy elements) show up clearly as discontinuities in the relative residual. These results are typical. The HONE method has achieved relative accuracy (a high level of accuracy relative to the precision with which the calculations were performed, in this case standard double precision, irrespective of the magnitudes of the eigenvalues and eigenvector elements involved). To conclude this section we present pseudocode for the complete HONE method. Bkk (as distinct from ) represents the kth element of the diagonal of B11 the matrix B and in general xk (as distinct from
Fig. 7. Absolute value of relative residual vectors at various stages for the 200th eigenpair. Calculated with the vector from the Arnoldi method after 900 iterations (· · ·) and that vector after applying the first, single vector update part (—) and the second block Nesbet part (− − −) of the HONE method. Data smoothed for clarity.
x(i) j ) represents the kth component of the vector x. last and ylast represent the residual vector r(i) and r j eigenvector estimate y(i) j , respectively, resulting from represents the last iteration of the previous i loop. B11 the leading qj × qj block of the matrix B − λj I . (i) σ (i) 1 and σ 2 represent the first qj and remaining n − qj elements of the vector σ (i) , respectively. HONE: Perform Lanczos/Arnoldi calculation to converge eigenvalues λj and large eigenvector elements y(1) j for j := 2, 3, . . . , n (1) r(1) := By(1) j − λj yj for k := 1, 2, . . . , n Rk := rk(1)/(By(1) j )k end select set of updatable elements Ij from R for i := 2, 3, . . . until convergence σ (i) := 0 for k := 1, 2, . . . , n if k ∈ Ij σk(i) := −rki−1 /(Bkk − λj ) end end (i) (i−1) yj := yj + σ (i)
48
T.J. Frankcombe, S.C. Smith / Computer Physics Communications 141 (2001) 39–54 (i) (i) y(i) j := yj /||yj ||
r(i)
:= By(i) j
− λj y(i) j
end (1) yj := ylast j r(1) := rlast for k := 1, 2, . . . , n Rk := rk(1) /(By(1) j )k end set cutoff index qj from R for i := 2, 3, . . . until convergence σ (i) = −r(i−1) Solve B11 1 1 (i) σ 2 := 0 (i) (i−1) yj := yj + σ (i) (i)
(i)
yj := yij /||yj || r(i) := By(i) j
− λj y(i) j
end end end HONE The set of updatable elements Ij is based on the relative residual vector R and the nature of the eigenvector in question. Generally it includes the indices in the region of unacceptably high relative residual vector elements corresponding to high energies for the lower half of the spectrum, and all but the index of the spike element (i.e. j ) for the upper half of the spectrum where the eigenvectors resemble delta functions. The update cutoff index qj is also determined by examining R. This should be set to the index of the upper extremity of the region of unacceptably high relative residual vector elements. An important property of a diagonalisation method for solving unimolecular MEs is scalability. Generally for a two-dimensional ME the matrix is too large to fit in core memory, meaning that only the action of the matrix on a vector is readily available. The HONE method presented here is comprised of generally scalable components. The Lanczos method can be implemented using only a matrix–vector product routine and a handful of vectors. The first, single vector update generalisation of the Nesbet method contained in the HONE method can also be performed with only a matrix–vector product routine and a couple of vectors. The second Nesbet component of the HONE method, the block Nesbet update, requires the solution of a linear system of equations. A two-dimensional imple-
mentation of the HONE method would likely involve a further generalisation of Eq. (13) into a greater number of blocks and the solution of larger systems of equations. While direct methods for this system solve have been used in this work, scalable options exist for such system solves, such as the MINRES [24] and GMRES [25] methods. While the coupling within the residual vector was found to be too strong to allow smaller blocks to be updated separately, a least squares solution of an over-determined system arising from a rectangular block of the matrix, also scalable, could potentially be used.
4. Required precision level Once a complete eigendecomposition has been determined, with a reasonable level of relative accuracy for the level of precision the calculations are implemented in, Eqs. (6) and (10) can be used to determine the time dependence of any particular initial population for all times. 4.1. Propagation in double precision The initial population used to test the propagation for this ethane test case was a distribution with all the population in the 350th energy grain. This grain corresponds to an energy of about 418 kJ/mol, a little way above the reaction threshold energy of 365 kJ/mol. This single spike microcanonical population may be regarded as similar to a population that could arise in laser excitation experiments, or crudely representative of a chemically activated population. When the propagation is performed in the symmetrised representation of the ME, the populations for all times are at first glance appropriate, with some seemingly insignificant noise in small values at low energies. These profiles are transformed back to the raw population by multiplication by f 1/2 . While higher energy features of the population profile remain well resolved when transformed to the raw population, the noise at low energies completely dominates. Some population elements are calculated to be negative. Any total population calculated by summing the energy resolved population becomes meaningless. Such a summed total population is the sort of variable
T.J. Frankcombe, S.C. Smith / Computer Physics Communications 141 (2001) 39–54
that may be experimentally observed, for example, by time-resolved absorption measurements. Further calculations reveal that even though the HONE method achieves reasonable relative accuracy, double precision calculations are insufficiently accurate to yield useful total populations. The highly accurate (200 digit) eigen-decomposition calculated with the MPFUN package was truncated to double precision to perform the projection and propagation. Though reduced, the numerical noise at low energies was still present and dominating. The fact that propagating the population in the high precision arithmetic (as opposed to truncating them to double precision) produces converged, noise free results indicates that even if perfect double precision relative accuracy could be achieved this would not be sufficient to solve the problem. Sample population distributions are shown in Fig. 8 for a time of 100 ns after initiation. While the numerical noise does become insignificant at times of the order of 500 ns, by this time the overall population is well into a single exponential decay which can easily be elucidated with the traditional Nesbet method, so little has been gained. Also shown in Fig. 8 is the population distribution calculated by truncating the high precision eigenvec-
Fig. 8. Modelled population distribution of ethane under an ECT kernel 10−7 s after initiation with a single spike population. Calculated from double precision HONE method eigenvectors (− − −), high precision eigenvectors truncated to double precision (· · ·) and quadruple precision (—). Negative values in ‘noisy’ region shown as absolute value.
49
tors to quadruple precision. Quadruple precision maintains around 32 decimal digits of accuracy, and is simulated on 64-bit processors in a similar fashion to the way double precision is simulated on 32-bit processors. It is obvious from Fig. 8 that the significance of the numerical noise at low energy has been greatly reduced, and agrees well with the population distribution calculated by doing the complete propagation in high precision. It is worth noting at this point that, as predicted, the eigenvectors produced by the Arnoldi method failed to reproduce the initial population as required by Eq. (7) everywhere bar the spike element, element 350. The reproduction at this element was exclusively due to the magnitude of the 350th element of the 350th eigenvector, which closely resembles the initial population. While such a decomposition could model the reactive decay of the population spike for a short period (which is characterised by a single exponential), the remainder of the population was completely unresolved with significant regions of erroneous negative population. 4.2. Quadruple precision The entire HONE method, including the initial Arnoldi calculation, was repeated in quadruple precision. At the end of the two Nesbet phases no eigenvector yielded a relative residual norm greater than 2 × 10−22 . Most relative residual norms were of the order of 10−28 . Results of propagating a population initially confined to the 350th element are shown in Fig. 9 for a selection of times after initiation. At times shorter than 10−12 s numerical noise is once more present in the low energy elements, though for times greater that 10−14 s this noise has no significant effect on the total summed population. For times greater than 10−14 s the transient dynamics that cannot be modelled with a single exponential decay are fully resolved. The HONE method, even though it involves several steps and the significant overhead involved in quadruple precision arithmetic, requires considerably less computational effort than the direct diagonalisation in a high precision environment. Some relative timings are presented in Table 1 for this 600 × 600 test case. It is worth noting that these timings are for a reasonably primitive implementation of the HONE method; little regard was taken for computation time
50
T.J. Frankcombe, S.C. Smith / Computer Physics Communications 141 (2001) 39–54
and several unnecessary quantities were calculated and written to disk at each iteration. The propagation times quoted are for calculating the distribution at 50 discrete times.
5. Exponential-down kernel example
Fig. 9. Modelled population distribution of ethane under an ECT kernel from eigenvectors calculated using the HONE method implemented in quadruple precision. Populations at 10−15 s (—), 10−13 s (− · · −), 10−10 s (− · −), 10−8 s (− − −), and 10−7 s (· · ·) after initiation. Successive curves shifted on x axis to show details of spike population reduction. Negative values in ‘noisy’ region shown as absolute value.
Table 1 Approximate relative timings for eigenproblem methods on 667 MHz Compaq XP1000. Includes I/O times Precision
Phase
Double
Arnoldi
Quadruple
High
Rel. time 1.0
HONE phase 1
0.5
HONE phase 2
0.4
Propagation
0.01
Total
1.9
Arnoldi
4.5
HONE phase 1
3.7
HONE phase 2
2.5
Propagation
0.08
Total
10.8
Direct method
38.4
(Confirm residuals) Propagation Total
(25) 0.3 38.7
While the ECT energy transfer kernel has a solid theoretical footing (describing the ergodic limit), a more commonly used kernel is the exponential-down or simply exponential model [26]. This model specifies the transitions to lower energy as an exponential dependent on the energy difference and a single parameter that can be equated to the quantity Edown , the average amount of energy transferred in collisions leading to a lower energy. Transitions upward in energy are specified by the detailed balance condition [3]. The exponential-down model kernel used in this work used the parameter Edown = 250 cm−1 (roughly 3 kJ/mol). This is slightly down from the Edown of the ECT derived kernel of around 3.63 kJ/mol, or 303 cm−1 . The behaviour of the Arnoldi method and the character of the eigenvectors, described below, was insensitive to changing this parameter to 150 or 350 cm−1 . An accurate eigendecomposition of the rate matrix using the high precision (200 digit) QL method could be obtained, though with significantly more effort than in the ECT case. The QL algorithm took longer to converge than in the ECT case by a factor of twenty five. This extended calculation time is indicative of the difficulty of finding the eigendecomposition of this matrix, a difficulty that prevented bringing the full power of the HONE method to bear on the problem. The HONE method was applied in quadruple precision to the ethane decomposition model described above. The application of the Arnoldi method was problematic. The eigenvectors found were considerably different to those found for the ECT derived rate matrix described in the preceding sections. While the general description of the eigenvectors corresponding to the smaller eigenvalues remained the same — an oscillating region with the number of nodes increasing with the position in the spectrum merging into an exponentially decaying region — the extent of the oscillating region grew much faster with the position in the
T.J. Frankcombe, S.C. Smith / Computer Physics Communications 141 (2001) 39–54
spectrum. The eigenvector corresponding to the fifth smallest eigenvalue had large elements (of roughly equal magnitude) up to the element corresponding to the reaction threshold energy; in the ECT kernel case this did not occur until around the 80th smallest eigenvalue. Unlike the eigenvectors of the ECT kernel derived matrix, none of the first 300 eigenvectors of the exponential-down kernel derived matrix showed a significant drop in magnitude of the elements corresponding to low energies. The upper end of the spectrum was more similar to the eigenvectors of the ECT derived matrix. Instead of a vector comprising two exponential decays meeting at a spike element many orders of magnitude larger, the two exponential decay parts of the vectors met at a spike element around a factor of 10 larger. While in the ECT case the eigenvectors retained the same basic shape right through the upper part of the spectrum, in the exponential-down case the large magnitude peak of the eigenvector broadens and gains structure as one looks further from the top of the spectrum. Where these behaviours met in the middle of the spectrum the eigenvectors exhibited mixed character. While the oscillatory part of the vector receded to lower energies and oscillated significantly less, a region of oscillation remained after a region of exponential decay. This oscillation region appears to be a precursor to the broad peaks in the next higher part of the spectrum. The various classes of eigenvectors are shown in Fig. 10. It is the eigenvectors in the region of the transition from these mixed character eigenvectors to the broad peak eigenvectors that cause trouble. Examination of the residuals and comparison to the 200 digit results show that the eigenvectors produced in this region of the spectrum by the Arnoldi method are spurious and the eigenvalues are inaccurate. Increasing the number of Arnoldi iterations from 900 to 2000 did not significantly alter the presence of these inaccurate ghosts in the spectrum, merely adding an additional 1100 ghosts of the smallest eigenpair to the bottom of the calculated spectrum. While reasonable approximations to the true eigenvector forms could be obtained through a shift-invert strategy, accuracy better than around 10−6 could not be obtained through any method investigated in quadruple precision. The Nesbet variations of the HONE method applied to these low accuracy vectors could not improve the accuracy in the critical region
51
Fig. 10. Absolute value of larger magnitude elements of representative eigenvectors of the rate matrix derived from the exponential-down model. Eigenvectors corresponding to the 20th (· · ·), 306th (—), 321st (− − −), and 500th (− · −) smallest eigenvalues.
just above the 300th energy grain. These difficulties left 25 eigenvectors missing from the spectrum. Without well converged regions of the vectors from which to start, the HONE method cannot be applied. This meant that a full spectrum could not be obtained for the exponential-down model derived rate matrix. However the HONE method was applied to those eigenpairs suitably converged by the Arnoldi method. The exponential-down kernel provided additional complications for the application of the first Nesbet part of the method to the low end of the spectrum. When correcting the parts of the vectors corresponding to unacceptably high relative residual vectors, the method would progress a little way, then stagnate leaving a high relative residual. To combat this the method was applied repeatedly, with each restart deeming more of the vector converged based on low relative residual vector elements. An example of the relative residual vectors that were achieved using this method is shown in Fig. 11. The successive update cutoffs are clearly visible as sudden drops in the relative residual vector. This approach was only required for the first twenty eigenpairs. The low energy parts of the eigenvectors in the first half of the spectrum do not drop off to small values in the exponential-down kernel case as they do in the ECT kernel case. The magnitude of these
52
T.J. Frankcombe, S.C. Smith / Computer Physics Communications 141 (2001) 39–54
Fig. 11. Absolute value of relative residual vector for eigenvector corresponding to the fifth smallest eigenvalue of the exponential-down derived rate matrix. Relative residual vector at the end of the Arnoldi method (· · ·), after stagnation of the HONE method (− − −) and after successive applications of the HONE method with adaptive cutoffs (—).
Fig. 12. Population profiles calculated for 10−6 seconds after initiation for the exponential-down kernel ME. Populations from the full high precision eigendecomposition (—) and from the eigendecomposition calculated using the quadruple precision HONE method, truncated to the first 200 (− − −) and 300 (· · ·) eigenpairs. Negative values in ‘noisy’ region shown as absolute value.
elements being similar to the magnitude of the largest elements allowed the Arnoldi method to determine the value of these elements accurately. Hence the block Nesbet update part of the HONE method was not needed to enhance these elements in this case. After the application of the single vector update part of the HONE method, all but two of the smallest 300 eigenpairs produced a relative residual norm less than 10−27 . The 300th eigenvalue, λ300 , was calculated to be approximately −7.373 × 107 s−1 . Recalling that the eigenpairs with the smallest m eigenvalues are sufficient to model times t 1/|λm |, using the first 300 eigenpairs gives this lower limit to be t 1.3 × 10−8 s. This lower limit turns out to be optimistic by around two orders of magnitude. Due to the density of the spectrum reducing the number of eigenpairs used to 200 raises the lower limit to just t 1.4 × 10−8 s. As the eigenpairs beyond a missing section of the spectrum cannot be simply used in a propagation of the form of Eqs. (6) and (7) and any such eigenpairs do not contribute significantly to the population at times significantly longer than given by the above limits, the upper part of the spectrum (beyond the missing 25 eigenvectors) was not fully converged using the HONE method.
Fig. 12 shows the effect of truncating the population expansion in Eq. (6) to 200 and 300 eigenpairs for t = 10−6 s. The deviation from the population determined using the complete eigendecomposition calculated in high precision is clearly much greater for the 200 eigenpair expansion than for the 300 eigenpair expansion, as expected. The level of noise in the 300 eigenpair case is around a factor of 10−7 smaller than the maximum population elements, indicating that at t = 10−6 s highly accurate total populations can be determined even in the presence of the numerical error. At t = 10−5.9 ≈ 1.26 × 10−6 s the calculated population shows no sign of numerical instability or noise. Fig. 12 shows no spike at the initiation energy (the 350th energy grain). Details of the destruction of the single energy character of the population occur at shorter times. The populations at these shorter times are subject to truncation error and are not revealed by the 300 eigenpair expansion being used. The calculated population for a range of times after initiation are shown in Fig. 13. These populations indicate that the truncated eigendecomposition can give the behaviour of the population well into the transient region before relaxing to a single exponential decay mode. While using the raw eigenvectors calculated using the Arnoldi method to project and propagate the
T.J. Frankcombe, S.C. Smith / Computer Physics Communications 141 (2001) 39–54
53
6. Conclusion
Fig. 13. Population profiles for the exponential-down kernel ME from a eigendecomposition truncated to 300 eigenpairs produced by the quadruple precision HONE method. 10−6 s (—, thicker), 10−5.8 s (- - -), 10−5.6 s (− − −), 10−5.4 s (− · −), 10−5.2 s (− · · −), 10−5 s (· · ·), and 10−4.5 s (—, thinner) after initiation. Negative values in ‘noisy’ region shown as absolute value.
population performs significantly better in this case than in the ECT case, significant errors still occur, especially in the low energy end of the population. These errors once again render the total summed populations unusable. Fig. 13 shows that with the exponential-down model kernel the population is approaching a single exponential decay 10−5 s after initiation. Though Fig. 9 does not show it, a similar population profile is reached around 10−7 s after initiation for the ECT model kernel. The total population is around a factor of 10 smaller in the ECT case, even at times a factor of 100 shorter. No significant reduction in population occurs after this long time solution has been obtained due to the very small unimolecular reaction rate constant (smallest eigenvalue) for this reaction. While admittedly these models are very different, the modelling shows that the two models are easily discriminated when comparing to potential experimental data for short lived transient populations. It is anticipated that more closely related models could be discriminated for similar difficult low temperature cases using the HONE method, as is common for less difficult, higher temperature cases [7,27–31].
The solution of the master equation for unimolecular and complex-forming bimolecular reactions reactions is notoriously difficult at low temperatures (i.e. around 500 K or below). However, a large amount of extant experimental data falls within this temperature range. Hence, both for atmospheric modelling and for validation of modelling parameters for use in combustion modelling at higher temperatures, it is crucial to develop accurate, robust and efficient methods for solution of the master equation in this low temperature regime. For simple (single isomer) unimolecular decay in the steady state regime, the Nesbet method [11,12] provides the most efficient solution. However, for systems of increasing complexity and in particular for complex-forming bimolecular reactions temporal evolution of the molecular population distributions becomes a central issue and necessitates more generalised methods. In this paper we have explored the spectral approach to time evolution in the ME problem. Using a onedimensional ME for ethane as a test system, it has been demonstrated that even though the higher eigenvalues of the rate matrix are accurately computed by standard methods in double precision, there is substantial numerical noise in the associated eigenvectors which renders a straightforward spectral approach infeasible. The numerical noise was shown to occur in the parts of the eigenvectors at or above the reaction threshold, which are very small in magnitude ( 10−16 ), in which case absolute accuracy to double precision results in pure noise. The next objective addressed was to develop a method capable of achieving relative accuracy at double precision, so that these problematic regions of the eigenvectors would be accurate to double precision relative to their own very small magnitude. This has been achieved for the first time in this work by a generalisation of the Nesbet method — the HONE (High-Order Nesbet Eigenvectors) method presented in detail above — used in conjunction with Lanczos or Arnoldi methods to compute starting approximations to the eigenvectors. We then showed that even with relative accuracy in the eigenvectors, double precision is insufficient for accurate spectral propagation in the ethane ME problem. Extension of the HONE method to quadruple precision (32 digits) allows reliable spectral propagation for an
54
T.J. Frankcombe, S.C. Smith / Computer Physics Communications 141 (2001) 39–54
ECT kernel in the rate matrix, but still displays some error for the more common (and physically more realistic) exponential kernel. Significantly, the failure of the HONE method in the latter case may be attributed to a failure of the prerequisite Arnoldi method (providing the initial approximation to the eigenvectors) rather than the generalised Nesbet method. We have also carried out ultra high precision calculations (200 digit precision) in all cases. While considerably slower than the HONE method and requiring substantial storage such that it could not be applied to more complex two-dimensional MEs, this approach has been crucial to provide a benchmark for comparison. The results presented have shed considerable light on the issue of developing discretised matrix solutions to the ME under conditions where temporal evolution is required rather than a steady state solution. Future directions in this work will involve the development of scalable iterative methods which are applicable to more complex one- and two-dimensional problems. It is now apparent that such methods will necessarily operate in a higher precision environment than is commonly used in scientific computations (in all likelihood beyond quadruple precision), and we are in the process of exploring these issues for a number of representative systems.
Acknowledgements Many thanks to Dr. Kevin Gates for helpful discussions and for providing microscopic rate constant and density of states information for the example calculation. We gratefully acknowledge ongoing support from the University of Queensland and the Australian Research Council (Large Grant No. A10027155) for this and related work. TJF is supported financially by an Australian Postgraduate Award.
References [1] R.G. Gilbert, S.C. Smith, Theory of Unimolecular and Recombination Reactions, Blackwell Scientific, Oxford, 1990.
[2] K.A. Holbrook, M.J. Pilling, S.H. Robertson, Unimolecular Reactions, 2nd edn., John Wiley & Sons, Chichester, 1996. [3] S. Nordholm, H.W. Schranz, in: J.R. Barker (Ed.), Advances in Chemical Kinetics and Dynamics, Vol. 2A, JAI, Connecticut, 1995, p. 245. [4] W. Forst, Theory of Unimolecular Reactions, Academic Press, New York, 1973. [5] B.J. Zwolinski, H. Eyring, J. Am. Chem. Soc. 69 (1947) 2702. [6] J.R. Barker, Chem. Phys. 77 (1983) 301. [7] J.R. Barker, K.D. King, J. Chem. Phys. 103 (1995) 4953. [8] L. Vereecken, G. Huyberechts, J. Peeters, J. Chem. Phys. 106 (1997) 6564. [9] E. Montroll, K. Shuler, Adv. Chem. Phys. 1 (1958) 361. [10] R. Nesbet, J. Chem. Phys. 43 (1965) 311. [11] B.J. Gaynor, R.G. Gilbert, K.D. King, Chem. Phys. Lett. 55 (1978) 40. [12] R.G. Gilbert, K. Luther, J. Troe, Ber. Bunsenges. Phys. Chem. 87 (1983) 169. [13] H.W. Schranz, S. Nordholm, Chem. Phys. 74 (1983) 365. [14] P.K. Venkatesh, A.M. Dean, M.H. Cohen, R.W. Carr, J. Chem. Phys. 107 (1997) 8904. [15] P.K. Venkatesh, A.M. Dean, M.H. Cohen, R.W. Carr, J. Chem. Phys. 111 (1999) 8313. [16] T.J. Frankcombe, S.C. Smith, J. Comput. Chem. 21 (2000) 592. [17] N.J.B. Green, P.J. Marchant, M.J. Perona, M.J. Pilling, S.H. Robertson, J. Chem. Phys. 96 (1992) 5896. [18] Y. Saad, Numerical Methods for Large Eigenvalue Problems, Manchester University Press, Manchester, 1992. [19] J.K. Cullum, R.A. Willoughby, Lanczos Algorithms for Large Symmetric Eigenvalue Computations, Vol. I, Birkhäuser, Boston, 1985. [20] S. Nordholm, B.C. Fresier, D.L. Jolly, Chem. Phys. 25 (1994) 433. [21] Available from Netlib, http://www.netlib.org. [22] D.H. Bailey, ACM Trans. Math. Software 21 (1995) 379. [23] M. Blitz, M.S. Beasley, M.J. Pilling, S.H. Robertson, Phys. Chem. Chem. Phys. 2 (2000) 805. [24] C.C. Paige, M.A. Saunders, SIAM J. Numer. Anal. 12 (1975) 617. [25] Y. Saad, M.H. Schultz, SIAM J. Sci. Stat. Comp. 7 (1986) 856. [26] J. Shi, J.R. Barker, Int. J. Chem. Kinet. 22 (1990) 187. [27] S.J. Klippenstein, D.L. Yang, T. Yu, S. Kristyan, M.C. Lin, S.H. Robertson, J. Phys. Chem. A 102 (1998) 6973. [28] P.W. Seakins, S.H. Robertson, M.J. Pilling, D.M. Wardlaw, F.L. Nesbitt, R.P. Thorn, W.A. Payne, L.J. Stief, J. Phys. Chem. A 101 (1997) 9974. [29] V.D. Knyazev, W. Tsang, J. Phys. Chem. A 104 (2000) 10 747. [30] S.J. Klippenstein, L.B. Harding, J. Phys. Chem. A 104 (2000) 2351. [31] L. Ming, T.D. Sewell, S. Nordholm, Chem. Phys. 199 (1995) 83.