0 ELSEVIER
Applied Numerical Mathematics 21 (1996) 335-352
APPLE ID ,,
A stiff ODE preconditioner based on Newton linearization Colin J. Aro a,b Computer Research Group, Lawrence Livermore National Laboratory, P.O. Box 808, L-50, Livermore, CA 94551, USA Department of Applied Science, University of California, Davis, CA, USA
Abstract
The equations of atmospheric chemistry are very stiff and represent the major computational obstacle in the simulation of the chemical and physical processes in the atmosphere. This paper compares two examples of "preconditioned time differencing", a technique for constructing cheap time stepping schemes for chemical kinetic ODE systems by eliminating much of the matrix algebra associated with the usual implicit calculation. These preconditioned schemes are also compared with a well known implementation of the popular backward differentiation formulas, and are shown to be a viable alternative to these fully implicit BDF methods applied to chemical kinetics systems. Tests are carried out on a small 2 species system, a box model problem, and a full three-dimensional chemical-radiative-transport model. Keywords: Preconditioning; Stiff ODEs; Atmospheric chemistry
1. Introduction
The field of atmospheric chemistry has recently seen a great deal of research into the development of stiff ordinary differential equation (ODE) solvers. The equations of atmospheric chemistry are usually very stiff and represent the major computational obstacle in the numerical simulation of the chemical and physical processes in the atmosphere [12,13,34,37]. Various techniques and methods for dealing with or reducing the stiffness of these chemical ODE systems include the lumping or family approach [6,7,11,15,41], hybrid methods [26], iterative methods [12,18,20,36,41], and preconditioned implicit methods [2-5,42-44]. The preconditioned implicit techniques have shown considerable promise and will be the focus in this study. In most cases, the iterative technique used as the preconditioner in these methods is a variant of Picard iteration. This note describes a high performance preconditioner suitable for problems in diurnal atmospheric chemistry. This preconditioner is compared to the Jacobi preconditioned method [4] and the implicit backward differentiation formulas (BDFs) used in the general ODE solver LSODE [30]. Comparisons are performed for a small 2 species system, a box model, and a full 3-D chemicalradiative-transport (CRT) model. The new method is shown to be a viable choice in terms of CPU speed and accuracy achieved in a full model. 0168-9274/96/$15.00 Copyright © 1996 Elsevier Science B.V. All rights reserved PII SO 168-9274(96)00017-7
336
C.J. Aro / Applied Numerical Mathematics 21 (1996) 335-352
1.1. P r e c o n d i t i o n e d time differencing
Preconditioned time differencing [4,31] is a technique for constructing cheap ODE time stepping methods by combining an iterative technique with an implicit ODE integration formula. It has been successfully applied to the equations of chemical kinetics in a variety of settings [3-5,42,43]. The basic idea behind preconditioned time differencing is illustrated by considering the first order nonlinear system of ordinary differential equations: dy d--t = I ( y ' t),
(1)
y(O) = Yo.
A k-step implicit multistep formula for solving these equations is [24] Vn+l = A t ~ k f (v,~+l,
tn+l ) _]_gn
(2)
(with At = step size, tn+l = ( n -~ 1)At, yn+l ~ y ( t n + l ) , flk = const, and gn a known vector). A function F ( u , v, t) is defined such that F ( v , v, t) = G ( v , t) = A t / 3 k f ( v , t) + gn. This function is commonly derived from an iterative technique for solving the nonlinear algebraic system defined by Eq. (2). A preconditioned time difference method takes a predetermined number of iterations (usually one) of this iterative technique and is given by:
Algorithm 1.1. 1. Set (y,~+J)(0) (the initial guess). 2. For s = 1 , 2 , . . . , l, compute (yn+l)(s) = F ( ( y n + l ) ( s ) , ( y n + l ) ( s - O , t n + l ) . 3. Set y,~+l = (yn+l)(t). Common choices for the preconditioner F are the Jacobi preconditioner [4]: F~(u, v, t) = A t f l k f i ( v l ,
v2, . ., . v i - .l , u~, . vi+l,
., VN, t) + gin,
and the Gauss-Seidel preconditioner [4,42,44]: F i ( u , v, t) = A t / 3 k f i ( u l , u2, . . . , u i - l , ui, vi+l , . . . , VN, t) + gn,
where N is the number of components in the vector y. The methods described by Algorithm 1.1 have proven to be successful, when compared to the implicit BDF techniques commonly used in treating chemical kinetics problems [3,44]. These preconditioned schemes produce explicitly computable methods with excellent stability characteristics. This allows them to take numerical time steps comparable with the typical full implicit techniques, while retaining the numerical efficiency of a purely explicit technique. 1.1.1. C H E M S O D E
This section describe CHEMSODE [2], a collection of FORTRAN subroutines implementing a preconditioned time differencing technique. The package uses the Jacobi preconditioner applied to the backward Euler method. This "preconditioned backward Euler method" is then used as the initial guess (Step 1 in Algorithm 1.1) for a Jacobi preconditioned trapezoid rule. In both cases, only a single iteration is used (in Step 2 of Algorithm 1.1). This technique has been used successfully in
C.J. Aro / Applied Numerical Mathematics 21 (1996) 335-352
337
mathematically idealized chemical kinetics problems [4], box model problems [3,4], and full 2-D and 3-D CRT model problems [3,5]. The CHEMSODE package uses a variable step size determined by the local error indicator: ~n+l
__
2
c + 1 (cYn+l - (1 + c)y n + y~-l)
(tn - t n - l )
with c - (~n+l
-
-
~;n)"
(3)
This indicator gives an estimate of At2y'(t~) + O(At3), which is a bit conservative, but has been used successfully [3,42,44]. The step size is adjusted by requiring that
n+l max( ]E;__[ i \ATOL(i) + RTOL(i)y~
)
fn+l
- II
II ~< 1,
where ATOL and RTOL are arrays containing the absolute and relative error tolerances respectively. That is to say, the local truncation error is forced to be less than ATOL(i) + RTOL(i)y~z for each i, at each step. Hence, ATOL specifies a portion of the error tolerance as an absolute parameter, while RTOL specifies the remaining portion as a percentage fraction of the species value. This method of specifying error tolerance is common in ODE solvers of this type [2,8,19,21,29,30,44]. Thus, if the weighted norm liE n+l II is less than one, the step is accepted. The next step size to be attempted is calculated via the expression
Atnew =
0.94
\
ll) 1/(p+l) ) Af°ld'
where p is the order of the integration scheme (taken to be one in this case, since the error estimate is first order). The step size is not allowed to be increased more than a factor of ten for any single adjustment. Whenever a step is rejected, the step size is subsequently not allowed to be increased by more than a factor of two in any single adjustment. When at least ten successive steps are accepted, the step size is again allowed to increase by as much as a factor of ten. This scheme prevents the method from increasing the step length too aggressively. The maximum and minimum step sizes are 1 hour and 1 second respectively. A non-zero minimum is needed because of the intolerably small step sizes forced by the approximate error indicator Eq. (3) at start-up. The initial step size attempted is the minimum step length. All implementation details may be found in [2]. CHEMSODE's major advantages are a simple code structure (less than 600 lines of FORTRAN code) and a high degree of memory efficiency.
1.1.2. LSODE The LSODE [30] package is also used for comparison in this study. LSODE has been used in many CRT model studies [23,28,46-48] and is representative of the state-of-the-art in the robust implementation of implicit multistep methods for the general ODE problem. For the treatment of stiff problems, LSODE uses the BDF techniques [16,17]. These methods are the most popular currently used for the general stiff problem [35]. LSODE implements these methods in a way that allows both the step size and the method order to dynamically vary throughout the course of the integration (with method order varying from one to five with the BDFs). In this study, LSODE will be called to use the BDFs, using a Newton iteration to solve the resulting nonlinear system. The Jacobian matrix is automatically generated using finite differences and is inverted as often as the code deems necessary. This is one of the code's optimizations, since it is
338
C.J. Aro /Applied Numerical Mathematics 21 (1996) 335-352
sometimes unnecessary to change the Jacobian at every step. LSODE (as with CHEMSODE) does not take any special vectorization advantage (as in [21]). The two solvers should therefore provide a fair and accurate comparison between the integration techniques under investigation regardless of the architectural platform used.
1.2. The Newton preconditioner Preconditioned time differencing begins with an implicit ODE integration method (Eq. (2) for example). It alleviates the problem of performing a full nonlinear solve on (2) by only approximately solving this nonlinear system with a small fixed number of iterations of an iterative solution technique. One normally selects the particular iterative technique to take optimal advantage of the specific characteristics of the particular class of problem under consideration. The advantages that the Jacobi and Gauss-Seidel preconditioners have on chemical kinetics problems are that they render the ODE method explicit and that the Jacobian matrix Of/Oy does not have to be constructed and inverted. These are large computational advantages [4,5]. A compromise between the "no Jacobian" and the "full Jacobian" methodologies is attempted here. This idea is shown to be a viable choice when compared to the Jacobi technique used in CHEMSODE. In [39], Stott and Harwood approximated the Jacobian matrix used in the Newton-Raphson solution of a backward Euler integration by its main diagonal. Similarly, the Kregel method (or K-method) [27] seeks to reduce the size of the Jacobian via a similar approximation. These innovations are most appropriate when the step size is very small or when the Jacobian is diagonally dominant [27]. Working along similar lines, we make the approximation: i~f~ f i ( Y n+l , /~n+') ~ f i ( y n , ~ n + l ) + ~yi (Yn, tn+') [ n+l
- v?)
+
O(Ay2).
One can then use this to define the preconditioner:
Fi(u, v, t) = At~k fi(v, t) + -~vi(V)iUi - vi)
gi.
(4)
This preconditioner is applied in Algorithm 1.1, taking only a single iteration. This is detailed as follows:
Algorithm 1.2. 1. Set (yn+!)o = ~-a+! (an appropriate "predictor"). 2. Set (fn+m)0__
fi((yn+l)O, tn+l)
and
(j~+l)O
0f~ ~yi((yn+l)O, tn+l).
3. Solve Yin+l ) 1 = At/3k[(f~+') ° nt- ( j ? + l ) O ( /
n + l \)l _ [ n+l ) o ) ] + g ,n,
i.e., set At/3k[(fn+l) ° _ ( j ni+ l ) 0 (y ni + l ) 0] + n+lXl Yi ) =1 -- At"Jk "kdi ) 4. Set yn+l =
(yn+ 1) 1.
gin
C.J. Aro /Applied Numerical Mathematics 21 (1996) 335-352
339
Note that Algorithm 1.2 only requires knowledge of the diagonal of the Jacobian matrix, and that solving for (yn+l)l in Step 2 only requires that a sequence of linear scalar equations be solved. This eliminates the need to solve a sequence of nonlinear (quadratic, in the case of chemical kinetics equations) scalar equations as with the Jacobi preconditioned method [2], so the proposed method is slightly more efficient than Jacobi in terms of the computational effort that must be expended to compute the next time advanced value. Additionally, the main diagonal of the Jacobian matrix is often easy to calculate for certain classes of problems. For the atmospheric problems discussed in Section 2, it is immediately available with no extra computation. One drawback that this technique will share with the Jacobi preconditioned method (and other types of preconditioned schemes) is that it fails to conserve the total number of atoms involved in the simulation [4]. This diagonal approximation is appropriate when the Jacobian matrix is diagonally dominant [27]. This situation will always guarantee stability, but is unfortunately not always the case in chemical kinetics (see the "five species" problem in [4]). Nonetheless, the numerical calculation can still be stable if this condition is violated. As to the numerical accuracy of the Jacobian calculation, we require only that the Jacobian's elements be calculated to at least as high a numerical accuracy as the species concentrations themselves. This will ensure that the calculations retain the proper order of numerical accuracy when the algorithm is implemented. 1.3. Goals Jbr this study
This study will quantify the benefits of eliminating the need for matrix algebra in the ODE time stepping method of a full 3-D atmospheric CRT model. The remainder of this study will develop a preconditioned time difference algorithm using the preconditioner described in Algorithm 1.2. The underlying implicit ODE integration techniques will be the backward Euler method and the trapezoid rule. Even though the trapezoid role is not the best choice for a very stiff problem, its one step nature makes it very memory efficient for larger problems (such as the 3-D CRT model used in this study). In addition, the trapezoid rule is the underlying method used in the Jacobi preconditioned algorithms implemented in CHEMSODE, so its use here will provide a fair and accurate comparison with those techniques.
2. Chemical kinetics
The equations of atmospheric chemistry can be cast in the nonlinear form [3,38]: dy~ = /; 2 - fi(y,t) P~(y,t) - [,~(y,t)y~ - -L~(y, )Yi, (5) dt where y is avector of chemical concentrations (y~ is the ith component of the vector), t is the time, and Pi, Li, Li are non-negative quantities representing chemical production and loss rates. Important to note is that the production and loss rates Pi, L~, Li, while functions of y, do not depend on the component Yi. Eq. (5) includes homogeneous, elementary chemical reaction mechanisms whose rates of reaction depend only on the reactants involved. This includes virtually all of the chemical reactions that have
C.J. Art / Applied Numerical Mathematics 21 (1996) 335-352
340
been studied experimentally [38]. Furthermore, the kinetic order of the reaction is not allowed to be more than two with respect to any of the reactants involved. The rates for third or higher order dependence are negligible due to the small collision cross sections for these reactions [38]. Reaction rates having kinetic order higher than three are possible, but have never been observed in the chemical literature [38]. Eq. (5) therefore includes a great deal of chemical kinetics and is certainly an appropriate assumption for the atmosphere, but is not all encompassing. Clearly, the main diagonal of the Jacobian has -([,~ + 2L~yi) as its ith element. These elements are readily accessible to any chemistry solver using an interface similar to CHEMSODE [2], that is, for any solver defining the ODE problem in terms of Pi, L,i and Li. Further, it is easily seen from the form of the production and loss rates (illustrated in [4], for example) that numerically approximating these diagonal entries by approximate values of the dependent variable is accurate to the same order with respect to the numerical steplength as the original numerical accuracy in the dependent variable. This guarantees that the numerical calculations will retain the appropriate order of numerical accuracy. Thus, the easy availability of these diagonal elements also guarantees sufficient numerical accuracy. The proposed technique is illustrated here. One iteration of the "Newton" preconditioned technique (Algorithm 1.2) is applied to Eq. (5) and is detailed in Algorithm 2.1: Algorithm 2.1. 1. Set (yn+~)o 2. Set
=
~+1
(again, an appropriate "predictor").
(p?+l)O = Zi((y.+,)o
t.+,),
=
t.+2) '
(pn+,)o _
(fn+,)o
_
and
(jn+l)0 = - ((Lin+')° + 3. Set
(yn+l)l \
4. Set
i
]
Atl3k[(f?+l) ° =
1 -
_
j n + l o y,~+l)o ( i ) ( i ]+gi °
yn+l = (yn+l)l.
Generalization to more than one iteration (in Step 2 of Algorithm 1.1) is done in the obvious fashion. The effect of increasing the number of iterations often does nothing other than consume additional CPU time with little additional benefit. In [4], a linearized version of a chemical kinetics system is analyzed and shown to be very slowly convergent. With this in mind, it is usually best to reap the stability advantage that a single iteration provides and not try to iterate to convergence. This study will therefore not investigate the effects of additional iterations of the preconditioner. In the remainder of this section, runs are carried out on the small Chapman mechanism, described in [4,9]. These comparison runs are performed at a constant step size in an effort to compare the accuracy of this technique to other comparable techniques (namely the BDFs) used in the treatment of chemical kinetics equations (see [3,4] for an analogous study of the Jacobi preconditioned algorithm).
C.J. Aro / Applied Numerical Mathematics 21 (1996) 335-352
341
The specific methods used in this study are the first order backward Euler method and the second order trapezoid rule. These two methods have previously been successful in developing the second order Jacobi technique used in CHEMSODE. This study duplicates that methodology with the Newton preconditioner (described by Algorithm 2.1) used in place of the Jacobi preconditioner. Applying Algorithm 2.1 to the backward Euler technique and using the result as the predictor in the preconditioned trapezoid rule results in the following algorithm. Algorithm 2.2 (Newton). 1. S e t ( y n + l ) 0 = yn (the identity predictor [4]). 2. Set
(P/n+') ° = Pi((yn+')O,~n+,) (~7+')° = ~((Un+')°,t,+,),
,
(~//+l)O=-Li((yn+l)O,[;n+l),
(f?+l)o= (p;,+l)o- (~7+,)O(y.+,)o_ (-~nii+')O((yn+l)O(yn+l)O),
and
(J['+l)°=-(([-,in+')° -t- 2(L7+1)°(yn+l)°). 3. Set
(y7+1)I/2= At[(f/n+l) 0 - ( ,an+l )(~ o yn+l )]+y~ o n 1 - At(J?+i) ° (Preconditioned backward Euler method). 4. Set (p/,~+ 1)l/2=pi((yn+l)l/2
l~n+l), tn+l) '
(~,7+1)1/2 :Zi((yn+l)l/2 (V ÷' )l/2=--Li((yn+l)l/2 tn+l) ' (f/n+ 1),/2= (r?+l)1/2 _ (~7+1)1/2 (y,~+l),t2 _ (~+l),/2((y~+l)1/2 (j~<+l ) 1/2 ~----
(yn+l)1/2),
and
((L7+1) 1/2 -t- 2(-~i+')'/2(yn+')1/2).
5. Set 1A/-[( len+ 1,"l1/2 - ( j ? + l ) l / 2 (y~+l)i = ~-vLwi
1 n) (y~+l)l/Z] + (Yin + ~Atf~
1 -- 1 A t ( j ? + l ) l / 2
(Preconditioned trapezoid rule). Algorithm 2.2 describes a method which is second order accurate in time and which is very stable for the class of equations described by Eq. (5). If the problem to be solved is a linear one, Algorithm 2.2 coincides with the Jacobi preconditioner. Thus, an analysis similar to that performed on the linearized Chapman mechanism in [4] would lead one to expect the Newton algorithm to have very similar stability characteristics to the Jacobi method. Jacobi has previously been demonstrated to be very
342
C.J. Aro /Applied Numerical Mathematics 21 (1996) 335-352 le+08
I
I
I
',4
~
{ ,t
""
\\ \:
,
g 6e+07
v
.~ ~,
O
5e+07
', '.. ~.. I
olD
g, O
- .... 'NEWTON' -+--
'CHEMSODE'
,,
,'
7e+07
E v
I
'BDF2'
, , ~ $ ,, ,'~" " , ¥
8e+07
"1
I
,+.+, .~" "+
9e+07
v
I
,
~', 4e+07
3e+07
i
2e+07
,,+,
1 e+07 t
i
!
I
6
8
v
0
2
4
10
v
v
v
v
v
v
v
v
12
,
14
Time (hours)
Fig. 1. Concentration of atomic oxygen from the Chapman problem computed with the Jacobi, BDF2 and Newton methods. All methods were run using a constant steplength of 50 seconds.
stable for this class of problem, and numerical experiments using large step sizes confirm these results for the Newton algorithm (even though the Jacobian may violate diagonal dominance). A comparison between the Newton algorithm (Algorithm 2.2), the Jacobi preconditioned trapezoid rule (used by CHEMSODE), and the second order BDF at a constant step size of 50 seconds is shown in Fig. 1. A constant step size was used to demonstrate that the Newton technique delivers accuracy comparable to the other techniques for like step sizes. The problem used was the simple 2 species Chapman problem, which has been used to test stiff solvers in several previous studies [4,9]. The step size of 50 seconds was chosen because it was an approximate average step size for several error controlled runs performed in [4]. The problem was run for a length of 12 hours (from sunrise to sunset). The figure shows a plot of the concentration of atomic oxygen for each method over the entire 12 hour run. The results compare favorably with those seen in [4], except that the Newton algorithm peaks out at a slightly higher value. The discrepancy stays well below ten percent, for the most part: thus, as with the Jacobi preconditioned technique, the Newton algorithm seems to deliver accuracy comparable to the BDF techniques when similar step sizes are used.
343
C.J. Aro / Applied Numerical Mathematics 21 (1996) 335-352
For the remainder of this study, the Newton algorithm will be implemented as a variable step size routine in the same manner as the other two solvers. This is relatively straightforward to do. CHEMSODE was modified to compute species concentrations using the Newton algorithm (Algorithm 2.2) while keeping all other routines intact. This involved the modification of only two subroutines [2], namely those that compute the component equations for the species concentrations at the next step. The error estimate (Eq. (3)) and all other routines used by CHEMSODE were retained intact. This solver will be referred to as NEWTON for the remainder of the study. 2.1. A box model
A "box model" [45] isolates the reaction equations in a single spatial zone from a full CRT model. It uses the same radiative and chemistry routines found in the full models [23,28,4648], but with no advection or diffusion. In this section, a set of twenty chemical species is used (03, N20, NO, NO2, NO3, N205, HONO, HNO3, HO2NO2, H20, OH, HO2, H202, H2, CH4, CH302 , CH3OOH, CH20, CO, CH302NO2 ) in addition to two that are assumed to be in equilibrium (O(1D) and O). No lumping [6,7,11,15,41] is used in this study. Temperatures and pressures are from U.S. Standard Atmospheres (1976), as is the radiation field. The radiation field uses clear sky, a ground albedo of 0.1, with only Rayleigh Scattering, and no aerosol/cloud chemistry. There are 47 thermal reactions and 21 photolytic reactions [2]. The model is run with temperature and pressure variables simulating mid-level stratospheric values (an altitude of 26 km). The time of year is July 1st (Summer) and the atmospheric location is 5 deg. North latitude. This corresponds to one of the tests used in [4, Test 1] and is run here to simply confirm the Newton preconditioner's suitability so that tests can be immediately carried out on the full 3-D CRT model. The drawback to this model is the proportion of run time that is taken up by the radiative calculations. In a CRT model, radiative calculations are done for each column of zones, while chemistry is done for each zone. Thus, for a full model, chemistry is the dominant computation. Since the box model is only a single zone, however, the overhead spent in doing the radiative calculations is going to limit the overall speedup seen in the model. A good rule of thumb is to assume a 50•50 split in run time for chemistry and radiative calculations [3,4]. 2.1.1. Timings and results The temperature and pressure values are obtained from the U.S. standard atmosphere and represent the desired altitude. The latitude value affects the radiative transfer calculations. The date (time of year) affects the length of day (depending on the latitude) and therefore affects the diurnal cycle and all radiative transfer calculations. The run begins at noon local time and integrates forward 24 hours. Table 1 shows the CPU time for each algorithm. Identical parameters were used for the test runs. LSODE was called with RTOL set at 10-4 while ATOL was set at 2.55 × 10 -1 (concentrations are scaled within the model so that units are
Table 1 CPU time for CHEMSODE, LSODE and NEWTON on the box model calculation LSODE CHEMSODE 427.6 seconds
328.1 seconds
NEWTON 286.8 seconds
344
C.J. Aro / Applied Numerical Mathematics 21 (1996) 335-352
Table 2 Summary of accuracies for box model run on selected species (NEWTON and CHEMSODE). Agreement to within machine accuracy is indicated by the entry "exact" Algorithm
03
CH4
N20
NO
NO2
NO3
CHEMSODE
4.03
exact
exact
2.29
2.38
2.58
NEWTON
2.62
exact
exact
1.77
1.94
2.27
Table 3 Summary of accuracies for box model run on all species (NEWTON only). Agreement to within machine accuracy is indicated by the entry "exact" Species
O3
N20
NO
NO2
NO3
Accuracy
2.62
exact
1.77
1.94
2.27
Species
N205
HONO
HNO3
HO2NO2
H20
Accuracy
2.13
1.85
2.83
1.85
exact
Species
OH
HO2
H202
H2
CH4
Accuracy
2.71
3.01
3.05
exact
exact
Species
CH302
CH3OOH
CH20
CO
CH3OzNO2
Accuracy
2.55
2.61
2.54
3.75
1.99
in molecules--thus the relatively high value of ATOL). Its method flag (MF) was set at 22, causing it to use the BDFs with numerically generated Jacobians. The maximum order was set at 2. The method order is restricted for these tests, since CHEMSODE and NEWTON use second order methods. Restricting LSODE puts them on equal footing with respect to LSODE method's order, and was necessary due to the relatively minor fraction of CPU time taken by the chemistry in these tests. C H E M S O D E and N E W T O N used the same error control values as LSODE but, as previously stated, were forced to take a minimum step size of 1.0, because the local error indicator Eq. (3) forced an intolerably small step size at start-up. This minimum step constraint was needed to get things started. During the comparison runs, CHEMSODE, NEWTON, and LSODE are called to integrate forward 15 minutes and record output. They are then "restarted" to integrated forward another 15 minutes, continuing this process until the full 24-hour run is complement. The term "restart" means that the routines begin each 15 minute part of the run as though it was the initial call. This is in contrast to the concept of a "continue" [30, see "SRCOM"] which implies that the integrator picks up where it left off and continues the computation as though it had never stopped. Restarting the routines is representative of the way in which they would be called in the full model. If the number of gridpoints is at all large, the memory required to facilitate a continue call would clearly be prohibitive in LSODE's case. The one-step nature of the other two algorithms makes a restart virtually identical to a continue call. This was a motivation in choosing the trapezoid rule as a basis for these methods. Table 2 shows the relative accuracy comparison of NEWTON with C H E M S O D E for the average value of selected chemical species concentrations over the entire 24 hour run. Table 3 shows NEWTON's accuracies for all of the species in the model. Here, error is defined to be the difference between the box model's output when run with LSODE and its output when run with C H E M S O D E
C.J. Aro / Applied Numerical Mathematics 21 (1096) 335-352
345
or NEWTON. That is, LSODE is used as an accuracy benchmark. The tables' entries represent the number of significant digits in the solution and are calculated by taking the log of this relative (percent) difference in the output values in question. Although NEWTON's error is notably higher, it is close to the one percent error level desired in most atmospheric models. Greater than one percent is generally considered redundant [3,42]: it is therefore safe to say that NEWTON is a viable alternative to the other two methods considered here. NEWTON is clearly fastest with CHEMSODE fairly close in performance and LSODE least desirable in this case. This result is completely consistent with the amount of effort required for each algorithm to take a numerical timestep: LSODE requires the most with NEWTON slightly favorable to CHEMSODE. NEWTON's elimination of the matrix algebra yields an approximate 40% increase in CPU speed for this model. 2.2. A 3-D model
IMPACT [5,32], the 3-D CRT model used in this study, is based on the operator splitting method [10,25,40], where each of the primary operators (advection, diffusion, convective mixing, photolysis, and chemistry (including sources)) is dealt with in an independent fashion. Advection in all three directions is currently treated using a second order van Leer method (an upstream-biased monotonic grid point scheme) [1]. The upstream nature of this method reduces phase errors to a minimum and the monotonicity control eliminates the need for a filling algorithm and the severe problems that would arise with negative values of the chemical species concentrations. Diffusion is currently included using a very simple constant coefficient multiplied by a grid-based second derivative. Photolysis implementation allows simulations using either complete diurnal calculations or diurnally averaged reaction rate coefficients. In this study, complete diurnal calculations will be used. The photolytic loss rate constants are calculated by integrating the product of absorption coefficient, quantum yield, and solar flux over wavelength (175 nm to 760 nm). To capture the spectral detail needed for photodissociation calculations, the two-stream, multiple-layer, UV-visible model uses 126 wavelength bins between 175 nm and 735 nm. The scattering of energy from the direct solar beam within each individual layer is treated using the delta-eddington algorithm [22]. The scattering of diffuse radiation (i.e., previously scattered radiation) from each individual layer is modeled using the simpler SaganPollack algorithm [33]. Both algorithms allow inclusion of the bulk optical properties of clouds and aerosols. Finally, the adding method is used to calculate irradiances throughout the vertically inhomogeneous atmosphere. The absorption cross sections and quantum yields include temperature and pressure dependence where appropriate and available. IMPACT can be run either in an interactively coupled mode or stand-alone (off line) using externally obtained meteorological fields (for example, derived from a general circulation model or from assimilated data sets). For the calculations performed in this study, done off line, the use of data assimilated meteorological fields (winds, temperature, etc.) from the Data Assimilation Program at NASA Goddard has been adopted [32]. Due to the large amount of CPU time needed at this early stage, a limited chemistry scenario will be used. There are eight chemically reacting species used in these tests (03, H20, OH, HO2, H202, CH4, CH20, CO), modeled by 15 thermal reactions and 6 photolytic reactions (listed in Table 5). Some of the reactions modeled deserve special explanation. For example, equation 13 in Table 5 is the net result of the following mechanism:
C.J. Aro / Applied Numerical Mathematics 21 (1996) 335-352
346
CO + OH --+ CO2 + H, H + 0 2 --+ HO2. Carbon dioxide and molecular oxygen are assumed to be constant in the model, while atomic hydrogen (along with molecular hydrogen) is assumed to be in instantaneous steady state. Thus, the net result of these two reactions operating in unison is equation 13 in Table 5. Using similar assumptions, equation 14 of Table 5 is derived from the following: 2CH4 + 2OH --+ 2H20 + 2CH3, 2CH3 + 202 + 2M --+ 2CH3OO, 2CH3OO --+ 2CH30 + 02, 2CH30 + 02 --+ 2HO2 + 2CH20, while equation 20 in the table represents the mechanism: CH20 + hu --+ H + HCO, H + HCO -+ H2 + CO, and equation 21 gives: CH20 + hv --+ H + HCO, H + HCO + 202 --+ 2HO2 + CO. The remaining reactions in Table 5 are modeled directly with reaction rates and photochemical data taken from the latest evaluation [14].
2.2.1. Timings and results IMPACT was started at noon on sidereal day two. The model time steps were five minutes long. Within the chemistry portion of the model, LSODE was called with RTOL set at 10 -4 and ATOL set at 10 -2°. CHEMSODE and NEWTON used the same parameters, with the exception of a minimum step size of one second. LSODE's method flag (MF) was set to 22, corresponding to variable order BDFs with a Newton iteration to solve the implicit system (using a numerically generated Jacobian matrix). The restriction on LSODE's method order is abandoned for these runs. During the runs, CHEMSODE, NEWTON, and LSODE are called within the chemistry portion of the model to integrate forward 5 minutes and record output (this was the length of the operator split model step). At the next model step, they are "restarted" to integrate forward another 5 minutes, in a manner similar to the box model runs. Test runs are performed on the CRAY C-90. Code is written to allow the compiler to vectorize when possible. No special attempts were made to increase vectorization as in [21] however, so that accurate comparisons between the preconditioned method with the backward differentiation formulas used by LSODE were attained. In Table 4, C-90 runtimes are shown for IMPACT using NEWTON, CHEMSODE and LSODE. NEWTON is superior for both one and ten model steps. Interesting to note is that the two preconditioned implicit solvers (CHEMSODE and NEWTON) are more favorable on longer runs. This behavior was pointed out in [5]. The reason for this behavior is the necessity to restart LSODE at each call. This forces LSODE to immediately calculate and invert a Jacobian matrix, something it
347
C.J. Aro / A p p l i e d Numerical Mathematics 21 (1996) 335-352
Table 4 Timings for the IMPACT model on the CRAY C-90 # of model steps
LSODE
CHEMSODE
NEWTON
1 step
2699 sec.
1947 sec.
1446 sec.
10 steps
13306 sec.
7191 sec.
5870 sec.
Table 5 Chemical reactions modeled in the IMPACT 3-D CRT model 12
H202 + OH -+ H20 + HO2
O + 03 --+ 202
13
CO + OH -+ HO2
O(ID) + N2 -+ O + N2
14
CH4 + OH --+ CH20 + HO2 + H20
O(1D) + 02 --+ O + 02
15
CH20 + OH -+ H20 + HO2 + CO
H20 + O(ID) --+ 2OH
16
02 + h u ~ 20
OH + 03 --+ HO2 + 02
17
03 + h u --+ 0 + 02
HO2 + O-+ OH + 02
18
03 + hu --+ O(ID) + O2
HO2 + 03 --+ OH + 202
19
H202 + h u --+ 2OH
HO2 + OH --+ H20 + 02
20
CH20+hv ~CO
10
HO2 + HO2 -+ H202 + 02
21
CH20 + h u --+ 2HO2 + CO
11
HO2 + HO2 + H20 -+ H202 + 02 + H20
1
0+02
2
--+ 03
may not have to do on a continue call. The one step explicit nature of C H E M S O D E and N E W T O N makes restarts a non-issue, since no Jacobian matrix inversion is necessary. Thus, the speedup shall characteristically grow for longer runs. Nonetheless, this study shows that again N E W T O N is fastest, with C H E M S O D E relatively close behind. The benefit of eliminating the matrix algebra in this case is much more significant, with N E W T O N running as much as twice as fast (or more) as LSODE. This is demonstrative o f the large amount of CPU effort required of the chemistry integrator in a full model. The difference between the numerical results obtained by L S O D E and N E W T O N is examined for a run of ten model steps (nearly an hour of simulation). The CPU time for a complete episode is still somewhat prohibitive. Even though the run is not quite long enough to represent a complete modeling episode, the chemistry integrators will nonetheless take thousands of numerical steps. The run will therefore sufficiently represent the chemistry integrator's accuracy. Fig. 2 shows the percent difference in IMPACT's computed ozone concentration between the N E W T O N run and the L S O D E run. That is, in a manner analogous to the box model run, L S O D E is used as the accuracy benchmark. Fig. 2(a) depicts the lower stratosphere (approximately 25 km altitude), Fig. 2(b) shows the middle stratosphere (30 km) and Fig. 2(c) shows the upper stratosphere (35 km). The accuracy is not quite as good as the seen from C H E M S O D E [3,5], however is certainly acceptable since it is well below the one percent error mark. Tropospheric regions are not examined, since the model is not equipped to model the troposphere in a meaningful way at this time. Given these results, we can conclude that N E W T O N is meeting its accuracy requirement at a reasonable savings in CPU time.
C.J. Aro / Applied Numerical Mathematics 21 (1996) 335-352
348
5(I 1.17e-02
9.02e - 03 6,38e
03
3.74e - 03 1.10e---03
0 --
1.54e....
03
,.d -4.18e-03 - - 6 . 8 2 c ... 0 3 - 9 , 4 6 e .-- 0 3 - 1.21e--.02
-50 - 1.47e- 02
0
100
200 Longitude (a)
300
Fig. 2. Percent difference between the IMPACT model's computed ozone concentration using NEWTON as the chemistry integrator and the model's computed ozone concentration using LSODE as the chemical integrator. (a) depicts the lower stratosphere (25 km), (b) depicts the middle stratosphere (30 km), and (c) depicts the upper stratosphere (35 km).
3. Conclusions This study has produced a simple and effective preconditioner which is superior to the Jacobi preconditioned technique implemented in CHEMSODE. It makes use of the diagonal entries of the Jacobian matrix which are readily available in the case of chemical kinetics problems and eliminates the need for matrix algebra by the ODE time stepping method. Run at constant step size, the Newton preconditioned method produced numerical results comparable with the traditional implicit BDF techniques commonly used for the numerical solution of chemical kinetics problems, while retaining the efficiency of an explicit time stepping scheme. The implementation of NEWTON, the variable step size, error controlling version of the method was then compared favorably to CHEMSODE and LSODE on a box model problem over a complete diurnal cycle. In this scenario, NEWTON was found to be approximately a ten percent improvement over CHEMSODE, and more than a forty percent improvement over LSODE (in its most common
349
C.J. Aro / Applied Numerical Mathematics 21 (1996) 335-352
50
5.32e--02 4.30e
02
3 . 2 9 e - 02 2.27c.- 0 2 1 . 2 6 c - 02 2 43e-03 --7.72e
03
- 1 . 7 9 e - 02 - 2.811c-
02
-3.82e-(}2
- 50
....4.,q3e • 02
0
1O0
2(}0
300
Longitude
(b) Fig. 2. (Continued.) configuration). This led to the IMPACT 3-D CRT model and another favorable comparison between NEWTON and the other two algorithms. Even in the relatively short modeling scenario considered here, NEWTON proved to be as much as two times as fast as LSODE and fifty percent faster than CHEMSODE with numerical discrepancies well below the one percent level. Further development in the direction of higher order methods and in the direction of more efficient implementations promise to bring large scale 3-D CRT calculations closer to computational feasibility.
Acknowledgements
Many thanks go to Garry Rodrigue of U.C. Davis for his guidance. Thanks also to the Global Climate Research Division at LLNL, in particular Doug Rotman and John Tannahill for their contributions to the work performed on the IMPACT CRT model used in this study. Special thanks to the anonymous reviewers for helpful criticism, to Alex Ganus and Craig Soldat for helpful discussion, and to John Feo of the Computer Research Group at LLNL for support during the revision stage. This work
350
C.J. Aro /Applied Numerical Mathematics 21 (1996) 335-352
50
1.26e - 0 I
1 . 0 5 e - 01 8.38e .... 02 6.28e - 02 O
4,18e - 0 2 2.08e - 02 •- 2 . 0 8 e - 04 -2.12e-02
--4.22e- 02 -6.32e-02
-50
-8,42e-02
0
100
200 Longitude (c)
300
Fig. 2. (Continued.) was performed under the auspices of the U.S. Department of Energy by the Lawrence Livermore National Laboratory under contract number W-7405-Eng-48 and was supported in part by the NASA Atmospheric Chemistry Modeling and Analysis Program and the Department of Energy's Office of Health and Environmental Research, Environmental Science Division.
References
[1] D.J. Allen, A.R. Douglass, R.B. Rood and ED. Guthrie, Application of a monotonic upstream-biased transport scheme to three-dimensional constituent transport calculations, Monthly Weather Rev. 119 (1991). [2] C.J. Aro, CHEMSODE: a stiff ODE solver for the equations of chemical kinetics, Report UCRL-ID-119557, Lawrence Livermore National Laboratory, Livermore, CA (1995). [3] C.J. Aro, On the numerical treatment of problems in atmospheric chemistry, Ph.D. Thesis, University of California, Davis, CA (1995). [4] C.J. Aro and G.H. Rodrigue, Preconditioned time differencing for stiff ODEs in diurnal atmospheric kinetics, Comput. Phys. Comm. 92 (1995).
C.J. Aro / Applied Numerical Mathematics 21 (1996) 335-352
351
[5] C.J. Aro, G.H. Rodrigue and D.A. Rotman, A high performance chemical kinetics algorithm for 3-D atmospheric models, Report UCRL-JC-122054, Lawrence Livermore National Laboratory, Livermore, CA (1995). [6] J. Austin, On the explicit versus family solution of the fully diurnal photochemical equations of the stratosphere, J. Geophys. Res. 96 (12) (1991). [7] G. Brasseur and S. Solomon, Aeronomy of the Middle Atmosphere (D. Reidel, Dordrecht, 1984). [8] P.N. Brown, G.D. Byrne and A.C. Hindmarsh, VODE, a variable coefficient ODE solver, SIAM J. Sci. Statist. Comput. 10 (1989). [9] G.D. Byrne and A.C. Hindmarsh, Stiff ODE solvers: a review of current and coming attractions, J. Comput. Phys. 70 (1) (1987). [10] M.G. Crandall and A. Majda, The method of fractional steps for conservation laws, Math. Comput. 34 (1980). [11] P.J. Crutzen, Ozone production rates in an oxygen-hydrogen-nitrogen oxide atmosphere, J. Geophys. Res. 76 (1971). [12] C.E Curtis and J.O. Hirschfelder, Integration of stiff equations, Proc. Nat. Acad. Sci. U.S.A. 38 (1952). [13] D. Dabdub and J.H. Seinfeld, Extrapolation techniques used in the solution of stiff ODEs associated with chemical kinetics of air quality models, Atmos. Environ. 29 (3) (1995). [14] W.B. DeMoor et al., Chemical kinetics and photochemical data for use in stratospheric modeling, Evaluation #11, Report JPL 94-26, Jet Propulsion Laboratory, Pasadena, CA (1994). [15] S. Elliott, R.P. Turco and M.Z. Jacobson, Tests on combined projection/forward differencing integration for stiff photochemical family systems at long time step, Comput. Chem. 17 (1993). [16] C.W. Gear, The automatic integration of stiff ordinary equations, in: Proceedings Int. Fed. Inform. Proc. Congress (Humanities Press, New York, 1968). [17] C.W. Gear, Numerical Initial Value Problems in Ordinary Differential Equations (Prentice-Hall, Englewood Cliffs, NJ, 1971). [18] E. Hesstvedt, O. Hov and I.S.A. Isaksen, Quasi-steady state approximations in air pollution modeling: comparison of two numerical schemes for oxidant prediction, Int. J. Chem. Kinet. 10 (1978). [19] A.C. Hindmarsh, ODEPACK, a systematized collection of ODE solvers, Report UCRL-88007, Lawrence Livermore National Laboratory, Livermore, CA (1982). [20] B.G. Hunt, Photochemistry of ozone in a moist atmosphere, J. Geophys. Res. 71 (1966). [21] M.Z. Jacobson and R.P. Turco, SMVGEAR: a sparse-matrix vectorized GEAR code for atmospheric models, Atmos. Environ. 28 (2) (1994). [22] J.H. Joseph, W.J. Wiscombe and J.A. Weinman, The delta eddington approximation for radiative flux transfer, J. Atmos. Sci. 33 (1976). [23] D.E. Kinnison, Effects of trace gases on global atmospheric chemical and physical processes, Ph.D. Thesis, University of California, Berkeley, CA (1989). [24] J.D. Lambert, Computational Methods in Ordinary Differential Equations (Wiley, New York, 1973). [25] R.J. LeVeque, Numerical Methods for Conservation Laws (Birkh~iuser, Basel, 1992). [26] G.J. McRae, W.R. Goodin and J.H. Seinfeld, Numerical solution of the atmospheric diffusion equation for chemically reacting flows, J. Comput. Phys. 45 (1982). [27] E.S. Oran and J.P. Boris, Numerical Simulation of Reactive Flow (Elsevier Science, Amsterdam, 1987). [28] K.O, Pattern Jr, P.S. Connell, D.E. Kinnison, D.J. Wuebbles, T.G. Slanger and L. Froidevaux, Effects of vibrationally excited oxygen on ozone production in the stratosphere, J. Geophys. Res. 99 (D1) (1994). [29] W.H. Press, S.A. Teukolsky, W.T. Vettering and B.P. Flannery, Numerical Recipes in FORTRAN: The Art of Scientific Computing (Cambridge University Press, New York, 2nd ed., 1992).
352
C.J. Aro / Applied Numerical Mathematics 21 (1996) 335-352
[30] K. Radakrishnan and A.C. Hindmarsh, Description and use of LSODE, the Livermore solver for ordinary differential equations, Report UCRL-ID-113855, Lawrence Livermore National Laboratory, Livermore, CA (1993). [31] G. Rodrigue and D. Wolitzer, Preconditioned time differencing for the parallel solution of the heat equation, in: Proceedings of the 4th SlAM Conference on Parallel Processing for Scientific Computing (SIAM, Philadelphia, PA, 1989). [32] D.A. Rotman, D.J. Wuebbles and J.E. Penner, Atmospheric chemistry using massively parallel computers, in: Proceedings AMS 5th Annual Symposium on Global Change Studies (1994). [33] C. Sagan and J.B. Pollack, An isotropic nonconservative scattering and the clouds of Venus, J. Geophys. Res. 72 (1967). [34] R.D. Saylor and R.I. Fernandes, On the parallelization of a comprehensive regional-scale air quality model, Atmos. Environ. 27A (1993). [35] L.F. Shampine, Stiffness and the automatic selection of ODE codes, J. Comput. Phys. 54 (1984). [36] T. Shimazaki and A.R. Laird, A model calculation of the diurnal variation in minor neutral constituents in the mesosphere and lower thermosphere including transport effects, J. Geophys. Res. 75 (1970). [37] W.C. Shin and G.R. Carmichael, Comprehensive air pollution modeling on a multiprocessor system, Comput. Chem. Engng. 16 (1992). [38] J.I. Steinfeld, J.S. Francisco and W.L. Hase, Chemical Kinetics and Dynamics (Prentice-Hall, Englewood Cliffs, NJ, 1989). [39] EA. Stott and R.S. Harwood, Am implicit time-stepping scheme for chemical species in a global atmospheric circulation model, Ann. Geophysicae 11 (1993). [40] G. Strang, On the construction and comparison of difference schemes, J. Numer. Anal. 5 (1968). [41] R.E Turco and R.C. Whitten, A comparison of several computational techniques for solving some common aeronomic problems, J. Geophys. Res. 79 (1974). [42] J.G. Verwer, Gauss-Seidel iteration for stiff ODEs from chemical kinetics, J. Sci. Comput. 15 (5) (1994). [43] J.G. Verwer, J.G. Blom and W. Hundsdorfer, An implicit-explicit approach for atmospheric transportchemistry problems, Report NM-R9501, CWI, Amsterdam (1995). [44] J.G. Verwer and D. Simpson, Explicit methods for stiff ODEs from atmospheric chemistry, Report NMR9409, CWI, Amsterdam (1994). [45] R.E Wayne, Chemistry of Atmospheres (Clarendon Press, Oxford, 2nd ed., 1991). [46] D.J. Wuebbles, ES. Connell, K.E. Grant, R. Tarp and K.E. Taylor, Initial results with the LLNL twodimensional chemical-radiative-transport model of the troposphere and stratosphere, Report UCID-21178, Lawrence Livermore National Laboratory, Livermore, CA (1987). [47] D.J. Wuebbles, K.E. Grant, ES. Connell and J.E. Penner, The role of atmospheric chemistry in climate change, APCA J. 39 (1) (1989). [48] D.J. Wuebbles, J.S. Tamaresis and K.O. Patten, Quantified estimates of total GWPs for greenhouse gases taking into account tropospheric chemistry, Report UCRL-ID-115850, Lawrence Livermore National Laboratory, Livermore, CA (1993).