Local Error Control in Codes for Ordinary Differential Equations* Lawrence F. Shampine Numerical Mathematics L?ivi-sion 5122 Sandia Labqratories Albuquerque, New Mexico 87115
Transmitted by M. Scott
ABSTRACT There are several schemes for the control of local error which are seen in differential equation solvers. The analysis attempts to explain how the selection of a scheme influences the behavior of global error seen in high quality production codes. Two rules of thumb for estimating global errors are given theoretical support when used in conjunction with suitable codes. Substantial numerical experiments support the analysis and conclusions.
1.
INTRODUCTION
When solving an initial value problem for an ordinary differential equation it is usually the global error, i.e., the difference between the true and computed solutions, that one would like to control. Since this appears difficult and expensive, most codes for the initial value problem control their local errors instead. The local error measures how closely the code follows a solution of the differential equation over one step. Controlling the local error controls the global error after a fashion. And so, how one chooses to control the local error is of fundamental importance. There are two criteria which are popular, error per step and error per unit step. It has not been previously recognized how important it is in this context whether or not one does local *This work was supported by the U.S. Energy Research and Development Administration (ERDA), Contract No. AT(29-l)-789. By acceptance of this article, the publisher and/or recipient acknowledges the U.S. Government’s right to retain a nonexclusive, royalty-free license in and to any copyright covering this paper. APPLIED MATHEMATICS
AND COMPUTATION 3,189-210
0 Elsevier North-Holland, Inc., 1977
(1977)
189
LAWRENCE
190
F. SHAMPINE
extrapolation, i.e., uses the local error estimate to “improve” the solution. Interestingly, all four combinations of error criterion and local extrapolation are to be found in widely available, good quality codes. We shall attempt to explain the effect of this choice on the behavior of the global error. Besides its importance in understanding existing codes, this investigation will help the writers of new codes to decide which possibility to implement. Users of codes for solving ordinary differential equations rarely appreciate the distinction between the local error which the code attempts to control and the global error which concerns them. In the author’s experience, users choose local error tolerances somewhat smaller than the global errors they want, and usually this proves quite satisfactory. When they wish more confidence in the results, they solve the problem for a succession of tolerances and evaluate the results by consistency. We shall provide theoretical support for these rules of thumb when applied with discretion to codes designed with them in mind. 2.
BASIC DEFINITIONS
AND ASSUMPTIONS
We wish to solve
(1)
on an interval [a, b] by a method which computes approximate solutions yj to y (x,) for a sequence of points xi with a = x0 < xi < . +. . The function f(x, y) is assumed to satisfy a Lipschitz condition with constant L for a < x < b, 1yI < co. Indeed, we shall assume f is as smooth as is required for the arguments, at least on subintervals of [a, b]. Of course, these assumptions are far stronger than are necessary. In particular, a familiar argument [l, Chapter l] shows that the smoothness off only need be assumed to hold in a strip containing the solution of (l), (2). The local error of the method in stepping from xi_ 1 to xi is defined to be lej=uj_,(ri)-
yj,
(3)
where the local solutions I+_ i(x) are defined for each index i - 1 by
(4) (5)
191
Local Error Control
There are two popular criteria for controlling the local error. The criterion of error per unit step (EPUS) is ]lei] < e(xj-- xi-,),
(6)
or in terms of the step size hi = xi - xj_ r, ]lei] < zhj. (Throughout the paper we suppose hi > 0.) The criterion of error per step (EPS) is ]leJ < z.
(7)
The limit arguments employed throughout this paper lead to results valid for “small” step sizes. Ordinarily, then, in the circumstances we study, error per unit step is a more stringent criterion than error per step. A basic assumption is that codes modeled in this paper are successful in achieving (6) or (7), whichever they attempt. Our aim is to describe the behavior of “real” or “production” codes, by which we mean those being used for routine solution of differential equations in a computing center as opposed to codes written merely to illustrate a point. After describing how such codes select their step sizes, it will become clear how reasonable the assumption is when the codes are applied to a problem with piecewise smooth f and slowly varying solution y(x) (what we consider to be the typical problem). The important role of local extrapolation in this context does not seem to have been recognized. By local extrapolation we mean that an estimate (est) of the local error lei is formed at each step and is used to improve the numerical result yi. We seek to approximate ui_ ,(xj) = yi + lej; hence if we have a good estimate of the local error, yi + est should be a better numerical solution than yi. We shall suppose that est is asymptotically correct, with the consequence that local extrapolation is equivalent to using another method of order (at least) one higher than the basic method. As a transparent example, consider the Fehlberg (4,5) Runge-Kutta pair which at each step generates two values yi and $ using formulas of orders 4 and 5 respectively. The local error estimate is est = ii - yj; hence the extrapolated value is yi + est = yi, the result of the fifth order formula. We shah consider the four possibilities of the two criteria with and without local extrapolation. They will be labeled for easy reference, and in each case at least one good quality code using this possibility will be cited. EPUS -
EPS -
error per unit step, no extrapolation: Stewart [3] classical RungeKutta, fixed order; RKJT(Shampine, Allen [4]), Runge-Kutta-Fehlberg, fixed order. error per step, no extrapolation: DIFSUB (Gear [5]), Adams, variable order.
LAWRENCE
192
XEPUS XEPS
-
F. SHAMPINE
error per unit step, extrapolation: Zonneveld [S] RungeKuttaZonneveld, fixed order. error per step, extrapolation: DVDQ (Krogh [7J), Adams, variable order; STEP (Shampine, Gordon [l]) Adams, variable order; RKF~~ (Watts, Shampine [19]), Runge-Kutta-Fehlberg, fixed order,
Local Error and Local Truncation Error The concept of local truncation error refers to the global solution y(x) rather than to the local solutions u/(r) associated with local error. Although it is clear from the data used that codes attempt to estimate the local error, many authors speak of estimating the local truncation error. This is permissible because almost all the formulas known to us result in the two being asymptotically equivalent. Since it is convenient to work with a fixed function y(x) rather than a family {t+(x)} in our analysis, we shall go into the matter. Local truncation error is a concept arising naturally in convergence proofs but appearing artificial when just stated. In stepping to xi one supposes the numerical method is used to generate an approximation yj* based on exact values y(x,,) and y’(x,), rather than computed values yn and f(~, y,). For example, an explicit Runge-Kutta method has no memorized values; hence yi is generated starting from the pair (xi-i, yi_J and yi* from the pair (+,,Y(+~)). The local truncation error is then ltei=y(rj)-yi*. (Some authors reverse the sign, and some take out a factor of the step size.) For typical one step methods the local truncation error can be expressed in terms of a smooth principal error function p)(x, y) [8], so that for a method of order k (8) and we shall always presume this to be the case. Clearly then,
=ltei+
O(hF+‘)
if we have a convergence theorem for the method. Methods with memory behave much the same way. The Adams-Moulton method, for example, has a
193
Local Ewor Control
principal error function of the form cy &+ “(xi_ 1) for a suitable constant c, when the step size is constant. The functions y(x) and ui_ 1(x) are solutions of the same differential equation which differ at xi_ 1 by the amount Y(+,)~~-1. G iven a convergence theorem to the effect that the global errors tend uniformly to zero, it is the case that u/!ll)(xi_ i) converges to Y k+l(r& [9, Pa 401 (remember we assume f is as smooth as necessary). Variable step sizes complicate the matter but do not alter the conclusion for the Adams methods [l]. Henceforth we shall only consider methods with a smooth principal error function for which convergence of the numerical solution implies that local error and local truncation error are asymptotically equal. All of the good codes we mention and nearly all others known to us use estimators of the local error which are asymptotically correct, and they presume the step size is small enough that a good estimate is obtained. In addition to this necessary assumption, it is presumed that the principal part of the error furnishes a good approximation to the error, so that (9) where we use the common notation “G” for “approximately equal to”. An implicit assumption is that cp(x, y(x)) does not vanish identically on a subinterval of [a,b], with the consequence that the method is actually of higher order than k. The correct asymptotic dependence of the step sizes for multi-step methods must take into account variation of the step size. In principle this is easy enough to do for the Adams methods, as is described in [l, p. 1111, but it is so expensive in practice that the code of Gabel [lo] is the only one known to us which does so. In particular, the codes DIFSUB, DVDQ, and STEP which we have cited do assume that the dependence on the step size is as in (9) and so pretend that locally a constant step size is used. All contain a strong bias in their complex algorithms towards slow variation of the step size to make the approximation of (9) as valid as possible subject to other algorithmic goals. Choosing the Step Size Efficient codes use about the largest step size which appears to satisfy the error criterion. We shall define Hi, the locally optimal step size, as the largest step size the method can use at z+__~and pass the test (6), or (7) as the case may be. Then with the assumptions we make
&i+-fik+l
Irp(xi_,,y(xi_,))]
‘-‘Hik+ll~(xi_l,y(~~_l))I
(error per unit step), (error per step).
194
LAWRENCE F. SHAMPINE
Using estimates of the local (truncation) error and (9), the appropriate relation just stated is employed to approximate Hi. Production codes do not actually use an estimated locally optimal step size. For one thing, the ratio of successive step sizes is bounded above so as to get a sensible value for Hi if 9, should vanish at xi-i [remember, we are excluding degenerate cases of ‘p(x, y ( r )) vanishing on an interval]. This provision means that we need not treat such occurrences as special cases: Problems with ‘p( x, y(x)) which vanishes can be split into subintervals on which it does not vanish and our arguments apply directly, plus very short connecting intervals involving a few steps where the restriction on step sizes is decisive. These connecting intervals are unimportant to our semi-quantitative study of the error, and we shall ignore them. Henceforth we shall presume that on the interval where we study the integration, ~(x, y(x)) is b ounded away from zero. This has the important consequence that small local error tolerances E are equivalent to using small step sizes. It may well happen that the step size is not sufficiently small to result in a very accurate estimate of the error or to make the approximation (9) a very good one. To discount these possibilities, codes use a step size somewhat smaller than what appears to be optimal. This is usually done in one of two equivalent ways. One is to use a fraction of the estimated optimal size Hi, and the other is to aim at a fraction p of the tolerance E. As examples, STEP aims at 0.56, and RFK uses 0.8Hi, which, since it is a fourth order code based on error per unit step, corresponds to aiming at (0.8)4e. Naturally, if it appears that a trial step does not pass the local error test (6) [or (7)], the step is rejected and the step size reduced until the inequality appears to hold. Effect of Extrapolation on Step Sizes and Local Error Our first observations clarify the role played by local extrapolation. The local error depends on the numerical solution value yi actually accepted, and hence depends on whether or not extrapolation is done. Because the step size is chosen using an estimate of the local error, it likewise depends on this decision. This dependence is very weak, for (9) shows that asymptotically the principal part of the local error does not depend on the numerical solution, and consequently neither does the principal part of the step size. Thus, for small tolerances e, the step sizes chosen when using EPUS (EPS) are virtually identical to those chosen when using XEPUS (XEPS). Since the work is proportional to the number of steps, we realize that for small tolerances local extrapolation does not affect the work. Yet, by assumption, local extrapolation raises the order of the scheme and so produces more accurate results asymptotically. One measure of efficiency is to compare the accuracy obtained by two procedures when they are allowed the same amount of work. In this measure we conclude that XEPUS is more efficient than EPUS
Local Error Contml
195
and XEPS is more efficient than EPS for all sufficiently stringent tolerances. There is an important relation between EPUS and XEPS based on local extrapolation and our assumption that codes relate the step size to the tolerance e in a reasonable way. When we extrapolate, we must distinguish between (i) the local error of the method which is used to determine the step size and (ii) the error actually made in the step by the method resulting from the combination of the basic method and extrapolation. Thus for XEPS we have E~HilC+ll’p(Xi_l,y(x~_~))l for the locally optimal step size Hi and
for the step size hi and formula actually used. The algorithms arrange that hi < Ht ; hence
for a suitable definition of y holding on [a, b]. Because this is the same as the error per unit step criterion with an altered tolerance, we see that XEPS and EPUS have the same qualitative behavior with respect to e. There is no useful quantitative relation, as y depends on the problem and might be 106, or as well 10 -‘. 3.
BOUNDS ON THE GLOBAL
ERROR
The behavior of the global error at x,,, e,, = y (x,,) - y,,, will depend on the local error tolerance e, the kind of control used, the differential equation, and how the step sizes are chosen. One reason for the popularity of the error per unit step control is a bound we shall now develop. The global error at xi can be split into ei=y(xj)-ui_l(xi)+lei. (LO) An elementary inequality [2, p. 1031 about solutions of (1) states that
LAWRENCE
196
F. SHAMPINE
Then lejl < lej_ ,lexpL+ + IleJ. If the method uses the given initial value, so that e, = 0, this implies
(e,l
5 (lei(. j-1
(12)
If one uses the error per unit step control (6), then
The inequality (13) is a very simple, elegant result for the error per unit step criterion which assumes nothing about the method being used, just that the error control is successful. Without further assumptions a bound like (13) cannot be true for error per step. To see this, suppose one solves y’ =O, y (0) = 1 by a method with lei = E at each xi = ih. The global error at x,, is ne:, so it is not bounded independently of the number of steps. Of course, this kind of thing does not happen in practice, because the number of steps is related to the tolerance. The key to understanding real codes is to account for the rules used by them for selecting the step size. Before examining other cases, let us consider how realistic the bound (13) for EPUS is for production codes. The only assumption related to the code is that the test (6) is passed at every step. It is clear that asymptotically the kind of step selection algorithm we have described will yield steps such that (6) is true. The experimental studies [ll, 121 h ave verified this in practice for a wide variety of codes. Our experience with a few codes we have subjected to such tests is in agreement. Thus on theoretical and experimental grounds we believe the basic assumption quite a realistic one. To derive bounds for the other cases we have to relate the step size used to the tolerance. The case of EPS will suffice to show the technique. From the definition of the locally optimal step size Hi for the tolerance c and from our assumption about the existence of a principal error function (8), we see there is a constant c, > 1 such that for all sufficiently small e, clc
>
Hikfl IcP(xi-l,Y(xj-l)))'
For Adams methods we recall that (9) is only an approximation when the
Local Error Control
197
step size is varied. Because of limits imposed on the rate of variation, the preceding inequality also holds for these methods, with the size of cr reflecting the permissible range of step sizes. The same arguments lead to
in which hi is the step size actually taken and so must satisfy hi < Hi. These inequalities imply that
From the inequality (12) we immiediately conclude a bound on the global error,
A more meaningful expression for Q emerges if we approximate the sum defining Q by a Riemann integral for small step sizes hi:
EPS:
Q&ck/(k+l)C;/(k+l)C
p~Ll’p(t,y(f))ll”k+l’dt
for smalle.
Similar arguments for XEPUS and XEPS lead to the same result for an appropriately altered definition of Q. Approximating these quantities by integrals, we find XEPUfj:
Q &
c(k+ l)/$$+
for small e,
WC4
For purposes of comparison we restate (13) as EPUS:
Q=ci%dt
for all e.
198
LAWRENCE
F. SHAMPINE
We have been considering the order k to be fixed. By splitting up the basic interval [a, b] into subintervals where a constant order is used, the arguments are easily modified to deal with variable order codes as well; there is no need to state a formal result. In each case we have found the global error 1y (x,) - yn,(to be bounded by an expression of the form ePK (x,). The power p is 1 for EPUS and XEPS; it depends on the order k for the other cases, but for higher orders it is not very different from 1. The function K(x) depends on the case and the problem, but it is always zero initially and grows at a rate dependent on the Lipschitz constant as the integration in x proceeds. This expression is very pessimistic, and much more insight will be furnished by the approximations of the global error developed in the next section. Still, our assumptions are quite weak, and important conclusions can be drawn. With realistic assumptions about how codes choose their step sizes, we find there is no accumulation of error using the error per step criterion as our contrived example showed was possible in general. The behavior of the bound on the global error with respect to the local error tolerance e is particularly regular for EPUS and XEPS. If the order k is moderately large or, in variable order codes, there is a strong bias towards using high orders, the behavior for EPS and XEPUS is not very different. If the Lipschitz constant is of modest size, the bounds say the global error will start off zero and can grow at a moderate rate as the integration proceeds. For such problems we conclude two rules of thumb. First, one can expect the global error to be comparable to the local error tolerance. Second, because the bounds are approximately proportional to the local error tolerance, we may hope that the global error will also be roughly proportional to the local error tolerance. Both of these rules will be sharpened greatly in the next section. We have written versions of RKF implementing all four cases and applied them to the test problems collected by Gogh [13]-more specifically, to those problems satisfying our hypotheses, which are (l)-(4), (6), and (8)-(10). The example y’ = y, y (0) = 1 solved on [0,50] with a pure relative error test is representative. The maximum global error is displayed in Table 1 along with ND, the number of derivative evaluations required. In this fixed order Runge-Kutta code the number of steps and the work are directly related to ND. As shown in the table, local extrapolation did not affect the number of steps, and closer examination showed the step sizes to be virtually identical. For some of the test problems the number of steps taken at crude tolerances differs between extrapolated and unextrapolated versions, but those taken at stringent tolerances never differ. These experimental results agree with our predictions. Likewise, when efficiency is measured by the global error obtained for the same work, local extrapolation clearly increases efficiency for small tolerances, though not necessarily at crude tolerances. All four
Local Em
199
Control TABLE 1 PERFORMANCEOFVERSIONSOFRKFUSINCALLFOURERRORCONTROL SCHEMES
TO SOLVE
y’ = y, RELATIVE
y(o)
= 1 ON [o, %]
ERROR
WITH
EPUS Max Error -1
l.~Eo
-2 -3 -4 -5 -6 -7 -8
9.32E-1 1.72~-2 1.033-3 1.6034 1.823-5 1.933-6 1.98&7
EPS ND
Max Error
ND
42 90 270 600 1158 2142 3893 7001
9.85E- 1 7.253-l 1.253-2 1.313-3 4.033-4 7.983-5 1.413-5 2.36~-6
78 114 282 510 864 1422 2298 3700
XEPUS -1 -2 -3 -4 -5 -6 -7 -8
A PURE
TEST
XEPS
1.00E0 9.183-l 3.663-2 9.993-4 4.403-5
42 90 270 600 1158
9.793-l 7.113-l 3.023-2 2.123-3 1.793-4
78 114 282 510 864
2.213-6 1.173-7 6.323-g
2142 3893 7001
1.643-5 1.573-6 1.523-7
1422 2298 3700
cases have much the same behavior of the global error with respect to z. This problem has a modest Lipschitz constant, and the global errors are comparable to e for all the cases. Extensive computations of this kind are reported for D~DQ(XEPS) in [13]. We have computed the same quantities using DIFSUB(EPS) and STEP (XEPS). Still more elaborate computations with STEP are reported in Chapter 7 of [l]. Very extensive computations with D~DQ, DIFSUB,STEP, RKF45, and other codes are reported in [19]. All these computations show experimental behavior consistent with our predictions and with the results exhibited in Table 1. 4.
ESTIMATES OF THE GLOBAL ERROR
In this section we shall develop estimates of the global error which prove much more revealing than the bounds studied in the last section, but which require a stronger assumption about the local error. To approximate the global error we must approximate the local error at each step. The algorithms we have described aim at a local error of Be for some constant p < 1
LAWRENCE
200
F. SHAMPINE
in the case of EPS, and /3ehi in the case of EPUS; asymptotically they will be successful. Our basic assumption in this section is that the local error is approximately /3e (or ,&hj) at each step. This is a good assumption for the Runge-Kutta codes cited; it is not so good, but is still a reasonable one, for the Adams codes cited. The Runge-Kutta codes adjust their step size at every step so that deviation of the local error above or below /?e on one step is compensated for on the next. To reduce their overhead and for other practical reasons, the variable order Adams codes do not attempt to adjust their step this precisely. On the other hand, they manipulate the order as well to help bring the local error into line. A poor adjustment of the step size will lead to a phenomenon we call “ratcheting”. By this we mean that a reduction in the tolerance sometimes leads to little if any reduction in the global error. The way this occurs is that for a given tolerance the code chooses step sizes smaller than necessary. When the tolerance is reduced, the code sees it is possible to meet the reduced tolerance with the same or nearly the same step sizes, and so the error produced at each step is nearly the same. To illustrate how serious this can be in an otherwise excellent code, we note that when using the code DESUB [14] to solve the problem A2 of the collection [II], we obtained computed results which were identical for tolerances lo-‘, 10e3 , . . . , 10V7. This is an extreme example even for this code, and with attention to the matter one can write codes for which our basic assumption of this section is quite good for small E. In the splitting (10) we must relate y(xi) - ul_ r(xi) to y(xi_ J - yi_ 1 by an approximation, rather than a bound as in (11). This is easily done using the equation of first variation, since ui_ r(x) and y(r) are solutions of the same equation which differ by the small amount ei_ 1 at xi_ 1. Thus
where U(X) is the solution of
u’=fy (x,y(x))u,
If we let g(x)=f,(x,y(x)),
U(Xj_l)= 1.
we can solve for u(x), and from (10) we find
Our assumption about the local truncation error implies it is of one sign on the interval, namely sgn(cp). For EPUS we are supposing that lej h H,,&sgn(cp),
Local Em
201
Control
where Hi is the locally optimal step size corresponding to /?e. Substituting this expression into (14), we find after a little manipulation that
As in the preceding section, for small e (small step sizes), this Riemann sum is approximated by an integral. Thus
EPUS:
e, GP~sgn(pl)q(
i”g(t)dt)/GexP(
- i’g(s)h)dt. a
The arguments are the same for all the cases, so we shall just state the results. EPS:
XEPUS:
XEPS:
By breaking the interval of integration into subintervals on which the order is held constant, the analysis is easily modified to describe variable order codes. In the case of one step methods it is possible to relate approximations of this kind to the standard asymptotic theory. Henrici [8, p. 83 ff.] considers a sequence of integrations with the step size at xi_ i being chosen by a step size function hl = 8 (x/_ i) h,,. The function 8 (x) satisfies 0 < 0 (x) Q 1, and h, is a constant for each integration. Assuming that the method is of order k and that there is a smooth principal error function q( x, Y), he proves the global
202
LAWRENCE
F. SHAMPINE
error at a fixed point x,, satisfies en =w(xJh~,+O(h~~‘), where w’=g(x)w+flk
(+To,y(x)),
w(a)=O.
Henrici assumes that the step sizes can be described by a step size function. This is a convenient assumption for deriving asymptotic results, and it seems to be standard. So far as we know, its relevance to actual computation has not been considered before. We are presuming that our codes implement an asymptotically correct estimate of the local truncation error. Consistent with this assumption, let us suppose here that the estimator exactly yields the principal part of the local truncation error, so that estimated hei= hF”q(xi_,,y(zi_l)). In the case of EPUS, we are now discussing codes which choose their step size by
because they aim at a tolerance of &hi. Because we consider problems with 1~ (x, y(x))] bounded away from zero, we may let
Such codes are easily seen to select hi = 0 (xi_ Jh,,.
+)=
where
k( ,,,,;(,))Jk*
Thus a way of automatically selecting the step size which is a close model of actual computation does result in step sizes described by a step size function.
203
Local En-or Control
If one carries through the evaluation of Henrici’s asymptotic expression for the global error using the 8 (x) just described, he easily finds the same result for EPUS as just obtained by our approximate analysis. Because our result proves to be asymptotically correct for a large class of methods, we are encouraged about the usefulness of the approach. The same justification for one step methods can be made for all the results of this section, and we shall refer no more to the matter. These approximations give a much more realistic appraisal of the global error than the bounds of the preceding section. In our experience the typical problem is rather stable, meaning that the local solutions t+(x) show little tendency to diverge from the global solution y(x). Such problems have a far slower rate of growth of the global error than the bounds would suggest. The role played by the stability of the equation is most clearly seen in (14). If g(r) is negative on (xi_ i, xi), so that the equation is stable over this interval, then propagated error is damped in the step rather than amplified as the bounds had to allow for. For problems which are moderately stable over most of [a, b] and never seriously unstable, the global error will be comparable to the local error tolerance. Very Stable Equatim It is illuminating to pursue the matter of stable equations, and we shall consider two situations. In one situation the equation is very stable but not exceedingly so, that is, not stiff. For such problems the g(t) in (14) is presumed so large that the exponential damping effectively removes the propagation of global error and we can write ei Glei. For EPUS we have
so that
For our later discussions we are only interested in how the global error depends on e. After similar manipulations we find in the various cases: EPUS
e, is proportional to
e(k+Wk
EPS
XEPUS
Q
p(L.+Wk
XEPS .#+Wk+l)
When the equation is stiff, the approximations we have made break down. This is because for the methods with finite stability regions which have
204
LAWRENCE
F. SHAMPINE
concerned us, the step size is not determined by the local error test, but instead is limited so as to remain in the region of absolute stability. We have argued elsewhere [1,15] that in this situation such codes will use an average step size corresponding to being on the boundary of the region of absolute stability. For this reason the main distinction to be made among the cases when stiff problems are solved by non-stiff methods is the size of the stability region. This reduces to the question of how local extrapolation (which alters the method) affects the stability of the basic method. Local extrapolation improves substantially the stability of several important methods and reduces substantially the stability of others [16]. The Runge-Kutta-Fehlberg (4,5) pair used in RKF has its stability region uniformly and substantially increased by local extrapolation. Krogh’s test problem 7 becomes stiff between x-30 and x = 50. On this interval extrapolation reduces the cost in the various versions of RKF by 27% at all tolerances (the error criterion is also affecting this result). The effect of local extrapolation on the Adams PECE formulas depends on the order. At the lowest orders, 1-3, the stability is greatly improved. The degree of improvement decreases up to order 11, where it is not clear which procedure one would prefer. At order 12, it is better not to extrapolate as regards stability. Variable order Adams codes use low orders when they encounter stability difficulties (this is even used as a way of detecting and informing the user of stiffness in the driver DE associated with STEP), and the stability properties of orders 14 determine how the codes perform; cf. [l, Chapter 9; 17; 18, p. 151. Extrapolation is very helpful for the Adams PECE formulas when implemented in a variable order code. Efficient y The work involved in integrating to a fixed point b is proportional to the number N of steps taken. Now
can be approximated just as we did the global error. For example, for both EPUS and XEPUS,
so
EPUS, XEPUS:
N’(
Be) -l~k~~/q+,
y(t))lr’kdt.
205
Local Error Cow01 Similarly, for both EPS and XEPS, EPS, XEPS:
iv&(/%)_ l/(k+l)~bl~(t,y(t))ll’[k+l)d~. (1
With approximations for the cost and the global error, we can draw some conclusions about efficiency. One measure of efficiency is to consider the cost as a function of the input parameter E. We have already noted that extrapolation does not affect this cost, but here we can say much more. Leaving aside the constants involved, we see that for EPUS and XEPUS the cost is proportional to e -1/k for small e; for EPS and XEPS it is proportional to e -l/(k+l). In contrast to Sec. 3, we now can compare all four cases to conclude that in this measure EPS and XEPS are preferred over EPUS and XEPUS. This measure does not consider accuracy. Another measure of efficiency is the cost as a function of the global error p attained. We have already noted that extrapolation increases efficiency in this sense. To see how the cost depends on the global error for EPUS, we recall that the cost is proportional to z - iik and the global error ,u is proportional to E, so that the cost must be proportional to p - ‘ik . Similar manipulations show that for EPUS and EPS the cost is proportional to p -Ilk for small global errors IL; for XEPUS and XEPS it is proportional to p-‘/@+‘). In this measure of efficiency XEPUS and XEPS are preferred over EPUS and EPS. Earlier we could say XEPUS (XEPS) is preferred over EPUS (EPS) on general grounds, but we were unable to make comparisons like that of XEPUS with EPS or XEPS with EPUS as we can now. We have also obtained estimates of the global error for very stable equations. For such problems the proportionality is altered to p -l/(k+l) for EPUS and EPS and to 1_1-i/(~+‘)for XEPUS and XEPS. Thus the preferences are not altered in this special case. To be sure, we have ignored constants in our study of efficiency which may be decisive for a given problem, and we are confident that there are problems for which any given case is the most efficient. Our object is to gain insight as to which case is the most practical and our approach furnishes that.
of Thumb Returning now to the size of the global error, we examine the role of the constant P. The authors of DVDQ, STEP, and RKF~~ have been quite explicit that P has been chosen so that when integrating standard sets of routine problems, the global errors resulting are comparable to the local error tolerance. From private communications with the authors of other recent codes we believe this has become common. Believing as we do that the typical routine problem has a smooth function and is moderately stable, it is The Rules
LAWRENCE
206
F. SHAMPINE
reasonable to “tune” a code so that our first rule of thumb that the global error is comparable to the local error tolerance will be as valid as possible. For carefully written and tuned codes, this rule is very useful for routine problems. The extensive results for DVDQ [13] and STEP [l] demonstrate its worth even when applied to unusually difficult problems (cf. Table 3 below). The results of Table 1 for RKF are not particularly favorable, as this problem has g(x) - 1 and the various versions of the code were not tuned, yet the maximum global errors are not conspicuously different from the local error tolerance. Our second rule of thumb is that the global error is roughly proportional to the local error tolerance. There are, however, two distinct ways to use this behavior to estimate global errors. For both we must integrate from a to b with two different tolerances er > ls to get approximate solutions yCIand I/,,. If the proportionality were very good, so that
we could use it to estimate the global error of the 17uzleaccurate result by
We do not know of any code which will furnish very reliable estimates in this way, though some Runge-Kutta codes do pretty well. With attention to the matter one can write codes which pretty reliably reduce the global error when the tolerance is reduced. Such codes can be used to assess their global errors as follows. To be specific, suppose we integrate with e1 = (Eand e2 = 0. le, which are suitable practical values. We can estimate the global error of the less accurate result by
If we merely hypothesize that the global error is reduced by lo%,
then we see that the ratio I of the estimate to the true error, YO.lr -
Yc
=r[Y(WYJ,
207
Local Error Control is bounded by 0.1 =Gr < 1.9,
which is to say the estimate is correct to within an order of magnitude. If we assume that the global error is reduced by 50%, we find the estimate is correct to within a factor of 2. In view of our analysis, a carefully written code, particularly one using EPUS and XEPS, will approach a reduction by 90% and so lead to estimates which are likely to be correct within a factor of 2. With much more confidence one can expect them correct within an order of magnitude. This is reasonably valid for all the codes cited, though some could be improved by attention to their design in this respect. Krogh’s test problem 10 [13] is the solution of a restricted three body problem over one period of length T. This problem involves an unusually large range of step sizes, and also of orders in a variable order code. At the end of the interval, one solution component is zero and the other 1.2. Because the problem is to be integrated with a pure absolute error test, it might appear very difficult to get even the correct sign for the error associated with the zero component. In Table 2 we display results computed with RKF, which is a fourth order Runge-Kutta code implementing EPUS. The estimated errors are computed by the scheme just described, with tolerances c and 0.1~ The estimates are remarkably good down to the tolerance of lo-“. We might expect the excellent estimates because the code TABLE 2 ESTIMATING
GLOBAL USING
ERRORS RKF AND
FOR A RESTRICTED THE
METHOD
THREE
y1(T)=1.2
‘o!hc -2 -3 -4 -5 -6 -7 -8 -9 - 10 -11 - 12
BODY
PROBLEM
OF CONSISTENCY
y2( T) = 0.0
Est.
True
Est.
True
Error
Error
Error
Error
- 6.83-3 1.3E-3 - 5.2~-5 - 1.4E-5 - 6.5~7 ~.OE-8 6.2~-Q 1.13-Q 6.5~-10
- 5.6~-3 1.23-3 - 6.7~-5 - 1.5E-5 - 6.2~-7 2.7~8 6.7x-9 5.63-10 - 5.93-10
1.3~-2 1.13-2 2.2~-3 2.03-3 2.3~-4 ~.OE-4 2.93-5 2.6~-5 2.23-6 2.03-6 1.1E-7 1.2~-7 6.8%9 7.63-g 8.53-10 1.73-10 - 4.8E-10 6.8~-10 1.23-g - l.l~-9 run terminated due to
1.2~-9
-
minimum step size
1.2~-9
208
LAWRENCE
F. SHAMPINE
TABLE 3 ESTIMATING
GLOBAL
ERRORS
STEP
AND y1(T)=
FOR A RESTRICTED THE
METHOD
THREE
BODY
PROBLEM
USING
OF CONSISTENCY
1.2
yz( T) = 0.0
Est.
True
Est.
True
log,,e
Error
Error
Error
Error
-2 -3 -4 -5 -6 -7 -8 -9 - 10 -11 -12a
8.9&1 1.7E-3 -2.1~-3 3.33-4 - 1.1E-5 - ~.OE-6 2.13-7 - 1.4~-8 - l.lE-9 -2.33-11 l.OE-11
8.9E-1 - 9.1E-5 - 1.8~-3 3.2~-4 - 1.3E-5 - 1.9E-6 ~.OE-7 - 1.6~-8 -1.13-g - 2.5~-12 2.03-11
- 8.6~-2
-
- 6.4~-3 6.1~-4 - 1.7E-4 - 1.9E-5 - ~.OE-7 - 4.03-7 1.2~-8 -2.1E-10 - l.OE-10 - 1.03-11
~.OE-3 4.13-4 - 1.9E-4 - ~.OE-5 - 5.83-7 - 3.8~-7 1.23-S -3.2~-10 - l.lE-10 - l.lE-11
9.2E-2
-
aLimiting precision detected by code.
adjusts its step size at every step. This is a very difficult problem for a fourth order code, and we believe those integrations with tolerances of 10-i” and below (affecting the estimate for tolerances lo-’ and below) are being affected by roundoff. In Table 3 we display results computed with STEP, which is a variable order Adams code implementing XEPS. Because the adjustment of the step size is not so precise as in RKF for reasons we have already mentioned, the estimates are not as good as those of Table 2, but they do support our contention that such estimates are quite likely to be correct to within an order of magnitude. In fact they are often correct to within a factor of 2. In one respect STEP performs markedly better. It is designed to yield high accuracy and contains devices to both reduce propagated roundoff error and detect impossible accuracies. The consequence here is that the estimates are good right down to limiting precision.
5.
CONCLUSIONS
We have studied two rules of thumb for estimating global errors and the associated question of the local error criterion used. It appears that the rules are quite useful if the code was designed with them in mind and they are applied with discretion. Let us summarize the guidelines which have emerged. For the global error to be comparable to the local error tolerance,
Local Error Control
209
the problem must be presumed moderately stable over the interval of integration and the tolerance “small”. EPUS and XEPS give the better qualitative behavior of the global error with respect to the tolerance, especially in variable order codes, but EPS and XEPUS are satisfactory if rather high orders are used most of the time. It is essential that the step size used be fairly close to optimal to get good qualitative behavior; this is easier to achieve with one step methods. The internal tolerance used by the code should be tuned on a large set of representative problems so as to make the quantitative relation between global error and tolerance as close as possible. The observations and analysis of this paper do not settle the matter of which case to use. The principal question is the code writer’s attitude toward the role played by the local error criterion and how he wishes to view the error of the solution. Some authors prefer to measure error by regarding the numerical solution as the exact solution of a problem close to the one posed and to measure how close the problems are-measure the residual of the computed solution. This is very attractive to the numerical analyst, and allows one to be rather precise when using EPUS and to draw similar but less precise conclusions for XEPS. Other authors, including the present one, feel that it is global error which most concerns the user and prefer to regard the step size selection as a way of efficiently and reliably integrating a problem, with accuracy to be assessed in the two ways described herein. There is an analogy to Gaussian elimination, with one viewpoint favoring a small residual as defining a “good” solution, and the other relying upon residual correction to estimate the “true” accuracy. Whether or not to use local extrapolation also depends on one’s attitude toward the error. Granted the view espoused in this paper and that the method considered has its stability improved by extrapolation, the arguments for XEPS are overwhelming. With a different view the matter is unclear, as each case has its drawbacks. The author has written production codes using the Fehlberg formulas and the Adams formulas in PECE form which employ EPUS, EPS, and XEPS. Besides the factors studied here, there are others clearly separating the cases. Among them are the important practical matters that with error per unit step one cannot deal in straightforward manner with discontinuities, integrable singularities, or starting a variable order Adams code. All these factors and a great deal of experience with variant codes has convinced the author that XEPS is very much to be preferred for these two methods. Anyone writing a code for the initial value problem should seriously consider this case. Our colleugue H. A. Watts has always influenced our thinking about the solution of ordinary differential equations, and his cornnwnts have been especiully valuable in the preparation of this paper.
210
LAWRENCE
F. SHAMPINE
REFERENCES
8 9 10 11
12
13 14
15
16 17
18 19
L. F. Shampine and M. K. Gordon, Computer Solution of Orclinuy LXffi?wntial Equations: the Initial Value Problem, Freeman, San Francisco, 1975. G. Birkhoff and G.-C. Rota, Ordinary Differential Equations, Ginn, New York, 1962. N. F. Stewart, An integration routine using a Runge-Kutta method, M. SC. Thesis, Dept. of Comput. Sci., Univ. of Toronto, 1965. L. F. Shampine and R. C. Allen, Jr., Numerical Computing: an introduction, Saunders, Philadelphia, 1973. C. W. Gear, Numerical Initial Value Probkm in Ordinary Dtffmential Equations, Prentice-Hall, Englewood Cliffs, N.J., 1971. J, A. Zonneveld, Automatic Numerical Integration, Math. Ceutre Tracts No. 8, Mathematisch Centrum, Amsterdam, 1964. F. T. Krogh, VODQ/SVDQ/DVDQ-variable order integrators for the numerical solution of ordinary differential equations, TU Dot. No. CP-2308, NPO-11643, Jet Propul. Lab., Pasadena, Calif., May 1969. P. Henrici, Discrete Variable Methods in Ordinay Dif$erential Equations, Wiley, New York, 1962. S. Lefschetz, Diffewntial Equations: Geometric Theory, Interscience, New York, 1957. G. F. Gabel, A predictor-corrector method using divided differences, M. SC. Thesis, Dept. of Comput. Sci., Univ. of Toronto, 1965. T. E. Hull, W. H. Enright, B. M. Fellen, and A. E. Sedgwick, Comparing numerical methods for ordinary differential equations, SIAM. 1. Numet. Anal. 9 (1972), 603-637. W. H. Enright, R. Bedet, I. Farkas, and T. E. Hull, Test results on initial value methods for non-stiff ordinary differential equations, Tech. Rept. No. 68, Dept. of Comput. Sci., Univ. of Toronto, 1974. F. T. Krogh, On testing a subroutine for the numerical integration of ordinary differential equations, JACM 20 (19731, 545-562. P. A. Fox, DESUB: integration of a first-order system of ordinary differential equations, in Mathematical Software (J. Rice, Ed.), Academic, New York, 1971, Chapter 9. L. F. Shampine, Stiffness and non-stiff differential equation solvers, in Nurnerische Behandlung mm Differentialgkkhungen, ISNM 27 (L. Collatz, Ed.), Birkhauser, Basel, 1975, pp. 287301. L. F. Shampine, Local extrapolation in the solution of ordinary differential equations, Math. Comput. 27 (1973), 91-97. F. T. Krogh, Changing step size in the integration of differential equations using modified divided differences, Proc. of the Couf on the Nume~. Solution of Ordinary Differential Equations, Lecture Notes in Math. No. 362, Springer, New York, 1974. F. T. Krogh, The numerical integration of stiff differential equations, TRW Rep. No. 99900..6573-ROOO, TRW Systems, Redondo Beach, Calif., 1968. L. F. Shampine, H. A. Watts, and S. M. Davenport, Solving non-stiff ordinary differential equations--the state of the art, SIAM Reu., 18 (1976), 376411.