Performance of Gauss implicit Runge-Kutta methods on separable Hamiltonian systems

Performance of Gauss implicit Runge-Kutta methods on separable Hamiltonian systems

An International Journal computers & mathematics with applications PERGAMON Computers and Mathematics with Applications 45 (2003) 481-501 www.elsev...

1MB Sizes 1 Downloads 69 Views

An International Journal

computers & mathematics with applications

PERGAMON

Computers and Mathematics with Applications 45 (2003) 481-501 www.elsevier.com/locate/camwa

Performance of Gauss Implicit R u n g e - K u t t a M e t h o d s on Separable Hamiltonian Systems V . A N T O H E AND I. G L A D W E L L Department of Mathematics Southern Methodist University Dallas, TX 75275, U.S.A. A b s t r a c t - - W e consider implementations of a variable step size (and, separately, constant step size), fourth-order symplectic Gauss implicit Runge-Kutta method for the solution of Hamiltonian systems. We test our implementations on Kepler's problem with the aim of judging the algorithms' qualitative behavior and efficiency. In particular, we introduce compensated summation as a method of controlling roundoff accumulation. Also, we show how the variable step size Gauss implicit RungeKutta method performs on Kepler's problem with solution orbits of high eccentricity, and compare its performance with that of two Runge-Kutta-NystrSm codes. Finally, we discuss the calculation of efficient starting values for the associated iterations, measure the cost in iterations of our various predictors, and comment on the strategies for terminating the iteration. (D 2003 Elsevier Science Ltd. All rights reserved. K e y w o r d s - - I m p l i c i t Runge-Kutta, Hamiltonian systems.

1. I N T R O D U C T I O N Assume the a u t o n o m o u s H a m i l t o n i a n H ( p , q) is a smooth real function where p = (pl,p:.,,..., p~)T are the generalized momenta, q = (ql, q 2 , . . . , q d ) T are the generalized coordinates, and d is the n u m b e r of degrees of freedom. The H a m i l t o n i a n system of ordinary differential equations (ODEs) corresponding to H ( p , q) is, for i = 1, 2 , . . . , d,

dp__A = OH dt Oqi ' dq___£ = O H dt Opi'

(1.1)

(p(to) with which we need initial conditions \ q(to)] = ( P : ) . Note H ( p , q) = H0 - H(po, q0)

(1.2)

(i.e., dH = 0) and H a m i l t o n i a n ODE systems of the form (1.1) are symplectic [1]. Here, we consider using the two-stage, fourth-order Gauss implicit R u n g e - K u t t a (IRK) methods for solving H a m i l t o n i a n systems (1.1). W h e n implemented in infinite precision, these methods have desirable properties (such as simplecticness) which mirror the properties of the original'

0898-1221/03/$ - see front matter (~) 2003 Elsevier Science Ltd. All rights reserved. PII: S0898-1221(02)00319-X

Typeset by A h/cS-TEX

482

V. ANTOHE AND I. GLADWELL

Hamiltonian system and which ensure slow error growth in constant step size integration and a bounded error in the Hamiltonian [2]. A separable Hamiltonian has the structure H ( p , q ) = T(p) + V ( q ) .

(1.3)

In mechanics, T(p) = (1/2) p. ( M - l p ) represents kinetic energy (M is the mass matrix) and V(q) represents potential energy. The Hamiltonian system for (1.3) has "partitioned form" d._pp= -VqV, dt

(14)

dq = VpT = M - l p . dt

This system may be integrated using specially designed IRK methods, for example the symplectic partitioned IRK methods (which are explicit for partitioned systems) and certain Runge-KuttaNystr6m methods; see [1,3]. Though implicit methods are not necessary for these problems, we consider their use, particularly the use of the IRK methods because of their mathematical properties and because they have been widely recommended for the nonseparable case; see [1] and the references therein. Although our numerical example, Kepler's problem, is separable, we do not take advantage of separability. Our overall conclusions do not depend on it. In the scaled version of Kepler's problem widely used in numerical tests [4-6], we have M = I and V(q) = - 1 / ( q . q)l/2. So, the Hamiltonian (1.3) becomes H(p, q) - p" p 2

1 (q. q)l/2'

(1.5)

and the Hamiltonian ODE system (1.4) becomes dp dt

-

1 (q. q)3/2q'

dq

(1.6)

d--t = P " Typically, Kepler's equations are integrated starting from the initial conditions

P=

l+e

, (1.7)

q=

0

'

where e, 0 < e < 1, is the eccentricity of the orbit. (The larger the value of e, the more eccentric the orbit and the more difficult the integration.) In Section 2, we study how the accuracy of the solution of the nonlinear IRK stage equations impacts the accuracy of the relative Hamiltonian errors in a long time integration of Kepler's equations. We consider both a direct constant step size integration and an "error controlled" integration employing an arclength reparametrization with constant step size integration of the reparametrized equations. Here, we assume that the IRK method is implemented to machine accuracy, that is, that the associated iterations are iterated to convergence. In particular, we introduce compensated summation as a method of controlling roundoff accumulation. We continue by showing how the reparametrized method performs in extreme circumstances and compare that performance with that of two Runge-Kutta-NystrSm (RKN) codes. In Section 3, we derive explicit Runge-Kutta (ERK) methods which approximate the stage values of the implicit IRK

Separable HamiltonianSystems

483

methods. Using a fixed-point iteration to compute the stage values in the case when the ()DEs being solved are nonstiff, as in the Hamiltonian system case, is suggested in [1] and in many other places. Our aim is to use good predictors to speed up the fixed-point iterations when computing the stage values. 2. A C C U R A C Y

OF THE

IRK

INTEGRATIONS

Here, our main concern is the behavior of the relative Hamiltonian error in integrations with the fourth-order two-stage Gauss IRK formula. Consider the autonomous ODE system y' = f(y).

(2.1)

For system (1.4) ' we choosey = (Pq) and f ( y ) = r,(_vqv(q)) WpT(p)

"

2.1. IRK Formulas and Constant Step Size Integration

The two-stage fourth-order Gauss IRK formula may be represented in standard notation using the Butcher tableau [1,2]. 1 1 v~ 2 6 4 6 1 v~ 1

i+y

(2.2)

~+T 1

1

This tableau is a shorthand for the integration formula Yn+l

=

Y= + 1 Atn+l[f(Y0 + f(Y2)],

(2.3)

where the stages Y1, Y2 satisfy the equations

YI=yn+Atn+I[lf(y1)+ ( 1

~v~)f(Y2)],

(2.4) Y2 = Yn + At~+l

[(1

+

_~_~)

1 f(Yl) + ~ f(Y2)

Here, Yn approximates the solution y(tn) at time tn, and Atn+l = tn+l - tn. (The values in the left column of tableau (2.2) are required only when solving nonautonomous systems.) Solving equations (2.3) and (2.4) exactlyguarantees that the two-stage Gauss IRK method is fourth-order in At and is symplectic; see [4]. The latter property guarantees that the relative Hamiltonian error is bounded for all time. In the remainder of this section, we assume that these equations are solved precisely to machine precision. We achieve this using a fixed-point iteration Z(~+I)

At,~+l [ l f ( y n + Z ~ 0 ) + ( 1

v ~ ) f (yn + Z~i))] (2.5)

~2~(i+l)=Atn+l

+

f y . + Z ~ i) + ~ f

where Z ( 0 = ~Z(20) =

yn+Z(2 i)

yn)

y(20 y=

'

484

V. ANTOHE AND I. GLADWELL

This iteration continues until there is no further change in the scaled iterates

= \T?

) - A

+I

z?

-

to machine precision. Of course, this does not give the exact solution, but it is about as close to the true solution as we can realistically hope to achieve in a finite fixed precision computation. The scaling by At is to remove the implicit factor of Atn+l in Z (~). T h a t is, it removes the dependence on the step size of the iteration error control; see [1, Section 5.5]. For Atn+l small enough, the iteration is guaranteed to converge to the "true" answer, so the quality of the prediction Z (°) of the stages is important only for efficiency. We return to this efficiency question in Section 3. For now, we observe that in all our experiments the step size Atn+l has been small enough so that the iteration has always converged. For large values for At,~+l, we lose accuracy in the integration of Kepler's problem (that is, a graph of the (Pl,P2) phase plane of the computed solution is no longer a closed orbit), but still the iteration for T (~) converges.

5.0e-10

0.0e+00 -5.0e-10 im

o ?

-1.0e-09

2:

-1.5e-09 -2.0e-09 -2.5e-09 0

i 200

i 400

i 600

I 800

1000

period n u m b e r F i g u r e 1. I n t e g r a t i o n of K e p l e r ' s p r o b l e m w i t h e = 0.9, A t = 0.0005, S = 200.

In Figure 1, we show the relative Hamiltonian error in a double precision (DP) Fortran 90 (f90) integration of Kepler's problem with eccentricity e = 0.9 using a constant step size At = 0.0005 in the two-stage Gauss IRK method. In this and all later figures, we plot the error only once every S steps, so that the reader is able to observe any structures in the plot. With each figure, we give the value of At used in the integration and the value of S used for the plot. The range of integration is chosen to be the same as that used in [4]; that is, it is of length 1,025 periods. In [4], the global error in the variables p, q is analyzed on this range. The error in the solution grows linearly with time t; see [2]. However, the elliptical phase plot of the solution in the (ql, q2)-plane remains a closed orbit in all the results reported here. Indeed, though the numerical solution has significant global error, it traces the orbit the same number of times as the true solution. In Figure 1, we observe the regular (bounded) behavior of the error which lies within an asymmetric band about the zero error line. The same type of regular behavior is obtained for integrations using a range of constant step sizes. The largest step size that we tried for this eccentricity which preserves the periodic nature of the solution ( P ) is At = 0.001. In

Separable Hamiltonian Systems

485

Table 1, the m a x i m u m Hamiltonian errors observed for At = 0.001 and for successively smaller values of At are given. Down to At = 0.0002, these maximum Hamiltonian errors demonstrate clearly the fourth-order behavior of the method; subsequently, roundoff error impacts the order calculations. We also include the corresponding values for Fortran 90 extended precision (EP) calculations which clearly indicate the correct order; roundoff error has no significant impact in these calculations. The regular fourth-order reduction in the error as the step size is halved that is observed in Table 1 breaks down when the step size At is sufficiently small. In Figure 2, we show the relative Hamiltonian errors from the same sequence of step sizes as in Table 1. Observe that the loss of fourth-order behavior in Table 1 is due to presence of a phenomenon not observable for larger step sizes, where there are larger truncation errors. These results and those for the reparametrized equations discussed in the next section were initially produced using a MATLAB implementation; M A T L A B uses double precision arithmetic. We have also used a Fortran 90 double precision implementation of the same algorithm on the same computer system, producing close to identical results. (MATLAB and Fortran 90 are syntactically close, and therefore, it is relatively easy to check t h a t the algorithm implemented is the same.) Table 1. Maximum relative Hamiltonian errors integrating Kepler's problem with e = 0.9 over 1,025 periods. At

Herrmax(DP)

0.0010 0.0005 0.0002 0.0001

-3.90597350 x 10-8 -2.43982967 x 10 - 9 -6.41406928 x 10-11 -6.99174052 x 10-12

Order

Herrmax(EP)

-3.90591643 x 10-8 4.00082982 -2.43862020 X 1 0 . 9 3.97101530 -6.24102202 x 10-1] 3.19751639 -3.90047404 x 10-12

Order 4.00152409 4.00032271 4.00006093

The linear error growth observed in Figure 2 is reminiscent in size and behavior of that predicted by the standard analysis of the effects of the accumulation of roundoff error, for a double precision calculation. (Another, possibly clearer, example of linear error growth is shown in Figure 6.) This linear error growth is probably mainly explained by the accumulation of roundoff error. (Note that, although we solve the stage equations (2.4) almost exactly, there are still roundoff errors in the formation of the solution in (2.3).) We observe t h a t during the linear error growth, the maximum size of the error is approximately the same (that is, of order 10 -12) in a double prec:ision (16 decimal digit) calculation for all the step sizes where its effect is visible. Note, as the step size is halved, the number of steps doubles but the increment (the second term in (2.3)) to the solution is approximately halved. To check the effects of precision, we computed exactly the same quantities using extended precision in Fortran 90, on the same computer system. At this extended precision, the error growth phenomenon vanishes, confirming that the problem is indeed due to the precision of the arithmetic. 2.2. E r r o r C o n t r o l b y A r c l e n g t h R e p a r a m e t r i z a t i o n The short step sizes needed to compute a realistic solution of Kepler's problem when using a constant step size are entirely due to the difficulty of integrating accurately near the point of closest approach in the orbit (viewing Kepler's equations as describing a two-body problem with one b o d y fixed at the origin). Much larger step sizes could be used elsewhere on the orbit. Since to use a standard variable step size algorithm would violate the simplecticness of the Gauss IRK method, a number of authors [1,4,7-9] have suggested transforming the system to one where a constant step size method would produce approximately the same error on every step. In [4], an arclength reparametrization

dt d'r

g(P' q)

486

V. ANTOHE AND I. GLADWELL 1.0e-08

5.0o-10

O.Oo+O0

O,Oe+O0

-5.00-10 -1.00-08

?

-1.00-09

? .I-

"I-

-2.0e-08

-1 30-09 -3.0e-08

-2.00-09

-4.0e-08

-2.50-09

t

i

I

I

I

200

400

600

800

1000

0

200

400 600 period number

period number

(a) At = 0.0010, S = 200.

800

1000

(b) At = 0.0005, S = 200.

2.0e-11

1.0e-12

O.Oe+O0

i

-1.0e-12

-2.0e-ll -3.00-12

? "I-

"1-

-4.0e-11

-5.00-12

-6.oe-11

-8.oe-11 200

~ = 400 600 pedod number

~ 800

i

-7,0e-12 1000

0

(c) At = 0.0002, S = 400.

I

200

i

400 600 period number

i

SO0

10o0

(d) At = 0.0001, S = 500.

Figure 2. Integration of Kepler's problem with e = 0,9. is p r o p o s e d . T h i s leads to t h e e q u a t i o n s dp

dT dq

-

g(p,q)VqH(p,q), (2.6)

d-"~ = g ( p ' q ) V p H ( p , q), which for K e p l e r ' s p r o b l e m r e d u c e to d__pp=

dT

g ( p , q) ( q . q)3/2 q '

(2.7)

dq d--~ = g ( p ' q ) p ' A v a r i e t y of choices of g ( p , q) have b e e n p r o p o s e d in t h e literature. arclength measure

g(p,q) = [p. (M_Zp) +

Here, we use a s t a n d a r d

llVqV(q)l122]-1/2

(2.8)

recommended in [4].For Kepler'sproblem,thisgives y(p,q) =

Ip .

I

1 -i/2

p + (q" q)----------~

(2.9)

Separable Hamiltonian Systems

487

The resulting reparametrized equations are further modified by adding a term to preserve the Hamiltonian nature of the problem; see [7-9] for a justification. For Kepler's problem, this modification leads to the equations dp d--'~ =

g(p, q) (q. q)3/2 q - [H(p, q) - Ho]Vqg(p, q), (2.10)

dq d-~ = g(p' q)p + [H(p, q) - Ho]Vpg(p, q), where

Vpg(p, q) = - g ( p , q)3p, Vqg(p,q) =

2g(p,q) 3 .(q - - q - ~ q.

(2.11)

The idea is to integrate equations (2.10) with a constant step size AT in % hence, preserving simplecticity of the system as a function of 7-. Of course, for a constant value of AT, the step size At in the original independent variable, t, varies with arclength; that is, the step size is small where there are large variations in with respect to unit variations in t and is large elsewhere. In the case of Kepler's problem, that means that the actual step size At is small near the point of closest approach of the two bodies. Although its provenance is an arclength reparametrization of time, AT is effectively being used as a local error control tolerance. 1.5e-08

1.0e-09

1.0e-08

7.5e-10 5.0e-10

5.0e-09

2.5e-10

?

O.Oe+O0

-1-

O.Oe+O0

-5.0e-09 -2.5e-10

-1.0e-08

-5.0e-10

-1.5e-08

-7.5e-10 0

200

400 600 800 period number.

1000

0

(a) A T = 0 . 0 5 0 0 0 , S = 10.

200

400 600 period number

800

1000

800

1000

(b) A ~ = 0 . 0 2 5 0 0 , S = 20.

6.0e-11

1.2e-ll 1.0e-11

4.0e-11

[ J

8.0e-12

2.0e-11 ? "I-

6.0e-12

O.Oe+O0

?

4.0e-12

"I"

2.0e-12

-2.0e-11

O.Oe+O0 -4.0e-11

-2.0e-12

-6.0e-11

-4.0e-12 0

200

400 600 period number

(c) A r = 0 . 0 1 2 5 0 , S = 20.

800

1000

0

200

400 600 period number

(d) A-r = 0 . 0 0 6 2 5 , S = 100.

F i g u r e 3. R e l a t i v e H a m i l t o n i a n e r r o r , D P i n t e g r a t i o n of t h e r e p a r a m e t r i z e d K e p l e r p r o b l e m , e = 0.9.

488

V. ANTOHE AND I. GLADWELL Table 2. R e l a t i v e H a m i l t o n i a n errors and e s t i m a t e d orders of accuracy for D P integ r a t i o n of the r e p a r a m e t r i z e d Kepler p robl e m w i t h e = 0.9 over 1,025 periods. AT

Herrmax(DP)

Order

Herrmax(EP)

O rde r

Herrmax(DPCS)

2.072759 x 10 - 7

Order

0.10000

2.072764 x 10 . 7

0.05000

1.295087 × 10 - 8

4.00043

1.295036 × 10 - 8

4.00049

1.295038 × I0 - s

4.00049

0.02500

8.101368 × I0 - I 0

3.99874

8.093571 × I0 - I 0

4.00007

8.093659 × 10 - 1 °

4.00006

0.01250

5.076650 × 10 -11

3.99622

5.058419 × 10 -11

4.00002

5.060841 x i 0 - l I

3.99934

0.00625

1.054801 × 10 -11

2.26691

3.161501 × 10 -12

4.00000

3.167244 x 10 -12

3.99808

2.072760 × i 0 - 7

Table 3. Step size ranges and n u m b e r s of steps (in t h o u s a n d s ) for D P i n t e g r a t i o n of the r e p a r a m e t r i z e d Kepler problem w i t h e --- 0.9 over 1,025 periods. AT

Atmln

Atmax

Steps

0.10000

1.00 X 10 - 3

2.77 × 10 -1

163

0.05000

5.00 X 10 - 4

1.39 × 10 -1

327

0.02500

2.50 X 10 - 4

6.95 X 10 - 2

654

0.01250

1.25 X 10 - 4

3.48 × 1 0 - 2

1308

0.00625

6.24 x 10 - 5

1.74 × 10 - 2

2616

In Figure 3, the relative Hamiltonian error for equations (2.10) is given for step size A r = 0.05 and for successive halvings of that step size. In Table 2, the corresponding maximum relative Hamiltonian errors and, in Table 3, the range of values of the underlying step size At are given. A check of the data files used for plotting the figure reveals that, for each value of AT, not surprisingly the minimum step size A t m i n is taken near the point of closest approach of the two bodies, and the maximum step size is taken at a point distant from there. Also, a study of the estimated errors in Table 2 for the larger values of A r in double precision arithmetic reveals that the error behaves with the correct (fourth) order accuracy in the constant step size At. In Figure 3, we observe that most of the error plot is contained in a band of regularly placed points which is reducing in width at the correct order with AT, but that a "chaotic" set of points becomes more dominant in the error as A r becomes small. As in the constant step size case, as the step size AT is halved, the number of steps approximately doubles and the minimum step size Atmin iS approximately halved. In contrast, in the extended precision calculation, the errors and the estimated order behave regularly, as there are no significant roundoff effects. (The columns in Table 2 headed DPCS correspond to double precision calculations with compensated summation, which are discussed in Section 2.4.) A study of the corresponding data files reveals that the chaotic values all correspond to points near the point of closest approach. The chaotic points are deceptively dense in the figures because there are so many short At steps (and hence, so many plotted values) in the vicinity of the point of closest approach in comparison with the number of At steps taken elsewhere. What is not clear from Figure 3 is how this chaotic error regime behaves in a longer integration. In Figure 4b, we show a plot of the relative Hamiltonian error over a time range nearly ten times longer and using the smallest step size AT ---- 0.00625 in Figure 3. Figure 4 reveals that the behavior of the chaotic part of the error over this longer range is similar to its behavior on the original range. However, it is clearer that the error iS growing essentially linearly over this long time range and that, simultaneously, the regular part of the error remains qualitatively and quantitatively unchanged over this longer time span. Note that, over the ten times longer range, the width of the chaotic band increases by a factor of less than five; about ten times as many At steps are used. Figures 4a and 4c are intended as Fortran 90 double precision and MATLAB "snapshots" of the short (1,025 period) integration for direct comparison, and for comparison with their longer

Separable Hamiltonian Systems

489

1.2e-11 4.0e- 11

8.0e-12

2.0e-11

4.0e-12 =,..

?

O.Oe+O0

0.0e+00

?

"1-

"1"

-4.0e-12

-2.0e-11

-8.0e-12

-4.0e-ll

-1.2e-11

t

0

2.00

....

I

I

-6.0e-11

I

400 600 8~ period number

.

1~0

I

I

I

2000

4000

6000

2

8000

10000

4000 6000 8000 period number

10000

period number

(b) f90 DP, S = 500.

(a) f90 DP, S = 50. 1.2e-11 4.0e-11 8.0e-12 2.0e-11

4.0e-12

?

O.Oe+O0

O.Oe+O0

?

-r"

'1-

-4.0e-12

-2.0e-11

-8.0e-12

-4.0e-11

-1.2e~11

0

200

400 600 800 period number

(c) MATLAB, S = 50.

1000

-6.0e- 11

0

2000

(d) MATLAB, S = 500.

Figure 4. Integration of the reparametrized Kepler problem for e -- 0.9 and ~T = 0.00625. (10,000 period) integration range counterparts, Figure 4b and 4d, respectively. It is clear that, t h o u g h M A T L A B and F o r t r a n 90 double precision are b r o a d l y e q u i w l e n t , in an integration as sensitive to roundoff error accumulation as this, t h e small differences in i m p l e m e n t a t i o n have very significant effect. T h a t being said, qualitatively (and even q u a n t i t a t i v e l y t a k i n g a global view) t h e results from M A T L A B and F o r t r a n 90 double precision are very similar. In t h e F o r t r a n 90 double precision implementation, we also isolated the calculation of relative H a m i l t o n i a n error and c o m p u t e d it to extended precision from the double precision solution values. To double precision t h e values for the Hamiltonian error were t h e same as in the entirely double precision calculation. This confirms t h a t the double precision H a m i l t o n i a n error is app r o x i m a t e l y correct. Hence, this calculation is not a potential source of a c c u m u l a t e d error nor is it likely to significantly affect t h e integration t h r o u g h the introduction of the Hamiltonian error t e r m in equation (2.10). 2.3. C o n t r o l o f t h e I t e r a t i o n E r r o r A s t a n d a r d a p p r o a c h to controlling the error (see [1, Section 5.5]) is an absolute error test on the iterates A T (j) < tel, (2.12) where A T (j) = i/-(~+1) - T (~) and tel is an iteration error tolerance, or a similar relative error control. Here, usually tel > e where e is the largest number which when a d d e d to one gives

490

V. ANTOHE AND I. GLADWELL

exactly one. To be able to say that the stage equations have been satisfied "exactly", setting, at least approximately, tol = ~ would seem to be necessary. In [4], one aim is to study the effects of varying tol on the accuracy of the solution y = (qP) for a set of choices of constant step sizes. It is shown that, for an eccentricity e = 0.9, small values of tol are needed for all the step sizes capable of producing an accurate solution over an integration range of 1,025 periods. Also, the smaller the step size used the smaller the value of tol needed. For the constant step size numerical results presented in the previous section, we iterated equations (2.4) to convergence; t h a t is, we required A T (i)

< tol = K~,

(2.13)

where K ~ 4.5. To avoid "unnecessary" iterations, it is often suggested (see, for example, [1]) t h a t the convergence rate be estimated from the last three iterates, then used to determine if the iteration is likely to have converged but t h a t the iteration error test has not been satisfied. This technique is useful when constructing an efficient code, but here we are more concerned with the accuracy and phenomenology of the results; so we have not used it. We also iterate to convergence in the tests in the next section where we study the efficiency of various predictors. The number of iterations t h a t we report would be smaller if we used such a convergence rate estimation technique to i m p l e m e n t a more efficient iteration. However, we believe that, relatively, the differences in the numbers of iterations would be unchanged. In the M A T L A B environment implemented on a D E C Alpha, e ~ 2.22 x 10 -16. The following detailed observations are of integrations using A t = 10 -4 and tol = 2.3 x 10 -16. Usually, when the iteration has converged, A T (i) = 0. Occasionally, A T (i) is instead an integer fraction of e. Most commonly, then A T (i) is e/2 or e/4, but occasionally it is a much smaller fraction of a power of 2 times e. Even less frequently, it is a nonpower of two integer fraction of e. Most of the latter values occur in a relatively small time interval after around 1,000 periods. The smallest values observed are about four orders of magnitude smaller than e, for which behavior we have no simple explanation. 2.4. C o m p e n s a t e d

Summation

For many years it has been understood t h a t the effects of the accumulation of roundoff in the summation of a series might be reduced by using the compensated summation (CS) technique; N see [10,11]. Using the notation in [12, Ch. 4], this algorithm for computing ~-']i=0 xi may be written as follows. COMPENSATED SUMMATION ALGORITHM sum - - 0 err = 0 for i -- l : N

temp = sum

(2.14)

y = xi + err s u m = temp + y err = (temp - sum) + y endfor To achieve an optimal error bound on the accumulation of roundoff error using this algorithm, it is necessary that the correction y at any stage be smaller than the corresponding term t e m p , a condition satisfied almost always in our experiments. Kahan [13,14] has suggested a modification

Separable Hamiltonian Systems

491

designed to achieve the same error bound for a computer with processors without guard digits. This modification replaces the line err = (temp - sum) + y in (2.14) by f=0 if sign(temp) = sign(y) then f = (0.46 • sum - sum) + sum

(2,15)

err = (temp - f ) - (sum - f ) + y. It is well known that when summing the solution of an ODE, for example as in (2.3), the error in the solution, and even the solution itself, may be dominated by the effects of accumulated roundoff. (In the context of ODEs, Henrici [15] gave the first detailed statistical analysis and showed the results of some numerical experiments.) Each of [12,16-18] shows a "U-curve" figure, which represents the compensatory effects of truncation error and roundoff error for integration across a fixed range of integration with a variety of cons.tant step sizes. For large step sizes, accumulated truncation error dominates and, for small step sizes, accumulated roundoff error dominates. The U-curve in [16,17] is a plot of the prediction from a convergence analysis, and the similar curves in I12,18] are the results of solving simple ODEs with constant step sizes and low order methods. The figure in [12] includes a plot of the same computations but with the compensated summation algorithm implemented as in (2.14). These results show almost no impact of roundoff error even for quite small step sizes, and they suggest strongly that compensated summation is an effective approach to the reduction of the effects of accumulated roundoff in integrations with many short steps. Compensated summation has not been widely used in accumulating the solution in numerical software for ODEs. As far as we are aware, the only use in publicly available software i..~ in the Adams variable step/variable order code DE/STEP, written by Shampine and Gordon and described in [19, Ch. 9]. (The recent, and somewhat simpler, Fortran 90 version of D E / S T E P in [20, Ch. 12] also uses compensated summation.) However, though compensated summation is used in the accumulation of the solution, the intention is to improve the accuracy and quality of the local error estimates, not to improve the accuracy of the solution p e r se. It is observed that when the error estimates are close to roundoff error level they can be sufficiently inaccurate so as to impact seriously the algorithm for the choice of order and step size. Usually, the effect is that a lower than optimal order is chosen. The effect of using compensated summation is illustrated in [18, p. 9] in a favorable comparison of the efficiency of D E / S T E P with the variable step/variable order Adams implementation in the well-known code DIFSUB [21] which does not use compensated summation. Notably, D E / S T E P only permits local error tolerances larger than a few units of e and only exploits compensated summation when the local absolute/relative error tolerance is small (that is, 100e or less); then the error estimates must be small if the step si~,e is appropriate. More recently, researchers in celestial mechanics [22-24] have observed the need to control the accumulation of roundoff error in their very long integrations. For example, Quinlan [22] simulates the evolution of the solar system over a period of three million years with a consr~ant step size of 3/4 of an hour using a linear multistep (StSrmer) method of order thirteen and, also, with a symmetric multistep method of order twelve; that is, he uses the equivalent of the highestorder methods implemented in D E / S T E P and in DIFSUB but without local error estimation. Among the measures implemented and described in [22-24] is compensated summation, w:hich dramatically improves the accuracy of the computed results. In the algorithm described in (2.14), for our two-stage Gauss IRK method (2.3), the variable sum represents the solution y,~, and the term xi represents the increment (the second term in (2.3)) which involves the stages (2.4) at the current step. We have employed compensated summation to reduce the impact of roundoff errors. It is sufficiently effective that, even over our longest range of integration, what was in double precision computation a roundoff accumulation that dominated the solution is reduced to an unimportant effect; see Figure 5. See also Table 2

492

V. ANTOHEAND I. GLADWRLL

where we observe t h a t c o m p e n s a t e d s u m m a t i o n has improved t h e double precision results to the point where t h e relative H a m i l t o n i a n errors behave essentially correctly; t h a t is, t h e results are close to those p r o d u c e d by an e x t e n d e d precision calculation (see F i g u r e 5d). W h e n we add K a h a n ' s improvement (2.15) to the c o m p e n s a t e d s u m m a t i o n a l g o r i t h m (2.14), we observe essentially no qualitative nor q u a n t i t a t i v e changes to the results. 4.0e-11

4.0e-ll

2.0e-11

2.0e,-11 O.Oe+O0

O.Oe+O0

?

.I-

-2.0e-11

-2.0e- 11

-4.0e-11

-4.0e-ll I

-6.0e-11

O

2000

I

I

4000 6000 period number

-6.0e-ll

I

8000

10000

0

(a) f90 DP, S = 500.

2000

4000 6000 period number

8000

10000

8000

10000

(b) MATLAB, S = 500.

3.0e-12

3.0e-12

1.0e-12

1.0e-12 ? 'I-

? -t-1.0e-12

-1.0e-12

°3.0e-12

-3.0e-12 0

2000

4000 6000 period number

(c) f90 DPCS, S = 500.

8000

10000

0

2000

4000 6000 period number

(d) f90 EF, S = 500.

Figure 5. Impact of compensated summation for the reparametrized Kepler problem; e ----0.9, AT = 0.00625. 2.5. C o m p a r i s o n o f t h e R e l a t i v e E f f i c i e n c y o f R e p a r a m e t r i z a t i o n and Local Error Control So far, all our tests are for the eccentricity, e = 0.9. This is t o o small a value for a serious test of any a d a p t i v e step size integrator. In Table 4, we show results for this eccentricity and for an increasing set of r e p a r a m e t r i z a t i o n tolerances A~-. (The values of the step size ratios are used in Section 3 below.) Observe t h a t we are able to c o m p u t e a closed orbit solution for a wide range of eccentricities for the relatively large tolerance AT ---- 0.05. Next, observe t h a t as t h e eccentricity is increased, t h e m a x i m u m step size and the m a x i m u m step size ratio are almost constant while t h e m i n i m u m step size decreases rapidly, as is to be expected since t h e integration becomes significantly more difficult near t h e point of closest approach. So, how does t h e r e p a r a m e t r i z a t i o n approach c o m p a r e in efficiency with a conventional local error control a l g o r i t h m implemented in O D E software? In Table 5, we observe t h a t we need a small value for A r to o b t a i n an a c c u r a t e H a m i l t o n i a n error when the eccentricity is large, the difficult case for integrators. On this basis, we choose to make a comparison with t h e results for A r ---- 0.00625 using c o m p e n s a t e d summation; then t h e roundoff error has a small impact.

Separable Hamiltonian Systems

493

Table 4. Maximum step size ratios and numbers of steps (in thousands) for DP integration of the reparametrized Kepler problem over 1,025 periods without cornpensated summation. AT

e

Atmax

Max. Ratio

Steps

1.74 x 10 -2

1.01

2616

Attain

0.00625

0.9

6.24 x 10 -5

0.01250

0.9

1.25 x 10-4

3.48 x 10 -2

1.02

1308

0.02500

0.9

2.50 X 10 -4

6.95 X 10 -2

1.03

654

0.05000

0.9

5.00 X 10 -4

1.39 X 10 -1

1.06

327

0.10000

0.9

1.00 X 10 -3

2.77 X 10 -1

1.13

163

0.05000

0.99

5.00 X 10 -6

1.90 X 10 -1

1.088

94O

0.05000

0.999

5.00 X 10 -8

1.99 X 10 -1

1.095

2906

0.05000

0.9999

5.00 X 10 -1°

1.99 X 10 -1

1.096

9130

Table 5. Maximum step size ratios, numbers of steps (in thousands) and stage evaluations (in thousands) for DP integration of the reparametrized Kepler problem with e = 0.9999 over 10,000 periods with CS. AT

0.00625

Herrmax

2.3800 x 10 -7

Atmi n

Atm~x

Max. Ratio

Steps

Stage Evals.

6.25 × 10-11

2.50 × 10-2

1.01

712,917

7,325,113

0.01250

-1.7543 × 10 -6

1.25 x 10-1°

5.00 × 10 -2

1.02

356,461

3,771,496

0.02500

-2.2092 x 10 -6

2.50 x 10-10

1.00 x 10 -1

1.05

178,229

1,967,294

0.05000

3.2902 x 10 -4

5.00 × 10 -1°

2.00 ~< 10 -1

1.10

89,114

1,105,262

See, however, F i g u r e 6 w h i c h d e m o n s t r a t e s t h a t e v e n w h e n using c o m p e n s a t e d s u m m a t i o n , acc u m u l a t e d r o u n d o f f error can r e m a i n a p r o b l e m . C o m p e n s a t e d s u m m a t i o n significantly reduces t h e size of t h e a c c u m u l a t e d r o u n d o f f so t h a t t h e m a x i m u m r e l a t i v e H a m i l t o n i a n error is of t h e correct order, b u t it is clear t h a t for t h e larger eccentricities r o u n d o f f error is still v e r y visible. So, for e x a m p l e , for t h e i n t e g r a t i o n w i t h e = 0.999, c o m p e n s a t e d s u m m a t i o n reduces t h e effect of a c c u m u l a t e d r o u n d o f f by an o r d e r of m a g n i t u d e . For c o m p a r i s o n , we use two R u n g e - K u t t a - N y s t r S m ( R K N ) e m b e d d e d pairs which were first i m p l e m e n t e d in a p u b l i c l y available c o d e [25] for special s e c o n d - o r d e r s y s t e m s (such as t h o s e arising in celestial m e c h a n i c s ) and, later, c o n v e r t e d to t h e s o l u t i o n of s e p a r a b l e H a m i l t o n i a n p r o b l e m s ; see [26]. T h e R K N m e t h o d s are • R K L - - a s i x t h - o r d e r f o r m u l a w i t h a f o u r t h - o r d e r e m b e d d e d local error e s t i m a t e , using six s t a g e e v a l u a t i o n s per step; • R K H - - a t w e l f t h - o r d e r f o r m u l a w i t h a t e n t h - o r d e r e m b e d d e d local e r r o r e s t i m a t e , using s e v e n t e e n s t a g e e v a l u a t i o n s p e r step. B o t h m e t h o d s are i m p l e m e n t e d using local error control; t h a t is, t h e user specifies a (relative) local e r r o r t o l e r a n c e (1.e.t.) to control t h e error. (In [27], it is o b s e r v e d t h a t t h e s e m e t h o d s can be v e r y c o m p e t i t i v e for solving H a m i l t o n i a n problems.) Since t h e s e m e t h o d s are used w i t h a s t a n d a r d local error control strategy, we e x p e c t t h e m to deliver similar a c c u r a c y for t h e s a m e local error t o l e r a n c e b u t at different costs ( m e a s u r e d in s t a g e e v a l u a t i o n s ) . In b o t h cases, t h e m e t h o d is n o t s y m p l e c t i c , nor H a m i l t o n i a n p r e s e r v i n g in any o t h e r way, so we e x p e c t t o see t h e H a m i l t o n i a n error grow l i n e a r l y over long time. Indeed, this is j u s t w h a t we observe; t h o u g h t h e error is h i g h l y o s c i l l a t o r y (especially for R K L and tol = 10-14), t h e overall b e h a v i o r is linear growth. In T a b l e 6, we show s t a t i s t i c s for runs w i t h R K L and R K H c o r r e s p o n d i n g to t h o s e in T a b l e 5. F i r s t , we choose t h e 1.e.t. tol = 10 -14, t h u s essentially asking for t h e m a x i m u m a c c u r a c y from t h e i n t e g r a t o r . W e see t h a t , for b o t h R K L and R K H , t h e a c c u r a c y delivered ( m e a s u r e d by t h e H a m i l t o n i a n error) over t h e w h o l e i n t e g r a t i o n r a n g e is smaller t h a n t h a t delivered by t h e r e p a r a m e t r i z a t i o n a p p r o a c h w i t h A T = 0.00625, a n d at a significantly lower cost p a r t i c u l a r l y for

494

V. ANTOHEAND I. GLADWELL

1.0e-09

1.0e-09

7.5e-10

7.5e-10

5.0e-10

5.0e-10

l II,

2.5e-10 ? "I"

?

'I-

O.Oe+O0

0.0,

:

-2.5e-10

-2.5e-10

-5.0e-10

-5.Oe-lO

-7.5e-10 0

200

400

600

8~

-7.5e-10

1000

0

I

I

I

I

200

400

600

8OO

period number (a) e

=

i ..............

I000

period number

0.9 without CS.

(b) e

=

0.9 with CS.

1.0e-07 2.0e-09

8.0e-08 6.0e-08

O.Oe+O0

4.0e-08 ? "r"

? •"r

2.0e-08

j

-2.0e-09

O.Oe+OO -4.00-09

-2.0e-08 -4.00-08

0

-6.0o-00 200

400

600

800

I

0

1000

I

I

I

i

I

200

400

600

800

1000

period number

period number

(c) e = 0.999 without CS.

(d) e = 0.999 with CS.

Figure 6. The reparametrized Kepler problem; DP integration, &r = 0.025, S = 100. R K H . I n d e e d , we could i n t e g r a t e efficiently over a m u c h longer r a n g e w i t h this t o l e r a n c e before we reach a p o i n t w h e r e t h e H a m i l t o n i a n error exceeds t h a t for t h e r e p a r a m e t r i z a t i o n approach. N e x t , we s h o w t h e results of i n t e g r a t i o n s w i t h l.e.t, tol = 10 -11, chosen b e c a u s e it p r o d u c e s a b o u t t h e s a m e size H a m i l t o n i a n error as for t h e r e p a r a m e t r i z e d a p p r o a c h . ( T h e 1.e.t. tol = 10 - 1 ° w o u l d be t o o large for t h i s p u r p o s e . ) Clearly, t h i s a p p r o a c h is significantly m o r e efficient again. ( R K L is m o r e efficient t h a n G a u s s I R K i n t e g r a t i n g t h e r e p a r a m e t r i z e d e q u a t i o n s by a factor of over 20 a n d R K H by a f a c t o r of over 200. O f course, a p a r t of t h i s a d d i t i o n a l efficiency arises f r o m t h e h i g h e r order.) A f u r t h e r a d v a n t a g e of t h e 1.e.t. c o n t r o l a p p r o a c h is t h a t far fewer steps are t a k e n , so t h e r e is no need for c o m p e n s a t e d s u m m a t i o n since r o u n d o f f e r r o r does n o t significantly i m p a c t t h e c o m p u t a t i o n . O f course, c o m p e n s a t e d s u m m a t i o n m a y be n e c e s s a r y for m u c h longer i n t e g r a t i o n s j u s t as in t h e celestial m e c h a n i c s c a l c u l a t i o n s d e s c r i b e d in [22-24]. Table 6. Maximum step size ratios, number of steps (in thousands) and stage evaluations (in thousands) for R.KN methods in DP integration of the Kepler problem with e = 0.9999 over 10,000 periods. Method

1.e.t.

Herrmax

&train

Atmax

Max. Ratio

Steps

Stage Evals.

RKL

10 -14

-4.2983 × 10 -9

4.59 × 10 - l °

6.00 × 10 -3

1.67

190,149

1,140,892

R.KH

10-14

-3.1523 × 10 -9

1.02 × 10 -7

4.17 × 10 -1

1.85

3,763

63,977

RKL

10 -11

4.9981 x 10 -7

2.58 × 10 -9

2.39 × 10 -2

1.50

46,591

279,548

RKH

10 -11

-2.9364 × 10-7

1.94 × 10 - ?

6.21 x 10- I

1.70

2,043

34,736

Separable Hamiltonian Systems

3. P R E D I C T I O N

495

OF THE IRK STAGE VECTORS

In (2.5), we describe a fixed-point iteration for computing the stages

Y2

in the Gauss IRK

equations (2.4). Of course, as in the Hamiltonian case, the system must be nonstiff for there to be any prospect of the fixed-point iteration converging. But, even for nonstiff problems, convergence is only guaranteed for small enough step sizes At. Usually, the requirements of accuracy are sufficient to make At small enough that the iteration converges. T h a t is the case in our experiments and, below, we simply assume that the iteration converges. This iteration has been discussed widely; see [28-30]. For stiff problems, see [31] and the references therein. Clearly, as long as evaluating the stages is the dominant expense' in the iteration, minimizing the number of these evaluations is the route to efficiency. Ignoring the possibility that a faster convergent but more expensive iteration (such as a Newton iteration) may be more efficient, it remains to find inexpensive ways of reducing the number of iterations in the fixed-point method. One such way, considered in Sections 3.3 and 3.4, is to compute, inexpensively, an accurate prediction for the first iterate

(<>)

y~0> , hence, reducing the number of iterations needed. Another

complementary approach is to avoid computing iterates solely to check for convergence when an extrapolant from previous iterates may provide the necessary check; here, we do not consider this complementary approach but, in a production code, it would be combined with accurate prediction.

3.1. Prediction by Extrapolation The simplest approach to providing values \ v2(O) ] for the iteration (2.5) is to use the zeroth-

order approximation \v2(O) ] ---- ( Y : ) ; see [1,32]. Then, each successive iteration increases the order of approximation by one. We observe that the first iterate

(~I(1)'>) Y2

of a fixed-point

iteration (2.5) is what would be obtained by using the forward Euler method on equation (2.1) stepping in turn to tn,1 = tn + (1/2 - v/3/6)Atn+l and to tn,2 = t n + (1/2 + V/3/6)At~+I, which are nominally the (time) locations of the stages when viewed as approximate solutions to the Hamiltonian ODEs. To obtain higher-order approximations to the stage values for use as the prediction \ y~O> , many authors have suggested extrapolation from the information computed on the previous step; that is, from the stage values on the step [t,~-l, tn], see [1,33-36] for starting values for the Gauss IRK methods and [6,37] for the IRK NystrSm methods designed for solving special second-order systems of ODEs. Direct extrapolation using the stage values from the previous step can deliver second-order accurate approximations for the stage values on the current step. To achieve higher order, we need additional evaluation of the function f(y) at locations determined to maximize the accuracy. Laburta obtains appropriate formulas of this type delivering third- and fourthorder accurate approximations to the stage values. It turns out that these additional points are located in the current interval, so the method cannot be viewed as a pure extrapolation. Of course, there are potential dangers associated with the process of extrapolation, particularly when the extrapolation is from data generated on a short step to a following much longer step. Then, clearly, the location of the extrapolated values may be very distant from most or all of the data. Since, in this paper, the extrapolants are polynomials, there is potential for very large errors even, or especially, when using high-order approximations. (Higher-order predictions require higher degree polynomial extrapolants, and hence, have the potential for faster error growth at large distances from the data.) Even in the "mixed" cases such as those designed by

[33,341

496

V. ANTOHE AND I. GLADWBLL

Laburta, there remains a potential problem due to the possibility that the current step is much longer than the step on which the stage data is available. Indeed, this was precisely the reason why the second author chose to restrict step size increases to a factor of two in the NAG Library Runge-Kutta-Merson code D02PAF [38]. There, dense output at the same order as the local solution was produced using a two-step Hermite polynomial interpolant. Still, it was necessary to restrict step size increases to a factor of two so as to prevent large interpolation errors in the longer interval; see [38] for details. Laburta [34] gives bounds on the sizes of the truncation error constants (see, for example, (3.5)) for the case where step size changes are limited to an increase by a factor of two and a decrease by a factor of ten. These are reasonable bounds on the size of the changes in the context of a conventional local error control method, but in the context of the variable step method of Section 2.2, there is no way to impose the. restrictions thus implied. That is, the step size changes are an implicit result of the arclength reparametrization and they cannot be restricted without impacting the method and its (desirable) mathematical properties, sO, in theory this method may choose a step size sequence for which the relative size of successive steps may be very large. Hence, extrapolation for predicting stage values has the potential for large errors. However, our numerical tests do not reveal that this occurs for the particular choice of repaxametrization (2.9) used here when integrating Kepler's problem with a variety of choices of AT and eccentricities e. In Table 4, we show the number of steps (in thousands) and the maximum ratios of successive step sizes throughout the whole range of integration for a range of step sizes A~- with a fixed eccentricity e = 0.9 and for a range of eccentricities with a constant step size A r = 0.05, the largest value of AT for which a closed elliptical orbit could be computed for the solution for each of the eccentricities. These, and other similar results, do not reveal a problem with step size ratios as the largest increase in step size seen is less than 20%.

3.2. E x p l i c i t R u n g e - K u t t a

Predictors

In [39,40], it is suggested that predictors for the stage values be based on the use of explicit l=tunge-Kutta (ERK) formulas. Of course, for this approach to be viable, it is necessary that the ODEs be sufficiently nonstiff that the step taken with the ERK method from the solution Yn generated by the IRK method is not impacted by stability problems to the extent that the prediction is too inaccurate. This is the case in the application in this paper and has been assumed elsewhere [41]. We have already seen that the forward Euler method is effectively the predictor as a result [ y,,'~ of the natural first fixed-point iteration if we start with the prediction ~ y~O)] = ~ y,,). This prediction is very widely used, even for stiff problems. And, generally, other ERK methods have at least the stability of forward Euler. In [39,40], the prediction is based on using the internal stages of well-known ERK method implemented in the software RKF45 [42]. RKF45 implements a six-stage ERK method which gives fifth- and fourth-order results which are used together for local error estimation, and the fifth-order result is used for integration in a local extrapolation mode. The idea in [39] is '~o use the local fifth-order solution of RKF45 to provide a locM error estimate in combination with the two-stage Gauss IRK locM solution so as to provide a local error estimate for the latter. At the same time, the stages of RKF45 axe to provide a prediction of the stage values for iteration to the solution either using the fixed-point iteration assumed in this paper or a Newton-type iteration as proposed for stiff problems. The prediction is generated by matching a linear combination of the ERK stages to maximum order to the IRK stage values. Any remaining free variables are chosen minimizing the sum of the squares of the discrepancies between the prediction and the stage values at the next higher order so as to minimize the residual error at that order.

Separable Hamiltonian Systems

497

3.3. S e c o n d - O r d e r P r e d i c t o r s Here, we construct second-order approximations to the IRK stage values for use as predictors; that is, we find predictors

y~O) such that

=

(3.1

The approach is similar to that described in [39,40] except that here we choose the "locations" of the stages of the ERK method to maximize the accuracy. The main difference between this approach and the normal construction of an ERK method is that there are two "solutions" at different (stage) locations; normally the aim is to produce one solution or two solutions at the same point (as for RKF45). To derive the methods, we write the predictions in the form p" ~(o)

i

= Yn + h t n + i E b u f ( V ' ) ' ,=1

(3.2)

P

V~0) : Yn + Atn+i E

b2,f(Vi),

i=1

where the p vectors Vi are the internal stages of the ERK method. Then, we expand both the predictions and the true stage values in Taylor series about yn and match terms in the series in powers of At,~+l. Analysis shows that we need p = 3 to achieve a match to O(At2+i), so the resulting Butcher tableau has the following form. 0 a21

0

a31

a32

0

bii b21

bl2 b22

b23

(3.3)

b13

Recall that the first iteration starting from the prediction Yn uses the (first order) forward Euler method and so involves an evaluation of f(y,~); that is, the second-order method involves just two additional stage evaluations. There are many possible choices of the constants in tableau (3.3) which give a second-order method. One such choice which we have used in our numerical experiments is as follows. 0 1 0 1

1

-17+i2v 24 -i-

12

0

13-8v

1

12 - i +4v

3

(3.4)

6

A potentially better choice of constants is one that minimizes some measure of the leading (third) order terms in the discrepancy between the predictor and the true stage values. For the j t h true stage, j = 1,2, this discrepancy has the form Np+l A+p+l

(p) (p) ~nq-1 E C O T~

i=l

(3.5)

498

V. ANTOHE AND I. GLADWELL

for the method of order p, where the terms T~ p) are elementary differentials and the '%runcation error constants" C[; ) depend on the constants in the E R K tableau (such as (3.3) for second-order predictors); see [18] for details. The measure of the third-order error t h a t we minimize is the sum oI squares 2_~j= 12__,~=1 tw~j J , as the constants C ~ ) are all t h a t we can vary. So, effectively we are seeking a better predictor averaged over the class of all ODEs which define the elementary differentials T~ p). We have computed m a n y local minima of this measure. One which gives a significantly smaller sum of squares ~'-~.2=1~-]N=~{ItviJ[(~(P)12J than for the constants in tableau (3.4) is given in the tableau below. 40.0 44.981466201178692.10 -2 -8.025575752711016.10 -1

40.0 41.1525393342871244

40.0

-9.709498970272551.10 -1 43.962514227957572

41.3040919725507836 -4.7366618023191505

-1.2181721011834133-10 -1 41.5628227089563900

(3.6)

2 The ratio of the square root of the sum of squares ~-~j=l :~--~N_~{~I[c(p)]2 L~ij J of tableau (3.4) to tableau (3.6) is 120.26. (In tableau (3.6), we have listed sufficient digits for DP IEEE-754 calculations.)

3.4. T h i r d - O r d e r

Predictors

Here, we need p -- 4 to m a t c h terms in the Taylor series to third order; t h a t is, the additional cost of producing a third-order prediction in comparison with t h a t of producing a second-order prediction is equivalent to half the cost of a single iteration. The corresponding tableau has the following form. 0 a21 0 a31 a32 0 (3.7) a41 a42 a43 0 hi1 b21

b12 b22

b13 b23

b14 b24

As in the second-order case, there are m a n y solutions which are third order. When we mini2 ~",Np+I mized the sum of squares ~-]~j=l z_~i=l [(w(p) t~ij j12 of the truncation error constants in the expression for the fourth-order error term, we computed very m a n y local solutions which give close to the same minimum. T h a t is, we were unable to find a solution which gave a significantly smaller sum of squares, unlike in the second-order case. One such choice is given in the tableau 0 1

0

1

1

1 12

1 12

0

2-3 12 243v~ 12

(3.8)

2-v 2 -v~ 2

4 v~ 4

6 24V~ 6

which we use in our tests below. (When deriving (3.8), we constrained the coefficients as follows: a21 _< 1/5, a31 4 a32 _< 1/3, and a41 4 a42 4 a43 <_ 1/2.)

Separable Hamiltonian Systems

499

3.5. N u m e r i c a l R e s u l t s For the predictors in previous sections to be worthwhile, they must reduce the overall cost of the fixed-point iteration in comparison with using the "trivial" predictor y~. Also, the higher the order, the more efficient the predictor should be, at least for short step sizes. In Table 7, we show the approximate average number of iterations and the approximate total number of' evaluations (in thousands) of f(y) when integrating Kepler's problem with a variety of step sizes. We consider four predictors: • • • •

the the the the

trivial predictor y , (Trivial); second-order predictor given by tableau (3.4) (Second Order); second-order predictor given by tableau (3.6) (Improved Second Order); third-order predictor given by tableau (3.8) (Third Order).

Generally, these results demonstrate the anticipated improvement in efficiency with the increase in order of the predictor. However, observe .that the improved second-order predictor is overall more efficient than the third-order predictor for the larger step sizes. Table 7. Costs of prediction: average numbers of iterations and stage evaLs. (in thousands)--MATLAB implementation integrating the reparametrized Kepler problem with e = 0,9 over 10,000 periods.

AT

Trivial

Second Order

Imp. Second Order

Stage Evals.

Av. It.

Stage Evals.

Av. It.

0.10000

32,602

10.22

31,103

8.25

0.05000

57,612

9.03

53,210

6.84

0.02500

102,337

8.02

95,446

5.98

86,897

0.01250

180,939

7.09

167,414

5.06

165,882

0.00625

322,067

6.31

308,286

4.54

283,276

4.

Stage Evals.

Third Order

Av. It.

Stage Evals.

Av. It:.

28,200

7.34

29,540

7.26

48,999

6.18

51,168

6.02

5.31

88,938

4.97

5.00

154,653

4.06

4.05

277,662

3.44

CONCLUSIONS

We have shown that standard double precision Gauss IRK solutions of Kepler's equation are subject to significant roundoff error effects for both constant and variable step algorithms. A MATLAB (double precision) implementation gives qualitatively similar but quantitatively different results to the corresponding Fortran 90 implementation on the DEC Alpha with IEEE-754 double precision arithmetic used for our experiments. These effects occur even when the phase plane solution maintains its orbit, and for step sizes close to those used in [4]. In the case of the variable step integrator, the roundoff error effects are associated with the many short steps near the point of closest approach. Indeed, in this case, it seems that the solution and the error away from this point maintain some structure even when the structure is lost close to the point. Very long integrations reveal that this roundoff error accumulation (in the error) is approximately linear. Surprisingly, inclusion of a simple, inexpensive compensated summation algorithm almost eliminates the roundoff error accumulation, at least over the ranges of integration used here for the eccentricity e = 0.9. For larger eccentricities, the integration is significantly more difficult and roundoff error accumulates even after applying compensated summation. Indeed, for e = 0.9999, there is only a narrow "window" in the values of AT for which an adequate solution may be computed and. for which roundoff error accumulation does not dominate the relative Hamiltonian error. When we use the explicit Runge-Kutta-NystrSm codes, we observe a significantly more efficient integration for the same overall error. Future work will include implementations of sixthand higher-order Gauss IRK methods for reparametrized Hamiltonian problems with the aim of determining whether the increase in order can close the efficiency gap with the explicit RKN codes. We have produced second- and third-order ERK predictors for the stage values in the twostage fourth-order Gauss IRK method. They behave essentially as expected; that is, generally

500

V. ANTOHE AND I. GLADWELL

the higher the order, the more efficient the predictor. It remains to find higher-order predictors; the (fourth) order of the stage values does not preclude predictors of even higher order being useful. REFERENCES 1. J.M. Sanz-Serna and M.P. Calvo, Numerical Hamiltonian Problems, Chapman & Hall, London, (1994). 2. A.M. Stuart and A.R. Humphries, Dynamical Systems and Numerical Analysis, Cambridge, (1996). 3. V. Antohe and I. Gladwell, Performance of two methods for solving separable Hamiltonian systems, J. Comp. Appl. Math. 125, 83-92, (2000). 4. M.P. Calvo, M.A. Lopez-Marcos and J.M. Sanz-Serna, Variable step implementation of geometric integrators, Appl. Numer. Math. 28, 1-16, (1998). 5. B. Cano and J.M. Sanz-Serna, Error growth in the numerical integration of periodic orbits, with application to Hamiltonian and reversible systems, SIAM J. Numer. Anal. 34, 1391-1417, (1997). 6. M.P. Laburta, Starting algorithms for implicit Runge-Kutta-NystrSm methods, Appl. Numer. Math. 27, 233-251, (1998). 7. E. Hairer, Variable time step integration with symplectic methods, Appl. Numer. Meth. 25, 219-227, (1997). 8. W. Huang and B. Leimkuhler, The adaptive Verlet method, SIAM J. Sci. Comput. 18, 239-256, (1997). 9. S. Reich, Backward error analysis for numerical integrators, Preprint SC 96-21, Konrad-Zuse-Zentrum, Berlin, (1996). 10. D.E. Knuth, The Art of Computer Programming, Volume 2, Seminumerical Algorithms, 2nd edition, Addison-Wesley, Reading, MA, (1981). 11. I.V. Viten'ko, Optimum algorithms for adding and multiplying on computers with floating point, USSR Comput. Math. Math. Phys. 8, 183-205, (1968). 12. N.J. Higham, Accuracy and Stability of Numerical Algorithms, SIAM, Philadelphia, PA, (1996). 13. W. Kahan, A survey of error analysis, In Proc. IFIP Congress, Ljubljana, pp. 1214-1239, North Holland, Amsterdam, (1972). 14. W. Kahan, Implementation of algorithms, Technical Report 20, Dept. Comp. Sc., UC Berkeley, (1973). 15. P. Henrici, Error Propagation for Difference Methods, Wiley, New York, (1963). 16. K.E. Atkinson, An Introduction to Numerical Analysis, 2 nd edition, Wiley, New York, (1989). 17. E. Isaacson and H.B. Keller, Analysis of Numerical Methods, Wiley, New York, (1966). 18. L.F. Shampine, What everyone solving differential equations numerically should know, In Computational Solution for Differential Equations, Ch. 1, (Edited by I. Gladwell and D.K. Sayers), Academic Press, London,

(1980). 19. L.F. Shampine and M.K. Gordon, Computer Solution o/ Ordinary Differential Equations: The Initial Value Problem, Freeman, San Francisco, CA, (1975). 20. J.R. Dormand, Numerical Methods for Differential Equations, CRC Press, Boca Raton, FL, (1996). 21. C.W. Gear, Algorithm 407, DIFSUB for solution of ordinary differential equations, Comm. A CM 14, 185-190,

(1971). 22. G. Quinlan, Round-off error in long-term orbital integrations using multistep methods, Celest. Mech. and Dyn. Astron. 58, 339-351, (1994). 23. T.R. Quinn and S. Tremaine, Roundoff error long-term planetary orbit integrations, Astron. J. 99, 1016-1023,

(1990). 24. T.R. Quinn, S. Tremaine and M. Duncan, A three million year integration of the earth's orbit, Astron. d. 101, 2287-2305, (1991). 25. R.W. Brankin, J.R. Dormand, I. Gladwell, P.J. Prince and W.L. Seward, ALGORITHM 670: A RungeKutta-NystrSm code, A C M Trans. Math. Softw. 15, 31-40, (1989). 26. I. Gladwell, K. Bouas-Dockery and R.W. Brankin, A Fortran 90 separable Hamiltonian system solver, Appl. Numer. Math. 25, 207-217, (1997). 27. V. Antohe, Numerical methods for Hamiltonian systems, SMU Math Rept. 99-6, (1999). 28. S. Gonzalez-Pinto, C. Gonzalez-Concepcion and J.I. Montijano, Iterative schemes for Gauss methods, Computers Math. Applic. 27 (7), 67-82, (1994). 29. S. Gonzalez-Pinto, J.I. Montijano and L. Randez, Iterative schemes for three-stage implicit Runge-Kutta methods, Appl. Numer. Math. 17, 363-382, (1995). 30. T. Roldan and I. Higueras, IRK methods for DAE: Starting algorithms, J. Comp. Appl. Math. 111, 77-92, (1999). 31. K.R. Jackson, A. Kvaerno and S.P. Norsett, An analysis of the order of Runge-Kutta methods that use an iterative method to compute their internal stage values, B I T 36, 713-765, (1996). 32. E. Hairer and G. Wanner, Solving Ordinary Differential Equations H: Stiff and Algebraic Problems, SpringerVerlag, Berlin, (1991). 33. M.P. Laburta, Starting algorithms for IRK methods, J. Comp. Appl. Math. 83, 269-288, (1997). 34. M.P. Laburta, Construction of starting algorithms for the RK-Gauss methods, J. Comp. Appl. Math. 90, 239-261, (1998). 35. J. Sand, RK-predictors: Extrapolation methods for implicit Runge-Kutta formulae, DIKU Report Nr. 88/18, (1988).

Separable Hamiltonian Systems

501

36. J. Sand, Methods for starting iteration schemes for implicit Runge-Kutta methods, In Computational Ordinary Differential Equations, (Edited by J.R. Cash and I. Gladwell), pp. 115-126, Oxford, (1992). 37. M.P. Laburta, Starting algorithms for then iterations of the RKN-Gauss methods, Appl. Numer. Math. 31, 81-101, (1999). 38. I. Cladwell, Initial value routines in the NAG library, ACM Trans. Math. Softw. 5, 386-400, (1979). 39. I. Cladwell, J. Wang and R.M. Thomas, Iterations and predictors for second order problems, In Proc. 14 th IMA CS World Conf. on Computational and Applied Mathematics, pp. 1267-1270, (1994). 40. J. Wang, Numerical methods for Hamiltonian systems, Ph.D. Thesis, Department of Mathematics, Southern Methodist University, (1993). 41. I. Cladwell and R.M. Thomas, Efficiency of methods for second-order problems, IMA J. Numer. Anal. 10, 181-207, (1990). 42. L.F. Shampine and H.W. Watts, Practical solution of ordinary differential equations by Runge-Kutta meth~ ods, Rept. SAND76-0585, Sandia National Laboratories, (1976).