Defect corrected averaging for highly oscillatory problems

Defect corrected averaging for highly oscillatory problems

Applied Mathematics and Computation 261 (2015) 90–103 Contents lists available at ScienceDirect Applied Mathematics and Computation journal homepage...

792KB Sizes 0 Downloads 101 Views

Applied Mathematics and Computation 261 (2015) 90–103

Contents lists available at ScienceDirect

Applied Mathematics and Computation journal homepage: www.elsevier.com/locate/amc

Defect corrected averaging for highly oscillatory problems A. Gerisch b, A. Naumann a,∗, J. Wensch a a

TU Dresden, Fachrichtung Mathematik, Institut für Wissenschaftliches Rechnen, Zellescher Weg 12-14, 01069 Dresden, Germany Technische Universität Darmstadt, Fachbereich Mathematik, Numerik und Wissenschaftliches Rechnen, Dolivostr. 25, 64293 Darmstadt, Germany

b

a r t i c l e

i n f o

Keywords: Ordinary differential equations Oscillatory problems Parabolic PDEs Krylov methods Preconditioning

a b s t r a c t The accurate solution of partial differential equations with highly oscillatory source terms over long time scales constitutes a challenging problem. There exists a variety of methods dealing with problems where there are processes, equations or variables on fine and coarse scales. Multiscale methods have in common, that they neither fully resolve the fine scale, nor completely ignore it. On the one hand, these methods strive, without significantly sacrificing accuracy or essential properties of the system, to be much more efficient than methods that fully resolve the fine scale. On the other hand, these methods should be considerably more accurate than methods that completely ignore the fine scale. Our defect corrected averaging procedure is based on a modified coarse scale problem, that approximates the solution of the fine scale problem in stroboscopic points. Nevertheless, our approximation process is clearly different from the stroboscopic averaging method. We give an error estimate for the solution of the modified problem. The computational efficiency of the approximation is furthermore improved by the application of preconditioning techniques. Tests on numerical examples show the efficiency and reliability of our approach. © 2015 Elsevier Inc. All rights reserved.

1. Introduction Partial differential equations (PDEs) with a highly oscillatory temporal source term constitute challenging problems for numerical time integrators. A variety of methods has been developed in order to treat these problems without resolving the microscale oscillations, among them the heterogeneous multiscale framework [1] as well as several averaging concepts like stroboscopic averaging [2], and the mollified impulse method [3]. In the methodology of the heterogeneous multiscale framework there is a microscopic model and a macroscopic model of the problem. Compression and reconstruction operators serve to switch between these two state spaces. The aim is to accurately approximate the macroscopic state of the system, see [1]. Thus, a macroscopic scheme is used to approximate the macroscopic state, whereas the effects on the microscopic scale have to be taken into account in order to update the macroscopic state. Our defect corrected averaging approach can be seen as a special case of this methodology, although we feel that it is more accurately summarized under the concept of stroboscopic averaging. Within the framework of stroboscopic averaging, the flow of the original highly oscillatory problem is approximated by the flow of an averaged problem, where the accuracy is high at so-called stroboscopic points. Each function evaluation during the



Corresponding author. Tel.: +49 46335060; fax: +49 35146337096. E-mail addresses: [email protected] (A. Gerisch), [email protected] (A. Naumann), [email protected] (J. Wensch). http://dx.doi.org/10.1016/j.amc.2015.03.081 0096-3003/© 2015 Elsevier Inc. All rights reserved.

A. Gerisch et al. / Applied Mathematics and Computation 261 (2015) 90–103

91

integration of the averaged macroscale problem is approximated by microscale simulations of the underlying highly oscillatory problem over some periods and an appropriate averaging. In the case of semidiscretized PDEs, the computational cost of the stroboscopic averaging approach increases strongly with the size of the semidiscretized system, rendering the method less useful. In this paper, we consider initial value problems (IVPs) for systems of linear ODEs containing a highly oscillatory periodic source term s(t) with small period size ε > 0

y (t) = Ly(t) + s(t), s(t) = s(t + ).

y(t0 ) = y0 ∈ Rm , (1)

Such problems arise from the semidiscretization of parabolic PDEs with a highly oscillatory, periodic temporal source term such as considered in Section 3. A simple way to deal with the fine scale term is to average the source term over an interval of length ε

s¯ =

1

 



0

s(t)dt,

and solve the averaged ODE

y¯  (t) = Ly¯ + s¯ ,

y¯ (t0 ) = y0 ,

(2)

which is independent of the small scale parameter ε . Linear ODEs of this class can be solved with standard integrators, or with specialized integrators like exponential integrators [4]. In this paper we present and investigate, in Section 2, a defect corrected averaging method for problem (1). The aim of this method is to reduce the error y(t) − y¯ (t) between the solutions of (1) and (2) in the stroboscopic time points t = nε . The method is further improved by the application of preconditioning techniques. After that, a model problem is discussed in Section 3 and the accuracy and computational efficiency of the new method with respect to standard stiff integrators is shown in Section 4. Finally, we discuss our results and present conclusions in Section 5. 2. The defect corrected averaging method 2.1. Exact solution of linear ODEs with periodic source at stroboscopic points The exact solution of the IVP for a linear ODE system with time-dependent source f(t), that is

y (t) = Ly(t) + f (t),

y(0) = y0 ,

is given by

y(t) = exp(tL)y0 + exp(tL)



t 0

(3)

exp(−τ L)f (τ )dτ .

For a polynomial source term in (3), i.e. f (t) =

y(t) = φ0 (tL)y0 + t

K 

K

k=0 gk t

k,

(4) there is, see [5], an explicit representation of the solution in the form

φk+1 (tL)gk tk .

(5)

k=0

Here the analytic functions φ k (z) are recursively defined by

φ0 (z) = exp(z), φ1 (z) = φk+1 (z) =

φk (z) − 1/k! z

,

φ0 (z) − 1 z

,

(6)

k ≥ 1.

(7)

In the case of a constant source term f(t) = g0 , the solution of (3) is thus given by

y(t) = φ0 (tL)y0 + tφ1 (tL)g0 .

(8)

The representation of the exact solution of (3) with polynomial source term f(t) by functions φ k , k  0, is applied in the derivation of adaptive Runge–Kutta methods, where φ 0 is chosen as a rational approximation to the exponential, see [5], as well as in the derivation of exponential integrators, where φ 0 is assumed to be the exponential function itself, see [4]. Now we consider problem (3) with an ε -periodic source term f. We rewrite its solution (4) at the stroboscopic point t = nε in the form

y(t) = y(nε) = exp(tL)y0 + exp(tL)

n−1  k=0

= exp(tL)y0 +

exp(tL) − 1   1 − exp −ε L

 ε 0

exp(−kε L)

 ε 0

exp(−τ L)f (τ )dτ

exp(−τ L)f (τ )dτ

(9)

(10)

92

A. Gerisch et al. / Applied Mathematics and Computation 261 (2015) 90–103

= exp(tL)y0 + t

exp(tL) − 1 tL

= exp(tL)y0 + tφ1 (tL)





ε

exp (ε L) − 1 εL 

εφ1 (ε L) −1 exp(ε L)

−1

 ε 0

exp(ε L)

 ε 0

exp(−τ L)f (τ )dτ

exp(−τ L)f (τ )dτ .

(11)

(12)

It has to be pointed out that for large scale problems the computation of the matrix functions exp and φ 1 is costly and only possible by approximations. For an overview of methods to compute or approximate the matrix exponential, we refer to the classic paper of [6]. The approximation of matrix functions like φ 1 is even more complicated. Thus, the representation (12) of the exact solution at stroboscopic points cannot be used directly. However, we will exploit its structure. In a more abstract setting the result above can be interpreted as follows. Let S(t0 , t1 , y0 , f) denote the solution operator of the IVP (3) with source term f, i.e. we have y(t1 ) = S(t0 , t1 , y0 , f) for y(t0 ) = y0 . We only consider the case t0 = 0 in the following. Whenever the source term f(t) in (3) is a constant vector, i.e. f(t) = v, we have

S(0, t, 0, v) = tφ1 (tL)v.

(13)

Furthermore, the following important properties of the solution operator will be used later. The solution y of (3) at time t1 can be written as

y(t1 ) = S(0, t1 , y0 , f ) = S(0, t1 , y0 , 0) + S(0, t1 , 0, f ),

(14)

where the last term is linear in the source term, in particular:

S(0, t1 , 0, α1 f1 + α2 f2 ) = α1 S(0, t1 , 0, f1 ) + α2 S(0, t1 , 0, f2 ).

(15)

Finally, we formally define an inverse solution operator, S(0, t, 0, ·)−1 , which maps a solution value y(t) = z at time t to the constant source term v that generates this solution at time t starting from t0 = 0 and y0 = 0, that is using (8)

S(0, t, 0, ·)−1 (z) = v = (tφ1 (tL))−1 z.

(16)

We now return to the solution (12) of problem (3) with ε -periodic source term f. Let v be the constant vector such that S(0, ε , 0, v) = S(0, ε , 0, f), that is using (16)

v := S(0, ε , 0, ·)−1 ◦ S(0, ε , 0, f ) = (εφ1 (ε L))−1 S(0, ε , 0, f ).

(17)

Now we rewrite (12) as

y(t) = y(nε) = S(0, t, y0 , 0) + tφ1 (tL)(εφ1 (ε L))−1 S(0, ε , 0, f ) = S(0, t, y0 , 0) + tφ1 (tL)v = S(0, t, y0 , 0) + S(0, t, 0, v) = S(0, t, y0 , v)

using (17) using (13)

using (14)

= S(0, t, y0 , ·) ◦ S(0, ε , 0, ·)−1 ◦ S(0, ε , 0, f ).

using (4) twice

(18) (19) (20) (21) (22)

Hence we can write the solution at the stroboscopic points as iterated application of the solution operator and its formal inverse on the ε -periodic source f. Eq. (22) forms the basis for the definition of our defect corrected averaging method. 2.2. Definition of the defect corrected averaging method Let, as before, f be ε -periodic and consider the exact solution of (3) at the stroboscopic points t = nε in the representation (22). This expression can be interpreted as follows: find a constant source term that generates, starting from y0 = 0, a flow that coincides in t = ε with the flow generated by the periodic source term; afterward use the flow corresponding to that constant source term and starting at y0 to evaluate the exact solution of the original highly oscillatory problem in the stroboscopic point t = nε . A numerical evaluation of (22) involves 1. the computation of S(0, ε , 0, f), which requires the numerical solution of the ODE in (3) over a period ε where a sufficient number of small steps of the time integrator of choice has to be applied in order to resolve the source term f; 2. the computation of the constant source term v  S(0, ε , 0, ·)−1 ◦S(0, ε , 0, f); and 3. the computation of S(0, t, y0 , v), which requires the numerical solution of the ODE in (3) with the constant source term v over a period t and thus can be done with a larger step size.

A. Gerisch et al. / Applied Mathematics and Computation 261 (2015) 90–103

93

The method outlined above requires the evaluation of the inverse operator on S(0, ε , 0, f), where f in general has a large mean value. We aim to reduce the associated numerical difficulties and errors by splitting the source term f in its mean and its oscillatory part in the following. That is why, we will apply the procedure above in combination with the simple averaging technique discussed in the introduction to arrive at a defect correction procedure. To this end let f¯ denote the time-average of the ε-periodic source f over an interval of length ε. Then, thanks to S(0, ε, 0, ·)−1 ◦ S(0, ε, 0, f¯) = f¯and property (15), we can rewrite the solution (22) as

y(t) = y(nε) = S(0, t, y0 , ·) ◦ (f¯ + S(0, ε , 0, ·)−1 ◦ S(0, ε , 0, f − f¯)).

(23)

From this formula it is seen that the defect f − f¯ is corrected by (an approximation of) fc := S(0, ε , 0, ·)−1 ◦ S(0, ε , 0, f − f¯) . Or, in other words, we have to find a vector fc fulfilling the condition

S(0, ε , 0, fc ) = S(0, ε , 0, f − f¯).

(24)

Because of the linearity of S in its last argument, see property (15), this task is nothing more than solving a linear system of equations Av = b, where b := S(0, ε , 0, f − f¯) but the matrix A is not given explicitly. We are only able to compute the application of A to a vector v by solving the corresponding problem (3) with constant source f(t) = v and y0 = 0, i.e.

Av = S(0, ε , 0, v).

(25)

This is the standard situation in which to apply a Krylov subspace solver to the linear system Av = b. Here, we will apply the GMRES method [7]. We now summarize the procedure of our defect corrected averaging method (DCA method) for the numerical solution of problem (3) with ε -periodic source f. Let t = te > 0 be the time point of interest (te is not necessarily a stroboscopic point). Then the following steps are performed in order to compute the numerical approximation ye ≈ y(te ): 1. Compute the averaged source term f¯ either analytically or numerically. 2. Compute b := SF (0, ε , 0, f − f¯) with a suitable fine scale time integrator with fine scale stepsize hF . 3. Compute the correction fc as the solution of the linear system SG (0, ε , 0, v) = b with the GMRES method, where all necessary evaluations of SG (0, ε , 0, v) are obtained using the fine scale time integrator, whereas a larger stepsize hG > hF may be used. 4. Let n be as large as possible such that for the stroboscopic point nε holds nε ≤ te . Compute yn := SC (0, nε , y0 , f¯ + fc ) with a suitable coarse scale time integrator with coarse scale stepsize hC . 5. If nε = te then ye := yn , otherwise compute ye := SF (0, te − nε , yn , f ) using the fine scale time integrator in order to finally obtain the numerical solution at te . Note, there are three different stepsize values involved: the fine scale stepsize hF , the coarse scale stepsize hC and the stepsize hG used in the GMRES iteration. It is obvious, that point 5 of the procedure above, with the appropriate value of n, can be applied several times, when the solution at several intermediate time points not equal to stroboscopic points is sought. However for keeping the method computationally efficient, the distance between these points should be substantially larger than ε . 2.3. Error analysis In the defect corrected averaging method, we compute

b = SF (0,  , 0, f − f¯)

(26)

EG + b = SF (0,  , 0, fc )

(27)

using a suitable fine scale integrator SF with the tolerance TOLF . The term EG applies for the GMRES error after solving the system S(0, ε , 0, v) = b up to the absolute tolerance TOLG . After that, we compute the solution at the stroboscopic point t = nε using a suitable coarse scale integrator with the tolerance TOLC . Therefore, the error at t = nε

E = ynε − y(nε) = SC (0, nε , y0 , fc + f¯) − S(0, nε , y0 , f ) = SC (0, nε , y0 , fc + f¯) − S(0, nε , y0 , fc + f¯)

E1

+ S(0, nε , y0 , fc + f¯) − S(0, nε , y0 , f )

E2

consists of the two parts E1 and E2 . The term E1 corresponds to the error of the coarse scale integrator SC and we get E1 = q O(hCC ) ≤ TOLC , where qC is the order of the coarse scale integrator. For the second part E2 , we obtain

E2 = S(0, nε , 0, fc + f¯ − f ) =

n−1  k=0

S(0, kε , S(0, nε , 0, fc + f¯ − f ), 0).

94

A. Gerisch et al. / Applied Mathematics and Computation 261 (2015) 90–103

Now, with (26) and (27) we have

EG = SF (0, ε , 0, fc + f¯ − f ), easily establishing the estimate

S(0, ε , 0, fc + f¯ − f ) ≤ TOLG + 2TOLF . Now if the right hand side Ly + f obeys a Lipschitz condition with a Lipschitz constant K, we have a classical sensitivity estimate S(0, t, y, 0)  exp (tK) y

E 2 ≤

n−1 

ekεK (TOLG + 2TOLF ) ≤ nenεK (TOLG + 2TOLF )

k=0

If the system is dissipative, we have

E2 ≤ n(TOLG + 2TOLF ). If the right hand side Ly + f even satisfies the one-sided Lipschitz condition with a one-sided Lipschitz constant l < 0, we get

E 2 ≤

n−1 

eklε (TOLG + 2TOLF )

k=0

=

1 − enlε (TOLG + 2TOLF ). 1 − elε

Thus, for dissipative systems we have established the estimate

E = n(TOLG + 2TOLF ) + TOLC .

(28)

Therefore we propose to choose tolerances at

TOLG ≈ TOLF ≈

TOLC . n

2.4. Preconditioning The defect corrected averaging scheme depicted above requires the solution of the linear system S(0,  , 0, fc ) = S(0,  , 0, f − f¯). The number of iterations needed by the iterative Krylov solver GMRES depends on the structure of the eigenvalues of the operator S(0, ε , 0, ·). For purely real eigenvalues, the residuum reduction of GMRES is discussed in [8]. The speed of convergence 1/τ can be approximated by

τ ≈ γ + γ2 −1 λ



where γ = λmax −λmin depends on the ratio of the largest and smallest eigenvalue. For example, applying the matrix function φ 1 (z) max min on the discrete Laplacian, leads to the spectrum S = (− 1x2 , 0]. This means, that the smallest eigenvalue tends to − when the grid is refined. Furthermore, the error reduction τ tends to 1 and the GMRES solver loses efficiency. We seek for preconditioners of the form P(L) where P is a rational function. The perfect preconditioner would provide the reciprocal of the matrix function φ 1 . That is why, we approximate

z exp(z) − 1

φ1 (z)−1 =

by a rational function P(z) and use P(L) as a preconditioner. The approximation P(z) has to coincide with φ1−1 (z) and to match at least the first derivative at the origin. Therefore, we seek a rational function P(z) fulfilling the conditions

P (0) = 1 P  (0) = −

(29)

1 2

(30)

and for third order polynomials the asymptote for z → −

P (z) ≈ −z. The two assumptions (29) and (30) lead to the linear preconditioner

P1 (z) = 1 − 0.5z

(31)

A. Gerisch et al. / Applied Mathematics and Computation 261 (2015) 90–103

95

4 2 0 −5

−4.5

−4

−3.5

−3

−2.5 z

φ−1 1

P1

−2

−1.5

−1

−0.5

P2

Fig. 1. Preconditioners P1 and P2 on the negative real axis.

1 0.8 0.6 −24 −22 −20 −18 −16 −14 −12 −10 −8 λ(L) P1 φ

−6

−4

−2

P2 φ

Fig. 2. Eigenvalues of the preconditioned systems.

whereas the additional asymptotic condition leads to the rational preconditioner

P2 (z) =

z2 − 2z + 2 . −z + 2

Both preconditioners P1 and P2 together with φ 1 (z)−1 along the negative real axis are sketched in the Fig. 1. Applying the preconditioner P1 on the matrix function φ 1 (L), each eigenvalue λ  σ (L) is mapped by the nonlinear function

 P1 φ1 (λ) = 1 −



λ eλ − 1 . 2 λ

This function has no extrema for λ < 0 and therefore maps the previous spectrum S to the narrower spectrum

σ (P1 φ1 ) =



0.5|λmax | − 1

|λmax |



,1

≈ (0.5, 1) (λmax → −∞) ⇒ τ ≈ 3. Putting those new eigenvalues in the expression for γ and the error reduction τ shows that the reduction is always larger than 3, even for small eigenvalues λ. Much more interesting is the behavior of the preconditioner P2 . Here, the eigenvalues λ of P2 (L)φ 1 (L) are

P2 φ1 (λ) =

λ2 − 2λ + 2 eλ − 1 . −λ + 2 λ

The function P2 φ 1 (z) has one maximum at z∗ = −2.261079369365234 with P2 φ 1 (z∗) = 1.081708259011712 and minimizes at the boundary z = 0 to 1, which can be seen in the Fig. 2. In consequence, the error reduction τ of GMRES with the preconditioned system P2 φ 1 is near 50 for systems with small eigenvalues, or in case of the discrete Laplacian for fine discretized meshes. The solution operator S(0,  , 0, fc ) corresponds to εφ 1 (ε L). To apply the preconditioners above on S(0, ε , 0, ·), we have to change the argument L to ε L:

P3 = I − 0.5 L P4 =

 2 L2 − 2 L + 2I . − L + 2I

All arguments above are applicable to the ODE case y

(32) (33)

= f(y), i.e. without any mass matrix. However, finite element discretization leads to an implicit ODE of the form MY = f(Y). For those types, we have to modify the preconditioners to

P3,FEM = M − 0.5 L P4,FEM =

( L)2 − 2M L + 2M2 . 2M −  L

96

A. Gerisch et al. / Applied Mathematics and Computation 261 (2015) 90–103

3. A model problem Let = (0, W) × (0, H) with positive width W and height H be the spatial domain. In we consider the temperature u(t, x) of a homogeneous material subject to a spatio-temporal influx along the upper (x = (x1 , H)) boundary part of and zero-flux along the remaining three boundary edges, i.e.

∂t u(t, x) = −∇ · (−D∇ u(t, x)) u(0, x) = 0

for x ∈ and t > 0,

(34)

for x ∈ ,

(35)

−D∇ u(t, x) · n(x) = −gf (t, x) −D∇ u(t, x) · n(x) = 0

for x = (x1 , H) ∈ and t > 0,

(36)

for x ∈ ∂ \ and t > 0.

(37)

In the above, D denotes the heat conduction coefficient and gf (t, x) is the heat source on . The heat stems from friction between the material at and another piece of material P moving periodically along . The shape of P is encoded in the shape function SP (ξ ) ∈ {0, 1} for ξ ∈ R. The coordinate ξ = 0 denotes a reference position on P. The movement of P along is described by the position X(t) of this reference position on . The indicator function of P on is thus given by

χP (t, x1 ) = SP (x1 − X (t)),

for (x1 , H) ∈ and t > 0.

The heat flux into due to friction is • present only at positions where P is located; • half the total heat flux due to friction (the other half goes into P) and proportional to the absolute velocity V(t) = |X (t)| of P along with coefficient of proportionality Ff , the friction force; Taken together this yields

F 2

gf (t, x) =

χP (t, x1 ) f V (t)

= SP (x1 − X (t))

(38)

Ff  |X (t)| . 2

(39)

3.0.1. Example: A piece P of length 2L is described by

SP (ξ ) = χ[0,1] (|ξ | /L). A typical movement of P is a high-frequency periodic with frequency 1/ε along a part of , i.e.

X (t) =

W W + cos(ωt), 2 4

ω=



ε

.

This yields

   ωW   sin(ωt) . V (t) = |X (t)| =  4 

 Friction is not the only mechanism which leads to heat influx at the interface of and P. A second mechanism is a heat flux due to temperature differences between the boundary  and the piece P or the surrounding medium (air). To add this to the model, first let θ (t, ξ ) denote the temperature of P at time t and position ξ and θ 0 denote the temperature of the surrounding medium. The exterior temperature uext (t, x) for x   is then given by



uext (t, x) =

θ (t, x1 − X (t)) : if x ∈ and χP (t, x1 ) = 1, θ0

: otherwise.

The new boundary condition on all of  then reads, using Newton’s law of cooling,



−D∇ u(t, x) · n(x) = hht (u(t, x) − uext (t, x)) − where hht is the heat transfer coefficient.

gf (t, x) : if x ∈ 0

: otherwise

,

A. Gerisch et al. / Applied Mathematics and Computation 261 (2015) 90–103

97

y 4

Γ(τ) Γ(0)

1x Fig. 3. The 2D geometry with boundary conditions. (For interpretation of the references to colour in the text, the reader is referred to the web version of this article.) Table 1 Parameter values for the 2D-problem. Parameter

Value

Parameter

Value

ε

5 15.1e − 6 5.978e − 3

N Steps per period Cpu time (ROS2)

100 1600 4588 s

D Ff

Finite element semidiscretization. Semidiscretization of the model problem (only friction, without Newton’s law of cooling) on a uniform grid with bilinear finite elements yields the system of stiff ODEs

My (t) = Ky(t) + f (t) for t > t0 = 0 with initial condition y(t0 ) = y0 = 0. Here M and K are the mass and stiffness matrices of the FE discretization, respectively, and g(t) describes the effect of the boundary conditions (friction). 4. Numerical examples In our numerical examples we solved the model problem, described in Section 3, with two different geometries. The first test set consists of a 4:1 rectangle together with the boundary conditions depicted in Fig. 3 and the parameters shown in Table 1. The rectangle is discretized with equidistant finite differences. With this example, we show the efficiency of the preconditioners with different periods ε and different grid spacings N1 . Furthermore the accuracy, as analyzed in Section 2.3, and the gained speedup will be presented. Our second geometry is a 3D CAD geometry. It represents a machine tool, introduced in [9]. The boundary conditions are as depicted in Fig. 6, where the red surface represents the heat loaded rail. This load consists of many different, but periodic, fast movements. Those fast movements lead to periods in range of some seconds, which is small compared to the total simulation interval of several hours. Furthermore, the velocity of the moving part changes in each full hour, which leads to different periods ε. In this experiment we concentrate on the error reduction and speedup with respect to the ROS2 [10,11] time integrator. 4.1. A 2D-problem Our 2D test case, a 4:1 rectangle, has the boundaries depicted in Fig. 3. The red and blue bars represent the friction boundary

at t = 0 and at a second point t = τ respectively. We are interested in the solution at te = 100 , where the source has run

100 times along the whole left boundary. The 4:1 rectangle 3 is discretized using an equidistant gridspacing h = N1 with a finite difference stencil of order 2. We compare the efficiency of the preconditioners above at different gridsizes and different periods. After that we show the speed up with respect to the linear implicit solver ROS2 and the accuracy increase compared to the simple averaging at the end time.

Preconditioning. Our tests on the preconditioners were run with ε ranging from 0.625 to 48. The system Aw = SF (0,  , 0, f − f¯) was solved using GMRES with a relative tolerance of TOLG = 10−12 . Table 2 contains the GMRES iterations for different grid spacings using the period size ε = 5. Without preconditioning, the number of GMRES iterations increases with the number of

98

A. Gerisch et al. / Applied Mathematics and Computation 261 (2015) 90–103 Table 2 Number of GMRES iterations with and without preconditioning for the 2D-problem. N

None

P1

P2

P3

P4

10 50 100 200

6 14 26 54

6 13 21 32

6 13 20 26

3 7 11 15

4 7 8 8

Table 3 Number of GMRES iterations for different period size ε and different number of gridpoints N. N = 100

N = 200

ε

No.

P3

P4

48.0 24.0 12.0 5.0 2.5 1.25 0.625

86 61 43 26 18 13 10

17 16 14 11 9 7 5

8 8 8 8 7 6 6

No.

P3

P4

166 119 84 54 38 26 18

18 17 16 15 13 11 9

9 8 8 8 7 7 7

Fig. 4. Solution of the 2D example using the fine scale integrator ROS2 at t = 100ε with 80,000 steps.

grid points. All four preconditioners have the effect that the number of GMRES iterations increases much slower than without preconditioning. The preconditioners P3 and P4 are more efficient than P1 and P2. Preconditioner P4 affects the number of iterations more than P3, but is also more costly in its application. The advantage of preconditioning increases with the number of grid points. The additional costs for the application of the preconditioners were cheap compared to the computation of the right hand side b = SF (0,  , 0, f − f¯), which involves the solution of linear systems of the same size. We also tested the dependency of the number of GMRES iterations on the period size ε for the preconditioners P3 and P4, see Table 3. Without any preconditioner, the number of iterations increases linearly with the number of points (compare N = 100 and N = 200) and slightly less than linear with period size ε . Note, using the preconditioner P4, the number of iterations is nearly constant, independent of ε and the number of gridpoints N. Speedup and accuracy. Fig. 4 shows the solution at t = te computed using the implicit ROS2 method. The only recognizable change in temperature occurs near the left boundary. We expect the highest discretization errors in this region. After averaging the source, the solution becomes symmetric to the centerline, leading to the highest differences to the original solution on this boundary also. Whereas the solution of the original problem takes nearly one and a half hour, the averaged system is solved with large timesteps in less than a minute, by the expense of accuracy. Using the defect corrected averaging method, we balance between accuracy and runtime. (error,finescale) To show the efficiency of the method, we define the speedup S with respect to the fine scale integrator as S = cpucpu (error,dca) . For the fine scale integrator, we used 1600 steps per period to get a maximum error of efine = 10−5 . Our fine scale integrator is of

A. Gerisch et al. / Applied Mathematics and Computation 261 (2015) 90–103

99

Table 4 Speedup obtained by defect corrected averaging using hG =  /16 for different coarse scale stepsizes hC and different fine scale stepsizes hF =  /50, . . . ,  /1600. H

hR 

1ε 2ε 10ε 20ε









50

100

200

400

800

7 11 20 24

12 25 28 20

29 38 30 17

33 52 20 9

34 39 8 5

Table 5 Errors ||yref − ycor ||∞ with hC =  for different fine scale step sizes hG =  , . . . ,  /16 and hF =  /50, . . . ,  /800. hG

hF 

ε /1 ε /2 ε /4 ε /8 ε /16









50

100

200

400

800

3.6e − 02 1.8e − 02 1.6e − 02 1.6e − 02 1.6e − 02

2.5e − 02 6.1e − 03 3.0e − 03 2.3e − 03 2.1e − 03

2.4e − 02 4.5e − 03 1.4e − 03 6.3e − 04 4.5e − 04

2.4e − 02 4.5e − 03 1.1e − 03 3.3e − 04 1.7e − 04

2.4e − 02 4.4e − 03 1.0e − 03 2.7e − 04 9.7e − 05

Table 6 Errors ||yref − ycor ||∞ with hG =  /16 for different coarse scale stepsizes hC =  , . . . , 20 and different fine scale step sizes hF =  /50, . . . ,  /800 as well as the accuracy ||yref − yav ||∞ obtained by simple averaging. hC

||yref − yav ||∞

hF 

1ε 2ε 4ε 10ε 20ε









50

100

200

400

800

1.6e − 02 1.6e − 02 1.5e − 02 1.5e − 02 2.1e − 02

2.1e − 03 2.2e − 03 2.4e − 03 4.1e − 03 1.0e − 02

4.5e − 04 5.1e − 04 7.5e − 04 2.4e − 03 9.0e − 03

1.7e − 04 2.2e − 04 4.5e − 04 2.2e − 03 8.9e − 03

9.7e − 05 1.6e − 04 3.9e − 04 2.1e − 03 8.9e − 03

order two, so we can scale the fine scale cpu time cpu(errorDca, finescale) =

edca efine

4.7e − 01 4.7e − 01 4.8e − 01 4.8e − 01 4.8e − 01

· cpu(fine). Numerical experiments showed

that the order decreases at stepsizes larger than ε /50, so the fine scale integrator should use at most this stepsize. Putting those assumptions together, we scale the cpu time of the fine scale integrator to the corresponding error of the defect corrected cpu(edca ,finescale) √ . solution, and get the speedup S(edca ) = cpu(edca ,dca)min

edca /efine ,32

The speedup S, shown in Table 4, depends on the times needed for the preprocessing steps 1. computation of b = SF (0,  , 0, f − f¯) 2. solution of SG (0,  , 0, fc ) = b and in addition of the time for the solution of SC (0, t, y0 , f¯ + fc ) using a coarse scale solver. This last step is as fast as solving the averaged problem and is less significant. Most of the time is spent in step 1 and step 2. Having this in mind, using the preconditioner is necessary to reach a significant speedup. Most of the time is now spent in side step 1, where the right hand side b is computed with high accuracy. Accuracy with respect to the original problem is controlled by the errors in the right hand side and the solution steps. To concentrate on the errors introduced by the approximation with the constant source, we solved the coarse scale problem with a fixed stepsize hC =  . The errors are depicted in Table 5. Following the diagonal in Table 5 from top left to bottom right, the reduction by an order of two with respect to hF , as well as hG , is clearly visible. Additionally, we fixed the stepsize hG to the value ε /16, which corresponds to the last row in Table 5. Now, we vary the stepsizes hC and hF . Comparing these solutions with the reference solution leads to the differences depicted in Table 6. Whenever the coarse scale integrator is applied with a sufficiently small stepsize hC , we expect the values in the main diagonal (from bottom left to top right) to behave similar to second order. Here, the estimated order of convergence varies between 2.4 and 1.8 for the larger values of the stepsizes hF and hG , but drops down to 1.2 for the smallest values of the stepsizes hF and hG due to the coarse

100

A. Gerisch et al. / Applied Mathematics and Computation 261 (2015) 90–103

0.1

||y-yc||/||y-yb||

0.01 0.001 0.0001 1e-05 1e-06 0.01

0.1 h/h0 h2 tol=0.1

1 tol=1e-3 tol=1e-5

Fig. 5. Accuracy increase with respect to averaged solution at simultaneously halving hG and hF using a fixed number of GMRES steps. Table 7 Solution times in s using preconditioner 3.

ε (s)

Direct

Averaged

Corrected

48 24 12

2971 4611 9276

9 9 9

115 110 106

Table 8 Scaled errors ||err||∞ · 104 in averaged and corrected solution at t = 3600 s.

ε (s)

Averaged

Corrected

48 24 12

33.2 96.8 251

0.377 2.18 8.91

Table 9 Number of GMRES iterations using no preconditioner or preconditioner P3.

ε (s)

None

P3

48 24 12

156 169 150

18 17 17

scale error. Note, that a relatively large stepsize hL =  /16 can be used in the GMRES iteration in order to achieve a sufficiently accurate solution because of the constant source term. Our last experiment investigates how much the choice of TOLG affects the accuracy. In all experiments above a high accuracy GMRES solution has been computed with TOLG = 10−12 . Using the preconditioner (33), we needed eight GMRES steps to get the exact solution. As we depicted in the error analysis section, there is no need to do so. To show this, we run GMRES with one, two and three steps, leading to tolerances about 10−1 , 10−3 and 10−5 respectively. The corresponding errors are shown in Fig. 5. Using one step, we got a tolerance of 0.1, so further halving the timesteps leads to no accuracy increase. With two and three steps, the residuum in the correction problem is reduced by approximately 10−2 and 10−4 , so the error ||y(t) − yc (t)|| is reduced by the order of the underlying fine scale integrators. 4.2. A real world 3D-example In analogy to the two-dimensional case, the problem was solved using the second order Rosenbrock method ROS2. The solution at time t = 3600 s can be seen in Fig. 7. The time used to solve the highly oscillatory problem increases linearly with decreasing period size ε , which can be seen in Table 7. The defect corrected averaging method has to resolve the periods only once in the correction phase, and the total runtime is therefore independent of the period size ε . In analogy to the finite difference solution we depict the errors of the corrected problem in Table 8 and the run times in Table 7. Even for this problem type using the preconditioner reduces number of GMRES iterations by a factor of 10 (Table 9).

A. Gerisch et al. / Applied Mathematics and Computation 261 (2015) 90–103

101

Fig. 6. The geometry of the 3D FEM example together with the red area for the friction boundary condition in the 3D FEM example. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

5. Discussion and conclusions The numerical solution of PDEs with highly oscillatory source terms constitutes an ongoing computational challenge. Whereas for highly oscillatory mechanical systems with low number of variables, i.e. represented by an ODE system of small dimension, averaging methods like stroboscopic averaging and the HMM perform satisfactorily, this is not the case for semi-discretized PDEs with oscillating sources, i.e. represented by an ODE system of typically very large dimension. The simple averaging technique, applied to such high-dimensional problems yields a good speed-up but at the cost of significantly loosing accuracy in the computed approximation, cf. Table 6. Such a loss in accuracy is typically not acceptable. In order to improve the accuracy of the simple averaging technique for linear ODE systems with highly oscillatory source terms, we developed, analyzed, and tested in this work a new defect corrected averaging method. In particular, this method computes a suitable correction to compensate for the defect between the original highly oscillatory and the time-averaged source term as used in the simple averaging technique. The underlying formula of the defect corrected averaging method derived returns the exact solution of the highly oscillatory problem at stroboscopic time points as long as time integration of the micro- and macro scale problems and the solution of a linear system of equations appearing in the method are performed exactly. In the presence of approximation errors for these problems, due to the use of numerical time integration schemes (fine scale integrator, coarse scale integrator) and the GMRES method for the iterative solution of the linear system, we have presented an error analysis of the defect corrected averaging method which, for dissipative systems, gives a bound on the global error in stroboscopic time points t = nε which depends on n, the tolerances required in the fine scale and coarse scale integrator, and the tolerance in the GMRES method. The analysis also hints on the optimal relative size of these

102

A. Gerisch et al. / Applied Mathematics and Computation 261 (2015) 90–103

Fig. 7. The solution computed with defect corrected averaging at t = 1 h with  = 12 s, hF =  /100, hG =  /8.

tolerances. Additionally, we derived two method-adjusted preconditioners in order to speed up the convergence of the GMRES iteration. Numerical tests of the defect corrected averaging method on linear ODE systems arising from finite difference and finite element discretizations in space of two- and three-dimensional PDE problems with highly oscillatory (in time) source terms where used as numerical test cases. The expected error reduction as a function of the time step size (tolerance) in the fine scale and coarse scale integrator as well as a function of the number of GMRES iteration steps was observed. Additionally, using GMRES with the proposed preconditioners yields a substantial reduction of required GMRES iteration steps and consequently a significant reduction of the overall required simulation time. Acknowledgment The authors gratefully acknowledge the funding of this work by the German Research Foundation (DFG) within the SFB/Transregio96. References [1] E. Weinan, B. Engquist, The heterogeneous multiscale methods, Commun. Math. Sci. 1 (1) (2003) 87–132. http://projecteuclid.org/getRecord?id=euclid.cms/ 1118150402 [2] M.P. Calvo, P. Chartier, A. Murua, J.M. Sanz-Serna, Numerical stroboscopic averaging for ODEs and DAEs, Appl. Numer. Math. 61 (10) (2011) 1077–1095, doi:10.1016/j.apnum.2011.06.007. [3] J.M. Sanz-Serna, Mollified impulse methods for highly oscillatory differential equations, SIAM J. Numer. Anal. 46 (2) (2008) 1040–1059, doi:10.1137/070681636. [4] M. Hochbruck, A. Ostermann, Exponential integrators, Acta Numer. 19 (2010) 209–286, doi:10.1017/S0962492910000048.

A. Gerisch et al. / Applied Mathematics and Computation 261 (2015) 90–103

103

[5] K. Strehmel, R. Weiner, Behandlung steifer Anfangswertprobleme gewöhnlicher Differentialgleichungen mit adaptiven Runge–Kutta-Methoden, Computing 29 (1982) 153–165. [6] C. Moler, C. Van Loan, Nineteen dubious ways to compute the exponential of a matrix, twenty-five years later, SIAM Rev. 45 (1) (2003) 3–49 (electronic), doi:10.1137/S00361445024180. [7] Y. Saad, M. Schultz, GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems, SIAM J. Sci. Stat. Comput. 7 (1986) 856–869. [8] Y. Saad, Krylov subspace methods for solving large unsymmetric linear systems, Math. Comput. 37 (155) (1981) pp. 105–126. http://www.jstor.org/ stable/2007504 [9] K. Großmann, C. Städel, A. Galant, A. Mühl, Berechnung von Temperaturfeldern an Werkzeugmaschinen, Z. wirtschaftlichen Fabrikbetrieb 6 (2012) 452–456. [10] J. Verwer, E. Spee, J. Blom, W. Hundsdorfer, A second-order Rosenbrock method applied to photochemical dispersion problems, SIAM J. Sci. Comput. 20 (4) (1999) 1456–1480. [11] J. Lang, Adaptive Multilevel Solution of Nonlinear Parabolic PDE Systems, Theory, Algorithm, and Applications., vol. 16, Springer, Berlin, 2001.