Residual scaling techniques in multigrid, II: Practical applications

Residual scaling techniques in multigrid, II: Practical applications

N~Ilt. it(~AND Residual Scaling Techniques in Multigrid, II: Practical Applications Jun Zhang Department of Mathematics The George Washington Unive...

1MB Sizes 1 Downloads 48 Views

N~Ilt.

it(~AND

Residual Scaling Techniques in Multigrid, II: Practical Applications Jun Zhang

Department of Mathematics The George Washington University Washington, DC 20052

ABSTRACT This paper focuses on the practical applications of the multigrid residual scaling techniques and is the continuation of a companion paper: Residual scaling techniques in multigrid, I: Equivalence proof [Appt Math. CompuL 86:283-303 (1997)]. We discuss the computational issues of some residual scaling techniques which have been proven mathematically equivalent. A heuristic residual analysis technique, based on the geometry of the grid points and the relaxation pattern, is introduced to estimate the optimal residual scaling factor for a high-order multigrid method. We compare the performance of a typical pre-optimization (pre-acceleration) technique with a typical post-optimization (post-acceleration) technique and show that the pre-optimization is preferable in both convergence and efficiency. Our numerical results support the theoretical conclusions made in the companion paper and demonstrate the full advantage of the pre-optimization technique over the post-optimization technique. © Elsevier Science Inc., 1998

1.

INTRODUCTION

In this paper, we study some effective techniques to accelerate the standard multigrid procedure for solving system of linear equations of the form

Ahu h = fh

(1)

which results from some discretized partial differential equations. The superscript h denotes the grid size of the discretization. (However, when

APPLIED MA THEMA TICS AND COMP UTA TION 90:229-252 (1998) © Elsevier Science Inc., 1998 655 Avenue of the Americas, New York, NY 10010

0096-3003/98/$19.00 PII S0096-3003(97)00400-1

230

J. ZHANG

there is no confusion, we may drop the superscript.) The coefficient matrix A h is usually sparse and of large dimensions, but may be nonsymmetric and nonpositive definite. For many (but not all) symmetric positive definite problems, the multigrid model has been shown to be very efficient [1-3]. But nonsymmetric problems frequently bring up difficulties. Standard multigrid method converges slowly or even diverges when it is used to solve nonsymmetric problems. Hence, acceleration techniques are needed to obtain solution of required accuracy within reasonable computer time. Acceleration techniques of various kinds have been designed by several authors to accelerate standard multigrid method in different situations; see [4-12] for some approaches and applications. These acceleration techniques can be categorized into two groups. The first group contains the pre-acceleration (pre-optimization) techniques which accelerate the multigrid process before the coarse grid procedure. One notable pre-optimization technique is the minimal residual smoothing, introduced into multigrid by Zhang [11], which minimizes the residual norm before it is projected to the coarse grid. Other heuristic pre-acceleration techniques are the pre-scaling techniques which scale the residual vector by a pre-determined scaling factor a before it is projected to the coarse grid. Published pre-scaling techniques are the over-weighted residual technique of Brandt and Yavneh [4], the under-injection technique of Zhang [10] and the heuristic residual analysis of Zhang [10]. The essential differences of these pre-scaling techniques are the particular applications and the heuristic methods used to determine the scaling factor a. All the above mentioned pre-acceleration techniques do not necessarily require that the coefficient matrix A ~ be symmetric positive definite and thus are advantageous in terms of applicability. Another group of multigrid acceleration techniques consists of the postacceleration (post-optimization) techniques which accelerate the multigrid process after the coarse grid procedure. The published techniques of this group are the steplength optimization technique of Reusken [6, 13], damped multigrid iteration of Hackbusch and Reusken [14], and the over-correction technique of M~ka and Van~k [5, 8]. The philosophy behind these post-acceleration techniques are essentially the same, they minimize the error in energy norm in each iteration during or after the coarse-grid-correction process. These techniques are also referred to by Zhang [12] as the post-scaling techniques because they scale the correction term by a scaling factor chosen to minimize the error in energy norm. We have showed in [12] that the post-scaling techniques are generally more expensive than the pre-scaling ones. Also, most above mentioned post-optimization techniques require that the coefficient matrix A h be symmetric positive definite and the application of these post-optimization techniques is severely limited by this requirement.

Residual Scaling Techniques

231

In [12] we proved that, regardless of how to determine the scaling factor, the pre-scaling and post-scaling techniques are mathematically equivalent if and only if their scaling factors are equal. Hence, we unified these techniques as the residual scaling techniques. Because of the equivalence theorem (theorem 3.2 of Zhang [12]) of the residual scaling techniques and because of the facts that pre-scaling techniques are generally cheaper (see discussions in [12]) and have a wider application range, they are preferred in applications over the post-scaling techniques. The remaining issue is to compute or estimate the scaling factor, say a. Following the theoretical investigation in the companion paper [12], the current paper continues the study of multigrid residual scaling techniques and focuses primarily on the issues of practical applications and numerical verifications. We use the heuristic residual analysis to estimate the optimal scaling factor for residual injection operator in a high-order multigrid method. Although we remarked that the post-optimization technique is more expensive than the pre-optimization technique, we do not know which one is more efficient in solving realistic problems. We will compare a typical pre-optimization technique, the minimal residual smoothing technique of Zhang [11], with a typical post-optimization technique, the over-correction technique of Van~k [8], and declare the winner. Although the theoretical results of [12] were proved for the two-level method, they can be generalized to the multigrid method straightforwardly. Assuming the scaling factor a has been determined, the multigrid method with pre-scaling acceleration techniques may be formulated as follows.

ALGORITHM 1.1.

Multigrid method with pre-scaling acceleration scheme

Uh ~ Multigrid( A ~, u h, fh)

Given any initial guess u0h. For k = 0, 1, 2 , . . . , do: If ~h = the coarsest grid, then Solve u~ = ( A h ) - l f ~. Else Relax v 1 times on Ahu~ = fh with the given initial guess Uk h. Compute r~ = f h _ Abut. Scale rk~ = a rkh. Restrict r~ h= Rr~. Set f 2 h = r~h. Set U2kh = O.

232

J. ZHANG

Do u~h ~ Multigrid( A 2h, u~h, f2h)/~ times. Correct u~+ 1-- uh -[- Pu~ h" Relax v 2 times on Ahu~+l = fh with the initial guess uh+ 1" End if. In algorithm 1.1, R is the restriction operator and P is the interpolation operator. /z is the number of coarse-grid-correction processes. Multigrid method with post-scaling acceleration schemes can be formulated similarly (see [12]). The rest of the paper is organized as follows: In Section 2 we analyze the multigrid residual transfer operators and set a justification for using nontrivial residual injection operator to substitute for the expensive full-weighting operator. We also discuss the relation between the pre-scaling factor and the successive over-relation (SOR) parameter. In Section 3, we introduce the minimal residual smoothing technique and use the heuristic residual analysis to derive an optimal scaling factor for the residual injection operator in a high-order multigrid method. In Section 4 we conduct several numerical experiments to show the advantages of the pre-scaling techniques and the optimal residual injection operator, and to compare the efficiency of the pre-optimization and post-optimization techniques. Conclusions and remarks are included in Section 5. 2.

2.1.

ANALYSIS OF RESIDUAL TRANSFER OPERATORS

Optimal Injection Operator

One standard way of carrying out residual projection can be described as weighting a linear combination of the neighboring residuals to the grid point of interest. Suppose residuals at the nearest n + 1 grid points are to be weighted to r0.

DEFINITION 2.1. A residual projection operator is linear if the residual is transferred as a linear combination with the residuals at the nearest neighboring grid points, that is

eo = Y:

(2)

i=0

where 0 is the point of interest, 1, 2, . . . , n ar6 the n nearest neighboring grid points, r0 is the quantity to be transferred to the coarse grid.

Residual Scaling Techniques

233

DEFINITION 2.2. A linear residual projection operator is consistent if the weights in (2) satisfy the consistency condition

= 1.

(3)

i=0

Most familiar residual projection operators, for example, full-weighting (n = 8), half-weighting (n = 4) and full-injection (n = 0), etc., are consistent linear operators. Some examples of nonlinear projection operators are de Zeeuw's matrix-dependent restriction operators [15]. If the residual vector on the fine grid is sufficiently smooth, there is probably no need to weight residuals, that is n = 0 in definition 2.1 is sufficient. According to multigrid philosophy, residual is projected to the coarse grid only after it is smoothed by the pre-smoothing sweeps. If the residual vector is smooth, the magnitude of residual at any one particular grid point should not differ significantly from those of its nearest neighboring grid points under the same relaxation pattern. ASSUMPTION 2.1. (Smooth Condition) After sufficient relaxation sweeps, the residuals at the grid points subject to same relaxation sweeps are locally equal. From (2), it is clear that if the Smooth Condition (assumption 2.1) is satisfied and the same relaxation is applied to all grid points, any consistent weighting scheme is equivalent to injection, which is computationally optimal. However, trivial injection may not be accurate in all cases because the very relaxation method used may affect the residuals at neighboring grid points. Especially when patterned relaxation such as red-black Gauss-Seidel (RBGS) is used, the residuals at the nearest neighboring grid points subject to different relaxation sweeps are obviously not equal. But their relation may be revealed by careful analysis of the relaxation pattern and the location of the grid points. This is the heuristic residual analysis technique that we will discuss later. Once we know how the relaxation sweeps affect the residuals at different grid points and derive the relations of the residuals among nearest neighboring grid points, we will be able to derive nontrivial residual injection operator which is competitive with the full-weighting operator in terms of computational efficiency.

2.2. Accuracy of Transfer Operator The proof of meshsize independent rate of convergence of multigrid assumes that the restriction and interpolation operators satisfy certain

234

J. ZHANG

conditions. These conditions have been laid down by Brandt [1] and Hackbusch [16]. In his monograph on multigrid, Hackbusch gave the following simple condition [16, p. 149]:

mR + mp > 2 m,

(4)

where m n is the order of the restriction operator R and m p is the order of the interpolation operator P. The order of an operator is defined as the highest degree plus one of the polynomials that are interpolated exactly by P or ~?R*, where R* is the adjoint of R (see [3]) and 77 is a scaling factor that can be chosen freely. 2m is the order of the underlying partial differential equation to be solved. Typical examples of restriction operators are trivial injection and weighting scheme of some kind such as (2). Injection transfers residual at the common points of the fine and coarse grids directly and has an order of 0 (see [16, p. 66]). Weighting scheme computes the residuals on the fine grid and projects a weighted average to each point on the coarse grid. Fullweighting has an order of 2, since it is adjoint to the bi-linear interpolation which is an operator of the 2nd order. Examples given by Wesseling [3] demonstrated that condition (4) may be necessary for some problems to work properly. For example, if we solve a Poisson equation using the bi-linear interpolation operator, then the possible restriction operator with the lowest order satisfying condition (4) is some kind of weighting scheme, for example, the 2nd-order full-weighting scheme. Condition (4) may be necessary for convergence proof, where the general cases are considered, but may not be computationaJly optimal. As Hackbusch [16, p. 64] noted, the injection operator is optimal with respect to the computational work. In practice, when multigrid with RBGS relaxation is used to solve the Poisson equation with the five-point discretization, the half-injection (the injection scaled by a factor of 1/2) operator is much more efficient than full-weighting. This is a classical example that violates condition (4) but is preferred in practice for the sake of computational efficiency. (However, we have shown in [10] that the scaling factor 1/2 is not optimal, a more accurate scaling factor (8 - 1/-2)/16 has been proposed.) This (and other) practical example suggests that, although trivial injection may not be robust with respect to the energy norm [17], carefully scaled non-trivial injection may be more cost-effective than full-weighting. The success of half-injection also suggests that the relaxation pattern be taken into consideration when we analyze the residual relation among different

235

Residual Scaling Techniques

grid points (to determine a non-trivial injection scaling factor). For a particular multigrid method, we always know the relaxation scheme a priori and careful analysis can yield a more efficient restriction operator. Moreover, the geometry of the grid points should he considered to determine the correct weight in formula (2), and so is the residual at a particular grid point that influences the quantity to be projected to the coarse grid. This is the basis of the heuristic residual analysis [10] which has been used successfully to derive an optimal injection scaling factor which results in almost 40% reduction in the computational work with respect to the standard full-weighting operator, when a Poisson equation is solved by multigrid with the five-point RBGS relaxation. The order of the interpolation operator P must be high enough in order to suppress the high frequency components that might pop up when the (coarse grid) error correction is interpolated back to the fine grid. However, it seems unnecessary to put such an abstract requirement on the restriction operator except that it should be accurate enough to represent the residual features on the fine grid. This may usually be satisfied by using a non-trivial injection operation (correctly scaled). 2.3.

Relation between Residual Sealing and S O R

The convergence of the multigrid method may usually be accelerated by employing a successive over-relation (SOR) step after the basic relaxation (smoothing) step. Suppose that the previous iterate is u k_ 1 and the current iterate after the pre-smoothing is ~k, a typical SOR step to obtain the new iterate u k is in the form of Uk =

, +

-

Uk-1)

,

(5)

where co is the SOR parameter. Suppose that the linear system to be solved is (1) (with superscript dropped) and the residual at each iteration is defined as

rk = f -

Auk,

then it is easy to verify that, corresponding to the SOR step (5), the following relation among the residuals at the previous and current steps holds =

+

(6)

236

J. ZHANG

although this step is not carried out explicitly. Hence the residual iteration is also an SOR step with the same parameter. The SOR parameter to actually scales the residual in some sense. If the residual is subsequently scaled by a scaling factor a, the change of SOR parameter to affects the effect of the residual scaling factor a. On the other hand, the change of a has no effect on the previous SOR step. Note that we can only vary to in the interval (0, 2) to insure convergence, but our numerical experiments showed that a can have a much larger interval of convergence. So far, our numerical experiments showed that, given a residual scaling factor a, a suitable choice of SOR parameter to may accelerate the convergence, particularly when the basic relaxation step is not effective. However, the acceleration rate depends on the optimality of a. If a has been chosen optimally, the acceleration due to SOR is insignificant. On the other hand, if a is not properly chosen, the acceleration due to SOR may be significant. However, like the application of SOR in the classical iterative methods, the issue of estimating optimal to for many practically interesting problems is still an open question. Some limited attempts to estimating to in the context of multigrid with RBGS smoothing have been published by Yavneh [18-19] for anisotropic problems and by Zhang [9] for isotropic problems. 3.

PRACTICAL APPLICATIONS OF RESIDUAL SCALING TECHNIQUES

We have studied the theoretical aspect of the residual scaling techniques in the companion paper [12] and discussed some issues concerning the residual transfer operators in Section 2 of this paper. We remarked that the pre-scaling (pre-optimization) technique (algorithm 1.1) is more attractive than the post-scaling (post-optimization) technique (algorithm 3.2 in [12]) in terms of the range of potential applications and efficiency. The computational issue of the pre-scaling techniques is of course to find some methods to estimate the optimal residual scaling parameter a. As we remarked in Section 2 the optimal a is usually problem-dependent (at least dependent on the relaxation method and the residual restriction operator employed). However, for a particular method, we may be able to estimate a by some means. On the other hand, by studying the relation between the pre-scaling technique and the SOR acceleration, we know that the smoothing and scaling may be linked. Hence, the idea of finding some method to compute the SOR parameter to and the pre-scaling parameter ~ according to some prescribed rule looks very attractive. However, we should always keep the application advantages in mind which means that a useful method employed to compute or estimate these parameters should not rely on the good property (symmetry and positive definiteness) of the coefficient matrix.

Residual Scaling Techniques

237

In this section, we study two pre-scaling techniques. The first is the so-called minimal residual smoothing technique which is actually an indirect residual scaling technique by our early remark. The second is a heuristic residual analysis method which determines the optimal residual scaling parameter based on the relaxation pattern and the geometry of the grid points. This method has been used successfully in [10] to derive an optimal scaling parameter for the multigrid residual injection operator with t h e five-point second-order RBGS relaxation for solving the isotropic operators (the Poisson-type equations). In this section, we will use a similar idea to derive an optimal residual injection scaling factor for the nine-point fourthorder RBGS relaxation for solving the isotropic (diffusion-dominated) operators. 3.1.

The Minimal Residual Smoothing Technique

Assume that a sequence of iterates {u k} is generated by a multigrid procedure and the residual sequence corresponding to {u k} is denoted by {rk}. There is a technique that generates from {u k} and {r k} a new sequence { v k} that has a "smoothed" residual sequence { Sk}. This is the so-called m i n i m a l residual s m o o t h i n g (MRS) technique, introduced by SchSnaner [20] to smooth and stabilize the iterates generated by the conjugate gradient-type methods. The MRS technique was extensively analyzed by Weiss [21] in the context of the generalized conjugate gradient methods. In a recent paper [11], we introduced the minimal residual smoothing technique to accelerate the standard multigrid procedure and showed that the acceleration rate can be very attractive when the standard multigrid iteration converges poorly (see [11]). We formulate the minimal residual smoothing technique of [20-21] as follows:

ALGORITHM 3.1. Minimal residual smoothing technique [20-21] Initialize s o = r 0 and v0 = u 0. For k = 1 , 2 , . . . , do: Compute u k and r k. Compute Ak = -- sT- 1( rk -- Sk- 1)/[l rk -- sk_ 111~. Set v k = vk_ 1 + Ak(u k -- vk_ 1) and s k = sk_ 1 + hk(r k -- sk_l). In algorithm 3.1, [[. I]2 is the Euclidean (or other proper) norm. The last step of algorithm 3.1 is reminiscent of the SOR step (5) and the implicit residual step (6). The MRS parameter Ak looks like playing the role of the SOR parameter to. Hence, the MRS technique may be viewed as an

238

J. ZHANG

SOR-like procedure with a prescribed rule to compute the SOR parameter co(hk). Note that in algorithm 3.1 h e is chosen to minimize [[f- A[(1-

)[) Yk-1

-[- ~Uk] [[2

over all A ~ R, where R is the set of all real numbers. The resulting residuals s k obviously have a non-increasing Euclidean norm, that is, [[8k[[2 ~ [[8k_1[[2 and [[sk[[2 ~< [[rk[[2 for each k. The cost of the minimal residual smoothing technique is 10 operations per grid point if the Euclidean norm is employed and in this case the cost is independent of the coefficient matrix of the underlying linear system. In [11], we employed the MRS procedure to accelerate the standard multigrid method by inserting the MRS procedure (algorithm 3.1) just after the residuals on the finest grid being computed and before they are projected to the coarse grid. At each multigrid iteration, we replace the original multigrid iterate u k and its residual iterate r e by the MRS iterate ve and the corresponding "smoothed" residual iterate s k. We then project the "smoothed" residual s e to the coarse grid to form a coarse grid sub-problem. (Note that we must replace both the multigrid iterate u e and its residual r e at the same time, otherwise the coarse grid sub-problem would provide a wrong correction to the fine grid.) In this way, we give the coarse grid "smoothed" residuals which are essential for the coarse grid to provide a good coarse-grid-correction to the fine grid [1]. The numerical experiments that were carried out in [11] showed that the MRS accelerated multigrid is almost 45% faster than the standard multigrid for solving some convectiondiffusion equations with a large Reynolds number. We [22] further developed some inexact versions of the minimal residual smoothing technique that have been shown to accelerate the standard multigrid convergence by as much as 88%. A complete description of the MRS accelerated multigrid method and some numerical experiments can be found in [11]. We include this acceleration scheme here as a practical technique to determine the relaxation parameter or the pre-scaling factor, because the SOR and MRS procedure can be viewed as an indirect (implicit) pre-scaling technique, which scales the residual by taking a (prescribed) linear combination of the current and some previous residuals. This is a perfect example of pre-optimization technique which minimizes the residual norm before the coarse-grid-correction process. The minimal residual smoothing technique should be compared with the over-correction technique discussed in our early paper [12] which is a perfect example of post-optimization technique and which minimizes the error in energy norm after the coarse-grid-correction process. We leave the

Residual Scaling Techniques

239

comparison of respective acceleration rate in the section of numerical experiments. However, by simple observation, we note that, depending on the sparsity of the coefficient matrix of the underlying linear system, the over-correction technique may be many times more expensive than the minimal residual smoothing technique. The MRS technique with Euclidean norm may be applied to any problem, but the over-correction technique requires that the coefficient matrix be symmetric positive definite.

3.2. The Heuristic Residual Analysis As we remarked above, the optimal residual scaling factor a is usually problem-dependent. However, for a particular implementation of a multigrid method, we may be able to estimate a by some means. The following is a heuristic residual analysis method which determines the optimal residual scaling parameter for the residual injection operator in a high-order multigrid method based on the RBGS relaxation and the geometry of the grid points. We consider a fourth-order finite difference discretization scheme for the Laplace (isotropic) operator 82 - -

c~2

a x2

+ - - .

~ y2

(7)

The discretization scheme for (7) is a compact nine-point formula called Mehrstellen (see [23]) with the computational stencil 1 4 1

4 -20 4

1] 4 • 1

(8)

This scheme has been used by Schaffer [24] to develop one of his higher order multigrid methods. Stiiben and Trottenberg [17] also extensively analyzed the high-order formula (8) by Fourier mode analysis. Recently, Gupta et al. [25] showed that, for the high-order formula (8), the multigrid method with RBGS relaxation and the full-weighting operator is most efficient on both serial and vector machines. It has been shown that full-weighting is more accurate and more robust, but more expensive, than the injection operator [25]. The real reason for looking for a non-trivial injection operator is to solve the convection-diffusion equation with stagnation points in its domai~ see Section 4.2. For the fourth-order formula (8) with RBGS relaxation, however, full-injection is divergent for many problems and half-injection converges very slowly.

240

J. ZHANG

We note that the multigrid method solves the residual equations on the coarse grids. For the high-order RBGS relaxation, the grid space is not completely de-coupled. Half-injection is not accurate because the scaling factor 1/2 does not reflect the correct scale of the residual transferred to the coarse grid. The full-weighting scheme is to weight the residuals at the nearest nine points by the formula 1

e~/2,j/: = ~ [ 4 r , , j

+ 2(r,+1,~ + r~_l,j + r~,j+l + r,,j_l)

4-(ri+l,j+ 1 4- ri+l,j_ 1 4- ri_l,3+ 1 4- r i _ l , j _ l ) ] .

(9)

Here r,, 3 is the residual on the fine grid at the reference point (i, j), i and j are even numbers (i.e., ri, j is the residual at a red point whose index-sum is even and has a corresponding coarse grid image point). ~,/2, 5/2 is the quantity to be transferred to the coarse grid point ( i / 2 , 3"/2). For the full-weighting formula (9), we evaluate the residuals at all fine grid points and weight them to the corresponding coarse grid points. On the other hand, the injection operator needs only the residual at the common points of the two grids. There is no cost for weighting the residuals at the neighboring grid points. The cost of evaluating residuals on a grid space is equivalent to one full relaxation sweep on that grid. Hence, the computational cost of using the injection operator is less than a quarter of the cost of using the full-weighting operator. We intend to find a residual scaling parameter for the residual injection operator with the nine-point RBGS relaxation that simulates the fullweighting scheme (9). In formula (9), the weight assigned to each point is determined by the involvement of that point in the number of coarse grid point computations. For example, ,r,+ 1, j+l is weighted into the calculation of residuals of four coarse grid points at ( i / 2 , i / 2 ) , ( i / 2 , j / 2 + 1),(//2 + 1, j / 2 ) and ( i / 2 + 1, j / 2 4- 1). The weights in (9) correctly reflect thesecorrelations. But they do not reflect the geometric correlation of ri+l, j+l with respect to the reference point (i, j), and the nine-point red-black relaxation pattern. To find the optimal injection operator is to find the optimal residual scaling parameter a to represent ?i/2, j/2 in terms of ri, ~, as accurately as possible, that is,

ri/2, 2/2 = a r,, j.

(10)

Residual Scaling Techniques

241

It is difficult to give a precise representation of the residual on each grid space. However, with some assumption, a reasonable estimate may be given. We assume that the exact solution is smooth in the domain of concern and the high frequency error components have been removed before the residual is injected to the coarse grid. Otherwise, several relaxation sweeps may be needed to smooth out the high frequency error components. One RBGS relaxation sweep can be viewed as two Jacobi half sweeps, each is carried out on roughly half of the grid points. Hence, by assumption 2.1 (Smooth Condition), the residual at each grid point is not supposed to differ by a large magnitude from those of its nearest neighboring grid points subject to the same relaxation pattern. We assume that the residuals are locally equal among the red points or the black points in a local nine-point stencil (see Figure 1). A half Jacobi sweep is first carried out on the red points without updating the black points (whose index-sums are odd). Except for those points close to the boundaries, a red point is updated by two previously updated (new) red points, two un-updated (old) red points and four un-updated (old) black points (see Figure 1 for the geometric relations of the grid points). The update of a red point uses new values at only two (red) points and the residual at that point is supposed to be large (relative to those of the neighboring black points). A subsequent half Jacobi sweep on the black points updates each black point by four new red points, two new black points and two old black points. Hence, the residual at each black point is presumably smaller than those of the nearest neighboring red points after a full RBGS sweep. If we assume that, after one RBGS sweep on the entire grid, the residuals at all the black points are zero, that is,

ri+l, j = ri_l, j = ri, j+ 1 =

FIG. 1.

ri, j _ 1 = O,

0



0



0



4

Black point.

0



0

4

Red point.

A red-blackorderedgrid points in a compact nine-pointstencil.

(ii)

242

J. ZHANG

and at all the red points are equal (and not zero, otherwise we would have reached convergence and have no need to transfer the residuals), ri_l,j_

1 =

ri+l,j_

1 ~--- r i _ l , j +

1 =

ri+l,j+

(12)

ri, j.

1 =

Substituting (10), (11), and (12) into (9), we obtain a = 1/2. This is the so-called half-injection. Since this c~ results from the assumption that the residuals at the black points are zero. It should be the lower bound of a. We denote a 1. . . . = 1/2. In practice, the residuals at the black points may not be zero. Since updating a black point uses three times as much new information as updating a red point, we may assume that after one red-black full sweep on the entire grid, the residual r~, j at any particular red point, (i, j), is three times as large as the residuals of its four immediate neighboring black points (i, j - 1), (i, j + 1),(i - 1, j),(i + 1, j), that is, 1 ri+l,j

= ri-l,j

=

ri, j+l

=

(13)

ri, j - 1 = -ffri, j.

We also assume that the other four immediate neighboring red points ( i - 1, j - 1), ( i + 1, j - 1), ( i - 1, j + 1) and ( i + 1, j + 1) have the residuals of the same magnitude as ri, j. But their geometric positions are v~ unit away from the reference point (i, j) (see Figure 1). Their influence on the weighting scheme (9) should be scaled by ~/-ff, that is, 1 ri-l,j-1

=

ri+l,j-1

=

ri-l,j+l

= fi+l,j+l

= _--~-~ ri, j. Vz

(14)

Substituting (10), (13), and (14) into (9), we have

a :

10 + 3V~ 24

= 0.5934.

(15)

This would be the upper bound of the residual injection factor, we denote it by

O~uppe r .

The optimal scaling factor aoptimaI lies between aupper and alower. There exists some ~: ~ (0, 1) such that

a~optima ' =

1

~{:lgupper nt- (1 -- ~ ) a , .... = : + ~

2zl

2) "

(16)

Residual Scaling Techniques

243

In the absence of further information to justify any preference for choosing s¢, we may simply choose ~ = 1 / 2 to balance o/low and o/upper which yields a very good scaling factor o / = 0.5467. However, our numerical experiments showed that a slightly smaller ~ = 0.4548 yields the best a O/optima I = 0 . 5 4 2 5 .

4.

NUMERICAL EXPERIMENTS

We test our pre-scaling (pre-optimization) techniques with several numerical experiments and compare the pre-optimization technique with the post-optimization technique whenever this comparison is possible. All problems are solved on the unit square [0, 1] × [0, 1] with Dirichlet boundary conditions. Exact solutions are given to satisfy the boundary conditions. The bi-linear interpolation and residual projection operator of some kind are employed in the multigrid V(1, 1)-cycle algorithm (one pre-smoothing and one post-smoothing). The optimal residual injection factor used here is o / = 0.5425 and the corresponding residual injection operator is referred to as the (optimal) residual-scaling in order to distinguish it from the half-injection. Test Problems 1 and 3 are solved on a SUN SPARCstation 1 +, Test Problem 2 is solved on an SGI (Silicon Graphics Indy) workstation, all use F O R T R A N 77 programming language in double precision. Hence the CPU comparisons are only relevant within each test problem. Unless otherwise indicated explicitly, the computations terminate when the dynamic residual in discrete L2 norm is reduced by 10 l°. We remark that the standard multigrid with point Gauss-Seidel relaxation may not be the most efficient in terms of solution process for solving the convection-diffusion equation and the anisotropic Poisson equation, we use it for the sole purpose of testing and comparing our acceleration techniques.

4.1. Test Problem 1: The Poisson Equation We first test the optimal residual scaling factor a = 0.5425 and compare the optimal residual injection with full-weighting and half-injection for solving the Poisson equation

y) 3 X2

+

a2u(x, y) 3 y2

= x2(1 - x2)(2 - 12y 2) + y2(1 - y2)(2 - 12x2),

(17)

244

J. ZHANG

with the exact solution u( x, y) - x 2 y2(1 - x2X1 - y2). The initial guess is u( x, y) = 0. Equation (17) is discretized by the nine-point formula (8). The grid points are ordered in a checker-board fashion (the red-black order) and the nine-point Gauss-Seidel (RBGS) is used as the smoother. Since we proved that the post-scaling is mathematically equivalent to the pre-scaling, our implementation uses the pre-scaling algorithm (algorithm 1.1). We first show the overall influence of the residual scaling parameter ~ of the residual injection operator on the convergence of the multigrid V-cycle. We fix h = 1 / 6 4 and vary a in [0, 1] and compute the corresponding number of V(1, 1)-cycles required to reduce the residual norm by 10 l°. The m a x i m u m V-cycle number allowed is 100, larger than 100 is considered as slow convergence or divergence. Figure 2 depicts their relationship. It is clear from Figure 2 that using a residual scaling parameter a larger than 0.5 in some range accelerates the convergence. Note that the full-injection ( a = 1) is divergent. Next, for different h and different residual transfer operators, we test and record the number of V-cycles and the corresponding CPU time in seconds in Table 1. It is clear that the (optimal) residual-scaling requires less CPU time than full-weighting to converge. Hence residual-scaling is computationally more efficient.

120 100 80

60

40

20 ,

0

0.2

0.4

0.6

0.8

FIG. 2. Test Problem 1 with h = 1/64: The number of multigrid V(1, 1)-cycles as a function of the residual scaling factor 0 ~ a ~< 1 for the residual injection operator.

245

Residual Scaling Techniques

TABLE 1 TEST PROBLEM 1 WITH DIFFERENT MESHSIZE h: PERFORMANCE COMPARISON OF DIFFERENT RESIDUAL TRANSFER OPERATORS.

Residual-Scaling

1/16 1/32 1/64 1/128 1/256

V-cycles

CPU(sec)

10 11 11 11 12

0.13 0.37 1.30 5.32 23.48

Full-Weighting

Half-Injection

V-cycles CPU(sec) 9 10 10 10 10

V-cycles

CPU(sec)

12 15 19 23 27

0.15 0.47 2.22 10.95 51.40

0.14 0.44 1.49 5.89 23.50

Our primary intention is to compare residual-scaling with half-injection because full-weighting diverges when we solve our second test problem (the convection-diffusion equation). We note that residual-scaling is twice as much efficient as half-injection. Hence, as far as the residual injection operator is concerned, residual-scaling with a factor a = 0.5425 is a clear winner. We also did similar experiments with different solutions and forcing terms, and the results were similar to those of Test Problem 1. 4.2. Test Problem 2: The Convection-DiffusionEquation We now consider a two-dimensional convection-diffusion equation with variable coefficients

~2u( x, y) + v~ X 2

+

c~ y 2

-x

_ y c) x

~Y

with the exact solution ~( x, y) = xy(1 - xX1 - y)exp( x + y). ~ ~ (0, 1] is chosen to reflect the magnitude of the convection to diffusion. The initial guess for this test problem is random numbers in (0, 1). Equation (18) is discretized by a fourth~order finite different scheme introduced in [26]. The resulting system of linear equations is a nine-point compact formula. Since the discretization scheme is rather complicated, readers are referred to [26] for a complete description. We use this scheme because it js stable with

246

J. ZHANG

respect to the variation of 6. The grid points are also red-black ordered and the nine-point Causs-Seidel (RBCS) is the smoother. Multigrid solution of this scheme has been investigated by Zhang [27]. We note that (1/2, 1/2) is a stagnation point where both convection coefficients vanish and equation (18) simulates a recirculating flow. For large ~, Equation (18) is diffusion-dominated and behaves like an (isotropic) Poisson equation. Hence our optimal residual scaling factor should be applicable. For small ~, Equation (18) becomes convectiondominated (anisotropic) and many numerical methods become increasingly difficult (converge slowly or even diverge). We choose h = 1/64 and vary the magnitude of 8 to compare the performance of different residual transfer operators: residual-scaling, fullweighting and half-injection. The results are given in Table 2. From Table 2, we can see that the optimal residual-scaling injection operator is more cost-effective than both the full-weighting and half-injection operator. For highly convection-dominated problems with vanishing e, injection is necessary to ensure convergence while full-weighting diverges. In both diffusion-dominated (large ~) and convection-dominated (small ~) cases, optimal residual-scaling converges faster than half-injection. Therefore, the optimal residual injection operator is more cost-effective and more robust than the full-weighting and half-injection operators for this test problem. Although our optimal residual scaling factor a = 0.5425 was derived from an isotropic operator, it works properly for the convection-diffusion equation (an anisotropic operator). However, for the convection-diffusion equation, we believe some better residual scaling factor may be derived along the line of the heuristic residual analysis by taking the convection coefficients into consideration. This is the subject of future research. 4.2.1. Acceleration by minimal residual smoothing. Since the coefficient matrix of Test Problem 2 is nonsymmetric and non-positive definite, we cannot apply the post-optimization technique to accelerate the conver-

TEST PROBLEM 2 WITH

TABLE 2 h = 1//64 AND DIFFERENT ~':

PERFORMANCE COMPARISON OF

DIFFERENT RESIDUAL TRANSFER OPERATORS.

Residual-Scaling

10 ° 10-1 10- 2 10 -3

Full-Weighting

Half-Injection

V-cycles

CPU (see)

V-cycles

CPU (sec)

V-cycles

CPU (sec)

11 11 20 478

1.27 1.27 2.30 54.70

11 11 24 diverge

1.70 1.70 3.73 diverge

15 15 30 580

1.71 1.70 3.42 66.20

Residual Scaling Techniques

247

gence. On the other hand, the minimal residual smoothing technique (with Euclidean norm) has no requirement on the property of the coefficient matrix. We use the minimal residual smoothing technique to accelerate the convergence of T e s t Problem 2 with s = 10 -a and h = 1/64. In particular, we use the optimal residual injection operator with a = 0.5425. The convergence history of the minimal residual smoothing acceleration is depicted in Figure 3. The graph shows the reduction histories of the logarithm of the residual in Euclidean norm against the number of iterations (V(1, 1)-cycles) with and without MRS acceleration for the first 2000 iterations. It is clear that the minimal residual smoothing technique is able to accelerate the convergence noticeably.

4.3. Test Problem 3: The Anisotropic Poisson Equation We consider the anisotropic Poisson equation

d x2

101°

+

( s x 2 + y2)exp(xy),

a y2

,

10° L ' 10-1°f



(19)



\~ \ \

\ \

10 -'m

0

'

500

'

1000

'

1500

2000

FIO. 3. Test Problem 2: Convergence histories of the logarithm of the residual in Euclidean norm. Solid line is for the optimal residual scaling without MRS acceleration, dashed line is for the minimal residual smoothing acceleration.

248

J. ZHANG

where u( x, y) -~ exp(xy) is the exact solution. ~ ~ (0, 1] is chosen to reflect the degree of anisotropy. The initial guess is u(x, y) = 0 for this test problem. Equation (19) is discretized by the usual five-point second-order central difference scheme and the resulting linear system is symmetric positive definite. Hence, the post-optimization techniques can be applied. We use the standard multigrid V(1, 1)-cycle algorithm with full-weighting and bi-linear interpolation. The grid points of this problem are naturally (lexicographically) ordered and the five-point Gauss-Seidel is used as the smoother. We compare the minimal residual smoothing technique (pre-optimization, see [11] for details on implementation) and the over-correction technique (post-optimization see [8] for details on implementation) in terms of convergence and CPU time in seconds. For simplicity, we only apply both acceleration techniques at the finest level. For the minimal residual smoothing implementation, we use the Euclidean norm. The over-correction algorithm deserves some explanation. Since we perform only one pre-smoothing and one post-smoothing sweep at each level, the post-smoothing process is incorporated into the process of the over-correction procedure (see [8]). In particular, we do not perform any postsmoothing after the over-correction process at the finest level. This implementation is different from that proposed by Reusken [6], where he computed the steplength (over-correction) parameter at the coarse grid, and corrected the approximate solution before the interpolation procedure. After the correction, some post-smoothing sweeps were applied at the finer level. Since the ideas behind the post-optimization are similar, we will be able to explain the difference between their results and ours. (Reusken [13] was not able to fully explain his numerical results.) We use h = 1/64 on the finest grid and compare the performance of different acceleration schemes. Test results are given in Table 3.

TABLE 3 h = 1/64:

TEST PROBLEM 3 WITH FULL-WEIGHTING AND

PERFORMANCE COMPARISON

OF DIFFERENT ACCELERATION SCHEMES.

Standard Multigrid

10° 10 - 1

10 -2 10 -3 10 -4

MRS Acceleration

Over-Correction

V-cycles

CPU(sec)

V-cycles

CPU(sec)

V-cycles

CPU(sec)

13 57 414 1902 3175

2.03 8.58 62.33 287.89 478.48

13 38 220 775 1403

2.41 7.06 40.69 142.60 258.40

13 57 414 1900 3174

2.47 10.20 73.88 339.81 566.08

Residual Scaling Techniques

249

Table 3 shows that the minimal residual smoothing technique is an efficient acceleration method. When the standard multigrid method converges fast (with 6 close to 1), the MRS acceleration rates are not very much; when it converges slowly (as e approaching 0), the MRS acceleration rates are very attractive, almost 60% for e = 10 -3. These results agree with our analytic prediction on the behavior of the minimal residual smoothing acceleration [28]. One unexpected result is that, in addition to increasing the computational cost, the over-correction procedure does not provide any acceleration for this test problem. The very slight decreases in V-cycle numbers for the last two cases were likely caused by the computational process, for example, the rounding errors. Our test results look very bad for the post-optimization technique. Our explanation is that, together with the implementation of the full-weighting and bi-linear interpolation, a post-optimization (over-correction) process is just a post-smoothing process, but is more expensive. The results of Van~k [8] were not suitable for comparison because it was implemented in an algebraic multigrid method and no comparison with a non-accelerated case was given. The results of Reusken [13] (multi-level implementation) can be explained readily. Because Reusken computed the steplength parameter at the coarser level, which is equivalent to adding one more post-smoothing sweep at the coarser level, this looks like changing a V-cycle algorithm into a super V-cycle algorithm (with one post-smoothing sweep on the finest grid, but two post-smoothing sweeps on the coarse grids) with a performance close to, but the cost slightly cheaper than, a W-cycle algorithm. We essentially give an explanation that the numerical results of his post-optimized V-cycle algorithm are very close to those of the W-cycle algorithm. Note that Reusken [13] was not comfortable with the computational cost of his algorithm when it is implemented on the finest grid. We plot the convergence histories of the standard and the MRS accelerated multigrid algorithms in Figure 4 for the case of ~ = 10 -~. We remark that the convergence behavior of the over-correction technique (not shown in Figure 4) is almost the same as that of the standard multigrid (solid line). In summary, our numerical results demonstrated that the minimal residual smoothing is an efficient pre-optimization acceleration technique, but the over-correction (post-optimization), under our test conditions, does not seem to be competitive. We would like to point out an interesting thing that we observed in our numerical experiments with the over-correction technique. Under our implementation with the full-weighting operator, the over-correction acts exactly like a post-smoothing sweep at the finest level. However, if the scale of the original residual is not correct with the full-weighting, for example, if we intentionally scale the residual by a scaling factor (which is equivalent to

250

J. ZHANG 10 2

10 ° 10 -2 10 -4

\\

10 -6 10 -s

10 -1°

\

10 -12 10 -14 0

\

\

i

i

i

500

1000

1500

2000

F]c. 4. Test Problem 3: Convergence histories of the logarithm of the residual in Euclidean norm. Solid line is for the standard multigrid method, dashed line is for the minimal residual smoothing acceleration.

scale the correction term by the same factor due to theorem 3.2 of Zhang [12]), the over-correction technique can "correct" the scale of the correction term back to that which would be obtained from the standard full-weighting operator. This suggests that the post-optimization technique is related to the Galerkin condition

A2h= RAhP, where R and P are the restriction and interpolation operators respectively. Note that for full-weighting and bi-linear interpolation, we have R* = ~?P. Since full-weighting is optimal with respect to the energy norm and the post-optimization techniques are essentially to minimize the error in energy norm in each iteration (see algorithm 3.2 of Zhang [12]), it seems that the post-optimization techniques can only "optimize" the multigrid process up to the performance of the standard full-weighting operator. Note that a post-optimization step is usually more expensive than the full-weighting operator, this is a more expensive substitute with no gain in convergence. However, if the multigrid method, for some reason (e.g., the algebraic multigrid), does not use full-weighting and the scale of the coarse grid (residual) equation becomes incorrect in the computational process, the post-optimization, if applicable, may be a potential technique to correct the scale of the coarse-grid-corrections.

Residual Scaling Techniques 5.

251

CONCLUSIONS AND REMARKS

We have analyzed the residual transfer operators and the relation between the SOR parameter and the residual scaling factor. We introduced two practical pre-acceleration techniques: the minimal residual smoothing and the heuristic residual analysis. We tested these techniques in several practical applications and compared the performance of the pre-optimization and post-optimization techniques. From our numerical experiments, we are convinced that the pre-acceleration (pre-optimization) technique is more efficient and has a wider range of applications than the post-acceleration (post-optimization) technique. We also gave some heuristic arguments to explain the inefficiency and the limit of the post-optimization technique. The results of this paper strongly support the theoretical conclusions of the companion paper [12] which favor the pre-acceleration techniques. The pre-acceleration techniques developed in this paper may be used in other practical applications to accelerate the convergence of the standard multigrid method, especially when the standard multigrid method converges slowly. The beauty of the minimal residual smoothing acceleration is that the application of this technique does not necessarily require that the coefficient matrix be symmetric positive definite. The conclusion of this paper establishes the full advantage of the preacceleration techniques over the post-acceleration techniques. We essentially argued that the post-optimization techniques should not be used in accelerating standard multigrid method with the full-weighting operator. REFERENCES 1 A. Brandt, Multi-level adaptive solution to boundary-value problems. Math. Comp. 31:333-396 (1977). 2 W.L. Briggs, A Multigrid Tutoria~ SIAM, Philadelphia, 1987. 3 P. Wesseling, An Introduction to Multigrid Method~ John Wiley & Sons, Chichester, 1992. 4 A. Brandt and I. Yavneh, Accelerated multigrid convergence and high-Reynolds recirculating flows. SIAM J. Sci. Comput. 14:607-626 (1993). 5 S. M~ka and P. Van~k, A modification of the two-level algorithm with overcorrection. Appl. of Math. 37:13-28 (1992). 6 A. Reusken, Steplength optimization and linear multigrid methods. Numer. Math. 58:819-838 (1991). 7 A. Reusken, Fourier analysis of a robust multigrid method for convection-diffusion equations. Numer. Math. 71:365-391 (1995). 8 P. Van~k, Fast multigrid solver. Appl. Math. 40:1-20 (1995). 9 J. Zhang, Acceleration of five-point red-black Gauss-Seidel in multigrid for Poisson equation. Appl. Math. Comput. 80:73-93 (1996).

252

J. ZHANG

10 J. Zhang, A cost-effective multigrid projection operator. 3. Comput. Appl. Math. 76:325-333 (1996). 11 J. Zhang, Minimal residual smoothing in multi-level iterative method. Appl. Math. Comput. 84:1-25 (1997). 12 J. Zhang, Residual scaling techniques in multigrid, I: equivalence proof. Appl. Math. Comput. 86:283-303 (1997). 13 A. Reusken, Convergence Analysis of Nonlinear Multigrid Methods, Ph.D. Thesis, University of Utrecht, The Netherlands, 1988. 14 W. Hackbusch and A. Reusken, Analysis of a damped nonlinear multilevel method. Numer. Math. 55:225-246 (1989). 15 P.M. de Zeeuw, Matrix-dependent prolongations and restrictions in a blackbox multigrid solver. J. Comput. Appl. Math. 33:1-27 (1990). 16 W. Hackbusch, Multi-Grid Methods and Application4 Springer-Verlag, Berlin, 1985. 17 K. StSben and U. Trottenberg, Multigrid methods: fundamental algorithms, model problem analysis and application~ monograph, GMD-Studien Nr. 96, Postfach 1240, Schloss Birlinghoven, D-5205 St. Augustin 1, 1984. 18 I. Yavneh, Multigrid smoothings for red-black Gauss-Seidel applied to a class of elliptic operators. SIAM J. Numer. Anal. 32:1126-1138 (1995). 19 I. Yavneh, On red-black SOR smoothing in multigrid. SIAM J. Sci. Comput. 17:180-192 (1996). 20 W. SchSnauer, Scientific Computing on Vector Computer~ North-Holland, Amsterdam, 1987. 21 R. Weiss, Convergence Behavior of Generalized Conjugate Gradient Methods, Ph.D. Thesis, University of Karlsruhe, Germany, 1990. 22 J. Zhang, Multigrid with inexact minimal residual smoothing, AppL Numer. Math. 24:501-512 (1997). 23 L. Collatz, The Numerical Treatment of Differential Equation~ Springer-Verlag, Berlin, 1960. 24. S. Schaffer, Higher order multi-grid methods. Math. Comp. 43:89-115 (1984). 25 M.M. Gupta, J. Kouatchou and J. Zhang, Comparison of second and fourth order discretizations for the multigrid Poisson solvers. J. Comput. Phys. 132:226-232 (1997). 26 M.M. Gupta, R.Manohar and J. W. Stephenson, A single cell high order scheme for the convection-diffusion equation with variable coefficients. Int. J. Numer. Methods Fluid 4:641-651 (1984). 27 J. Zhang, Multigrid solution of the convection-diffusion equation with highReynolds number. In Preliminary Proceedings of the 1996 Copper Mountain Conference on Iterative Methods. Copper Mountain, Colorado, April 9-13, 1996, Vol. II, pp. 1-9. 28 J. Zhang, Multigrid Acceleration Techniques and Applications to the Numerical Solution of Partial Differential Equations, Ph.D. Thesis, The George Washington University, Washington, D.C., 1997.