Applied Numerical North-Holland
Mathematics
443
8 (1991) 443-455
Solution of nonlinear equations * B.M. Averick
* * and J.M.
Poisson-type
Ortega
Institute for Parallel Computation, School of Engineering University of Virginia, Charlottesville, VA 22901, USA
and Applied Science, Thornton Hall,
1. Introduction We consider V.
nonlinear
Poisson-type
equations
of the form
[K(u)vu] =f,
O-1)
where K is a given positive differentiable function. For simplicity, we assume Dirichlet boundary conditions on the unit square in two dimensions and unit cube in three dimensions. If we discretize (1.1) by finite-difference methods (see Section 2) we obtain a discrete system of the form F(u)=A(u)u
- b(u)
= 0,
(1.2)
where A(U) is symmetric positive-definite for all U, and the vector b(u) is also, in general, a function of u. We are concerned in this paper with Newton-type methods for the solution of (1.2). For Newton’s method itself, we need to solve the linear systems F’(Uk)Gk+’
= -F(Uk),
k = 0, 1,2,3 )...)
(1.3)
where F’( uk) is the Jacobian matrix at uk and Sk+’ is the Newton correction vector. The Jacobian matrix will be large and sparse, especially when (1.1) is in three dimensions. Thus, iterative methods, such as the preconditioned conjugate gradient method, are natural candidates to solve (1.3) approximately. However, in general, the Jacobian matrix of (1.3) is not symmetric so that extensions of the conjugate gradient method for nonsymmetric systems, for example, GMRES [8] would be necessary. An attractive alternative, which we explore in this paper, is to use an approximate Newton method in which A(u) is taken as an approximation to the Jacobian matrix. Then, since A(U) is symmetric and positive-definite, conjugate gradient methods can be used to solve approximately the linear systems A(Uk)Gk+’ at each Newton
= -F(Uk),
k=O,
1,2,3
,...
(1.4
step.
* Research supported in part by the National Aeronautics and Space Administration under NASA Grants NAG-l-242, NAG-l-1112 and NAG-1-1050. * * Present address: Mathematics and Computer Science, Argonne National Laboratory, Argonne, IL 60439, USA. 016%9274/91/$03.50
0 1991 - Elsevier
Science Publishers
B.V. All rights reserved
B.M. Averick, J.M. Ortega / Solution of nonlinear Poisson-type
444
equations
The approximate Newton method based on (1.4) loses the quadratic convergence property of Newton’s method. However, the numerical results for our model problem show that the approximate Newton method sometimes converges in fewer outer iterations than Newton’s method itself. We also show that as the problem size increases, A(u) becomes a better approximation to the Jacobian matrix, so that the convergence properties of Newton’s method should be more closely attained. In particular, we obtain at least a fast linear rate of convergence. One advantage of using an iterative method to solve the Newton linear systems approximately is that the systems do not need to be solved too accurately in the early stages of the Newton iteration. Convergence criteria due, for example, to Dembo et al. [4] are useful in this regard. The same strategy can be applied to the approximate Newton method (1.4) and we call this a truncated approximate Newton method. For our model problem, truncation results in a savings of almost a factor of two in computing time. In Section 2, we give the discretization of (1.1) and properties of the Jacobian matrix of the function (1.2). In Section 3, we define our truncated approximate Newton method, and in Section 4 we give numerical results on a SUN 3/60 and a Cray 2 for a particular model problem.
2. The nonlinear system We discretize (1.1) by second-order finite differences, assuming a constant grid spacing h in all dimensions. The approximation to the x derivative at the (i, j, k) grid point is
where
(2.4 The derivatives in the y and z directions are approximated equation at the (i, j, k) grid point has the form -Ki~l/2,j,kU,+1/2,j,k
=
-
K~,~*1/2,kUi,jil/2,k
-
K*,],kfl/2Ui,j,k+l/2
similarly,
and the discretized
+
cl,j,kui,j,k
(2.3)
h%%j,k.
Here, the notation Ki * 1,2j,k~i * 1,2, j,k denotes the sum Ki+l/2,j,kUi+l/2,j,k
‘Ki-1/2,j,kUi-1/2,j,k,
(2
4
and similarly for the other terms. The quantity Cj,,,k is the sum of the six K terms in (2.3). If there are N interior grid points in each direction, then (2.3) is a system of n = N3 equations. For the corresponding two-dimensional problem, there will be N2 equations. If u is the vector of unknowns, the system (2.3) can be written in the form &$+A(u)u
- b(u) = 0,
(2.5)
where A(u) has the structure of the discretized Poisson matrix. The vector b(u) contains the values of the forcing function f and boundary values, as usual, but it is also a function of u since
B.M. Averick, J.M. Ortega / Solution of nonlinear Poisson-type
equations
445
a boundary value will now be multiplied by a corresponding value of K. For example, in the equation for node (l,l,l), the approximation (2.1) contains the term K(~(u,,,,, + u~~~,~))u~,~,~ is a known boundary value but z+,,~ is an unknown. which will be in 6(u). Here, z+,~,~ We next compute the Jacobian matrix F’(u) of (2.5). The term A(u)u may be written in terms of the columns a,( U) of A(u) as A(u)24 = i u;a,(u). 1=l By differentiating F’(u)
(24
this, we obtain =/4(u)
+ B(u),
(2.7)
U#z~(U) -b’(u).
(2.8)
where B(u)
= i i=l
The matrix B(U) has the same nonzero structure B(u) are of the form (see example below) :(‘i,j,k
-
Ui+l,j.k)K’(+(U,,j,k
+
as A( u). The nonzero
off-diagonal
elements
of
(2.9)
‘i+l,j.k))r
with similar terms corresponding to u,_~, j,k, ~~,~+r,~,~, and u~,~,~f1,2. The diagonal elements of B(u) are just the sum of the off-diagonal elements in the corresponding row. Note that the elements of b’(u) are automatically incorporated into terms of the form (2.9) due to the way b(u) is formed. To clarify the above discussion, we give an example of F(U) and F’(u) in one dimension for N = 3. We set Ki
f
=K(:(u,*,
l/2
Then the difference
+
equations
-Ki_r,2~i_1
ui)),
K/+,/2=K’(;(ui*,
+
are
+ (Ki-,/,
+ Ki+1/2)Ui_Ki+1/2U;+r
=h2h,
so that G/2
A(u)
=
+
-
K3,2
0
Thus
K3,2
ui>>-
K3,2
-
0
K3,2 +
K5,2
KS,2
-K5,2 KS,2
+
I )
G/2
i= 1,
,3
UoK1,2
b(u)
=
+
h2fl
h2f2 KJG,,
+
h=f3
I-
I
3.M. Auerick, J. M. Ortega / Solution of nonlinear Poisson-type
446
K;,z
K;,z
a;(u) = f
0
Therefore,
0 - K;,z K&z+ K&2 G/2 , - K;/z - K&z
0
0
0
-K&z
0
G/2
B(u)=;
Yl
-Yl
[
0
I
0
-K&z . G/2 + G/2 I
if we set y, = ( ui - u,+r) K/+l,Z, - Yo+
equations
we have from (2.8) that
0
Yl
-Y1 + Y2 - Y2
Y2 -Y2+Y3
1 .
(2.10)
In general, the matrix A(u) is symmetric and, if K is positive, as is assumed, A(u) is also positive-definite. This follows from the fact that A(u) is irreducibly diagonally dominant. The matrix B(u), however, is not symmetric unless K is a constant. In particular, the off-diagonal part of B(u), which we denote by B,(u), is skew-symmetric as illustrated by (2.10). Moreover, the diagonal part B,(u) of B(U) is not, in general, positive-definite, again illustrated by (2.10). Indeed, the symmetric part of F’(U), which is A( U) + B,(U), is not necessarily positive-definite.
3. Truncated approximate Newton methods Newton’s method in conjunction gradient method to solve the Newton linear (1.3). However, previous section, Jacobian F’(u) is not symmetric. attempt to use conjugate gradient type such as GMRES for the nonsymmetric systems, or we could consider systems in the normal equation approach has well-known disadvantages and experiments methods we will use. We have not the use of such as GMRES. Instead, consider an approximate ~@k)@+’
= -p(&),
Uk+l = &+
Sk+‘,
k=O, 1, 2,...
(3 .I)
in which the Jacobian matrix is approximated by A( u). The rationale for this approximation is as follows. First, A(u) is always positive-definite and thus we can apply the conjugate gradient method to the linear systems of (3.1). Secondly, from the form (2.9) of the elements of B(u), we on expect that B(u) --, 0 as h + 0 and N -+ 00. Indeed, under suitable smoothness assumptions one can show [2] that (1B(u) 11a, = O(h) and the solution of the differential equation, 11B,(u) 11o. = 0( h2). Table 1 illustrates this for the particular function K used in our experiments of the next section. In this table, the matrices are evaluated at the exact solution of the differential equation at the grid points. We note that a major computational advantage of the approximation of the Jacobian by A(U) is that B(U) does not need to be computed. Only A( uk) is needed which is required in any case to obtain F(u~). The iteration (3.1) will not, of course, retain the quadratic convergence properties of Newton’s method but, based on the approximation properties illustrated in Table 1, we do expect fast
B.M. Averick, J.M. Ortega / Solution of nonlinear Poisson-type Table 1 Approximation N
I 15 31 63
properties
equations
447
of A(u)
IIA(u) IIm
IIB(u)
61.3 68.7 72.1 74.9
0.62 0.34 0.18 0.09
IIoo
11 BD(“) 0.047 0.012 0.0031 0.0008
iI m
II B(u) IIco/II F’(u) IIca 0.010 0.0050 0.0024 0.0012
linear convergence. This is borne out by the experiments of the next section. It is also indicated by the spectral radius in Table 2 for the corresponding one- and two-dimensional problems as well as the small three-dimensional ones. It is the spectral radius of the matrix A(u*)-‘B(u*)
=1-A@*)-‘F’(u*),
(3.2)
where u * is the solution of F(u) = 0, that determines the linear asymptotic rate of convergence of the iteration (3.1). (See, e.g., Ortega and Rheinboldt [7].) The linear systems of (3.1) will not be solved exactly, of course, by the conjugate gradient method. Indeed, we do not wish to solve them very accurately until close to the solution. This suggests a truncated approximate Newton method in the sense of Dembo et al. [4]. In this case, we wish to solve (3.1) accurately enough to satisfy ]IL4(Uk)Gkf’
+ F(Uk) I] G
vk
iI F(iLk)
for some positive parameters qk. Dembo analyzed further by Averick [2].
II
et al. [4] have considered
(3.3)
this possibility
and it is
4. Numerical results We first give some preliminary results for the corresponding two-dimensional problem. The purpose of these results, which were obtained on a SUN 3/60, is to explore several possible methods before solving the three-dimensional problem on a Cray 2. The methods that we consider first are: (I) Newton’s method with Gaussian elimination to solve the Newton (II) Newton with CG to solve the normal equations;
systems;
Table 2 Spectral radii of A-‘(u*)B(u*) N
3 7 15 31 63
Spatial dimensions 1
2
3
0.0383 0.0469 0.0478 0.0484 0.0485
0.0353 0.0418 0.0438 0.0443 _
0.0319 0.0387 _ _
B.M. Averick, J.M. Ortega / Solution of nonlinear Poisson-type
448
equations
(III)
approximate Newton with the Jacobian matrix approximated by A(U): (a) linear systems solved by Cholesky, (b) linear systems solved by ICCG(0); (IV) approximate Newton with Jacobian approximated by its symmetric part: (a) linear systems solved by Cholesky, (b) linear systems solved by ICCG(0).
The convergence tests for the conjugate gradient methods are very stringent, so as to obtain the solution to the linear systems “exactly”; this is in contrast to the truncated methods for which results will be presented shortly. We have chosen the forcing function so that the exact solution of the differential equation is U(X, y) = x2 +y2. If ri is the vector of the exact solution evaluated at the grid points, and if we obtain an iterate uk so that I] F(uk) 11< II F(ri) II, we have obtained an approximate solution of the discrete system commensurate with the discretization error. Thus, we adopt a convergence criterion for the outer iteration of 11F(Uk) 11; G
cNL
=
11F(s)
11;.
(4-U
In practical problems, of course, other criteria would have to be used to estimate the discretization error. The values of cNL for the different problem sizes with N = 7, 15, 31, and 63 are 9.7 x 10F6, 1.7 x lo-‘, 2.9 x 10P9, and 4.7 X 10-l’. The convergence test for the inner conjugate gradient iteration is chosen to be
II rk II < c,_= 6NLx 10-2,
(4.2)
where rk is the final residual vector of the conjugate gradient iteration for the linear equations to be solved at the kth outer iteration. The function K used in the three-dimensional problem has been suggested by Zang [lo] as useful in certain fluids problems and is
K(u) =
lO(1 + 0.27~)~‘~ 3+0.27u .
(4.3)
This function is close to linear for values of u of interest, sional results we use the linear approximation
and for the preliminary
K(U) = 3.33 + 0.91U.
two-dimen-
(4.4)
Tables 3-8 give outer iterations, inner iterations when appropriate, and operation counts for the six methods in (I)-(IV) above. For those methods that solve the linear systems directly, N = 63 is omitted due to memory limitations. Results are given for two initial approximations: a linear interpolation (UT’ = xi + y,‘) of the boundary values at the left and right sides of the
Table 3 Newton-Gaussian N 7 15 31
elimination Linear approximation
Random approximation
Outer
Operations
Outer
Operations
2 2 3
21,000 280,000 6,400,OOO
4 4 4
40,000 553,000 8,500,OOO
B.M. Averick, J.M. Ortega / Solution of nonlinear Poisson-type Table 4 Newton-CG N
7 15 31 63
on normal
Operations
Outer
Inner
Operations
2 2 3 3
75 331 1570 7644
100,000 2,000,000 40,000,000 820,000,000
4 4 4 5
174 759 3141 13806
230,000 4,560,OOO 80,900,000 1,470,000,000
Newton-Cholesky
with A(u) Random
Linear approximation
Table 6 Approximate
approximation
Outer
Operations
Outer
Operations
2 3 3
15,000 265,000 3,640,OOO
3 3 4
23,000 265,000 4,840,OOO
Newton-ICCG(0)
with A(u) Random
Linear approximation
approximation
Outer
Inner
Operations
Outer
Inner
Operations
2 3 3 4
18 38 80 196
30,000 270,000 2,300,OOO 22,600,OOO
3 3 4 4
28 52 114 235
45,000 360,000 3,260,OOO 26,940,OOO
Table 7 Approximate
Newton-Cholesky
N
with symmetric
part
Linear approximation
7 15 31
Table 8 Approximate
7 15 31 63
approximation
Inner
7 15 31
N
Random
Outer
N
7 15 31 63
449
equations
Linear approximation
Table 5 Approximate
h’
equations
Random
approximation
Outer
Operations
Outer
Operations
2 3 3
17,000 276,000 3,679,OOO
5 7
41,000 637,000
Newton-ICCG(0)
with symmetric
Diverges
part
Linear approximation
Random
Outer
Inner
Operations
Outer
2 3 3 4
18 40 82 198
32,000 297,000 2,389,OOO 23,076,OOO
5 7
approximation Inner 51 143 Diverges Diverges
Operations 85,otj 992,000
B.M. Averick, J.M. Ortega / Solution of nonlinear Poisson-type
450 Table 9 TANICCG, cy
0.001 0.0005 0.0001
Table 10 TANICCG, a
0.005 0.001 0.0005
Table 11 TANICCG,
N = 7 Linear approximation Outer
Inner
Operations
3 2 2
14 11 13
27,200 20,700 23,400
Outer
Inner
4 3 3
26 21 24
.
Outer
Inner
Operations
0.001 0.0005 0.0001
4 3 3
19 16 19
36,200 30,000 33,900
Operations 212,000 168,000 187,000
approximation
0.005 0.0005 0.000005
Random
approximation
Outer
Inner
Operations
5 4 3
26 33 32
223,000 290,000 237,000
N = 31
Outer
Inner
Operations
0.001 0.0005 0.0001
4 3 3
52 45 48
1,600,OOO 1,374,ooo 1,444,ooo
0.01 0.005 0.001 0.0005
Random
OL
Linear approximation
Linear approximation
a
a
N = 15
(Y
Table 12 TANICCG,
equations
CX
0.001 0.0005 0.0001
Random
approximation
Outer
Inner
Operations
5 5 4
61 66 66
1,893,OOO 2,032,OOO 1,989,OOO
N = 63 (Y
Linear approximation Outer
Inner
Operations
6 6 4 4
102 112 97 105
12,733,OOO 13,870,000 11,738,OOO 12,536,OOO
0.01 0.005 0.0005 0.0000001
Random
approximation
Outer
Inner
Operations
8 7 6 4
136 132 1.58 167
17,071,000 16,396,OOO 18,849,OOO 19,418,OOO
domain, and one with random components in the interval [0,4]. (All components of the solution lie in [0,2].) The first is a much better approximation to the solution than the second. Of the methods used in Tables 3-8, the simple direct methods are not competitive for larger problem sizes, as expected, nor is conjugate gradient on the normal equations. Use of A(u) and the symmetric part of the Jacobian give comparable results with ICCG for the linear initial approximations (Tables 6 and 8) but the symmetric part of the Jacobian is not always positive-definite and this manifests itself by the “diverges” in the random approximations in
B.M. Averick, J.M. Ortega / Solution of nonlinear Poisson-type
equations
451
Tables 7 and 8. On the basis of these limited experiments, there seems to be no reason to perform the extra work to obtain the symmetric part of the Jacobian. Therefore, in the sequel, we pursue only the approximate Newton method with the Jacobian approximated by A(U) and the linear systems solved approximately by ICCG(0). Tables 9-12 give results for the truncated approximate Newton method, discussed in Section 3, in which the Jacobian matrix is approximated by A(u) and the linear systems are solved approximately by ICCG(0). We denote this method by TANICCG. The convergence criterion is based on (3.3) and is
The value of the parameter a in (4.5) is critical and the tables give results for several values of (Y, including one which is able to reduce the number of outer iterations to that of the approximate Newton method with the “exact” linear equation solution. For both starting approximations in Table 9, the value (Y= 0.0005 gives the minimum operation count. (Several other values of (II were also run but are not shown; in general, the larger (Y, the more outer iterations, as expected.) For the optimum value of (Y, the number of outer iterations is the same as shown in Table 6, but the number of inner iterations is almost halved, leading to operation counts that are approximately 50% lower for the truncated method. For the larger problem sizes, the optimum (Y depends on the initial approximation, as Tables lo-12 show. We make a few observations on the results in Tables 10-12. For the linear approximation, the minimum operation count is achieved for an (Ythat gives the same number of outer iterations as shown in Table 6. Decreasing (Y still further requires more inner iterations and has no benefit. The situation is quite different for the random approximation. Here, the minimum operation count is achieved for an a: that is not yet small enough to give the same number of outer iterations as in Table 6. This is not undesirable. An outer iteration requires fewer operations than an inner iteration so that sacrificing a few outer iterations to obtain a decrease in the number of inner iterations will usually reduce the overall operation count. Table 13 summarizes the optimum CL Note that N = 31 gives somewhat smaller values for than the overall trend would suggest. We have no a!optr for both initial approximations, explanation for this. The results in Tables 9-12 show that the number of operations depends strongly on (Y,and we have no theory to predict its optimum value. However, it seems to be the case that the use of the optimum (Y for smaller problem sizes works reasonably well for larger problems. If (~,r~( N) denotes the optimum (Y for problem size N, Tables 14 and 15 show the use of a,,,(7) and
Table 13 Optimum (Y for TANICCG N
7 15 31 63
Linear approximation
Random approximation
%pt
%pt
0.0005 0.001 0.0005 0.001
0.0005 0.005 0.001 0.005
B.M. Averick, J.M. Ortega / Solution of nonlinear Poisson-type
452 Table 14 Operations
using a,,,(7)
N
Linear approximation
15 31 63
equations
versus CI,_,~(N) Random
approximation
(Y= 0.0005
a,,,(N)
%
a = 0.0005
S,,(N)
%
187,000 1,374,ooo 12,536,OOO
162,000 1,374,ooo 11,738,OOO
11.6 0 6.8
290,000 2,032,OOO 18,849,OOO
237,000 1,893,OOO 16,396,OOO
15.1 7.3 15.0
a,,,(15) on the larger problems. The operation counts in the aopt columns are those from the corresponding Tables 9-12 and the % column gives the percentage difference in the operation counts. The use of the optimum (Y’S for N = 7 gives reasonably good results for the larger problem sizes, with a maximum increase in operations of 15%. The use of the optimum (Yfor N = 15 gives perfect results for N = 63, since the optimum (Y’S are the same for both problem sizes, but is somewhat inferior for N = 31 for the linear initial approximation. In any case, the N = 63 problem requires 50 to 60 times the amount of work as the N = 15 problem so that several runs of the N = 15 problem could be justified to determine a reasonable value of cy to use for the larger problem. We now consider the three-dimensional problem on a single processor of the Cray 2. The results of the preliminary runs for the two-dimensional problem showed that the truncated approximate Newton method (TANICCG) with the Jacobian approximation A( U) was the most satisfactory of all those considered. Hence, we will concentrate on this method for the three-dimensional problem but give comparisons with method (IIIb) for which the linear systems are solved “exactly”. We denote this method in the tables by ANICCG. The right-hand side of (1.1) is chosen so that the exact solution of the differential equation is x2 + y2 + z2. If ri is the vector of the exact solution evaluated at the grid points, we again use the convergence test (4.1) for the outer iteration. In this case, the values of cNL for the different problem sizes with N = 7, 15, 31, and 63 are 1.9 X 10P4, 7.4 X 10P6, 2.5 X 10P7, and 8.3 X 10P9. These values of N correspond to 343, 3375, 29791, and 250,047 unknowns, respectively. Again, results are given for two initial approximations to the solution, one with random components in the interval [0,4] (the exact solution of the differential equations lies in [0,3]), and one which linearly interpolates the boundary conditions in one direction so that
UP,k’ x,+Jf+z;. 1
Table 15 Operations
using c+(l5)
N
Linear approximation
31 63
(4.6)
1
versus aopt Random
approximation
CY = 0.001
a,,,(N)
%
(Y= 0.005
aopt(N)
%
1,600,OOO 11,738,OOO
1,374,ooo 11,730,000
16.5 0
1,983,OOO 16,396,OOO
1,893,OOO 16,396,OOO
4.7 0
B.M. Averick, J.M. Ortega / Solution of nonlinear Poisson-type
453
equations
Table 16 ANICCG,
Cray 2, linear approximation
N
Iterations Outer
Inner
Outer
Inner
Outer
Inner
Total
1 15 31 63
2 3 3 4
16 46 106 289
30 62 78 82
41 83 91 93
0.0019 0.011 0.068 0.65
0.0045 0.071 1.29 29.2
0.0064 0.082 1.36 29.9
Table 17 ANICCG,
Cray 2, random
N
Iterations
approximation Times (seconds)
Mflops Inner
Outer 7 15 31 63
Times (seconds)
Mflops
27 62 152 344
3 3 4 4
Outer
Inner
Outer
Inner
Total
31 62 77 81
47 82 89 92
0.0026 0.011 0.086 0.65
0.0074 0.095 1.89 34.7
0.010 0.11 1.98 35.4
For the previous two-dimensional results, the natural ordering of the grid points was used. To implement the ICCG iteration on the Cray 2 we use the red-black ordering to obtain long vector lengths (see, e.g., Ortega [6]). It is known (see, e.g., Duff and Meurant [5]) that this ordering may degrade the rate of convergence of the ICCG iterates (it does not effect the outer iteration), and it is quite likely that on the Cray 2 lower overall computation times would be obtained by using the natural or diagonal orderings as in Ashcraft and Grimes [l] and van der Vorst [9]. However, on highly parallel machines, the red-black ordering may be better (Chan et al. [3]). In any case, the particular ordering used will not affect the overall conclusions regarding the approximate Newton iteration. We first give results in Tables 16 and 17 for the ANICCG method in which the convergence test for the inner iteration is (4.2). These tables give the number of inner and outer iterations, as well as Mflop rates and times, for both the inner and outer iterations. Note that the Mflop rates are slightly higher for the inner iteration although the amount of time spent in the inner iteration increases much more rapidly as the problem size increases.
Table 18 TANICCG, N
7 15 31 63
Cray 2, linear approximation Iterations
Mflops
Time (seconds)
%pt
Outer
Inner
Outer
Inner
Outer
Inner
Total
2 3 4 5
10 24 64 135
31 62 76 81
47 82 88 92
0.0019 0.010 0.088 0.79
0.0029 0.040 0.85 14.2
0.0048 0.050 0.94 15.0
0.0005 0.005 0.005 0.005
B.M. Averick, J.M. Ortega / Solution of nonlinear Poisson-type
454 Table 19 TANICCG, N
7 15 31 63
Cray 2, random
7 15 31 63
approximation
Iterations
Mflops
Time (seconds)
“opt
Outer
Inner
Outer
Inner
Outer
Inner
Total
3 4 5 7
17 33 77 174
31 62 77 81
47 81 89 92
0.0026 0.013 0.10 1.0
0.0049 0.055 1.0 18.3
0.0075 0.068 1.1 19.3
Table 20 Comparison N
equations
0.0005 0.001 0.001 0.005
of times Random
Linear approximation ANICCG
TANICCG
Ratio
ANICCG
0.0064 0.082 1.36 29.9
0.0048 0.050 0.94 15.0
0.74 0.61 0.69 0.50
0.010 0.11 1.98 35.4
approximation TANICCG
Ratio
0.0075 0.068 1.1 20.8
0.75 0.62 0.56 0.59
In Tables 18-19, we give corresponding results for the TANICCG method using the convergence test (4.5). Results are given for only the optimum (Y’S,which are listed in the final columns of the tables. The total times for the two methods are then compared in Table 20. Table 20 shows that truncation allows a savings in time of up to a factor of 2, with generally greater savings for the larger problem sizes. Comparing the number of iterations in Tables 18 and 19 to Tables 16 and 17, we see that TANICCG required more outer iterations in most cases, but considerably reduced the number of inner iterations, which results in lower times. Table 21 shows the sensitivity to (Y for the N = 63 problem. Note that for the random approximation, an extremely small value of cx is required to reduce the number of outer iterations to that of the ANICCG method. As with the two-dimensional problem, we would like to determine suitable values of (Yfor the larger problems by means of the smaller problems. Tables 22 and 23, which correspond to Tables 14 and 15, show the effect of using the optimum (Yfor the N = 7 and N = 15 problems as the cy’s for the larger problems.
Table 21 TANICCG a
0.01 0.005 0.001 0.0005
iterations,
different
01, N = 63 (Y
Linear approximation Outer
Inner
Time
6 5 4 4
152 13.5 146 152
17.1 15.0 15.8 16.6
0.01 0.005 0.0001 0.0000001
Random
approximation
Outer
Inner
Time
8 7 5 4
177 174 195 250
20.4 19.3 20.8 26.13
B.M. Averick, J.M. Ortega / Solution of nonlinear Poisson-type
equations
455
Table 22 Times using (~“~~(7) versus Q(N) ‘V
15 31 63
Linear approximation
31 63
approximation
(Y= 0.0005
‘Y&N)
%
(Y= 0.0005
Q,,(N)
%
0.062 1.12 16.6
0.050 0.94 15.0
24 19 11
0.075 1.28 23.9
0.068 1.1 19.3
10 16 24
Table 23 Times using a,r,(l5) N
Random
versus LY,+(N)
Linear approximation
Random
approximation
1y= 0.005
“opt(N)
%
(Y= 0.001
%,t(N)
%
0.94 15.0
0.94 15.0
0 0
1.1 21.2
1.1 19.3
0 10
Use of a,,,(7) on the larger problems is only fair, resulting in increases of times of lo-24%. Use of q,,,(15) on the larger problems is unusually good and cannot be expected to happen too often. However, a reasonable strategy is to expend a modest effort to obtain q,rt for a smaller problem size and then use this value for larger problems. In summary, we can say that the truncated approximate Newton method has good potential for nonlinear Poisson-type problems of the type we have considered. Acknowledgement We are indebted paper.
to the referees
for suggestions
that helped
to improve
the presentation
of the
References [1] C. Ashcraft and R. Grimes, On vectorizing incomplete factorization and SSOR preconditioners, SIAM J. Sci. Statist. Comput. 9 (1988) 122-151. [2] B. Averick, Solution of nonlinear Poisson-type equations on vector computers, Ph.D. Dissertation, Department of Applied Mathematics, University of Virginia, Charlottesville, VA (1991). [3] T. Chan, C. Kuo and C. Tong, Parallel elliptic preconditioners: Fourier analysis and performance on the connection machine, Comput. Phys. Comm. 53 (1989) 237-252. [4] R. Dembo, S. Eisenstat and T. Steihaug, Inexact Newton methods, SIAM J. Numer. Anal. 19 (1982) 400-408. [5] I. Duff and G. Meurant, The effect of ordering on preconditioned conjugate gradients, BIT 29 (1989) 635-657. [6] J. Ortega, Introduction to Parallel and Vector Solution of Linear Systems (Plenum, New York, 1988). [7] J. Ortega and W. Rheinboldt, Iterative Solution of Nonlinear Equations in Several Variables (Academic Press, New York, 1970). [8] Y. Saad and M. Schultz, GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems, SIAM J. Sci. Statist. Comput. 7 (1986) 856-869. [9] H. van der Vorst, ICCG and related methods for 3D problems on vector computers, Comput. Phys. Comm. 53 (1989) 223-235. [lo] T. Zang, Private communication (1989).