Preconditioned Newton Methods using Incremental Unknowns Methods for the Resolution of a Steady-State Navier-Stokes-like Problem Olivier Goyon Laboratoire d ‘Analyse Numkrique d ‘Orsay Universitb Paris XI 91405 Orsay Cedex, France and Pascal Poullet UFR Sciences, Dipartement de Mathbmatiques Universit’e des Antilles et de la Guyane 97159 Pointe-‘a-Pitre Cedex Guadeloupe, French West Indies
ABSTRACT In a previous work, one of the authors has studied a numerical treatment (by fully implicit discretizations) of a two-dimensional Navier-Stokes-like problem and has proved existence and convergence results for the resulting discretized systems with homogeneous Dirichlet boundary conditions. In this work, we propose some new preconditioned multilevel versions of inexact-Newton algorithms to solve these equations. We also develop another multilevel preconditioner for a nonlinear GMRES algorithm. All of the preconditioners are based on incremental unknowns formulations. 0 Elsevier Science Inc., 1997
1.
INTRODUCTION Recent
advances
in computational
fluid
dynamics
techniques
and
com-
puter technologies have led to the development of very powerful tools for the simulation and analysis of fluid flow. In particular, incremental unknowns
are an implement arising from the mathematical theory of Navier-Stokes Equations (NSE) [2]. Although the primary motivation of their introduction APPLIEDMATHEMATICSAND COMPUTATION87:289-311 (1997) 0 Elsevier Science Inc., 1997 00963003/97/$17.00 655 Avenue of the Americas, New York, NY 10010 PI1 s0096_300x96)003049
0. GOYON AND P. POULLET
290
was to hierarchize the solution of the evolutive NSE in finite differences, Chen and Temam proved that incremental unknowns was a powerful and simple technique to accelerate resolution of elliptic-like problems [3, 41. In order to extend this result, in this study, we are interested in the implicit resolution of a steady-state nonlinear system of equations introduced by R. Temam in its evolutive form [5] and used by J. Laminie, et al. [6]. For this purpose, we implement two of the-most common methods: the Newton method [i’] (in fact, a truncated iterative Newton method [8, 91) and the nonlinear GMRES method [lo] (also called Newton-GMRES method). As it will be shown, such methods lead to solving several non-symmetric linear systems. The use of incremental unknowns formulations allows us to define two easy ways of multilevel preconditioning. The first one (applied to the truncated iterative Newton method) is to solve linear systems in the incremental unknowns basis by the Bi-cgstab [ll] and TF-QMR methods [12]. The second one consists of using incremental unknowns in the scaled preconditioned nonlinear GMRES algorithm. Even if, some of our methods can be considered as quasi-Newton methods [13-151, we set our study aside from this concept. The paper is organized as follows. In section 2 we present the Newton methods. In section 3 we introduce the incremental unknowns formulation. Section 4 deals with the model problem and its discretization. After verifying properties of iterative Newton methods, we detail algorithms of two variants of Newton’s method in section 5. The incremental unknowns GMRES method is introduced in section 6. A numerical discussion is presented in section 7. The last section is devoted to concluding remarks and future extensions. 2.
CONVERGENCE PROPERTIES OF NEWTON, ITERATIVE NEWTON AND INEXACT NEWTON METHODS Let
Fu=O
(1)
be a system of m nonlinear equations where the Jacobian matrix F’ is sparse and non-symmetric. We denote by u* the solution (if it exists) of the system. The Newton method can be derived by assuming that the following linear mapping
Lk( u) = Fuk + F’( u”)( u - u”)
(2)
Preconditioned Newton Methods using Incremental Unknowns
291
is a “good” approximation to F, where uk is an approximation to u*. In this case, a probably better approximation ukt ’ to u* can be obtained by solving Lk( 21) = 0.
(3)
This is equivalent to solving F’(&)(u-
u”) = -Fuk
(4)
or F’( &) sk+i = -Fuk u = ski1 i- uk,
(5) (6)
and then u becomes a new approximation to u*, and can be denoted by Uk+
1
The sequence of iterates ( uk) = u”, ul, . . . , uk is called the overall iteration of the Newton method. Under suitable assumptions, the sequence is well defined and converges quadratically to u*. The resolution of (5) to obtain sk+ ‘, will be called the inner iteration. If one decides to factorize F’(uk) (this must be done at each overall iteration), this would be inefficient computationally and a large amount of storage (O(mlog( m)) locations) would be required for each factorization (see Sherman [lS]). Clearly, the cost of the Newton method is essentially linked to the cost of each inner iteration of the method. Thus, a good choice to reduce the cost of the method is to develop fast solvers of (5). If these solvers are iterative methods, one then speaks about Newton-iterative methods. In this case, an iteration of the iterative methods to solve (5) is called a minor iteration. The main disadvantage of Newton-iterative methods (compared to the classical Newton method) is that the quality of the approximate solution sk+’ of (5) depends on the condition number of the matrix F’( uk> as it is underlined in the following proposition.
PROPOSITION 1.
full ranlc matrix, IIBil + 0, then
Let Y be an approximate solution of AX = B with A a and let R be the residual R = A Y - B. Assuming that
II Y - XII II XII where
K(
II Rll G KC4)IBll
A) is the characteristic number )IA-‘[[ X IIAll.
(7)
0. GOYON
292
AND P. POULLET
Owing to (7), it is obvious that if the matrix A is ill conditioning A) z=-l), then the residual R may be small and Y not close to X. In other words, if F’( uk> is ill conditioning, the residual of the iterative method (to solve (5)) may be small and sk+ ’ not be a “good” approximate solution of - [ F’( 21k>]-l( F&). C onsequently (5) is only approximately solved. Let F be a nonlinear mapping from Iwm into II3m with the following properties:
(K(
(i) (3) F is continuously differentiable in an open convex set 8, (ii) wz) there exists u* E .9 such that Fu* = 0 and F’( u*) is non-singular.
THEOREM 2. The equation Fu = 0 has a locally unique solution u* and there is an open set 9 which contains u* such that Vu0 ~9, the Newton iterates (5)-(6) are well define4 remain in 9 and converge to u*. More over, there is a sequence { ak} -+ 0 such that kfl II’LL
-
u*u
<
(YkllUk
-
u*II
(8)
(the convergence is super-linear). If FL also satisfies a Lipschitz condition at u* : there exists K such that
llF;( u) - F;( u*) 11 =GKllu -
u*II,
(9)
then there exists a constant p such that
llu k+l - u*11< plluk - u*llZ (the convergence
(10)
becomes quadratic).
The proof can be found in Ortega and Rheinboldt [7]. A disadvantage of the Newton method is that u” must be close to u*. This is a consequence of set 9 being likely small in Theorem 2. Some techniques have been used to overcome this (see Powell [17] or Brown and Saad [lo]). The class of inexact Newton methods has been studied by Dembo, Eisenstat and Steihaug in [18]. They have denoted by rk+ ’ the residual of (5), where rk+l
= F’(
uk) Sk+1 + Fuk
(11)
Preconditioned Newton Methods using Incremental Unknowns
293
and proved a local convergence theorem of inexact Newton’s method. They also characterized the convergence order of such a method with residual’s rate of convergence:
THEOREM 3.
If ](@+l I] Q 7711 Fhzlk]12 (where 71 is any nonnegative
stunt) then the sequence{
uk} of inexact Newton iterates converges
~0% quadrati-
cally to u*. In the class of inexact Newton methods, we can include truncated Newton methods [8], [9], which perform only a limited number of iterations for the resolution of (5) by an iterative solver. As (5) is not exactly solved, we are still in the case of inexact Newton methods. For the stopping criterion, we use a simple form of truncated Newton strategy: at the kth overall iteration, we impose
IIT Ic+lll
< cx 10-k IIFhukl12
(12)
where C depends on the system (1) to solve. By Theorem 3, (12) ensures better than the quadratic convergence. 3.
THE INCREMENTAL
UNKNOWNS
FORMULATION
Let us present the incremental unknowns formulation in the case of two grids in two dimensions. The coarse grid (G,) consists of nodes of type (2 i, 2j) and the fine grid consists of type (2 i + 1,2j), (2 i, 2 j + 1) and (2i + 1,2j + 1). If we denote by U an unknown of the initial system, by Y an unknown of the coarse grid and by Z an unknown of the fine grid (G,), Z are called incremental unknowns. Following Chen and Temam [3], formulas which define Y and Z are
1 yzi,zj
=
uzi,zj
zzi+1,2j
=
uzi+1,2j
-
a[
u28,23
+
k+2,2j]
zzi,2j+1
=
uzi,zj+1
-
i[
u2i,23
+
4s,2j+2]
-
a[
&,2j
+
uzz,zj+z
Z2i+1,2j+1
=
&,2j
+
uzi+z,zi
+
Gi+2,23+2]
(13)
0. GOYON
294
AND P. POULLET
Using recursively the last formulas on several grids, we are able to define multilevel schemes in incremental unknowns. Let A be the matrix of the initial linear problem where unknowns are set grid by grid, let S be the transfer matrix of (13). The initial nodal system is rewritten as
STASii = STb
(14
where ii is the reordered vector of incremental unknowns and b the reordered right hand side. In the case of some elliptic problems (Poisson equation), it is demonstrated in [4] that if K( A) is the condition number of A, we have
.(sTAs)
a
K(
(15)
A).
Moreover, the condition number behaves like o( N2> for A, while it is dlog’( N)) for STAS in the two-dimensional case. Although the structure of STAS is complex in dimensions greater than one, numerical tests ([19] or [201) proved that we can execute efficiently the matrix-vector product STAS.Z by a splitting. A study of the algebra of incremental unknowns in dimension three can be found in [19, 201. 4.
THE MODEL PROBLEM
AND TEST FUNCTIONS
In this section we present the model use: let us introduce an algebraic system from a steady-state Navier-Stokes-like called because it presents the main part original NSE.
-vAu+
(u.V)zd+
problem and the test functions we of nonlinear equations which comes problem [5]. This problem is so of computational difficulties of the
+(V.u)u=f
u=o where v is a strictly positive (small) number.
in R
(16)
on an
(17)
Preconditioned
Newton Methods using Incremental Unknowns
295
By using centered second order finite difference schemes, we are looking for the discrete solution u,, = ( ui, j, v,, $ of Fhuh = 0, namely
++
-&+,., - %l,j >+&,j+1- V&j-l) [ 1 z
Y
1 z(‘i+i.j
+ %,j
- “i-1,j)
z
[
1 -
Si,j
=
(2 =Gi, Ur,i
=
0
v, 3 = 0
O
j <
&_I,
NY_& (19)
if i or j E {l, N,, NY},
(20)
if ior Jo
(21)
(1, N,, NY}.
It is easy to verify that the Jacobian matrix F,‘(ui) of (18)-(19) at the kth overall iteration is sparse and non-symmetric. In order to study properties of the algorithm, first tests are performed with exact solutions (called first test function): we determine f,, j and gi, j such that U( 2, y) = sin(27rs)
X sin(27ry)
in R =]O, 1[2,
u( 5, Y) = 4 x, y) > U( Z, y) = w( CC,y) = 0
(22) (23)
on the boundaries,
are solutions (the discrete approach has also been developed in [20]).
0. GOYON
296
AND P. POULLET
The exact solutions (22)~(231, are denoted by u* and the computed solutions (at the kth overall iteration) are denoted by uk. In the theoretical part of this work (1111, we have studied a finite difference discretization applied to Navier-Stokes-like equations with home geneous Dirichlet boundary conditions. Since Navier-Stokes equations can generate boundary layers, our aim is also to choose numerical tests which generate boundary layers. We overcome this difficulty by setting the right hand side of the equations to zero with non-homogeneous Dirichlet boundary conditions by modifying conditions (20)~(21). A possible choice (already used for 2D Burgers equations by Jain and Holla and Goyon [21, 191) is given by the following initial conditions (called the second test function): a”( 2, y) = sin( 7rz) + cos( 7f y)
in ;iz = [0, 0.512
(24)
w”( 2, y) = 2 + y
in a.
(25)
The boundary conditions follow from the initial conditions. This type of solution is interesting by the growth of gradients along boundaries (see Fig. 1). 5.
EXPERIMENTAL ANALYSIS NEWTON ALGORITHMS
OF NEWTON
AND TRUNCATED
In this section, we first study convergence properties of the iterative Newton method. Then, we present and justify our choice of truncated strategy. We also discuss the optimization of Newton algorithms on vecto rial computers. 5.1. Rate of Convergence All of the following computations were carried out on a SUN Spare 20 with double-precision. For the stopping criterion of the overall iteration, we have chosen 11 Fhu# < lo-“. Each inner iteration is solved by the Bi-cgstab method [ll]. We recall that: ??
The linear convergence is achieved if there exists a constant (Y such that
IIUk+l ??
-
u*II 4 alluk - u*ll.
(26)
The super-linear convergence is achieved if there exists a convergent sequence (ok} decreasing to zero such that
IIUkfl
-
u*
11 <
(Ykll uk
-
u*
I).
Preconditioned Newton Method-s using Incremental
Unknowns
SECOND
TEST
FUNCTION
129x129
nodes
SECOND
TEST
FUNCTION
129x129
nodes
0 _
A
h
d
x
0
.
,
” >
0 0
FIG. 1.
Second test function.
v = 0.01.
297
0. GOYON AND P. POULLET
298 ??
The quadratic convergence is achieved if there exists a constant that
llu k+l - u*11< plluk - u*1i2.
p such
(28)
In order to verify the rate of convergence of our algorithm, we study (thanks to exact solutions) the sequence ok and p. Fig. 2 shows the decreasing of the sequence ok, for various values of v. Nevertheless, we notice, at v = 0.02, a particular last value of the sequence. This can be explained by rounding errors in operations with non-significant small numbers ( < 10-15) when the residual of the Newton iteration is close to zero. By referring to Table 1, we can notice bounded variations, with v, of the value of p: the algorithm seems to have a quadratic rate of convergence. 5.2. A Truncated Strategy The idea of truncated Newton methods is to control the iteration number of the iterative method to solve (5) and stop its resolution early by using (12). If Sk+ 1 allows us to determine the descent direction of the Newton algorithm, a truncated strategy gives an approximation of this descent direction. In order to conserve the fast convergence property of the standard Newton method, the approximate descent direction obtained by a truncated strategy must be close to the Newton descent direction. In [9], Dembo and Steihaug showed that, as one approaches the solution, one makes the truncated Newton method get closer to standard Newton method. It seems obvious that preconditioning techniques can be used in conjunction with truncated strategies. In Tables 2-4, we compare CPU time, the number of overall iterations and the cumulative number of minor iterations of the two approaches. We set the parameter C = l/1000 in (12) to define a test smaller than the residual of the first overall iteration. From Table 2 to Table 4, we experimentally show that truncated Newton methods always reduce the cumulative number of minor iterations (without changing the number of overall iterations) and then also reduce the CPU cost of the resolutions. 5.3.
Computational
Aspect
In this section, we discuss the ability of the algorithm to be implemented on vectorial computers. For this purpose, the following numerical tests have been performed on the Cray C94/C98 of the CNRS I.D.R.I.S. center. By
Preconditioned
Newton Methods using Incremental
Unknowns
FIRST TEST FUNCTION, 65x65 nodes. +++-++nu = 0.1 Mnu =0.05 wnu=0.02
10" 8 5 3 10" iT 2 Q 10" m s 10"
10"
1
2
3
4
5
6
7
6
9
10 11 12 13 14 15
FIRST TEST FUNCTION, 129x129 nodes. Rateofconvergence +-+nu = 0.1 Q--cInu =0.05 Df3nu =0.02
2
3
4
5
6
7 6 9 10 11 12 13 14 15 Overalliteration
FIG.2. The sequence (I~
299
0. GOYON AND P. POULLET TABLE 1 FIRST TEST FUNCTION, DIFFERENT VALUES OF /i?,652 AND 12g2
65 129
X x
65 nodes 129 nodes
NODES
v = 0.1
V = 0.05
v = 0.02
0.2 < p < 0.4 0.2 < p < 0.4
0.4 < p < 0.9 0.4 < p < 0.9
0.1 < p < 1.7 0.1 < p < 1.7
TABLE 2 FIRST TEST FUNCTION, 1% X 129 NODES, v = 0.1
Iterative Newton method Truncated Newton method
Number of overall iterations
Cumulative number of minor iterations
CPU time in seconds
5 5
1328 888
255s 175s
TABLE 3 FIRST TEST FUNCTION, 1% x
Iterative Newton method Truncated Newton method
129 NODES, V =
0.05
Number of overall iterations
Cumulative number of minor iterations
CPU time in seconds
6 6
1586 1027
303s 200s
TABLE 4 FIRST TEST FUNCTION, 129 x
Iterative Newton method Truncated Newton method
129 NODES, V =
0.02
Number of overall iterations
Cumulative number of minor iterations
CPU time in seconds
11 11
3396 2075
640s 395s
Monitor Cray-software (HPM) we could determine the number of million floating operations per second (M-flops) of our program. A large number of M-flops indicates a high level of vectorization (the peak performantie of the C94/C98 has been estimated at 960 M-flops per processor).
using the High Performance
Preconditioned Newton Methods using Incremental Unknowns
301
TABLE 5 M-FLOPSPERFORMANCE
Number of nodes
65 x 65
129 x 129
257 x 257
513 x 513
440
548
563
572
M-GOODS
In the following table, we summarize the results of various discretizations. These tests have been achieved with v = 0.1 and the first test function. From Table 5, we notice that the basic operations (especially in matrixvector products) are executed in a vectorial manner in most cases. 6.
THE INCREMENTAL UNKNOWNS METHOD STANDARD PRECONDITIONER FOR THE INNER ITERATIONS
AS A
In this section, we present two solvers, using incremental unknowns formulations, for the resolution of (5). They are based on the Van der Vorst’s Bi-cgstab method [ll] and the TF-QMR method [12] proposed by R. W. Freund. It is well known nowadays that these methods are one of the most popular ways of solving large sparse non-symmetric systems arising from partial differential equations. 6.1.
The Incremental
Unknowns Bi-cgstab
(I. U.-Bi-cgstab)
Method
We present a Bi-cgstab algorithm in the incremental unknowns basis, for the resolution of (5): The I. U.-Bi-cgstab
algorithm
Choose So, set f,, = -STF,ui - STF,‘(ut)Ss,, ?? choose F, such that (To, rO> # 0 (for example To = T,,), .&=(Y=W = 1 (tree scalars), ?? v,=p,=OO(two vectors). (1) For i = l,... do:
(0)
?? ??
??
p, = (70, f&,1, p, -ff
. B= ??
.
-x-.
Pi- 1 Pi=r. z-1 + &J:_, vi = STF;(u;)Spi, Pi
-
wi-lz)i-l),
0. GOYON AND P. POULLET
302 I1 - av,, . t = STF,,‘(u;)Sx,
.
x=f.
.
@i=
.
&=S.
(4
xl
(t,t)T z-1
+
api
+
wix,
if Si is accurate enough then goto (21, ?? ri=x-qt. (2) End of the resolution. ??
We notice that all matrix-vector products involve the transfer matrix S and its transpose. This produces a change of basis and then a preconditioning-like behavior. 6.2. The Incremental Unknowns TF-QMR (I. U.-TF-QMR) Method Recently Freund introduced TF-QMR, a variant of QMR, a new method for solving a non-symmetric linear system. As we will see later, it works well in incremental unknown basis. The I. U.- TF- QMR algorithm (0)
?? ??
Choose 5,, set p0 = x0 = r0 = - STF,,u; - STF;(u;5)Ss,,
set v, = STF,‘(ui)Spo, set d, = 0, ?? set r0 = q = IlqJl, 6, = 0, ?? set To = 0, ?? choose 7‘ ,, such that p,, = (F,, r,,> Z 0 (for example F, = ?,,), (1) For i = 1, . . . do: ?? set a,_i = (To, v,_i), ??
??
Pi-1
??
??
set oi_i = Oi-1’ Qi =
5i-1
-
?? ri=Ti_l-
(Yi_lVi_1, ~i_,STF,‘(U~)S(Zi_l
(2) For j = 2i oj+li2i do: ??
set 0, = -, rj-
??
set cj =
??
set rj =
1
1 Jiq’
7j_lejcj, dj_l where
+
qi)>
Preconditioned Newton Methods using Incremental
??
y3 = 2. ,_,ifj=2i-land Sj=Sj_i+qjdj,
303
Unknowns
yj=qiifj=2i,
?? if Zj is accurate enough then goto (4). (3) 0 Set ri = (fO, T,),
.&=F, ?? ?? ??
Zi = r; : pigi, P, = xi + Pi( Qi + Pi P,- A vi = STF,‘(~;)Sp,.
(4) End of the resolution. One can refer to section 8 for the discussion of numerical results. 7.
AN INCREMENTAL UNKNOWNS NONLINEAR GMRES METHOD
PRECONDITIONER
FOR
Brown and Saad [lo] have proposed the nonlinear GMRES method, which roughly consists of an inexact Newton algorithm where the inner iterations are solved by the GMRES method. Even if Krylov methods use an approximate resolution of the Jacobian system, in our finite differences case, we obtain a Jacobian matrix which does not need a matrix storage. Moreover, since its evaluation cost is very similar to the approximate one, we decide to use the exact discrete Jacobian matrix. One of the advantages of our GMRES implementation is to improve global convergence properties. Indeed, with the local quadratic model applied to the mapping: u * I]Fhzlllz (where (I.((2is the Euclidean norm), the attractive basin of Newton-like methods is well extended [23, lo]. Notations above follow Brown and Saad [lo]. Let us denote Kd ( d,,, = 50) the classical Krylov subspace spanned by (f(O), Jr’@, . . . , fdel)do)}. To form the approximate solution (by GMRES) we need to find the solution of the following least squares problem: min \Ir(O) - Jz\j =
min 11pe, - Tid~11.
%-ZKd
ye88d
The classical way to perform this is to factorize & into Qd R, with plane rotations, where Qd is an unitary (d + 1) X (d + l)-matrix and R, an upper triangular (d + 1) X d-matrix with last row being null [22]. The implementation is quite simple, because we update progressively the factorization as each column appears in B,. After d steps, the preceding decompo sition is achieved and, since Q,?i, = R,, 11/?el - ?i, yll = 11gd - Rd ?JII
where gd = Qd /3 e,.
304
0. GOYON AND P. POULLET
Hence the approximation solution ui” is updated by projecting y (the solution of the minimization problem over Rd) onto V,, the orthonormal basis of Kd. It is worth pointing out that it is not necessary to compute the approximate solution to obtain pj in the algorithm. Line-search backtracking procedure is based upon Dennis and Schnabel ideas [23], [ll]. Form an approximate solution uk, and let p be a descent direction. We want to find a step in this direction that yields an acceptable uk+ ‘. This choice is led by Goldstein-Armijo a-condition [23, lo]. The detailed algorithm is given in [lo]. In order to improve the efficiency of their algorithm, Saad and Schultz have suggested the use of preconditioning for solving the linear systems. This leads to transforming all systems in equivalent ones, thanks to a change of basis of type: &=
(Q-‘J(u;)P-‘)f.
(29)
For further details, see [22]. It is of prime importance to ensure that the preconditioned problem will be easier to solve than the original one. We propose to use the incremental unknowns basis as a preconditioner with Q-l = I and P-’ = S. The algorithm becomes: The I. lJ.- GMRES algorithm (0) Choose u”. (1) Arnoldi process: choose a tolerance E, let 6(O) be an initial guess f (‘) = -(F 5 = F;(u;) x S, ’
+ 56(O)> where F = F,,ui and
compute p = ]]~(~)]]zand 6,, = f #O for each j = 1,2,. . . do: compute js,
(c) Cd)&+1,j 6)
_
vjj+ 1
= =-
pj+llL y-k1 &+1,j’
compute the residual norm pj = IIF + %‘“ll~, we wish to obtain if we stop at this step, 62)if pj < E, then set d = j and goto (2),
(0
of the solution 6(j)
Preconditioned Newton Methods using Incremental
Unknowns
305
SECOND TEST FUNCTION, 513x513 nodes. Residual of the nonlinear GMRES iterations
1o-7
10”
0
20
FIG. 3.
Convergence
40 Overall iteration of the nonlinear
60
80
GMRES method.
(2) Form the approximate solution (by GMRES): ?? let us define I& the Hessenberg (d + 1) X d-matrix, with nonzero entries are ii, j, and cd is formed by [ i$, . . . , Gd], ?? find j& which_ is the solution of the minimization problem minll @ei - Hd $11over all 5 in IWd(e, = (l,O, . . . , 019, compute &cd)= &‘) -I- sd’ where %d’ = qd cd, owing to line-search backtracking procedure, form PLY” = uk + S6(d). If uk+ ’ is accurate enough then goto (3), else goto (1). (3) End of the resolution. ?? ??
In Figure 3, we observe that the residual quickly decreases in the I.U. basis. More details will be given in the next section.
8.
NUMERICAL
RESULTS
It is well known that even if the Bi-cgstab method works well in many cases, it can exhibit irregular convergence behaviour and breakdowns due to possible divisions by zero in determination of coefficients (Y and p. To eliminate most of the breakdowns, Freund has proposed, with TF-QMR, a transpose-free quasi-minimal residual approach 1121.
306
0. GOYON AND P. POULLET
Nevertheless, in some cases, previous works (made by Goyon [19] and C. Linder, private communication) have exhibited very large zones (a huge number of iterations) for which the TF-QMR residual is almost constant. However, as it can be seen in the tables, these troubles only appear in nodal basis and produce relatively large CPU times. To prevent this erratic convergence, we limit the number of minor iterations to iter, = 1000 for each overall iteration (but there is no way to determine iter, a priori). In one case, for a small viscosity (Table 9), this limitation produces a non-convergent algorithm (noted by N.C. in the table). Moreover, an automatic restart has also been implemented, to prevent occurrences of division by zero. In order to study our multilevel algorithms, we develop a series of numerical tests (using the second test function) by varying the finest mesh size and the viscosity parameter. It is well known that in such an equation, the smaller the viscosity parameter is, the smaller the mesh size has to be to enable us to compute the solution. Tests made with the finest mesh (513 X 513 nodes) have been performed on the Cray C94/C98 (with a stopping criterion of the overall iteration &a = lo-‘). If we detail the blocks of the Jacobian matrix of (18)-(191, we observe that viscous terms are dominant for
vs
sup(?+) 4j
x h
where h = h, or h,
(30)
and then the Jacobian matrix is close (in a certain sense) to the Laplacian matrix. Previous works, made by Chen and Temam [3, 41, and the authors [19, 201, have already proved the interest of incremental unknowns in this case. In other words, the incremental unknowns preconditioner should be very effective for large viscosity, with decreasing effectiveness as the viscosity diminishes. For simplicity, we group our tests in two classes, although the limit is not easily defined: The first class corresponds to a dominant viscous term for the problem. We illustrate that by setting 0.01 d v < 1.0 and h = l/128 (Table 6) and by setting 0.005 < Y < 0.1 and h = l/512 (Table 7). In Table 6, computations have been made with a SUN Spare 20. ?? For the second one, we can feel the more convective character of the problem by the choice of viscosity and the mesh. This is linked with Table 9 for v = 0.001 and h = l/512 but also almost exactly with Table 8 for v = 0.005 and h = l/512. ??
Preconditioned Newton Method-s wing Incremental Unknowns
307
TABLE 6 EFFICIENCY OF THE I.U. PRECONDITIONER, VARIOUS VALUES OF THE VISCOSITY PARAMETER,
129
X
Bi-cgstab method Cumulative iterations Multilevel method CPU time Cumulative iterations Multilevel method CPU time TF-QMR method Cumulative iterations Multilevel method CPU time Cumulative iterations Multilevel method CPU time Nonlinear GMRES method Cumulative iterations Multilevel method CPU time Cumulative iterations Multilevel method CPU time
129
NODES, SECOND TEST FUNCTION
Y = 1.0
V = 0.5
v = 0.1
v = 0.05
v = 0.01
1103 206s 128 43s
790 150s 97 34s
1187 225s 139 47s
1526 287s 173 59s
1072 212s 1018 275s
1306 325s 94 38s
1811 447s 79 33s
2347 594s 85 35s
2186 550s 158 60s
5052 1238s 402 140s
1911 1234s 83 24s
1670 1063s 81 23s
1061 623s 97 29s
922 525s 120 39s
725 384s 312 131s
TABLE 7 EFFICIENCY OF THE I.U. PRECONDITIONER, VARIOUS VALUES OF THE VISCOSITY PARAMETER,
513 x 513
Bi-cgstab method Cumulative iterations Multilevel method CPU time Cumulative iterations Multilevel method CPU time TF-QMR method Cumulative iterations Multilevel method CPU time Cumulative iterations Multilevel method CPU time Nonlinear GMRES method Cumulative iterations Multilevel method CPU time Cumulative iterations Multilevel method CPU time
NODES, SECOND TEST FUNCTION
v = 0.1
v = 0.05
v = 0.01
v = 5.10-3
6065 452s 167 23s
6866 517s 204 28s
4978 369s 1303 175s
4762 350s 3404 466s
8810 680s 197 28s
7464 568s 226 31s
13013 967s 807 110s
11011 828s 1493 201s
9754 1556s 171 15s
7932 1250s 151 15s
3509 538s 387 48s
3312 508s 650 92s
308
0. GOYON AND P. POULLET TABLE 8
EFFICIENCY OF THE I.U. PRECONDITIONER,
513 X 513 NODES, 1 grid standard Bi-cgstab Cumulative iterations CPU time TF-QMR Cumulative iterations CPU time Nonlinear GMRES Cumulative iterations CPU time
VARIOUS NUMBER OF GRIDS, v =
5.10-3,
SECOND TEST FUNCTION
2 grids 3 grids 4 grids 5 grids 6 grids 8 grids 9 grids I.U. I.U. I.U. I.U. I.U. I.U. I.U.
4762
2054
1046
839
955
1688
3580
3404
350s
479s
191s
125s
134s
226s
488s
466s
11011
7484
4815
974
603
705
1100
1493
828s
1790s
884s
147s
83s
96s
147s
201s
3312
1244
661
404
337
369
532
641
523s
193s
93s
51s
41s
46s
73s
94s
computational tests, we In the context of (30), by means of unpublished first verified that truncated strategies are always useful with multilevel formulations by using Bi-cgstab and TF-QMR methods. We can also observe that, with a viscosity parameter fixed (Tables 6 and 7), the faster decrease of the cumulative number of minor iterations is obtained by the maximum number of grids; this decrease is accompanied by a decrease of the CPU time: the preconditioner is very effective. We connect this last result with the remark above about the Jacobian structure. TABLE 9 EFFICIENCY OF THE I.U. PRECONDITIONER,
513 X 513 NODES, 1 grid standard Bi-cgstab Cumulative iterations CPU time TF-QMR Cumulative iterations CPU time Nonlinear GMRES Cumulative iterations CPU time
VARIOUS NUMBER OF GRIDS, v =
1.10-3,
SECOND TEST FUNCTION
2 grids I.U.
3 grids I.U.
4 grids I.U._
5 grids I.U.
9 grids I.U.
7111 525s
7030 1660s
7028 1297s
8520 1316s
14338 2050s
Breakdown
N.C.
10424 2471s
4446 811s
2086 313s
2167 300s
5917 808s
1495 239s
1006 155s
916 144s
1104 179s
1802 311s
3766 685s
Preconditioned
Newton Methods using Incremental
Unknowns
309
In these tables, incremental unknowns multilevel methods consist of one node on the coarse grid and the maximum number of grids. In other words, when viscous terms are dominant, we have shown that I.U. formulations reduce the cumulative number of minor iterations of the Newton method. Nevertheless, incremental unknowns are more expensive (in CPU time cost of the matrix-vector products) than the standard one-level formulations (see [19]). By referring to Tables 6 and 7, we just notice that this disadvantage does not seem to be prohibitive (except for two cases with the Bi-cgstab method: 129’ nodes and v = 0.01, 5132 nodes and v = 0.005). As a general rule, multilevel resolutions spend less CPU time than the standard ones. When (30) is not certainly verified, the best strategy is not to apply the maximum number of grids: there is always a better configuration of grids. This point of view is illustrated in Tables 8 and 9 in which we study all possible configurations of grids. The efficiency of the preconditioner is the consequence of two main factors: the I.U. preconditioner effect (which can be evaluated by the cumulative number of minor iterations) and the relative cost of each matrix-vector product (which depends on the number of grids, see [19]>. Even if the best preconditioner effect is given by n, number of grids and if the lowest CPU time matrix-vector cost is always given by one grid, the conjunction of the two effects can produce a most interesting result for n2 number of grids. For example, in Table 9 with Bi-cgstab method, ni = 3 and n2 = 1. In this Table, we notice that the Bi-cgstab method generates breakdowns (divisions by zero). A comparative study exhibits that the most robust method with the lowest CPU time cost is the I.U.-GMRES method. Nevertheless, our algo rithmic choices (line-search backtracking procedure, maximal size of the Krylov subspace) require more core storage than the other methods. To reduce the storage cost, one can use restarting, but restarting slows down the convergence. For this reason, I.U.-TFQMR may be another good choice.
9.
CONCLUDING
REMARKS
AND EXTENSIONS
In this work we have presented new strategies for inexact Newton methods. In particular we have used a incremental unknowns multilevel formulation combined with Bi-cgstab, TF-QMR and GMRES methods as a preconditioner of the Jacobian matrix. In a lot of cases, numerical experiments have shown that our algorithms are at least as effective as the standard ones. As we have seen in our cases, resolution methods of linear systems do not require the use of symmetric matrices. It is not necessary to apply transforming operators involving incremental unknowns formulation
310
0. GOYON AND P. POULLET
in the same way as for linear symmetric elliptic problems (see section 3). If we consider the operators P and Q like algebraic transformations in the scaled-preconditioned nonlinear GMRES algorithm, we can set P = S-l and Q = I in order to accelerate the resolution of the nonlinear steady-state equations. We proved that it is the best way of preconditioning the nonlinear GMRES algorithm for our test problem. In other words, I.U.GMRES is the most efficient procedure, in terms of CPU time, among our preconditioned hybrid Krylov methods. By referring to classical multigrid solvers, our present idea is to develop V-cycle strategies using incremental unknowns. These cycles can be seen as a sequence of approximations of the Jacobian matrix: we can speak of quasi-Newton strategies. Furthermore, in this framework, several nonlinear multilevel approximations can be derived. Future works will be given in this direction. The authors are pleased to acknowledge d’llnalyse
Numkrique
d’Orsay
and
Prof. R. Temam,
the I.D. R.I.S.
center
the Laboratoire where
computs
tional tests have been performed.
REFERENCES
5 6
7 8 9 10 11
0. Goyon, Theoretical study of a finite difference scheme applied to steady-state Navier-Stokes like equations, Preprint Orsay, Orsay France, 95-86 (1995). R. Temam, Inertial manifolds and multigrid methods, SIAM J. Math Anal 21(1):154-178 (1990). M. Chen and R. Temam, Incremental Unknowns for solving partial differential equations, Num Math 59:255-271 (1991). M. Chen and R. Temam, Incremental Unknowns in finite differences: condition number of the matrix, SIAM J. of Matrix Analysis and Applications 14(2):432455 (1993). R. Temam, Sur l’approximation des equations de Navier-Stokes, C! R. Acad Sci. Paris Serie A 262:21+221 (1966). J. Laminie, F. Pascal and R. Temam, Implementation and numerical analysis of the nonlinear Galerkin methods with finite elements discretization, Appl Num Anal 14(2):219-246 (1994). J. M. Ortega and W. C. Rheinboldt, Iterative Solution of Nonlinear Equations in Several Variables, Academic Press, 1970. S. G. Nash, Tmncated Newton Method PhD thesis, Stanford Univ., Stanford, CA, 1982. R. S. Dembo and T. Steihaug, Truncated-Newton algorithms for large-scale unconstrained optimization, Math Bog. 26:190-212 (1983). P. N. Brown and Y. Saad, Hybrid Krylov methods for nonlinear systems of equations, SIAM J. Sci Stat. Comput 11(3):450-481 (1990). H. A. Van der Vorst, Bi-cgstab: a fast and smoothly converging variant of Bi-cg for the solutions of non-symmetric linear systems, SIAM J. Sci. Stat. Comput 13(2):631-644 (1992).
Preconditioned 12 13 14 15 16 17
18 19
20
21 22
23
Newton Methods using Incremental
Unlcnowns
311
R. W. Freund, A transpose-free quasi-minimal residual algorithm for non-hermitian linear systems, SIAM J. Sci. Stat. Comput 14(2):470-482 (1993). .I. E. Dennis Jr, On Newton-like methods, Numer. Math 11:324-330 (1968). J. E. Dennis and J. J. More, A characterization of superlinear convergence and its application to quasi-Newton methods, Math Comp. 28:549-560 (1974). J. E. Dennis and J. J. More, Quasi-Newton methods, motivation and theory, SIAM Review 19:46-89 (1977). A. H. Sherman, On Newton-iterative methods for the solution of systems of nonlinear equations, SIAM J. of Numer. Anal 15(4):755-711 (1978). M. J. D. Powell, A Hybrid method for nonlinear equations. In Numerical Methods for Nonlinear Algebraic Equations, (P. Rabinowitz, Ed.) GordonBreach, New York, 1970. R. S. Dembo, S. C. Eisenstat and T. Steihaug, Inexact Newton methods, SIAM J. of Numer. Anal 19(2):400-408 (1982). 0. Goyon, R&solution numi%ique de probkmes stationnaires et kvolutifs non liniaires par la mhthode des Inconnues Incrkmentalea Thesis Univ. Paris XI, Orsay France, 1994. P. Poullet, Resolution d’Equations de la m&anique des fluides par la mgthode des Inconnues Incrementales-simulation de la turbulence par la mtthode des grandes ichelles, Thesis Univ. Paris XI, Orsay France, 1996. P. C. Jain and D. N. Holla, Numerical solutions of coupled Burgers equations, Int. J. Nonlinear Mech 13:213-222 (1978). Y. Saad and M. H. Schultz, GMRES: A generalized minimal residual algorithm for solving non-symmetric linear systems, SIAMJ. Sci Stat. Comput 7:856-869 (1986). J. E. Dennis and R. B. Schnabel, Numerical Methods for Unconstrained Optimization and Nonlinear Equations, Prentice-Hall, 1983.