A domain decomposition preconditioner for an advection–diffusion problem

A domain decomposition preconditioner for an advection–diffusion problem

Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170 www.elsevier.com/locate/cma A domain decomposition preconditioner for an advection±di€usion pr...

567KB Sizes 2 Downloads 91 Views

Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170

www.elsevier.com/locate/cma

A domain decomposition preconditioner for an advection±di€usion problem Yves Achdou a,*, Patrick Le Tallec b, Frederic Nataf c, Marina Vidrascu d a

c

INSA Rennes, 20 Av des Buttes de Coesmes, 35043 Rennes, France b Universit e Paris Dauphine, 75 775 Paris Cedex 16, France CMAP UMR7641 CNRS, Ecole Polytechnique, 91128, Palaiseau Cedex, France d INRIA Rocquencourt, BP 105, 78 153 Le Chesnay Cedex, France Received 1 February 1999

Abstract We propose a generalization of the Neumann±Neumann algorithm for advection±di€usion problems: the Neumann conditions are replaced by suitable Robin conditions. The method is ®rst introduced and analysed in the case where the domain is partitioned into vertical strips, then generalized to an arbitrary decomposition. A coarse space procedure is proposed. Finally, numerical results for bidimensional and tridimensional tests are given. Ó 2000 Elsevier Science S.A. All rights reserved.

1. Introduction The primary objective of domain decomposition methods is the ecient solution on parallel machines of problems in Computational Mechanics set on complex geometries and approximated on very ®ne grids. Most methods refer to iterative substructuring techniques (the so-called Schur domain decomposition method) and use non-overlapping domain partitions which split the original domain into small disjoint subdomains and reduce the problem to an interface problem solved by an iterative method. The main issue conditioning the success and parallel eciency of such techniques is the choice of the preconditioner which must have nice parallel properties, must be able to handle arbitrary elliptic operators and discretization grids, and its performance must be insensitive to the discretization step h and to the number of subdomains. Many such preconditioners have appeared in the literature, following the early work of Bramble, Pasciak and Schatz [4] ([8,9,11,15,24]). For three-dimensional elasticity, ef®cient results have been obtained using either the wire-basket algorithm of Smith [21] or the Neumann±Neumann preconditioner of [23,22]. For advection±diffusion or Helmholtz problems, one can also choose arti®cial boundary conditions as interface conditions to improve the condition number [10,12,13,19]. For the analysis of overlapping domain decomposition methods for non-symmetric problems, see [5±7]. We propose herein a generalization of the Neumann±Neumann preconditioner for the Schur domain decomposition method applied to an advection±di€usion equation, by replacing the local Neumann boundary conditions by suitable Robin boundary conditions which take into account the non-symmetry of the operator. The choice of these conditions comes from a Fourier analysis, which is given in Section 2. When the operator is symmetric the proposed Robin boundary conditions reduce to Neumann boundary

*

Corresponding author.

0045-7825/00/$ - see front matter Ó 2000 Elsevier Science S.A. All rights reserved. PII: S 0 0 4 5 - 7 8 2 5 ( 9 9 ) 0 0 2 2 7 - 3

146

Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170

conditions. Also, as in the symmetric case, the proposed preconditioner is exact for two subdomains and a uniform velocity. The paper is organized as follows. In Section 2, the method is de®ned and analyzed for a decomposition into strips at the continuous level. In particular, it is shown that the preconditioned system is close to an idempotent operator. Then, the proposed preconditioner is constructed directly at the algebraic level. Next, a variational writing is introduced in Section 3 which makes it possible to generalize the proposed approach to general partitions and to include a coarse space. Numerical results illustrating the performances of the proposed technique conclude the paper.

2. The vertical strip case The general case of an arbitrary domain decomposition is treated in Section 3. 2.1. The continuous case ± de®nition of the algorithm We consider a rectangular domain X ˆŠ0; H ‰Š ÿ g; g‰, and the advection±di€usion boundary value problem in X L…u†  cu ‡ ~ a  ru ÿ mDu ˆ f in X; u ˆ 0 on oX:

…1†

The positive constant c may arise from a time discretization by an Euler implicit scheme of the time dependent equation. The equation is solved by a primal Schur method. We focus on the case when the domain is decomposed into non-overlapping vertical strips Xk ˆ …lk ; lk‡1 †Š ÿ g; g‰; 1 6 k 6 N . Let nk . Ck;k‡1 ˆ flk‡1 gŠ ÿ g; g‰: The unit outward normal vector to domain Xk is denoted by ~ We introduce  N ÿ1 1=2 N ÿ1  L2 …X† ! …H ÿ1=2 …Š ÿ g; g‰†† S : H00 …Š ÿ g; g‰† !   1 ovk ovk‡1 m ……uk †1 6 k 6 N ÿ1; f † 7! ‡ 2 onk onk‡1 Ck;k‡1

;

…2†

1 6 k 6 N ÿ1

where vk satis®es L…vk † ˆ f

in Xk ;

vk ˆ uk on Ck;k‡1 for 1 6 k 6 N ÿ 1; vk ˆ ukÿ1 on Ckÿ1;k for 2 6 k 6 N ; vk ˆ 0

…3†

on oX \ oXk :

It is clear that U ˆ …uCk;k‡1 †1 6 k 6 N ÿ1 satis®es S…U ; 0† ˆ ÿS…0; f †: At the continuous level, we propose an approximate inverse of S…:; 0† de®ned by 1=2 …Š ÿ g; g‰††N ÿ1 ; T : …H ÿ1=2 …Š ÿ g; g‰††N ÿ1 ! …H00   1 …vk ‡ vk‡1 †Ck;k‡1 ; …gk †1 6 k 6 N ÿ1 7! 2 1 6 k 6 N ÿ1

…4†

Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170

147

where vk satis®es

m

~ a ~ nk o ÿ 2 onk

~ a ~ nk o ÿ m 2 onk

L…vk † ˆ 0 in Xk ; ! …vk † ˆ gk

on Ck;k‡1 for 1 6 k 6 N ÿ 1;

!

…5† …vk † ˆ gkÿ1

on Ckÿ1;k for 2 6 k 6 N ;

vk ˆ 0 on oX \ oXk : Remark 1. In Section 3, the boundary value problem (5) is proved to be well-posed (see also [1]) under the usual condition c ÿ 1=2 div ~ a > 0. Remark 2. Our approach is different from that used in [19] or [12], where the interface conditions a ~ nk ; 0† are used in the framework of Schwarz algorithms. m…o=onk † ÿ min…~ In Section 2.2, we analyze the preconditioner in the special case where g ˆ 1, for simplicity. 2.2. Theoretical study of the algorithm ± uniform velocity Let us ®rst recall the Proposition 2.1 (see [1]). In the case where the plane R2 is decomposed into the left …X1 ˆŠ ÿ 1; 0‰R† and a is uniform, we have that right …X2 ˆŠ0; 1‰R† half-planes, and where the velocity ~ T  S…:; 0† ˆ Id: This result shows that, as it is the case for the Neumann±Neumann algorithm applied to symmetric operators, the Robin±Robin preconditioner is exact for a two domain decomposition. For a more general decomposition into an arbitrary number of vertical strips, the vector ~ a is denoted ~ a ˆ …ax ; ay †. We assume without loss of generality that the component of the velocity normal to the inof the subdomains. The main result of terface is positive ax P 0. Let H denote the minimum of the widths p the section is Theorem 3 below: it states that for expfÿ…ax ‡ a2x ‡ 4mc†H =mg small enough, the preconditioned system is close to an idempotent operator of order ‰N =2Š, where ‰xŠ denotes the integer part of x. This has an important effect on the convergence of GMRES applied to the preconditioned system, as it will be seen from Corollary 6 and Fig. 2. QN ÿ1 For Theorem 3, we introduce HsN ˆ kˆ1 H s …Ck;k‡1 †; s 2 R the space of H s functions on the N ÿ 1 interfaces. Let U 2 HsN be denoted by U ˆ …uk †1 6 k 6 N ÿ1 . The space HsN is endowed with the norm kU kHs ˆ sup1 6 k 6 N ÿ1 kuk kH s …Ck ;k‡1† . N;1

Theorem 3. Assume that the velocity field ~ a is uniform. Let X ˆ …0; HX †  R be decomposed into N nonoverlapping vertical strips Xk ˆ …lk ; lk‡1 †  R; 0 6 k 6 N ÿ 1 and Ck;k‡1 ˆ flk‡1 g  R. Let H be the size of the smallest subdomain, H ˆ mink …lk‡1 ÿ lk †. Assume that q …6†   expfÿ…ax ‡ a2x ‡ 4mc†H =mg < 1; and that q  3N =…1 ÿ †

N ‡1

< 1:

Then, for n P ‰N =2Š, we have n

k…T  S…:; 0† ÿ Id† kL…Hs † 6 N

N 1 q‰n=‰N =2ŠŠÿ1 ; 2…1 ÿ q† …1 ÿ †N ÿ2

where [x] denotes the integer part of x.

…7†

148

Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170

Remark 4. The above result explains why the GMRES algorithm stagnates initially during the [N/2] first iterations and then converges fast, see Corollary 6 and Figs. 2 and 3. This behavior is classical for convection± diffusion type problems, see for instance [19] or [13]. Remark 5. In fact, as it can be seen from formulas (13) below, as soon as q expf…ax ÿ a2x ‡ 4mc†H =mg  1; the operator T  S is already very close to identity and a better estimate can easily be obtained: p 6 expf…ax ÿ a2x ‡ 4mc†H =mg : …8† k…T  S…:; 0† ÿ Id†kL…Hs † 6 N …1 ÿ †3 p 2 The above theorem is thus interesting when e…ax ÿ ax ‡4mc†H =m is close to 1. This corresponds for instance to the case of a very large time step with a strong laminar flow. On the other hand, the above results cannot be applied when the operator is symmetric and very close to a Laplace operator (c  1†. Indeed, in order to have good convergence rates, it is then necessary to modify the Robin±Robin preconditioner (which amounts to the Neumann±Neumann preconditioner) by adding a coarse space and a pseudo inverse for the Neumann problem (5). Proof. The width of subdomain Xk is denoted by Hk ; Hk ˆ lk‡1 ÿ lk . The proof is based on the Fourier transform in the y direction and the Fourier variable is denoted by n. The inverse Fourier transform is denoted by Fÿ1 . The action of S…:; 0† can be expressed in terms of Fourier transform S…n†U^ …n††; S…:; 0†…U † ˆ Fÿ1 …c

U 2 HsN ;

where c S is a tridiagonal …N ÿ 1†  …N ÿ 1† symbol-valued matrix. For instance, the value of c S k‡1;k‡1‡j ; j ˆ ÿ1; 0; 1 is computed in the following way. Let um 2 H s …Cm;m‡1 †, m ˆ k ÿ 1; k; k ‡ 1. Let vk satisfy L…vk † ˆ 0 in Xk ;

…9†

with the boundary conditions vk ˆ uk on Ckÿ1;k ; vk ˆ uk‡1 on Ck;k‡1 ;

…10†

and L…vk‡1 † ˆ 0 in Xk‡1 ; vk‡1 ˆ uk‡1 on Ck;k‡1 ; vk‡1 ˆ uk‡2

on Ck‡1;k‡2 :

We shall compute the Fourier transform of 1=2m……ovk =onk † ‡ …ovk‡1 =onk‡1 ††Ck;k‡1 . The Fourier transform of (9) w.r.t. y yields vk …x; n†† ˆ 0; …c ‡ ax ox ‡ ay in ÿ moxx ‡ mn2 †…^ where i2 ˆ ÿ1. For a given n, this equation is an ordinary di€erential equation in x whose solutions have the form a…n† expfkÿ …n†…x ÿ lk †g ‡ b…n† expfk‡ …n†…x ÿ lk‡1 †g, where q a  4mc ‡ a2x ‡ 4iay nm ‡ 4n2 m2 x  : …11† k …n† ˆ 2m

Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170

149

We have, Re…k …n†† > 0, where Re…z† denotes the real part of a complex number z. The Dirichlet boundary < conditions (10) yield the values of a and b: ‡

u^k …n† ÿ eÿk …n†Hk u^k‡1 …n† ; a…n† ˆ ÿ ‡ 1 ÿ e…k …n†ÿk …n††Hk ÿ u^k‡1 …n† ÿ ek …n†Hk u^k …n† : b…n† ˆ ÿ ‡ 1 ÿ e…k …n†ÿk …n††Hk Hence, o^ vk o^ vk Ck;k‡1 Ck;k‡1 ˆ onk ox ÿ ÿ ‡ m…kÿ ÿ k‡ †ek Hk mk‡ ÿ mkÿ e…k ÿk †Hk ^ ˆ u^k ‡ u : k‡1 ÿ ‡ ÿ ‡ 1 ÿ e…k ÿk †Hk 1 ÿ e…k ÿk †Hk The expression for …o^ vk‡1 †=…onk‡1 Ck;k‡1 † can be obtained in the same manner. Then, we obtain !  ‡  ÿ ÿ ‡ ÿ ‡ m o^ vk o^ vk‡1 m…kÿ ÿ k‡ †ek Hk mk ÿ mkÿ e…k ÿk †Hk ÿmkÿ ‡ mk‡ e…k ‡k †Hk‡1 ‡ u^k‡1 ˆ u^k ‡ ‡ ÿ ‡ ÿ ‡ ÿ ‡ 2 onk onk‡1 2…1 ÿ e…k ÿk †Hk † 2…1 ÿ e…k ÿk †Hk † 2…1 ÿ e…k ÿk †Hk‡1 † Ck;k‡1

‡

‡ u^k‡2

m…kÿ ÿ k‡ †eÿk Hk‡1 : ÿ ‡ 2…1 ÿ e…k ÿk †Hk‡1 †

This yields the following formulas for c S ÿ

m…kÿ ÿ k‡ †ek Hk c ; S k‡1;k ˆ ÿ ‡ 2…1 ÿ e…k ÿk †Hk † ÿ

‡

ÿ

‡

mk‡ ÿ mkÿ e…k ÿk †Hk ÿmkÿ ‡ mk‡ e…k ‡k †Hk‡1 c ‡ ; S k‡1;k‡1 ˆ ÿ ‡ ÿ ‡ 2…1 ÿ e…k ÿk †Hk † 2…1 ÿ e…k ÿk †Hk‡1 †

…12†

‡

m…kÿ ÿ k‡ †eÿk Hk‡1 c : S k‡1;k‡2 ˆ ÿ ‡ 2…1 ÿ e…k ÿk †Hk‡1 † Similarly, we can compute the symbol-valued matrix c T: c ^ T…n†G…n††; T…G† ˆ Fÿ1 … c

G 2 HsN :

Hence, the preconditioned system reads Tc S…n†U^ …n††; TS…:; 0†…U † ˆ Fÿ1 … c

U 2 HsN ;

where c Tc S is a pentadiagonal matrix. After a tedious computation, we obtain that its non-zero entries are given by the following formulas: …c Tc S†k;k‡2 ˆ ÿ …c Tc S†k;k‡1 ˆ

‡

eÿk ÿ

…1 ÿ e…k

…1 ÿ e…k

…c Tc S†k;k ˆ 1 ‡ …c Tc S†k;kÿ1 ˆ

ÿk †Hk‡2 †…1 ÿ

ÿ

ÿk

‡

ÿ

ÿ e…k

ÿk‡ †Hk‡1 †

‡

ÿ

;

‡

e…k ÿk †Hk‡2 ÿ e…k ÿk †Hk ‡ eÿk Hk‡1 ; †Hk †…1 ÿ e…kÿ ÿk‡ †Hk‡1 †…1 ÿ e…kÿ ÿk‡ †Hk‡2 †

ÿ

‡

ÿ

‡

e…k ÿk †Hk ‡ e…k ÿk †Hk‡1 ; ÿ ‡ ÿ ‡ …1 ÿ e…k ÿk †Hk †…1 ÿ e…k ÿk †Hk‡1 † ÿ

e…k …1 ÿ

…c Tc S†k;kÿ2 ˆ ÿ

…Hk‡1 ‡Hk‡2 †

‡

ÿ

ÿk‡ †Hkÿ1

‡

e…k ÿk †Hkÿ1 †…1 ek ÿ

…1 ÿ e…k

‡

ÿ

ÿ

ÿ

ÿ e…k ÿ

‡

ÿk‡ †Hk‡1

e…k ÿk †Hk †…1 ÿ

ÿ e…k

ÿ

ÿ

…Hkÿ1 ‡Hkÿ2 †

ÿk †Hkÿ2 †…1

…13†

ÿk‡ †Hkÿ1 †

ÿ ‡ e…k ÿk †Hk‡1 †

:

ek

Hk

;

150

Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170

The above expressions are valid for any k as long as the indices make sense and except for the following entries of the matrix c Tc S, where the Dirichlet boundary conditions (5) in the Robin±Robin preconditioner yield some minor modi®cations: …c Tc S†1;1 ˆ 1 ‡

ÿ

‡

ÿ

‡

e2…k ÿk †H1 ‡ e…k ÿk †H2 ; ÿ ‡ ÿ ‡ …1 ÿ e2…k ÿk †H1 †…1 ÿ e…k ÿk †H2 † ÿ

‡

ÿ

‡

e…k ÿk †H3 ‡ e…k ÿk †H1 ‡ eÿk H2 ; ÿ ‡ ÿ ‡ ÿ ‡ …k ÿk †H …k ÿk †H …k ÿk †H 3 †…1 ÿ e 2 †…1 ‡ e 1† …1 ÿ e

…c Tc S†1;2 ˆ

…c Tc S†N ÿ1;N ÿ1 ˆ 1 ‡

ÿ

‡

ÿ

‡

e2…k ÿk †HN ‡ e…k ÿk †HNÿ1 ; ÿ ‡ ÿ ‡ …1 ÿ e2…k ÿk †HN †…1 ÿ e…k ÿk †HNÿ1 † ÿ

…e…k

ÿk‡ †HN ÿ2

ÿ

‡

ÿ

‡ e…k ÿk †HN †ek HN ÿ1 : ÿ ‡ ÿ ‡ ÿ ‡ …1 ÿ e…k ÿk †HN ÿ2 †…1 ÿ e…k ÿk †HNÿ1 †…1 ‡ e…k ÿk †HN †

…c Tc S†N ÿ1;N ÿ2 ˆ

Tc S is very close to the identity. But, for For large values of n, it is clear from the expressions of k …n† that c n close to zero and c  1 (quasi steady state problems), the extra diagonal entries … c Tc S†k;kÿ2 are close to ÿ1 while the other extra-diagonal entries remain small. This is the reason why a theoretical study of the convergence has to be done. For this, matrix c Tc S is written as the sum of three matrices: the Identity, a c b nilpotent matrix of order ‰N =2Š … A† and a small perturbation … B†: c b Tc S ˆ Id ‡ c A ‡ B;

…14†

where c A ˆ …c A k;l †1 6 k;l 6 N ÿ1 ˆ



…c Tc S†k;l 0

for l ˆ k ÿ 2; 3 6 k 6 N ÿ 1; otherwise:

b is given by (14). From the structure of c The value of B A, it is clear that c A ‰N =2Š ˆ 0: In order to estimate the norms of the matrices, we ®rst have to de®ne the norms that will be used. Let x ˆ …x1 ; . . . ; xN ÿ1 † be a vector in RN ÿ1 and kxk1 denote the maximum norm, kxk1 ˆ sup1 6 k 6 N ÿ1 jxk j. For a matrix D ˆ …Dij †1 6 i;j 6 N ÿ1 2 R…N ÿ1†…N ÿ1† , the corresponding operator norm is kDk1 ˆ supi Rj j Dij j: kDx k1 6 kDk1 kxk1 . By applying the result of [17] (see also [18]), we have the following estimate. Proposition 2.2. If b q…n†  k B…n†k 1

‰NX =2Šÿ1 iˆ0

kc A…n†ki1 < 1:

…15†

Then, define C…n† by C…n† 

‰NX =2Šÿ1 1 kc A…n†ki1 1 ÿ q…n† iˆ0

…16†

we have for n P ‰N =2Š, b n …n†k ˆ k… c Tc S ÿ Id†n …n†k1 6 C…n†q…n†‰n=‰N =2ŠŠÿ1 : k… c A ‡ B† 1

…17†

Going back to the proof of Theorem 3, we shall estimate q…n† independently of n. Let  be as in Theorem 3, ‡

jeÿk

…n†Hk

j6

Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170

151

b for any n. Therefore, the norm of c A…n† is bounded by 1=…1 ÿ †2 and that of B…n† by 5=…1 ÿ †3 . Then, by i N ÿ2 for i 6 ‰N =2Š we get estimating k c A…n†k1 by 1=…1 ÿ † ‰NX =2Šÿ1 iˆ0

kc Ak1 6 i

N 1 : 2 …1 ÿ †N ÿ2 N ‡1

Hence, q…n† may be roughly estimated by 3N =…1 ÿ † . By using this estimate and Parseval±Plancherel theorem, we can conclude. We now consider the implications of Theorem 3 for the GMRES algorithm (see [20]) applied to the problem …T  S…:; 0††…U † ˆ ÿT  S…0; f †:

…18†

Since T  S…:; 0† is a bounded operator from HsN into itself, it is possible to de®ne a ``continuous'' GMRES algorithm. For this, we use a norm induced by a scalar product. The space HsN is endowed with the norm v u N ÿ1 uX kuk k2H s …Ck;k‡1 † : kU kHs ˆ t N ;2

We have kU kHs

N ;1

kˆ1

p 6 kU kHs 6 N ÿ 1 kU kHs : N;1

N ;2

0 ˆ 0. Hence, the For the sake of simplicity, the GMRES algorithm is considered with an initial guess, Ugmres initial residual is

r0 ˆ T  S…0; f †: The Krylov space Kn …r0 † is given by Kn ˆ Spanfr0 ; …T  S…:; 0††…r0 †; . . . ; …T  S…:; 0††nÿ1 …r0 †g: n is de®ned by The GMRES iterate Ugmres n † ‡ r0 kHs ˆ minn kT  S…:; 0††…U † ‡ r0 k: k…T  S…:; 0††…Ugmres U 2K

N ;2

…19†

Since the k:kHs norm is induced by a scalar product, the above minimization problem amounts to the N ;2 solving of a n  n square linear system. We have the n be defined as in (19) and the residual at Corollary 6. Let r0 ˆ T  S…0; f † be the initial residual. Let Ugmres n n 0 step n be defined by r ˆ …T  S…:; 0††…Ugmres † ‡ r . Then, under the assumptions of Theorem 3, we have the following estimate for n P ‰N =2Š

krn kHs 6 Cqn=‰N =2Šÿ1 kr0 kHs ; N ;2

N ;2

where p C ˆ N ÿ1 1‡

5 3

…1 ÿ †

‡

!

1 2

…1 ÿ †

N 2

2…1 ÿ q† …1 ÿ †N ÿ2

and q and  are defined in Theorem 3. Remark 7. Since this result is stated at the continuous level, it is a clear indication that the same kind of estimate should remain valid as the mesh size tends to zero.

152

Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170

Proof. The proof is based on a comparison between a ®xed point algorithm and the GMRES method de®ned above. Indeed, we introduce the following algorithm 0 Ufixed

pt

n‡1 Ufixed

pt

ˆ 0;

  n ˆ …Id ÿ T  S…:; 0†† Ufixed pt ÿ T  S…0; f †:

n It can easily be checked that Ufixed

pt

2 Kn . Then, by (19), we have

n n 0 † ‡ r0 kHs 6 k…T  S…:; 0††…Ufixed k…T  S…:; 0††…Ugmres pt † ‡ r kHs : N ;2

N;2

We now have to estimate the right-hand side of the above inequality. We have n Ufixed

pt

ˆÿ

nÿ1 X i …Id ÿ T  S…:; 0†† T  S…0; f †: iˆ0

Under the assumptions of Theorem 3, this series converges to the solution U of (18), U ˆÿ

1 X i …Id ÿ T  S…:; 0†† T  S…0; f †: iˆ0

Hence, n 0 n …T  S…:; 0††…Ufixed pt † ‡ r ˆ …T  S…:; 0††…Ufixed

pt

ÿ U†

i ˆ ÿT  S…:; 0†R1 iˆn …Id ÿ T  S…:; 0†† T  S…0; f † i

0 ˆ ÿT  S…:; 0†R1 iˆn …Id ÿ T  S…:; 0†† r :

This gives i

n 0 1 0 k…T  S…:; 0††…Ufixed pt † ‡ r kHs 6 kT  S…:; 0†Riˆn …Id ÿ T  S…:; 0†† r kHs N ;2

N ;2

i

S…:; 0†R1 iˆn …Id

6 kT  ÿ T  S…:; 0†† kHs kr0 kHs N ;2 N;2 p i 1 6 N ÿ 1kT  S…:; 0†Riˆn …Id ÿ T  S…:; 0†† kHs kr0 kHs N ;1 N ;2 p i 1 6 N ÿ 1kT  S…:; 0†kHs Riˆn k…Id ÿ T  S…:; 0†† kHs kr0 kHs : N;1

3

N ;1

2

N ;2

i

The norm of T  S…:; 0† is estimated by 1 ‡ 5=…1 ÿ † ‡ 1=…1 ÿ † and that of …Id ÿ T  S…:; 0†† (for i P ‰N =2Š) by using Theorem 3. Remark 8. From formulas (12), it appears that under the assumptions of Theorem 3, the symbol c S…n† of the unpreconditioned system S…:; 0† has basically the following structure   m b 0 …n† ; c A 0 …n† ‡ B S…n† ˆ …kÿ ÿ k‡ †…n† Id ‡ c 2 b 0 is small. The factor m=2…k‡ ÿ kÿ †…n† which corresponds where c A 0 is a nilpotent matrix of order N ÿ 1 and B in the physical space to an unbounded operator may lead to an ill-conditioned linear system and then to a poor convergence of the GMRES algorithm, see Section 4. 2.3. The discrete case We suppose for simplicity that the computational domain is R2 discretized by a Cartesian grid. Let us problem. We denote A ˆ …Akl  a discretization of the advection±di€usion  ij †i;j;k;l2Z the matrix resulting from suppose that the stencil is a 9-point stencil Akl ˆ 0 for ji ÿ kj P 2 or jj ÿ lj P 2 . This is the case, for ij instance, for a Q1-SUPG method or for a classical ®nite di€erence or ®nite volume scheme. We have to solve AU ˆ F , where U ˆ …uij †i;j2Z is the vector of the unknowns. The computational domain is decom-

Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170

153

posed into two half planes x1 and x2 . We introduce a discretized form Sh of the operator S (we adopt the summation convention of Einstein over all repeated indices) Sh : RZ  RZZ ! RZ 1 1l 1 1l 2 2l 2 ……u0j †j2Z ; …Fij †i;j2Z † 7! …Aÿ1l 0j vÿ1l ‡ Bj v0l ‡ A0j v1l ‡ Bj v0l ÿ F0j †j2Z ;

where …vmij † satisfy m Akl ij vkl ˆ Fij

vm0j

ˆ u0j

for i < 0 if m ˆ 1

and

i > 0 if m ˆ 2; j 2 Z;

for j 2 Z:

0l 1l 2l are the contributions of the domain xm to A0l The coecients Bml j 0j ; A0j ˆ Bj ‡ Bj . For example, if 1l 2l 0l ~ a ˆ 0; Bj ˆ Bj ˆ A0j =2. For example, for a 1D case with a uniform grid and an upwind ®nite di€erence scheme …a > 0† and c ˆ 0,

Aÿ1 0 ˆ ÿ

m a ÿ ; h2 h

A00 ˆ

2m a ‡ ; h2 h

A10 ˆ ÿ

m h2

and B1 ˆ

m a ‡ h2 h

and

B2 ˆ

m : h2

The vector Sh …u0 ; F † is the residual of the equation on the interface. It is clear that U0 ˆ …u0j †j2Z satis®es Sh …U0 ; 0† ˆ ÿSh …0; F †:

…20†

We propose for an approximate inverse of Sh …:; 0†; Th de®ned by Th : RZ ! RZ 1 …gj †j2Z 7! …v10j ‡ v20j †j2Z ; 2 where …v1ij †i 6 0;j2Z and …v2ij †i P 0;j2Z satisfy 1 Akl ij vkl ˆ 0;

i < 0; j 2 Z;

ÿ1l 1 A0j vÿ1l ‡

2 Akl ij vkl ˆ 0;

i > 0; j 2 Z;

2 A1l 0j v1l ‡

A0l 0j 1 v ˆ gj 2 0l

…21†

and A0l 0j 2 v ˆ gj : 2 0l

…22†

Remark 9. For a Neumann±Neumann preconditioner, in place of (21) and (22), we would have 1 1l 1 1l 2 2l 2 Aÿ1l 0j vÿ1l ‡ Bj v0l ˆ gj and A0j v1l ‡ Bj v0l ˆ gj . Remark 10. For a constant coefficient operator L and a uniform grid, a discrete Fourier analysis can be performed similarly to that of Section 2. It can then be proved that Th  Sh …:; 0† ˆ Idh : Remark 11. The last equations of (21) and (22) correspond to a discretization of the Robin boundary condition m…o=on† ÿ ~ a ~ n=2. Considering the previous 1D example, we have   A00 1 v1 ÿ v1ÿ1 a 1 v1 ÿ v1ÿ1 a 1 1 v ÿ vÿ1 ˆ m 0 ÿ …2vÿ1 ÿ v10 † v ‡ ˆ …m ‡ ah=2† 0 h Aÿ1 0 ÿ1 0 2 h h 2 2

154

Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170

and   A00 2 v2 ÿ v21 a 2 2 v v ‡ v0 : h A1l ‡ ˆm 0 j 1 0 2 2 h

…23†

Another discretization of m…o=on† ÿ ~ a ~ n=2 would not give (23). This is the reason why the approximate inverse is directly defined at the algebraic level. The discretization of the Robin boundary condition is in some sense adaptive with respect to the discretization of the operator (SUPG, upwind finite difference scheme or finite volume scheme). In the previous 1D example, a straight forward discretization of the Robin boundary condition would give for the first domain m

v10 ÿ v1ÿ1 a 1 ÿ v0 : h 2

When ah  m, which is usually the case, it is quite different from (21). In a ®nite element framework, the variational generalization of the next section gives automatically the good discretization of the Robin±Robin boundary condition.

3. Variational generalization 3.1. The continuous problem Let us now consider general partitions of our original domain X into non-overlapping subdomains Xi ; i ˆ 1; . . . ; N (Fig. 1), with interfaces Ci ˆ oXi n oX;

C ˆ [i Ci :

Fig. 1. Three-dimensional triangulation and automatic decomposition into 45 subdomains.

Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170

155

On X, we wish to solve the general advection±di€usion problem ÿ div…m…x†ru† ‡ ~ a…x†  ru ‡ c…x†u ˆ f

on X;

…24†

u ˆ 0 on oXD ; ou m…x† ˆ g on oXN : on

…25† …26†

We will restrict ourselves to well-posed problems, where the velocity ®eld ~ a 2 W 1;1 …X† is of bounded divergence 1 a† P a > 0 c ÿ div…~ 2 for some a 2 R and where the Neumann boundary conditions are only applied on a subset oXN of the domain boundary oX where we have out¯ow conditions (with ~ n denoting as usual the outwards pointing unit normal vector to oX) ~ a ~ nP0

8x 2 oXN :

The variational framework is better suited for extending the previous sub-structuring technique to such a general formulation and partition. The key point is then to identify the correct bilinear form describing the action of the above non-symmetric advection±di€usion operator on any given subdomain Xi of X. A naive choice would be to use Z mru  rv ‡ …~ a  ru†v ‡ cuv: a^i …u; v† ˆ Xi

But in fact, this choice is rather inappropriate, because of its lack of positiveness. A better choice is to integrate by parts the advection term 1=2…~ a…x†  ru†v and to introduce the local symmetrized form  Z Z  1 1 1 1 ~ a  ru†v ÿ …~ a  rv†u ‡ …c ÿ div…~ a††uv ‡ a ~ nuv mru  rv ‡ …~ ai …u; v† ˆ 2 2 2 Xi oX\oXi 2 Z 1 ~ a ~ ni uv: …27† ˆ a^i …u; v† ÿ Ci 2 With this choice, a direct multiplication and integration of (24) yields the variational problem N X

Find u 2 H…X† :

fai …u; v† ÿ Li …v†g ˆ 0 8v 2 H…X†;

…28†

iˆ1

with the notation  H…X† ˆ v 2 H 1 …X†; v ˆ 0 on oXD ; Z Z fv ‡ gv: Li …v† ˆ Xi

oXN \oXi

R a ~ ni uv added locally to each local form a^i cancel each other In this formulation, the interface terms ÿ Ci 1=2~ by summation. Their presence nevertheless guarantee that the local bilinear form ai is positive on the space of restrictions H…Xi † ˆ fvi ˆ vjXi ;

v 2 H…X†g

since we have by construction   Z Z 1 1 ~ a† u2 ‡ a ~ nu2 mjruj2 ‡ c ÿ div…~ ai …u; u† ˆ 2 Xi oXN \oXi 2

8u 2 H…Xi †:

Moreover, one can easily observe that the natural interface boundary condition which appears when we integrate this bilinear form in (27) by parts is precisely the Robin condition

156

Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170

m…x†

ou 1 ÿ ~ a ~ ni u ˆ . . . on Ci : on 2

used in Section 2. 3.2. Finite element discretization In view of its numerical solution, the above variational formulation (28) must ®rst be approximated by ®nite element methods, that is by replacing the space H…X† by a ®nite element space Hh …X†. For example, this space is constructed with second order isoparametric ®nite elements de®ned on regular ``triangulations'' of X (see Fig. 1). They are a good compromise between accuracy and cost-eciency. Other choices of ®nite element are of course possible, and some of the following numerical tests will use simpler triangular ®rst order elements. In any case, the triangulations to be introduced will respect the subdomain partitions used in domain decomposition: the interfaces Ci coincide with inter-elements boundaries, which mean that each subdomain is the union of a given subset of elements of the original triangulation. When used on advection dominated problems, these ®nite element techniques must be stabilized. In Galerkin least-squares techniques, this is done by adding element residuals Z    se …hl † ÿ div…mru† ‡ ~ a  ru ‡ cu ÿ f ÿ div…mrv† ‡ ~ a  rv ‡ cv Tl

to the original variational formulation with an adequate choice of the local positive stabilization parameters se …hl †. The stabilized ®nite element formulation is then N X

Find uh 2 Hh …X† :

faih …uh ; vh † ÿ Lih …vh †g ˆ 0

8vh 2 Hh …X†

…29†

iˆ1

with the notation aih …u; v† ˆ ai …u; v† ‡

XZ Tl Xi

ˆ Li …v† ‡

XZ Tl Xi

Tl

Tl

  se …hl †…ÿdiv…mru† ‡ ~ a  ru ‡ cu† ÿ div…mrv† ‡ ~ a  rv ‡ cv ; Lih …v†

se …hl †f …ÿdiv…mrv† ‡ ~ a  rv ‡ cv†:

We observe that the variational structures of the original problem and of its ®nite element discretization are very similar. This will also be true for the numerical domain decomposition technique to be introduced next. Therefore, from now on, we will use the same notation for both the continuous and the stabilized discrete problem and omit the subscript h in all ®nite element formulations. The reader will just have to remember that when dealing with ®nite elements, the bilinear and linear forms ai and Li , and the space H…X† should be replaced by their discrete counterparts aih ; Lih and Hh …X† as de®ned in the present section. 3.3. Substructuring Due to their variational structure, problems (28) or (29) can be reduced to interface problems by standard substructuring techniques. For this purpose, we introduce the local spaces of restrictions H…Xi † ˆ fvi ˆ v jXi ; v 2 H…X†g; 

H…Xi † ˆ fvi 2 H…X†; vi ˆ 0 in X n Xi g; ˆ space of functions of H…Xi † with zero continuous extension in X n Xi ; the global trace space V ˆ TrH…X† jC , the local trace spaces vi ˆ Trvi jCi ; Vi ˆ f

vi 2 H…Xi †g ˆ f vi ˆ Trv jCi ;

v 2 H…X†g;

Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170

157

the ai -harmonic extension Trÿ1 i : Vi ! H…Xi † de®ned by i ; vi † ˆ 0 ai …Trÿ1 i u



8vi 2 H…Xi †;

i † jCi ˆ ui ; Tr…Trÿ1 i u

i 2 H…Xi † Trÿ1 i u

…30†

given by and its adjoint Trÿ i i † ˆ 0 ai …vi ; Trÿ i u



i † jCi ˆ ui ; Trÿ i 2 H…Xi †: 8vi 2 H…Xi †; Tr…Trÿ i u i u

…31†



By construction, the bilinear form ai is elliptic on H…Xi †. Hence, the problems (30) or (31) are well-posed. We then can de®ne the local Schur complement operator Si : Vi ! V0i by i † 8 i ; Trÿ ui ; vi 2 Vi : hSi ui ; vi i ˆ ai …Trÿ1 i u i v In matrix form, by decomposing the local degrees of freedom Ui of ui ˆ Ri u (with Ri …resp: Ri † denoting the restriction operator from H…X† into H…Xi † (resp. from V into Vi )) into internal degrees of freedom  Ui and interface degrees of freedom Ui , and the matrix Ai associated to the bilinear form ai into " # i Bi A Ai ˆ ~T  ; …32† B Ai i

we simply have Trÿ1 i

ÿ1

 ÿA i I

ˆ

! Bi

;

ÿ1 Bi †Ui : Si Ui ˆ …Ai ÿ B~Ti A i With this notation, we can decompose each restriction of the solution u and test function v into    ÿ    Ri u ˆ ui ‡ Trÿ1 i …Ri u† and Ri v ˆ vi ‡ Tri …Ri v†, then eliminate the local component ui because it is solution of the local well posed problem 

ai …ui ; vi † ˆ Li …vi †







8vi 2 H…Xi † ui 2 H…Xi †:

…33†

Therefore, we can reduce the original problems (28) or (29) to the interface problem S u ˆ F

in V:

Here, we have introduced the global Schur complement operator Sˆ

N X iˆ1

RTi Si Ri ;

and the interface right-hand side F de®ned by X  † Li …Trÿ hF ; vi ˆ i Ri v i

ˆ

X i

ˆ

Xn i

ˆ

 †g fLi …vi † ÿ Li …vi ÿ Trÿ i Ri v

Xn i

o     i v Li …vi † ÿ ai ui ; vi ÿ Trÿ construction ofu R i i   o ÿ  Li …vi † ÿ ai ui ; vi construction of Trÿ ; i

with vi denoting any function of H…Xi † such that vi ˆ v on Ci .

…34†

158

Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170

ÿ i used in the definition of Si can be replaced by any Remark 3.1. By construction of Trÿ1 i , the function Tri v  function vi whose trace on Ci is equal to vi , yielding X  ; vi † 8 ai …Trÿ1 u; v 2 V: hS ui ; vi ˆ i Ri u i

3.4. Basic preconditioner The speci®c preconditioner detailed in this paper is a natural generalization of the so-called Neumann± Neumann algorithm proposed by Agoskhov P [2], Bourgat et al. [3], Morice [16], among others. The idea is to precondition the interface operator S ˆ i RTi Si Ri by a weighted sum of inverses [22]: X ÿ1 Di …Si † DTi : …35† M ÿ1 ˆ T ˆ i

Above, Di are diagonal weighting matrices satisfying X Di Ri ˆ Id: i

For implementation reasons (¯exibility and parallelism), the map Di must be as local as possible. The generic choice consists in de®ning Di on each interface degree of freedom v…Pl † by: q Di v…Pl † ˆ i v…Pk † q if the l degree of freedom of V corresponds to the k degree of freedom of Vi , Di v…Pl † ˆ 0; if not. Here P qi is a local measure of the sti€ness of subdomain Xi (for example the local diagonal value of Ai ) and q ˆ Pl 2Xj qj is the sum of qj on all subdomains Xj containing Pl . Remark 3.2. For any Fi 2 V0i ; …Si †ÿ1 Fi is equal to the trace on Ci of the solution wi of the local variational problem ai …wi ; vi † ˆ hFi ; Tr…vi †jCi i

8vi 2 H…Xi †; wi 2 H…Xi †:

…36†

By construction of our bilinear form ai , this local problem is associated to the operator ÿdiv…mrw† ‡ ~ a  rw ‡ cw and to the Robin boundary condition on the interface m

ow 1 ÿ ~ a ~ ni w ˆ F i on 2

on Ci :

By construction we also have (omitting the positive stabilization term XZ se …hl †…ÿdiv…mrw† ‡ ~ a  rw ‡ cw†2 Tl Xi

Tl

at the finite element level),   Z Z 1 1 2 ~ a† w2 ‡ a ~ nw2 ; mjrwj ‡ c ÿ div…~ ai …w; w† ˆ 2 Xi oXN \oXi 2 which means that the bilinear form ai is strictly elliptic on H…Xi †. Therefore, the variational problem (36) is always well posed.

Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170

159

3.5. Coarse solver When used on elliptic symmetric operators, it is observed that the basic preconditioner introduced in Section 3.4 behaves poorly when the number of subdomains is large. This is due to the existence of constant like functions which scale badly in energy norm when expanded from a local subdomain to the full domain. The traditional remedy is to build a small coarse global space which includes all these constant like functions and to solve the interface problem by a direct solver on this coarse space, and by a Neumann± Neumann preconditioned iterative solver on its orthogonal. This strategy can also be used for advection problems, although the optimal coarse space is not quite known yet in such problems. For this purpose, we introduce a given coarse space VG . Its construction will be done later. Its S orthogonal space Vf is then de®ned by Vf ˆ fvf 2 V; hSvf ; v^G i ˆ 0

8^ vG 2 V G g

…37†

and the action of the projector I ÿ PG form V into Vf parallel to VG on any given function v 2 V amounts to the solution of the global coarse variational problem v; v^G i hSvG ; v^G i ˆ hS v ˆ v ÿ vG : … I ÿ PG †

8^ vG 2 V G ; vG 2 V G ;

…38†

Again, the problem (38) is well-posed, since by construction the operator S is elliptic on VG . Indeed,  computing S as proposed in Remark 3.1 with vi ˆ Trÿ1 i Ri vG , and omitting for simplicity the positive stabilization terms, we have hSvG ; vG i ˆ

XZ i

Xi

 2 mjrTrÿ1 i Ri v G j

 Z 1 1 2 i vG †2 ‡  ~ a† …Trÿ1 R a ~ n …Trÿ1 ‡ c ÿ div…~ i i Ri v G † : 2 oXN \oXi 2 

Because of the presence of a coarse space, the solution u of the interface problem can now be uniquely decomposed in u ˆ uG ‡ uf ;

uG 2 V G ;

uf 2 V f ;

the coarse component uG being solution of the global coarse problem v G 2 V G ; uG 2 V G ; hSuG ; v^G i ˆ hF ; v^G i 8^

…39†

to be solved by a direct solver, and the orthogonal component uf being the solution of the remaining problem hSuf ; v^f i ˆ hF ; v^f i ÿ hSuG ; v^f i

8^ v f 2 V f ; uf 2 V f ;

…40†

to be obtained by a GMRES type iterative algorithm with updated preconditioner T

ÿ1

T

M ÿ1 ˆ …I ÿ PG †  T  …I ÿ PG † ˆ …I ÿ PG †  …Ri Di …Si † DTi †  …I ÿ PG † :

…41†

We can observe that the basic preconditioner is now only used to obtain the component of u orthogonal to the coarse space, and therefore avoids as much as possible the bad elements of the coarse space. Remark 3.3. The action of …I ÿ PG †T is often very easy to compute. Indeed, for any vf in Vf , we have T

vi ˆ hSvf ; v^ ÿ v^G i ˆ hSvf ; v^i 8^ v 2 V: h…I ÿ PG † Svf ; v^i ˆ hSvf ; …I ÿ PG †^ Similarly, by construction of uG , we also have …I ÿ PG †T …F ÿ SuG † ˆ F ÿ SuG :

160

Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170

3.6. Detailed algorithm 3.6.1. Preparation step The algorithm involves at each step the solution of Dirichlet problems (30) and Robin problems (36). For each subdomain, these local problems will be solved by a direct method. As the corresponding matrices do not change from one iteration to another, it is possible to compute and factorize them once for ever. Then the solution of the local linear systems will be a simple forward±backward substitution. Similarly, the matrix of the projector (38) can be computed and factorized once for all. The preparation step therefore executes the following tasks: i ) of the Robin problems (36) 1. For each subdomain compute and factorize the matrices Ai (respectively A (respectively the Dirichlet problems (30)). 2. Compute the interface weighting coecients Di . 3. For the coarse problem, for each subdomain (the loop on subdomains has to be replaced by a loop on cross-points when one uses the BPS piecewise linear coarse space): · Compute the local functions zia to be used to construct the basis …Djzja †j; a of the coarse space VG . zja by solving the local · Compute the vectors S T Djzja . By construction, this means computing Trÿ i Dj  Dirichlet problems 

zja ; vi † ˆ 0 8vi 2 H…Xi †; aTi …Trÿ i Dj  ÿ zja 2 H…Xi †; Tr…Tri Djzja †jCi ˆ Djzja ; Trÿ i Dj  i †T (Using there the available factorization of A i † and multiple right-hand sides Djzj , with matrix …A a and setting X ati …Trÿ zja ; vi † hS T Djzja ; vi ˆ i Dj  i

where vi is any consistent extension of v. If NEIGH is the number of neighboring subdomains of domain i, if Na is the number of local coarse functions, the number of right-hand sides to solve for each subdomain i is equal to Na  (NEIGH + 1). · Compute and factor the coarse matrix Cjb;ia ˆ hDizia ; S T Djzjb i…Na  (NEIGH + 1) dot products).  i † 4. Compute the local components ui of the solution by solving the Dirichlet problem(with matrix A 



ai …ui ; vi † ˆ Li …vi †





8vi 2 H…Xi † ˆ ui 2 H…Xi †:

5. Compute the coarse component uG of the solution by solving the coarse problem (with matrix C) X  fLi …vi † ÿ ai …ui ; vi †g 8j 8b; huG ; S T Djzjb i ˆ i

where vi is any consistent extension of Djzjb , that is any function of Vi such that vi ˆ Djzjb on Ci . Most of these computations are obviously independent on each subdomain and thus can be run in parallel. 3.6.2. GMRES initialization step 0 0 (i) Compute the local extension Trÿ1 i …uG ‡ uf †…uf 2 Vf is the initial value of uf and is usually taken to be 0 i , zero right-hand side, and interface data uf ˆ 0† by solving the Dirichlet problem (30) with matrix A 0 uG ‡ uf . This step is not necessary in the symmetric case because when Ai is symmetric, we have already zja ˆ Trÿ zja . computed Trÿ1 i Dj  i Dj  (ii) Compute the initial residual X  0 fLi …vi † ÿ ai …ui ‡ Trÿ1 hRes0 ; vi ˆ i …uG ‡ uf †; vi †g: i

P (iii) Precondition this residual by ®rst computing w0 ˆ i Di …Si †ÿ1 DTi Res0 by solving the local Robin problems (with matrix Ai )

Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170

161

ai …wi ; vi † ˆ hRes0 ; Di Tr…vi †jCi i 8vi 2 H…Xi †; wi 2 H…Xi †; P and by setting w0 ˆ i Di Tr…wi †. (iv) Compute ®nally the preconditional residual …I ÿ PG †w0 ˆ w0 ÿ w0G by solving the coarse problem (with matrix C) hw0G ; S T Djzjb i ˆ hw0 ; S T Djzjb i 8j

8b:

Most of these computations are obviously independent on each subdomain and thus can be run in parallel. 3.6.3. Solution step The practical implementation of a preconditioned GMRES type iterative loop for computing uf requires then the calculation of products X

…I ÿ PG † 

! Di …Si †

ÿ1

i

DTi

T

 …I ÿ PG † Svf

for a series of function vf 2 Vf . Computing such products amounts to the following steps:  (i) Compute the solution Trÿ1 v of the Dirichlet problem (30) with matrix A and interface data vf . i f i (ii) Compute dF ˆ …I ÿ PG †T Svf ˆ Svf by hdF ; vi ˆ

X i

^i †; ai …Trÿ1 i vf ; v

extension of v inside Xi . with v^i denoting any Pconsistent ÿ1 T  (iii) Compute w ˆ i Di …Si † Di dF by solving the local Robin problems (with matrix Ai ) ai …wi ; vi † ˆ hdF ; Di Tr…vi †jCi i

8vi 2 H…Xi †; wi 2 H…Xi †;

P and by setting w ˆ i Di Tr…wi †. (iv) Compute ®nally …I ÿ PG †w ˆ w ÿ wG by solving the coarse problem (with matrix C)  S T Djzj i 8j hwG ; S T Djzjb i ˆ hw; b

8b:

Again, the most memory and time consuming steps in this algorithm are independent on each subdomain and thus can be run in parallel. 3.7. Construction of the coarse space There are two main strategies for constructing coarse spaces. A multigrid based strategy will take as coarse space the trace of low order ®nite element functions de®ned on a much coarser triangulation, namely the triangulation constructed by assimilating every subdomain to a coarse ®nite element. This BPS technique (named after the celebrated paper where it was ®rst analysed) is rather ecient for 2D problems, and functions of the corresponding coarse space are linear on each edge. A subdomain based strategy will try to identify the possible local pathologies of the Robin preconditioner. We want in particular to deal with positive de®nite preconditioned operators, that is we would like to have hSvf ; M ÿ1 Svf i P Ckvf k2

8vf 2 Vf ;

162

Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170

for a proper choice of norm k  k2 . But a direct calculation yields T

hSvf ; M ÿ1 Svf i ˆ hSvf ; …I ÿ PG †  T  …I ÿ PG † Svf i T

T

ˆ h…I ÿ PG † Svf ; T  …I ÿ PG † Svf i ˆ hSvf ; TSvf i …construction of Vf † X ÿ1 hSvf ; Di …Si † DTi Svf i …construction of T † ˆ i

ˆ

X i

ˆ

hSvf ; Di TrwijCi i …construction of Siÿ1 †

X

ai …wi ; wi † …construction of wi †;

i

where wi denotes the solution of the local Robin problem (36) solved in the preconditioning step. By construction of our bilinear form, we also have (omitting the stabilization terms)   Z Z 1 1 2 ~ a† w2i ‡ a ~ nw2i : mjrwi j ‡ c ÿ div…~ ai …wi ; wi † ˆ 2 Xi oXN \oXi 2 RThe situation is therefore bad on internal subdomains (where oX \ oXi ˆ ;), when wi is a constant, and …c ÿ …1=2†div…~ a†† is small. To avoid such bad situations, we must forbid nonzero constant solutions in Xi the local Robin problems, which we achieve if we impose that all solutions of the local Robin problems be of zero average. To do this, we take as coarse space the space ) ( X i 0 i c Di Tr…vi †jCi ; …c †i arbitrary …42† V G ˆ vG ; vG ˆ i

generated by the weighted traces Di Tr…v0i †jCi of the solutions v0i of the local adjoint Dirichlet problems Z v^i 8^ ati …v0i ; v^i † ˆ vi 2 H…Xi †; v0i 2 H…Xi †: Xi

Then, by construction of the orthogonal space Vf , we have hSvf ; v^G i ˆ 0

8^ vG 2 VG

8vf 2 Vf :

By taking v^G ˆ Di Tr…v0i †jCi , this implies in particular that the local solution wi of the local Robin problems satisfy Z wi ˆ ati …v0i ; wi †…construction of v0i † ˆ ai …wi ; v0i † ˆ hSvf ; Di Tr…v0i †jCi i†…construction of wi † Xi

ˆ 0…definition of Vf †; which is the desired e€ect. Remark 3.4. In the symmetric case, (~ a ˆ 0, c positive arbitrary), the local coarse functions v0i are simply the 0 constant vi ˆ 1=c.

4. Numerical results in two-dimensions The advection±di€usion is discretized on a Cartesian grid by a Q1-streamline-di€usion method ([14]). The interface problem (20) is solved by a preconditioned GMRES algorithm and the stopping criterion is to reduce the initial residual by a factor 10ÿ10 . The preconditioners are either of the type Robin±Robin (R±R), Neumann±Neumann (N±N) or the identity (±).

Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170

163

4.1. A ®rst comparison between the preconditioners The ®rst series of tests will be to compare the performances of the Robin±Robin (R±R), Neumann± Neumann (N±N) preconditioners, and the non-preconditioned method, in some typical situations. No coarse space solver is used here. In these tests, the domain is the rectangle ‰0; 1Š  ‰0; 0:2Š partitioned into ®ve square vertical strips of sizes 0:2  0:2. In each subdomain, there is a uniform grid of 60  60 quadrangular elements. We choose c ˆ 1 and m ˆ 0:001 or m ˆ 1, and four velocities: (i) ~ a ˆ~ e1 . In this case, the velocity is perpendicular to the interfaces between the subdomains. (ii) ~ a ˆ~ ep 2 .In this case, the velocity is parallel to the interfaces. e2 †. We refer to this convecting ®eld as oblique. (iii) ~ a ˆ 2=2…~ e1 ‡~ e2 ÿ …x2 ÿ 0:1†~ e1 †. Here, the convecting ®eld can be seen as a vortex. (iv) ~ a ˆ 2p……x1 ÿ 0:5†~ Based on these results, several remarks can be made: · The Robin±Robin preconditioner performs much better than the Neumann±Neumann preconditioner when the viscosity is small while the performances are equivalent for large viscosity. · For small viscosities, when the velocity is not parallel to the interface, the Neumann±Neumann preconditioner gives very poor results (poorer than when no preconditioner is used). On the contrary, when the velocity is parallel to the interfaces, both the Neumann±Neumann and the Robin±Robin preconditioners work very well (2 iterations), (note that they are equivalent in this case). These results are in complete agreement with the Fourier analysis above. · When the velocity is normal or oblique, the Robin±Robin preconditioner has nilpotency properties which will be better illustrated in the next series of tests: this explains why the number of iterations is only 3 and 5 for these cases. · Thus, the Robin±Robin adapts smoothly to the di€erent situations presented in Table 1. · In our implementation, one iteration of the Robin±Robin or Neumann±Neumann method costs twice as much as when no preconditioner is used. Even though, the Robin±Robin preconditioner remains always cheaper. 4.2. In¯uence of the number of subdomains: case of strips The purpose of this series of tests is to assess the nilpotency properties of the algorithm. To illustrate the theory above, the domain is partitioned into vertical strips, the velocity is uniform and normal to the interfaces. The dependence of the number of iterations with respect to the number of strips will be tested. Here, the domain is X ˆ …0; Nsd =4†  …0; 1†, where Nsd . It is partitioned into Nsd square subdomains, which are rectangles of sizes 0:25  1. In each subdomain, there is a uniform grid of size 20  40. !; c ˆ 1, and the viscosity is m ˆ 0:001. The velocity is ~ a ˆ 3e 1 Here again, the three methods are tested (see Table 2). In Fig. 2, the convergence of the GMRES algorithms with and without the Robin±Robin preconditioner are plotted. It is very clear on the Fig. 2 that for the Robin±Robin preconditioner, the residual does not change signi®cantly for Nsd =2 ˆ 18 iterations, then drops quickly. For GMRES without preconditioner, the same kind of phenomenon is observed, but the decay of the residual is obtained only after Nsd ˆ 36 iterations. This con®rms very well the theory above on the idempotency of the preconditioned or not Schur Table 1 Number of iterations for di€erent velocity ®elds and ®ve subdomains Viscosity

Precond.nvelocity

Normal

Parallel

Oblique

Rotating

m ˆ 0:001

R±R N±N ±

3 52 14

2 2 34

5 42 13

36 >100 71

mˆ1

R±R N±N ±

9 9 30

9 9 38

10 10 41

10 11 41

164

Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170

Table 2 In¯uence of the number of subdomains Partition Grid c ˆ 1, ~ a ˆ 3~ e1 , m ˆ 0:001

R±R ± N±N

41 20  40

81 20  40

12  1 20  40

24  1 20  40

36  1 20  40

5 13 43

8 18 >100

12 22 >100

23 33 >100

30 44 >100

Fig. 2. The convergence with and without preconditioner: 36 subdomains

complement matrix. Note that, since the number of grid nodes is not large enough, the rates of convergence after the idempotency threshold are of the same order. This would not be the case for a ®ner grid, as we can see on Fig. 3 and in Section 4.3. On Fig. 3, the convergence plots correspond to the same experiment except that the grid has been re®ned with a geometrical progression of ratio 0.9 in the x2 direction. Therefore, the grid is very ®ne near the boundary x2 ˆ 0. We see on Fig. 3 that the idempotency properties of the preconditioned operator are conserved, although they appear less clearly. On the contrary, the non-preconditioned operator yields a very poor convergence. 4.3. In¯uence of the grid anisotropy and boundary layers We carry out exactly the same test as in 4.2, but the velocity has a boundary layer: 2 ~ e1 a ˆ 3 ÿ …300  …x2 ÿ 0:1† †~ ~ a ˆ 3~ e1

if x2 < 0:1 if x2 P 0:1

…43†

To capture the boundary layer, the mesh is re®ned in the x2 -direction, near the boundary x2 ˆ 0 with a geometric progression of ratio 0.9 (see Table 3). In comparison with the tests of Section 4.2, we see that the performances deteriorate due to the change of velocity and grid, but is very clear that the Robin±Robin method is the less severely a€ected. The nilpotency threshold is also observed here. Table 4 completes these results. Here, the subdomains are squares of sizes 0:25  0:25 and the velocity can either be uniform ~ a ˆ 3~ e1 or have a boundary layer, see (43). The meshes can either be uniform, or be progressively re®ned with a geometrical progression of ratio 0.9 in the x2 direction. Here again, we see that preconditioning becomes particularly useful when the grid is dramatically re®ned, as in the case of a boundary layer for instance.

Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170

165

Fig. 3. The convergence with and without preconditioner: 36 subdomains Table 3 Anisotropic grids Partition Grid cˆ1 m ˆ 0:001

R±R ± N±N

41 20  40

81 20  40

12  1 20  40

24  1 20  40

36  1 20  40

11 51 49

18 75 > 100

25 91 > 100

39 > 100 > 100

51 > 100 > 100

Table 4 In¯uence of anisotropic mesh re®nement and boundary layers Velocity

Re®nement

Uniform

Geometrical Progression Uniform

B. Layer

Geometrical Progression Uniform

Partition

31 40  40

31 60  60

31 60  60

R±R ± R±R ±

7 37 5 12

7 48 5 10

8 69 6 12

R±R ± R±R ±

9 40 8 18

8 57 7 20

10 73 10 24

4.4. In¯uence of the number of unknowns Here, we test the in¯uence of the number of grid points. The domain is the unit square, which is partitioned into 4  4 subdomains. In each subdomain, the grid varies from 20  20 up to 60  60. The velocity x ÿ~ x0 †, …x0 is the center of X† the viscosity is m ˆ 0:001 and the constant c equals to 10ÿ7 . Thus, is ~ a ˆ~ e3  …~ this case is almost a steady state computation. A coarse space solver of BPS type is used here, but it will be discussed later. Note that this test is reputed dif®cult, since although convection effects dominate, diffusion is the only effect that carries information from the boundary of the domain to its center. The convergence is not affected by the grid size (see Table 5).

166

Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170

Table 5 In¯uence of the number of grid points

c ˆ 10ÿ7 , m ˆ 0:001, ~ a ˆ~ e3  …~ x ÿ~ x0 †

Partition Grid

44 20  20

44 30  30

44 40  40

44 60  60

R±R ±

34 80

34 82

34 89

34 ±

Table 6 Coarse Space Grid

Coarse sp.

22 30  30

33 30  30

55 30  30

77 30  30

10  10 30  30

c ˆ 10ÿ4 ~ x ÿ~ x0 † a ˆ~ e3  …~ c ˆ 10ÿ4 ~ a ˆ 3~ e1

Yes No Yes No

20 22 7 7

26 31 13 12

36 51 18 15

42 68 22 18

48 91 25 23

4.5. Coarse space Here, we investigate the possibility of adding a coarse space solver to our preconditioner: we have chosen the space of functions which are the harmonic liftings of piecewise linear functions on the skeleton of the domain decomposition. This coarse space is similar to that used by Bramble, Paschiak and Schatz. In addition, for the Robin problems, we ®x one internal value to 0 in each subdomain which does not touch the physical boundary. The domain is the unit square, the viscosity is m ˆ 0:01 and the parameter c equals a ˆ~ e3  …~ x ÿ~ x0 † (vortex) or ~ a ˆ 3~ e1 . to 10ÿ4 (the problem is almost stationary). The velocity can either be ~ The number of subdomains varies from 2  2 up to 10  10, and the grid in each subdomain is 30  30. Note that the coarse space solver is really useful in the case of the rotating velocity: the convergence still varies when the diameter H of the subdomains decreases but the number of necessary iteration seems to grow sub-linearly with 1=H , while the growth is linear when no coarse space is used. For the uniform velocity, the coarse space solver does not seem useful. Remark that the viscous effects are more important for the rotating velocity. Further investigations are clearly needed at this level (see Table 6). 5. Preliminary results in three-dimensions The three-dimensional cases described here deal with domains of arbitrary shapes, decomposed into hexaedra or tetrahedra. The problem (24) is discretized by the stabilized Galerkin Least-Squares technique described in Section 3.2 using second order elements. As in the two-dimensional case, the interface problem is solved by a GMRES algorithm using the Robin±Robin preconditioner. The algorithm stops when the residual is less than 10ÿ6 . 5.1. A model problem The ®rst experiments deal with a regular discretization of the unit cube ‰0; 1Š  ‰0; 1Š  ‰0; 1Š on which a problem with a known analytical solution …u ˆ x2 ‡ y 2 ‡ z2 † is solved. We chose c ˆ 0:1 and m ˆ 1. and a uniform oblique velocity ~ a ˆ …1; 7; 9†. The total number of ®nite elements (NE), the corresponding number degrees of freedom (NDOF) and the number of iterations are given in Table 7. For the decomposition in nine subdomains we have also considered two types of boundary conditions: · Dirichlet condition on all the boundaries (18 iterations); · Dirichlet condition on ®ve faces of the cube and Robin boundary condition on the bottom face (40 iterations).

Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170

167

Table 7 A model three-dimensional problem Partition NE NDOF R±R

211 216 1232 3

221 256 1409 10

331 576 3021 11

333 1728 8294 18

No coarse space is used here. The results are sensitive to the number of subdomains, but the number of iterations remains very reasonable in all cases.

5.2. In¯uence of the velocity ®eld We next study the in¯uence of the velocity ®eld on the behavior of the preconditioner for an advection dominated case. For this we consider the following three decompositions (note that the domain is chosen such as each subdomain has a nice aspect ratio, and that each subdomain is made of 64 ®nite elements): 2 Case 1: The cube ‰0; 1Š  ‰0; 0:5Š split in 2  2  1 subdomains Case 2: The cube ‰0; 1Š2  ‰0; 0:33Š split in 3  3  1 subdomains Case 3: The cube ‰0; 1Š3 split in 3  3  3 subdomains The results are summarized in Table 8, where ~ v1 is de®ned by: ~ e1 v1 ˆ 300  x22~ ~ e1 v1 ˆ 3~

if x2 < 0:1 if x2 P 0:1

…44†

and ~ v2 denotes the velocity ®eld given by equation (43). These results show that the Robin±Robin preconditioner is far less sensitive than the original Neumann± Neumann one to the choice of velocity ®elds. This was expected since the original Neumann±Neumann preconditioner ignores all ~ a ~ n boundary coecients. Finally we solved the same problem, with ~ a ˆ …y=2 ÿ 0:5; ÿx=2 ‡ 0:5; 0† on the unstructured decomposition of Fig. 1. Here the unit cube contains 24 576 tetrahedric second order ®nite elements and is split into 45 subdomains by an automatic mesh partitioner. This is why the boundaries between subdomains are less regular than for the other computations. For this decomposition the algorithm converges in 48 iterations with the R±R preconditioner and in 45 iterations with the addition of the coarse space.

Table 8 In¯uence of the velocity ®eld Case 1

Case 2

12 24

21 51

35 165

R±R N±N

9 23

13 65

19 92

~ a ˆ~ v2 ; m ˆ 10ÿ2 ; c ˆ 10ÿ4 R±R N±N

9 37

16 88

25 >200

~ a ˆ …y=2 ÿ 0:5; ÿx=2 ‡ 0:5; 0†; m ˆ 10ÿ3 ; c ˆ 10ÿ4 R±R N±N

Case 3

~ a ˆ~ v1 ; m ˆ 10ÿ2 ; c ˆ 10ÿ4

168

Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170

Fig. 4. The decomposition of the tube into subdomains

5.3. In¯uence of the coarse space In this section we consider a long tube (cf. Fig. 4) ‰ÿ2; 2Š2  ‰0; 2Š decomposed into 32 subdomains …2  2  8†. Each subdomain contains 64 second order hexaedra (425 nodes). The coarse space is the domain based space described in Section 3. We solve the following problem: ÿ div …m…x†ru† ‡ ~ a…x†  ru ‡ c…x†u ˆ 0 u ˆ 2 ‡ x ‡ y on oXB ;

on X;

u ˆ 0 on oXT ; ou m…x† ˆ 0 on oXN ; on

…45† …46† …47† …48†

where XB denotes the bottom of the beam, XT the top and XN the lateral boundary of the beam. We consider two velocity ®elds: ~ e1 ÿ 0:1  x  …y 2 ÿ 4†~ e2 ‡ 4~ e3 ; a1 ˆ 0:1  y  …x2 ÿ 4†~ ~ e1 ÿ 0:1  x  …y 2 ÿ 4†~ e2 ‡ …4 ‡ 0:1  z  …y=10 ‡ x=10 ÿ 0:5††~ e3 : a2 ˆ 0:1  y  …x2 ÿ 4†~ The case is dicult because of the aspect ratio of the domain, and because of our choice of coecients imposing very little di€usion and almost no stabilization c. The results, obtained for different values of m and c are summarized in Table 9. The coarse space slightly improves the eciency of the preconditioner, but this is not really spectacular. The next test is a three-dimensional simulation of the two-dimensional problem solved in the previous section, with the rotating velocities. Starting from a uniform discretization of the unit square in 4  4 subdomains we obtain a three-dimensional mesh by transforming each quadrangle into a hexaedron. In order to obtain a good aspect ratio, the third-dimension is equal to the discretization step. Dirichlet boundary conditions are considered on the lateral boundary. The top and bottom faces are subjected to homogeneous Neumann boundary conditions. The results obtained are described in Table 10 for di€erent discretization steps.

Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170

169

Table 9 In¯uence of the coarse space Velocity

m

c

R±R

R±R + coarse

~ a1 ~ a2 ~ a2

10ÿ2 10ÿ2 10ÿ3

10ÿ3 10ÿ3 10ÿ7

30 36 57

28 31 51

Table 10 Three-dimensional simulation of a two-dimensional problem Grid

m

c

R±R

R±R + coarse

44 20  20 20  20

10ÿ3 10ÿ3 10ÿ3

10ÿ7 10ÿ7 10ÿ4

19 20 20

18 18 18

The results are very similar to the bidimensional case and completely insensitive to the chosen grid. Moreover, for such a small number of subdomains, the proposed preconditioner is already very ecient without coarse space.

6. Conclusion The proposed preconditioner is a generalization of the Neumann±Neumann algorithm. Its main advantages are · Its robustness, assessed ®rst theoretically by a Fourier analysis in the special case of domain decomposition into strips, then by various numerical tests in two and three-dimensions. In the tests above, the Robin±Robin preconditioner has proved fairly insensitive to the choice of the grid, the velocity, the viscosity and the constant c. It is sensitive to the sizes of the subdomains, but this seems unavoidable when convections dominates. Another nice feature is that the Robin±Robin preconditioner transforms continuously to the Neumann±Neumann preconditioner as the convection effects vanish. · It has the same algebraic structure as the Neumann±Neumann preconditioner. This means that when a software contains the Neumann±Neumann preconditioner, constructing the Robin±Robin preconditioner requires almost no modi®cations. Therefore, it has been easily implemented in the most general case i.e. a three-dimensional case with unstructured and automatically partitioned mesh. · A theoretical framework for adding a coarse space solver is now available. However, our knowledge on this preconditioner is not complete. In particular, the analysis of convergence in the most general case seems dicult, since energy considerations are clearly not sucient. Further works need to be done, in particular, · the algorithm will be tested on less academic situations. · The coarse space to be used will be speci®ed better and validated. References [1] Y. Achdou, F. Nataf, A Robin±Robin preconditioner for an advection±di€usion problem, C.R. Acad. Sciences Paris, series I 325 (1997) 1211±1216. [2] V. Aghoskov, Poincare±Steklov's operators and domain decomposition methods in ®nite dimensional spaces, in: Proceedings of DDM 1. [3] J.F. Bourgat, R. Glowinski, P. Le Tallec, M. Vidrascu, Variational formulation and algorithm for trace operator in domain decomposition calculations, in: Proceedings of DDM 2, AMS, 1988, pp. 3±16. [4] J. Bramble, J. Pasciak, A. Schatz, The construction of preconditioners for elliptic problems by substructuring, Math. Comp. 47 (1986) 103±134.

170

Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170

[5] Xiao-Chuan Cai, Some Domain Decomposition Algorithms for Nonselfadjoint Elliptic and Parabolic Partial Di€erential Equations. PhD thesis, Courant Institute of Mathematical Sciences, September 1989, Tech. Rep. 461, Department of Computer Science, Courant Institute. [6] Xiao-Chuan Cai, An additive Schwarz algorithm for nonselfadjoint elliptic equations, in: T. Chan, R. Glowinski, J. Periaux, O. Widlund (Eds.), Third International Symposium on Domain Decomposition Methods for Partial Di€erential Equations, SIAM, Philadelphia, PA, 1990, pp. 232±244. [7] Xiao-Chuan Cai, O. Widlund, Domain decomposition algorithms for inde®nite elliptic problems, SIAM J. Sci. Statist. Comput. 13 (1) (1992) 243±258. [8] T. Chan, T. Mathew, Domain decomposition algorithms, in: Acta Numerica, Cambridge University Press, Cambridge, 1994, pp. 13±51. [9] L. Cowsar, J. Mandel, M. Wheeler, Balancing domain decomposition for mixed ®nite elements, Technical Report 93±08, Department of Mathematical Sciences, Rice University, 1993. [10] B. Despres, Domain decomposition method and the Helmholtz problem ii, in: K. Ralph et al. (Eds.), Mathematical and Numerical Aspects of Wave Propagation, SIAM, Philadelphia 1993, pp. 197±206. [11] M. Dryja, O. Widlund, Additive Schwarz methods for elliptic ®nite element problems in three-dimensions, in: Proceedings of DDM 5. [12] F. Gastaldi, L. Gastaldi, A. Quarteroni, Adaptative domain decomposition methods for advection dominated equations, East± West J. Numer. Math. 4 (1996) 165±206. [13] C. Japhet, Optimized Krylov±Ventcell method application to convection±di€usion problems. in: Ninth International Symposium on Domain Decomposition Methods for Partial Di€erential Equations, Wiley, New York, 1996. [14] C. Johnson, U. Navert, J. Pitkaranta, Finite element method for linear hyperbolic problems, Comp. Meth. Appl. Engrg. 45 (1984) 285±312. [15] J. Mandel, Two-level domain decomposition preconditioning for the p-version ®nite element method in three-dimensions, Int. J. Num. Meth. Eng. 29 (1990) 1095±1108. [16] P. Morice, Transonic computations by a multidomain technique with potential and Euler solvers, in: J. Zienep, H. Oertel (Eds.), Symposium Transsonicum III, IUTAM, G ottingen, 24±27 May 1988, 1989, Springer, Berlin. [17] F. Nataf, F. Nier, Convergence rate of Schwarz type methods for an arbitrary number of subdomains, in: Ninth International Symposium on Domain Decomposition Methods for Partial Di€erential Equations, Wiley, New York, 1996. [18] F. Nataf, F. Nier, Convergence rate of some domain decomposition methods for overlapping and non-overlapping subdomains, Numerische Mathematik 75 (1997) 357±377. [19] F. Nataf, F. Rogier, Factorization of the convection-di€usion operator and the Schwarz algorithm, Math. Models Appl. Sci. 5 (1995) 67±93. [20] Y. Saad, H. Schultz, Gmres: a generalized minimal residual algorithm for solving non-symmetric linear systems, SIAM J. Sci. Stat. Comput 7 (1986) 856±869. [21] B. Smith, An optimal domain decomposition preconditioner for the ®nite element solution of linear elasticity problems, SIAM J. Sci Stat. Comp. 13 (1992) 364±378. [22] P. Le Tallec, Domain decomposition methods in computational mechanics, Comput. Mech. Advances 1 (2) (1994) 121±220. [23] P. Le Tallec, Y.H. De Roeck, M. Vidrascu, Domain decomposition methods for large linearly elliptic three-dimensional problems, J. Comp. Appl. Math. 34 (1991) 93±117. [24] O. Widlund, Iterative substructuring methods: algorithms and theory for elliptic problems in the plane, in: G. Golub, R. Glowinski, J. Periaux (Eds.), Proceedings of the First International Symposium on Domain Decomposition Methods for Partial Di€erential Equations, SIAM, Philadelphia, 1988.