Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170
www.elsevier.com/locate/cma
A domain decomposition preconditioner for an advection±diusion problem Yves Achdou a,*, Patrick Le Tallec b, Frederic Nataf c, Marina Vidrascu d a
c
INSA Rennes, 20 Av des Buttes de Coesmes, 35043 Rennes, France b Universit e Paris Dauphine, 75 775 Paris Cedex 16, France CMAP UMR7641 CNRS, Ecole Polytechnique, 91128, Palaiseau Cedex, France d INRIA Rocquencourt, BP 105, 78 153 Le Chesnay Cedex, France Received 1 February 1999
Abstract We propose a generalization of the Neumann±Neumann algorithm for advection±diusion problems: the Neumann conditions are replaced by suitable Robin conditions. The method is ®rst introduced and analysed in the case where the domain is partitioned into vertical strips, then generalized to an arbitrary decomposition. A coarse space procedure is proposed. Finally, numerical results for bidimensional and tridimensional tests are given. Ó 2000 Elsevier Science S.A. All rights reserved.
1. Introduction The primary objective of domain decomposition methods is the ecient solution on parallel machines of problems in Computational Mechanics set on complex geometries and approximated on very ®ne grids. Most methods refer to iterative substructuring techniques (the so-called Schur domain decomposition method) and use non-overlapping domain partitions which split the original domain into small disjoint subdomains and reduce the problem to an interface problem solved by an iterative method. The main issue conditioning the success and parallel eciency of such techniques is the choice of the preconditioner which must have nice parallel properties, must be able to handle arbitrary elliptic operators and discretization grids, and its performance must be insensitive to the discretization step h and to the number of subdomains. Many such preconditioners have appeared in the literature, following the early work of Bramble, Pasciak and Schatz [4] ([8,9,11,15,24]). For three-dimensional elasticity, ef®cient results have been obtained using either the wire-basket algorithm of Smith [21] or the Neumann±Neumann preconditioner of [23,22]. For advection±diffusion or Helmholtz problems, one can also choose arti®cial boundary conditions as interface conditions to improve the condition number [10,12,13,19]. For the analysis of overlapping domain decomposition methods for non-symmetric problems, see [5±7]. We propose herein a generalization of the Neumann±Neumann preconditioner for the Schur domain decomposition method applied to an advection±diusion equation, by replacing the local Neumann boundary conditions by suitable Robin boundary conditions which take into account the non-symmetry of the operator. The choice of these conditions comes from a Fourier analysis, which is given in Section 2. When the operator is symmetric the proposed Robin boundary conditions reduce to Neumann boundary
*
Corresponding author.
0045-7825/00/$ - see front matter Ó 2000 Elsevier Science S.A. All rights reserved. PII: S 0 0 4 5 - 7 8 2 5 ( 9 9 ) 0 0 2 2 7 - 3
146
Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170
conditions. Also, as in the symmetric case, the proposed preconditioner is exact for two subdomains and a uniform velocity. The paper is organized as follows. In Section 2, the method is de®ned and analyzed for a decomposition into strips at the continuous level. In particular, it is shown that the preconditioned system is close to an idempotent operator. Then, the proposed preconditioner is constructed directly at the algebraic level. Next, a variational writing is introduced in Section 3 which makes it possible to generalize the proposed approach to general partitions and to include a coarse space. Numerical results illustrating the performances of the proposed technique conclude the paper.
2. The vertical strip case The general case of an arbitrary domain decomposition is treated in Section 3. 2.1. The continuous case ± de®nition of the algorithm We consider a rectangular domain X 0; H ÿ g; g, and the advection±diusion boundary value problem in X L
u cu ~ a ru ÿ mDu f in X; u 0 on oX:
1
The positive constant c may arise from a time discretization by an Euler implicit scheme of the time dependent equation. The equation is solved by a primal Schur method. We focus on the case when the domain is decomposed into non-overlapping vertical strips Xk
lk ; lk1 ÿ g; g; 1 6 k 6 N . Let nk . Ck;k1 flk1 g ÿ g; g: The unit outward normal vector to domain Xk is denoted by ~ We introduce N ÿ1 1=2 N ÿ1 L2
X !
H ÿ1=2
ÿ g; g S : H00
ÿ g; g ! 1 ovk ovk1 m
uk 1 6 k 6 N ÿ1; f 7! 2 onk onk1 Ck;k1
;
2
1 6 k 6 N ÿ1
where vk satis®es L
vk f
in Xk ;
vk uk on Ck;k1 for 1 6 k 6 N ÿ 1; vk ukÿ1 on Ckÿ1;k for 2 6 k 6 N ; vk 0
3
on oX \ oXk :
It is clear that U
uCk;k1 1 6 k 6 N ÿ1 satis®es S
U ; 0 ÿS
0; f : At the continuous level, we propose an approximate inverse of S
:; 0 de®ned by 1=2
ÿ g; gN ÿ1 ; T :
H ÿ1=2
ÿ g; gN ÿ1 !
H00 1
vk vk1 Ck;k1 ;
gk 1 6 k 6 N ÿ1 7! 2 1 6 k 6 N ÿ1
4
Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170
147
where vk satis®es
m
~ a ~ nk o ÿ 2 onk
~ a ~ nk o ÿ m 2 onk
L
vk 0 in Xk ; !
vk gk
on Ck;k1 for 1 6 k 6 N ÿ 1;
!
5
vk gkÿ1
on Ckÿ1;k for 2 6 k 6 N ;
vk 0 on oX \ oXk : Remark 1. In Section 3, the boundary value problem (5) is proved to be well-posed (see also [1]) under the usual condition c ÿ 1=2 div ~ a > 0. Remark 2. Our approach is different from that used in [19] or [12], where the interface conditions a ~ nk ; 0 are used in the framework of Schwarz algorithms. m
o=onk ÿ min
~ In Section 2.2, we analyze the preconditioner in the special case where g 1, for simplicity. 2.2. Theoretical study of the algorithm ± uniform velocity Let us ®rst recall the Proposition 2.1 (see [1]). In the case where the plane R2 is decomposed into the left
X1 ÿ 1; 0R and a is uniform, we have that right
X2 0; 1R half-planes, and where the velocity ~ T S
:; 0 Id: This result shows that, as it is the case for the Neumann±Neumann algorithm applied to symmetric operators, the Robin±Robin preconditioner is exact for a two domain decomposition. For a more general decomposition into an arbitrary number of vertical strips, the vector ~ a is denoted ~ a
ax ; ay . We assume without loss of generality that the component of the velocity normal to the inof the subdomains. The main result of terface is positive ax P 0. Let H denote the minimum of the widths p the section is Theorem 3 below: it states that for expfÿ
ax a2x 4mcH =mg small enough, the preconditioned system is close to an idempotent operator of order N =2, where x denotes the integer part of x. This has an important effect on the convergence of GMRES applied to the preconditioned system, as it will be seen from Corollary 6 and Fig. 2. QN ÿ1 For Theorem 3, we introduce HsN k1 H s
Ck;k1 ; s 2 R the space of H s functions on the N ÿ 1 interfaces. Let U 2 HsN be denoted by U
uk 1 6 k 6 N ÿ1 . The space HsN is endowed with the norm kU kHs sup1 6 k 6 N ÿ1 kuk kH s
Ck ;k1 . N;1
Theorem 3. Assume that the velocity field ~ a is uniform. Let X
0; HX R be decomposed into N nonoverlapping vertical strips Xk
lk ; lk1 R; 0 6 k 6 N ÿ 1 and Ck;k1 flk1 g R. Let H be the size of the smallest subdomain, H mink
lk1 ÿ lk . Assume that q
6 expfÿ
ax a2x 4mcH =mg < 1; and that q 3N =
1 ÿ
N 1
< 1:
Then, for n P N =2, we have n
k
T S
:; 0 ÿ Id kL
Hs 6 N
N 1 qn=N =2ÿ1 ; 2
1 ÿ q
1 ÿ N ÿ2
where [x] denotes the integer part of x.
7
148
Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170
Remark 4. The above result explains why the GMRES algorithm stagnates initially during the [N/2] first iterations and then converges fast, see Corollary 6 and Figs. 2 and 3. This behavior is classical for convection± diffusion type problems, see for instance [19] or [13]. Remark 5. In fact, as it can be seen from formulas (13) below, as soon as q expf
ax ÿ a2x 4mcH =mg 1; the operator T S is already very close to identity and a better estimate can easily be obtained: p 6 expf
ax ÿ a2x 4mcH =mg :
8 k
T S
:; 0 ÿ IdkL
Hs 6 N
1 ÿ 3 p 2 The above theorem is thus interesting when e
ax ÿ ax 4mcH =m is close to 1. This corresponds for instance to the case of a very large time step with a strong laminar flow. On the other hand, the above results cannot be applied when the operator is symmetric and very close to a Laplace operator (c 1. Indeed, in order to have good convergence rates, it is then necessary to modify the Robin±Robin preconditioner (which amounts to the Neumann±Neumann preconditioner) by adding a coarse space and a pseudo inverse for the Neumann problem (5). Proof. The width of subdomain Xk is denoted by Hk ; Hk lk1 ÿ lk . The proof is based on the Fourier transform in the y direction and the Fourier variable is denoted by n. The inverse Fourier transform is denoted by Fÿ1 . The action of S
:; 0 can be expressed in terms of Fourier transform S
nU^
n; S
:; 0
U Fÿ1
c
U 2 HsN ;
where c S is a tridiagonal
N ÿ 1
N ÿ 1 symbol-valued matrix. For instance, the value of c S k1;k1j ; j ÿ1; 0; 1 is computed in the following way. Let um 2 H s
Cm;m1 , m k ÿ 1; k; k 1. Let vk satisfy L
vk 0 in Xk ;
9
with the boundary conditions vk uk on Ckÿ1;k ; vk uk1 on Ck;k1 ;
10
and L
vk1 0 in Xk1 ; vk1 uk1 on Ck;k1 ; vk1 uk2
on Ck1;k2 :
We shall compute the Fourier transform of 1=2m
ovk =onk
ovk1 =onk1 Ck;k1 . The Fourier transform of (9) w.r.t. y yields vk
x; n 0;
c ax ox ay in ÿ moxx mn2
^ where i2 ÿ1. For a given n, this equation is an ordinary dierential equation in x whose solutions have the form a
n expfkÿ
n
x ÿ lk g b
n expfk
n
x ÿ lk1 g, where q a 4mc a2x 4iay nm 4n2 m2 x :
11 k
n 2m
Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170
149
We have, Re
k
n > 0, where Re
z denotes the real part of a complex number z. The Dirichlet boundary < conditions (10) yield the values of a and b:
u^k
n ÿ eÿk
nHk u^k1
n ; a
n ÿ 1 ÿ e
k
nÿk
nHk ÿ u^k1
n ÿ ek
nHk u^k
n : b
n ÿ 1 ÿ e
k
nÿk
nHk Hence, o^ vk o^ vk Ck;k1 Ck;k1 onk ox ÿ ÿ m
kÿ ÿ k ek Hk mk ÿ mkÿ e
k ÿk Hk ^ u^k u : k1 ÿ ÿ 1 ÿ e
k ÿk Hk 1 ÿ e
k ÿk Hk The expression for
o^ vk1 =
onk1 Ck;k1 can be obtained in the same manner. Then, we obtain ! ÿ ÿ ÿ m o^ vk o^ vk1 m
kÿ ÿ k ek Hk mk ÿ mkÿ e
k ÿk Hk ÿmkÿ mk e
k k Hk1 u^k1 u^k ÿ ÿ ÿ 2 onk onk1 2
1 ÿ e
k ÿk Hk 2
1 ÿ e
k ÿk Hk 2
1 ÿ e
k ÿk Hk1 Ck;k1
u^k2
m
kÿ ÿ k eÿk Hk1 : ÿ 2
1 ÿ e
k ÿk Hk1
This yields the following formulas for c S ÿ
m
kÿ ÿ k ek Hk c ; S k1;k ÿ 2
1 ÿ e
k ÿk Hk ÿ
ÿ
mk ÿ mkÿ e
k ÿk Hk ÿmkÿ mk e
k k Hk1 c ; S k1;k1 ÿ ÿ 2
1 ÿ e
k ÿk Hk 2
1 ÿ e
k ÿk Hk1
12
m
kÿ ÿ k eÿk Hk1 c : S k1;k2 ÿ 2
1 ÿ e
k ÿk Hk1 Similarly, we can compute the symbol-valued matrix c T: c ^ T
nG
n; T
G Fÿ1
c
G 2 HsN :
Hence, the preconditioned system reads Tc S
nU^
n; TS
:; 0
U Fÿ1
c
U 2 HsN ;
where c Tc S is a pentadiagonal matrix. After a tedious computation, we obtain that its non-zero entries are given by the following formulas:
c Tc Sk;k2 ÿ
c Tc Sk;k1
eÿk ÿ
1 ÿ e
k
1 ÿ e
k
c Tc Sk;k 1
c Tc Sk;kÿ1
ÿk Hk2
1 ÿ
ÿ
ÿk
ÿ
ÿ e
k
ÿk Hk1
ÿ
;
e
k ÿk Hk2 ÿ e
k ÿk Hk eÿk Hk1 ; Hk
1 ÿ e
kÿ ÿk Hk1
1 ÿ e
kÿ ÿk Hk2
ÿ
ÿ
e
k ÿk Hk e
k ÿk Hk1 ; ÿ ÿ
1 ÿ e
k ÿk Hk
1 ÿ e
k ÿk Hk1 ÿ
e
k
1 ÿ
c Tc Sk;kÿ2 ÿ
Hk1 Hk2
ÿ
ÿk Hkÿ1
e
k ÿk Hkÿ1
1 ek ÿ
1 ÿ e
k
ÿ
ÿ
ÿ
ÿ e
k ÿ
ÿk Hk1
e
k ÿk Hk
1 ÿ
ÿ e
k
ÿ
ÿ
Hkÿ1 Hkÿ2
ÿk Hkÿ2
1
13
ÿk Hkÿ1
ÿ e
k ÿk Hk1
:
ek
Hk
;
150
Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170
The above expressions are valid for any k as long as the indices make sense and except for the following entries of the matrix c Tc S, where the Dirichlet boundary conditions (5) in the Robin±Robin preconditioner yield some minor modi®cations:
c Tc S1;1 1
ÿ
ÿ
e2
k ÿk H1 e
k ÿk H2 ; ÿ ÿ
1 ÿ e2
k ÿk H1
1 ÿ e
k ÿk H2 ÿ
ÿ
e
k ÿk H3 e
k ÿk H1 eÿk H2 ; ÿ ÿ ÿ
k ÿk H
k ÿk H
k ÿk H 3
1 ÿ e 2
1 e 1
1 ÿ e
c Tc S1;2
c Tc SN ÿ1;N ÿ1 1
ÿ
ÿ
e2
k ÿk HN e
k ÿk HNÿ1 ; ÿ ÿ
1 ÿ e2
k ÿk HN
1 ÿ e
k ÿk HNÿ1 ÿ
e
k
ÿk HN ÿ2
ÿ
ÿ
e
k ÿk HN ek HN ÿ1 : ÿ ÿ ÿ
1 ÿ e
k ÿk HN ÿ2
1 ÿ e
k ÿk HNÿ1
1 e
k ÿk HN
c Tc SN ÿ1;N ÿ2
Tc S is very close to the identity. But, for For large values of n, it is clear from the expressions of k
n that c n close to zero and c 1 (quasi steady state problems), the extra diagonal entries
c Tc Sk;kÿ2 are close to ÿ1 while the other extra-diagonal entries remain small. This is the reason why a theoretical study of the convergence has to be done. For this, matrix c Tc S is written as the sum of three matrices: the Identity, a c b nilpotent matrix of order N =2
A and a small perturbation
B: c b Tc S Id c A B;
14
where c A
c A k;l 1 6 k;l 6 N ÿ1
c Tc Sk;l 0
for l k ÿ 2; 3 6 k 6 N ÿ 1; otherwise:
b is given by (14). From the structure of c The value of B A, it is clear that c A N =2 0: In order to estimate the norms of the matrices, we ®rst have to de®ne the norms that will be used. Let x
x1 ; . . . ; xN ÿ1 be a vector in RN ÿ1 and kxk1 denote the maximum norm, kxk1 sup1 6 k 6 N ÿ1 jxk j. For a matrix D
Dij 1 6 i;j 6 N ÿ1 2 R
N ÿ1
N ÿ1 , the corresponding operator norm is kDk1 supi Rj j Dij j: kDx k1 6 kDk1 kxk1 . By applying the result of [17] (see also [18]), we have the following estimate. Proposition 2.2. If b q
n k B
nk 1
NX =2ÿ1 i0
kc A
nki1 < 1:
15
Then, define C
n by C
n
NX =2ÿ1 1 kc A
nki1 1 ÿ q
n i0
16
we have for n P N =2, b n
nk k
c Tc S ÿ Idn
nk1 6 C
nq
nn=N =2ÿ1 : k
c A B 1
17
Going back to the proof of Theorem 3, we shall estimate q
n independently of n. Let be as in Theorem 3,
jeÿk
nHk
j6
Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170
151
b for any n. Therefore, the norm of c A
n is bounded by 1=
1 ÿ 2 and that of B
n by 5=
1 ÿ 3 . Then, by i N ÿ2 for i 6 N =2 we get estimating k c A
nk1 by 1=
1 ÿ NX =2ÿ1 i0
kc Ak1 6 i
N 1 : 2
1 ÿ N ÿ2 N 1
Hence, q
n may be roughly estimated by 3N =
1 ÿ . By using this estimate and Parseval±Plancherel theorem, we can conclude. We now consider the implications of Theorem 3 for the GMRES algorithm (see [20]) applied to the problem
T S
:; 0
U ÿT S
0; f :
18
Since T S
:; 0 is a bounded operator from HsN into itself, it is possible to de®ne a ``continuous'' GMRES algorithm. For this, we use a norm induced by a scalar product. The space HsN is endowed with the norm v u N ÿ1 uX kuk k2H s
Ck;k1 : kU kHs t N ;2
We have kU kHs
N ;1
k1
p 6 kU kHs 6 N ÿ 1 kU kHs : N;1
N ;2
0 0. Hence, the For the sake of simplicity, the GMRES algorithm is considered with an initial guess, Ugmres initial residual is
r0 T S
0; f : The Krylov space Kn
r0 is given by Kn Spanfr0 ;
T S
:; 0
r0 ; . . . ;
T S
:; 0nÿ1
r0 g: n is de®ned by The GMRES iterate Ugmres n r0 kHs minn kT S
:; 0
U r0 k: k
T S
:; 0
Ugmres U 2K
N ;2
19
Since the k:kHs norm is induced by a scalar product, the above minimization problem amounts to the N ;2 solving of a n n square linear system. We have the n be defined as in (19) and the residual at Corollary 6. Let r0 T S
0; f be the initial residual. Let Ugmres n n 0 step n be defined by r
T S
:; 0
Ugmres r . Then, under the assumptions of Theorem 3, we have the following estimate for n P N =2
krn kHs 6 Cqn=N =2ÿ1 kr0 kHs ; N ;2
N ;2
where p C N ÿ1 1
5 3
1 ÿ
!
1 2
1 ÿ
N 2
2
1 ÿ q
1 ÿ N ÿ2
and q and are defined in Theorem 3. Remark 7. Since this result is stated at the continuous level, it is a clear indication that the same kind of estimate should remain valid as the mesh size tends to zero.
152
Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170
Proof. The proof is based on a comparison between a ®xed point algorithm and the GMRES method de®ned above. Indeed, we introduce the following algorithm 0 Ufixed
pt
n1 Ufixed
pt
0;
n
Id ÿ T S
:; 0 Ufixed pt ÿ T S
0; f :
n It can easily be checked that Ufixed
pt
2 Kn . Then, by (19), we have
n n 0 r0 kHs 6 k
T S
:; 0
Ufixed k
T S
:; 0
Ugmres pt r kHs : N ;2
N;2
We now have to estimate the right-hand side of the above inequality. We have n Ufixed
pt
ÿ
nÿ1 X i
Id ÿ T S
:; 0 T S
0; f : i0
Under the assumptions of Theorem 3, this series converges to the solution U of (18), U ÿ
1 X i
Id ÿ T S
:; 0 T S
0; f : i0
Hence, n 0 n
T S
:; 0
Ufixed pt r
T S
:; 0
Ufixed
pt
ÿ U
i ÿT S
:; 0R1 in
Id ÿ T S
:; 0 T S
0; f i
0 ÿT S
:; 0R1 in
Id ÿ T S
:; 0 r :
This gives i
n 0 1 0 k
T S
:; 0
Ufixed pt r kHs 6 kT S
:; 0Rin
Id ÿ T S
:; 0 r kHs N ;2
N ;2
i
S
:; 0R1 in
Id
6 kT ÿ T S
:; 0 kHs kr0 kHs N ;2 N;2 p i 1 6 N ÿ 1kT S
:; 0Rin
Id ÿ T S
:; 0 kHs kr0 kHs N ;1 N ;2 p i 1 6 N ÿ 1kT S
:; 0kHs Rin k
Id ÿ T S
:; 0 kHs kr0 kHs : N;1
3
N ;1
2
N ;2
i
The norm of T S
:; 0 is estimated by 1 5=
1 ÿ 1=
1 ÿ and that of
Id ÿ T S
:; 0 (for i P N =2) by using Theorem 3. Remark 8. From formulas (12), it appears that under the assumptions of Theorem 3, the symbol c S
n of the unpreconditioned system S
:; 0 has basically the following structure m b 0
n ; c A 0
n B S
n
kÿ ÿ k
n Id c 2 b 0 is small. The factor m=2
k ÿ kÿ
n which corresponds where c A 0 is a nilpotent matrix of order N ÿ 1 and B in the physical space to an unbounded operator may lead to an ill-conditioned linear system and then to a poor convergence of the GMRES algorithm, see Section 4. 2.3. The discrete case We suppose for simplicity that the computational domain is R2 discretized by a Cartesian grid. Let us problem. We denote A
Akl a discretization of the advection±diusion ij i;j;k;l2Z the matrix resulting from suppose that the stencil is a 9-point stencil Akl 0 for ji ÿ kj P 2 or jj ÿ lj P 2 . This is the case, for ij instance, for a Q1-SUPG method or for a classical ®nite dierence or ®nite volume scheme. We have to solve AU F , where U
uij i;j2Z is the vector of the unknowns. The computational domain is decom-
Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170
153
posed into two half planes x1 and x2 . We introduce a discretized form Sh of the operator S (we adopt the summation convention of Einstein over all repeated indices) Sh : RZ RZZ ! RZ 1 1l 1 1l 2 2l 2
u0j j2Z ;
Fij i;j2Z 7!
Aÿ1l 0j vÿ1l Bj v0l A0j v1l Bj v0l ÿ F0j j2Z ;
where
vmij satisfy m Akl ij vkl Fij
vm0j
u0j
for i < 0 if m 1
and
i > 0 if m 2; j 2 Z;
for j 2 Z:
0l 1l 2l are the contributions of the domain xm to A0l The coecients Bml j 0j ; A0j Bj Bj . For example, if 1l 2l 0l ~ a 0; Bj Bj A0j =2. For example, for a 1D case with a uniform grid and an upwind ®nite dierence scheme
a > 0 and c 0,
Aÿ1 0 ÿ
m a ÿ ; h2 h
A00
2m a ; h2 h
A10 ÿ
m h2
and B1
m a h2 h
and
B2
m : h2
The vector Sh
u0 ; F is the residual of the equation on the interface. It is clear that U0
u0j j2Z satis®es Sh
U0 ; 0 ÿSh
0; F :
20
We propose for an approximate inverse of Sh
:; 0; Th de®ned by Th : RZ ! RZ 1
gj j2Z 7!
v10j v20j j2Z ; 2 where
v1ij i 6 0;j2Z and
v2ij i P 0;j2Z satisfy 1 Akl ij vkl 0;
i < 0; j 2 Z;
ÿ1l 1 A0j vÿ1l
2 Akl ij vkl 0;
i > 0; j 2 Z;
2 A1l 0j v1l
A0l 0j 1 v gj 2 0l
21
and A0l 0j 2 v gj : 2 0l
22
Remark 9. For a Neumann±Neumann preconditioner, in place of (21) and (22), we would have 1 1l 1 1l 2 2l 2 Aÿ1l 0j vÿ1l Bj v0l gj and A0j v1l Bj v0l gj . Remark 10. For a constant coefficient operator L and a uniform grid, a discrete Fourier analysis can be performed similarly to that of Section 2. It can then be proved that Th Sh
:; 0 Idh : Remark 11. The last equations of (21) and (22) correspond to a discretization of the Robin boundary condition m
o=on ÿ ~ a ~ n=2. Considering the previous 1D example, we have A00 1 v1 ÿ v1ÿ1 a 1 v1 ÿ v1ÿ1 a 1 1 v ÿ vÿ1 m 0 ÿ
2vÿ1 ÿ v10 v
m ah=2 0 h Aÿ1 0 ÿ1 0 2 h h 2 2
154
Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170
and A00 2 v2 ÿ v21 a 2 2 v v v0 : h A1l m 0 j 1 0 2 2 h
23
Another discretization of m
o=on ÿ ~ a ~ n=2 would not give (23). This is the reason why the approximate inverse is directly defined at the algebraic level. The discretization of the Robin boundary condition is in some sense adaptive with respect to the discretization of the operator (SUPG, upwind finite difference scheme or finite volume scheme). In the previous 1D example, a straight forward discretization of the Robin boundary condition would give for the first domain m
v10 ÿ v1ÿ1 a 1 ÿ v0 : h 2
When ah m, which is usually the case, it is quite different from (21). In a ®nite element framework, the variational generalization of the next section gives automatically the good discretization of the Robin±Robin boundary condition.
3. Variational generalization 3.1. The continuous problem Let us now consider general partitions of our original domain X into non-overlapping subdomains Xi ; i 1; . . . ; N (Fig. 1), with interfaces Ci oXi n oX;
C [i Ci :
Fig. 1. Three-dimensional triangulation and automatic decomposition into 45 subdomains.
Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170
155
On X, we wish to solve the general advection±diusion problem ÿ div
m
xru ~ a
x ru c
xu f
on X;
24
u 0 on oXD ; ou m
x g on oXN : on
25
26
We will restrict ourselves to well-posed problems, where the velocity ®eld ~ a 2 W 1;1
X is of bounded divergence 1 a P a > 0 c ÿ div
~ 2 for some a 2 R and where the Neumann boundary conditions are only applied on a subset oXN of the domain boundary oX where we have out¯ow conditions (with ~ n denoting as usual the outwards pointing unit normal vector to oX) ~ a ~ nP0
8x 2 oXN :
The variational framework is better suited for extending the previous sub-structuring technique to such a general formulation and partition. The key point is then to identify the correct bilinear form describing the action of the above non-symmetric advection±diusion operator on any given subdomain Xi of X. A naive choice would be to use Z mru rv
~ a ruv cuv: a^i
u; v Xi
But in fact, this choice is rather inappropriate, because of its lack of positiveness. A better choice is to integrate by parts the advection term 1=2
~ a
x ruv and to introduce the local symmetrized form Z Z 1 1 1 1 ~ a ruv ÿ
~ a rvu
c ÿ div
~ auv a ~ nuv mru rv
~ ai
u; v 2 2 2 Xi oX\oXi 2 Z 1 ~ a ~ ni uv:
27 a^i
u; v ÿ Ci 2 With this choice, a direct multiplication and integration of (24) yields the variational problem N X
Find u 2 H
X :
fai
u; v ÿ Li
vg 0 8v 2 H
X;
28
i1
with the notation H
X v 2 H 1
X; v 0 on oXD ; Z Z fv gv: Li
v Xi
oXN \oXi
R a ~ ni uv added locally to each local form a^i cancel each other In this formulation, the interface terms ÿ Ci 1=2~ by summation. Their presence nevertheless guarantee that the local bilinear form ai is positive on the space of restrictions H
Xi fvi vjXi ;
v 2 H
Xg
since we have by construction Z Z 1 1 ~ a u2 a ~ nu2 mjruj2 c ÿ div
~ ai
u; u 2 Xi oXN \oXi 2
8u 2 H
Xi :
Moreover, one can easily observe that the natural interface boundary condition which appears when we integrate this bilinear form in (27) by parts is precisely the Robin condition
156
Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170
m
x
ou 1 ÿ ~ a ~ ni u . . . on Ci : on 2
used in Section 2. 3.2. Finite element discretization In view of its numerical solution, the above variational formulation (28) must ®rst be approximated by ®nite element methods, that is by replacing the space H
X by a ®nite element space Hh
X. For example, this space is constructed with second order isoparametric ®nite elements de®ned on regular ``triangulations'' of X (see Fig. 1). They are a good compromise between accuracy and cost-eciency. Other choices of ®nite element are of course possible, and some of the following numerical tests will use simpler triangular ®rst order elements. In any case, the triangulations to be introduced will respect the subdomain partitions used in domain decomposition: the interfaces Ci coincide with inter-elements boundaries, which mean that each subdomain is the union of a given subset of elements of the original triangulation. When used on advection dominated problems, these ®nite element techniques must be stabilized. In Galerkin least-squares techniques, this is done by adding element residuals Z se
hl ÿ div
mru ~ a ru cu ÿ f ÿ div
mrv ~ a rv cv Tl
to the original variational formulation with an adequate choice of the local positive stabilization parameters se
hl . The stabilized ®nite element formulation is then N X
Find uh 2 Hh
X :
faih
uh ; vh ÿ Lih
vh g 0
8vh 2 Hh
X
29
i1
with the notation aih
u; v ai
u; v
XZ Tl Xi
Li
v
XZ Tl Xi
Tl
Tl
se
hl
ÿdiv
mru ~ a ru cu ÿ div
mrv ~ a rv cv ; Lih
v
se
hl f
ÿdiv
mrv ~ a rv cv:
We observe that the variational structures of the original problem and of its ®nite element discretization are very similar. This will also be true for the numerical domain decomposition technique to be introduced next. Therefore, from now on, we will use the same notation for both the continuous and the stabilized discrete problem and omit the subscript h in all ®nite element formulations. The reader will just have to remember that when dealing with ®nite elements, the bilinear and linear forms ai and Li , and the space H
X should be replaced by their discrete counterparts aih ; Lih and Hh
X as de®ned in the present section. 3.3. Substructuring Due to their variational structure, problems (28) or (29) can be reduced to interface problems by standard substructuring techniques. For this purpose, we introduce the local spaces of restrictions H
Xi fvi v jXi ; v 2 H
Xg;
H
Xi fvi 2 H
X; vi 0 in X n Xi g; space of functions of H
Xi with zero continuous extension in X n Xi ; the global trace space V TrH
X jC , the local trace spaces vi Trvi jCi ; Vi f
vi 2 H
Xi g f vi Trv jCi ;
v 2 H
Xg;
Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170
157
the ai -harmonic extension Trÿ1 i : Vi ! H
Xi de®ned by i ; vi 0 ai
Trÿ1 i u
8vi 2 H
Xi ;
i jCi ui ; Tr
Trÿ1 i u
i 2 H
Xi Trÿ1 i u
30
given by and its adjoint Trÿ i i 0 ai
vi ; Trÿ i u
i jCi ui ; Trÿ i 2 H
Xi : 8vi 2 H
Xi ; Tr
Trÿ i u i u
31
By construction, the bilinear form ai is elliptic on H
Xi . Hence, the problems (30) or (31) are well-posed. We then can de®ne the local Schur complement operator Si : Vi ! V0i by i 8 i ; Trÿ ui ; vi 2 Vi : hSi ui ; vi i ai
Trÿ1 i u i v In matrix form, by decomposing the local degrees of freedom Ui of ui Ri u (with Ri
resp: Ri denoting the restriction operator from H
X into H
Xi (resp. from V into Vi )) into internal degrees of freedom Ui and interface degrees of freedom Ui , and the matrix Ai associated to the bilinear form ai into " # i Bi A Ai ~T ;
32 B Ai i
we simply have Trÿ1 i
ÿ1
ÿA i I
! Bi
;
ÿ1 Bi Ui : Si Ui
Ai ÿ B~Ti A i With this notation, we can decompose each restriction of the solution u and test function v into ÿ Ri u ui Trÿ1 i
Ri u and Ri v vi Tri
Ri v, then eliminate the local component ui because it is solution of the local well posed problem
ai
ui ; vi Li
vi
8vi 2 H
Xi ui 2 H
Xi :
33
Therefore, we can reduce the original problems (28) or (29) to the interface problem S u F
in V:
Here, we have introduced the global Schur complement operator S
N X i1
RTi Si Ri ;
and the interface right-hand side F de®ned by X Li
Trÿ hF ; vi i Ri v i
X i
Xn i
g fLi
vi ÿ Li
vi ÿ Trÿ i Ri v
Xn i
o i v Li
vi ÿ ai ui ; vi ÿ Trÿ construction ofu R i i o ÿ Li
vi ÿ ai ui ; vi construction of Trÿ ; i
with vi denoting any function of H
Xi such that vi v on Ci .
34
158
Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170
ÿ i used in the definition of Si can be replaced by any Remark 3.1. By construction of Trÿ1 i , the function Tri v function vi whose trace on Ci is equal to vi , yielding X ; vi 8 ai
Trÿ1 u; v 2 V: hS ui ; vi i Ri u i
3.4. Basic preconditioner The speci®c preconditioner detailed in this paper is a natural generalization of the so-called Neumann± Neumann algorithm proposed by Agoskhov P [2], Bourgat et al. [3], Morice [16], among others. The idea is to precondition the interface operator S i RTi Si Ri by a weighted sum of inverses [22]: X ÿ1 Di
Si DTi :
35 M ÿ1 T i
Above, Di are diagonal weighting matrices satisfying X Di Ri Id: i
For implementation reasons (¯exibility and parallelism), the map Di must be as local as possible. The generic choice consists in de®ning Di on each interface degree of freedom v
Pl by: q Di v
Pl i v
Pk q if the l degree of freedom of V corresponds to the k degree of freedom of Vi , Di v
Pl 0; if not. Here P qi is a local measure of the stiness of subdomain Xi (for example the local diagonal value of Ai ) and q Pl 2Xj qj is the sum of qj on all subdomains Xj containing Pl . Remark 3.2. For any Fi 2 V0i ;
Si ÿ1 Fi is equal to the trace on Ci of the solution wi of the local variational problem ai
wi ; vi hFi ; Tr
vi jCi i
8vi 2 H
Xi ; wi 2 H
Xi :
36
By construction of our bilinear form ai , this local problem is associated to the operator ÿdiv
mrw ~ a rw cw and to the Robin boundary condition on the interface m
ow 1 ÿ ~ a ~ ni w F i on 2
on Ci :
By construction we also have (omitting the positive stabilization term XZ se
hl
ÿdiv
mrw ~ a rw cw2 Tl Xi
Tl
at the finite element level), Z Z 1 1 2 ~ a w2 a ~ nw2 ; mjrwj c ÿ div
~ ai
w; w 2 Xi oXN \oXi 2 which means that the bilinear form ai is strictly elliptic on H
Xi . Therefore, the variational problem (36) is always well posed.
Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170
159
3.5. Coarse solver When used on elliptic symmetric operators, it is observed that the basic preconditioner introduced in Section 3.4 behaves poorly when the number of subdomains is large. This is due to the existence of constant like functions which scale badly in energy norm when expanded from a local subdomain to the full domain. The traditional remedy is to build a small coarse global space which includes all these constant like functions and to solve the interface problem by a direct solver on this coarse space, and by a Neumann± Neumann preconditioned iterative solver on its orthogonal. This strategy can also be used for advection problems, although the optimal coarse space is not quite known yet in such problems. For this purpose, we introduce a given coarse space VG . Its construction will be done later. Its S orthogonal space Vf is then de®ned by Vf fvf 2 V; hSvf ; v^G i 0
8^ vG 2 V G g
37
and the action of the projector I ÿ PG form V into Vf parallel to VG on any given function v 2 V amounts to the solution of the global coarse variational problem v; v^G i hSvG ; v^G i hS v v ÿ vG :
I ÿ PG
8^ vG 2 V G ; vG 2 V G ;
38
Again, the problem (38) is well-posed, since by construction the operator S is elliptic on VG . Indeed, computing S as proposed in Remark 3.1 with vi Trÿ1 i Ri vG , and omitting for simplicity the positive stabilization terms, we have hSvG ; vG i
XZ i
Xi
2 mjrTrÿ1 i Ri v G j
Z 1 1 2 i vG 2 ~ a
Trÿ1 R a ~ n
Trÿ1 c ÿ div
~ i i Ri v G : 2 oXN \oXi 2
Because of the presence of a coarse space, the solution u of the interface problem can now be uniquely decomposed in u uG uf ;
uG 2 V G ;
uf 2 V f ;
the coarse component uG being solution of the global coarse problem v G 2 V G ; uG 2 V G ; hSuG ; v^G i hF ; v^G i 8^
39
to be solved by a direct solver, and the orthogonal component uf being the solution of the remaining problem hSuf ; v^f i hF ; v^f i ÿ hSuG ; v^f i
8^ v f 2 V f ; uf 2 V f ;
40
to be obtained by a GMRES type iterative algorithm with updated preconditioner T
ÿ1
T
M ÿ1
I ÿ PG T
I ÿ PG
I ÿ PG
Ri Di
Si DTi
I ÿ PG :
41
We can observe that the basic preconditioner is now only used to obtain the component of u orthogonal to the coarse space, and therefore avoids as much as possible the bad elements of the coarse space. Remark 3.3. The action of
I ÿ PG T is often very easy to compute. Indeed, for any vf in Vf , we have T
vi hSvf ; v^ ÿ v^G i hSvf ; v^i 8^ v 2 V: h
I ÿ PG Svf ; v^i hSvf ;
I ÿ PG ^ Similarly, by construction of uG , we also have
I ÿ PG T
F ÿ SuG F ÿ SuG :
160
Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170
3.6. Detailed algorithm 3.6.1. Preparation step The algorithm involves at each step the solution of Dirichlet problems (30) and Robin problems (36). For each subdomain, these local problems will be solved by a direct method. As the corresponding matrices do not change from one iteration to another, it is possible to compute and factorize them once for ever. Then the solution of the local linear systems will be a simple forward±backward substitution. Similarly, the matrix of the projector (38) can be computed and factorized once for all. The preparation step therefore executes the following tasks: i ) of the Robin problems (36) 1. For each subdomain compute and factorize the matrices Ai (respectively A (respectively the Dirichlet problems (30)). 2. Compute the interface weighting coecients Di . 3. For the coarse problem, for each subdomain (the loop on subdomains has to be replaced by a loop on cross-points when one uses the BPS piecewise linear coarse space): · Compute the local functions zia to be used to construct the basis
Djzja j; a of the coarse space VG . zja by solving the local · Compute the vectors S T Djzja . By construction, this means computing Trÿ i Dj Dirichlet problems
zja ; vi 0 8vi 2 H
Xi ; aTi
Trÿ i Dj ÿ zja 2 H
Xi ; Tr
Tri Djzja jCi Djzja ; Trÿ i Dj i T (Using there the available factorization of A i and multiple right-hand sides Djzj , with matrix
A a and setting X ati
Trÿ zja ; vi hS T Djzja ; vi i Dj i
where vi is any consistent extension of v. If NEIGH is the number of neighboring subdomains of domain i, if Na is the number of local coarse functions, the number of right-hand sides to solve for each subdomain i is equal to Na (NEIGH + 1). · Compute and factor the coarse matrix Cjb;ia hDizia ; S T Djzjb i
Na (NEIGH + 1) dot products). i 4. Compute the local components ui of the solution by solving the Dirichlet problem(with matrix A
ai
ui ; vi Li
vi
8vi 2 H
Xi ui 2 H
Xi :
5. Compute the coarse component uG of the solution by solving the coarse problem (with matrix C) X fLi
vi ÿ ai
ui ; vi g 8j 8b; huG ; S T Djzjb i i
where vi is any consistent extension of Djzjb , that is any function of Vi such that vi Djzjb on Ci . Most of these computations are obviously independent on each subdomain and thus can be run in parallel. 3.6.2. GMRES initialization step 0 0 (i) Compute the local extension Trÿ1 i
uG uf
uf 2 Vf is the initial value of uf and is usually taken to be 0 i , zero right-hand side, and interface data uf 0 by solving the Dirichlet problem (30) with matrix A 0 uG uf . This step is not necessary in the symmetric case because when Ai is symmetric, we have already zja Trÿ zja . computed Trÿ1 i Dj i Dj (ii) Compute the initial residual X 0 fLi
vi ÿ ai
ui Trÿ1 hRes0 ; vi i
uG uf ; vi g: i
P (iii) Precondition this residual by ®rst computing w0 i Di
Si ÿ1 DTi Res0 by solving the local Robin problems (with matrix Ai )
Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170
161
ai
wi ; vi hRes0 ; Di Tr
vi jCi i 8vi 2 H
Xi ; wi 2 H
Xi ; P and by setting w0 i Di Tr
wi . (iv) Compute ®nally the preconditional residual
I ÿ PG w0 w0 ÿ w0G by solving the coarse problem (with matrix C) hw0G ; S T Djzjb i hw0 ; S T Djzjb i 8j
8b:
Most of these computations are obviously independent on each subdomain and thus can be run in parallel. 3.6.3. Solution step The practical implementation of a preconditioned GMRES type iterative loop for computing uf requires then the calculation of products X
I ÿ PG
! Di
Si
ÿ1
i
DTi
T
I ÿ PG Svf
for a series of function vf 2 Vf . Computing such products amounts to the following steps: (i) Compute the solution Trÿ1 v of the Dirichlet problem (30) with matrix A and interface data vf . i f i (ii) Compute dF
I ÿ PG T Svf Svf by hdF ; vi
X i
^i ; ai
Trÿ1 i vf ; v
extension of v inside Xi . with v^i denoting any Pconsistent ÿ1 T (iii) Compute w i Di
Si Di dF by solving the local Robin problems (with matrix Ai ) ai
wi ; vi hdF ; Di Tr
vi jCi i
8vi 2 H
Xi ; wi 2 H
Xi ;
P and by setting w i Di Tr
wi . (iv) Compute ®nally
I ÿ PG w w ÿ wG by solving the coarse problem (with matrix C) S T Djzj i 8j hwG ; S T Djzjb i hw; b
8b:
Again, the most memory and time consuming steps in this algorithm are independent on each subdomain and thus can be run in parallel. 3.7. Construction of the coarse space There are two main strategies for constructing coarse spaces. A multigrid based strategy will take as coarse space the trace of low order ®nite element functions de®ned on a much coarser triangulation, namely the triangulation constructed by assimilating every subdomain to a coarse ®nite element. This BPS technique (named after the celebrated paper where it was ®rst analysed) is rather ecient for 2D problems, and functions of the corresponding coarse space are linear on each edge. A subdomain based strategy will try to identify the possible local pathologies of the Robin preconditioner. We want in particular to deal with positive de®nite preconditioned operators, that is we would like to have hSvf ; M ÿ1 Svf i P Ckvf k2
8vf 2 Vf ;
162
Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170
for a proper choice of norm k k2 . But a direct calculation yields T
hSvf ; M ÿ1 Svf i hSvf ;
I ÿ PG T
I ÿ PG Svf i T
T
h
I ÿ PG Svf ; T
I ÿ PG Svf i hSvf ; TSvf i
construction of Vf X ÿ1 hSvf ; Di
Si DTi Svf i
construction of T i
X i
hSvf ; Di TrwijCi i
construction of Siÿ1
X
ai
wi ; wi
construction of wi ;
i
where wi denotes the solution of the local Robin problem (36) solved in the preconditioning step. By construction of our bilinear form, we also have (omitting the stabilization terms) Z Z 1 1 2 ~ a w2i a ~ nw2i : mjrwi j c ÿ div
~ ai
wi ; wi 2 Xi oXN \oXi 2 RThe situation is therefore bad on internal subdomains (where oX \ oXi ;), when wi is a constant, and
c ÿ
1=2div
~ a is small. To avoid such bad situations, we must forbid nonzero constant solutions in Xi the local Robin problems, which we achieve if we impose that all solutions of the local Robin problems be of zero average. To do this, we take as coarse space the space ) ( X i 0 i c Di Tr
vi jCi ;
c i arbitrary
42 V G vG ; vG i
generated by the weighted traces Di Tr
v0i jCi of the solutions v0i of the local adjoint Dirichlet problems Z v^i 8^ ati
v0i ; v^i vi 2 H
Xi ; v0i 2 H
Xi : Xi
Then, by construction of the orthogonal space Vf , we have hSvf ; v^G i 0
8^ vG 2 VG
8vf 2 Vf :
By taking v^G Di Tr
v0i jCi , this implies in particular that the local solution wi of the local Robin problems satisfy Z wi ati
v0i ; wi
construction of v0i ai
wi ; v0i hSvf ; Di Tr
v0i jCi i
construction of wi Xi
0
definition of Vf ; which is the desired eect. Remark 3.4. In the symmetric case, (~ a 0, c positive arbitrary), the local coarse functions v0i are simply the 0 constant vi 1=c.
4. Numerical results in two-dimensions The advection±diusion is discretized on a Cartesian grid by a Q1-streamline-diusion method ([14]). The interface problem (20) is solved by a preconditioned GMRES algorithm and the stopping criterion is to reduce the initial residual by a factor 10ÿ10 . The preconditioners are either of the type Robin±Robin (R±R), Neumann±Neumann (N±N) or the identity (±).
Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170
163
4.1. A ®rst comparison between the preconditioners The ®rst series of tests will be to compare the performances of the Robin±Robin (R±R), Neumann± Neumann (N±N) preconditioners, and the non-preconditioned method, in some typical situations. No coarse space solver is used here. In these tests, the domain is the rectangle 0; 1 0; 0:2 partitioned into ®ve square vertical strips of sizes 0:2 0:2. In each subdomain, there is a uniform grid of 60 60 quadrangular elements. We choose c 1 and m 0:001 or m 1, and four velocities: (i) ~ a ~ e1 . In this case, the velocity is perpendicular to the interfaces between the subdomains. (ii) ~ a ~ ep 2 .In this case, the velocity is parallel to the interfaces. e2 . We refer to this convecting ®eld as oblique. (iii) ~ a 2=2
~ e1 ~ e2 ÿ
x2 ÿ 0:1~ e1 . Here, the convecting ®eld can be seen as a vortex. (iv) ~ a 2p
x1 ÿ 0:5~ Based on these results, several remarks can be made: · The Robin±Robin preconditioner performs much better than the Neumann±Neumann preconditioner when the viscosity is small while the performances are equivalent for large viscosity. · For small viscosities, when the velocity is not parallel to the interface, the Neumann±Neumann preconditioner gives very poor results (poorer than when no preconditioner is used). On the contrary, when the velocity is parallel to the interfaces, both the Neumann±Neumann and the Robin±Robin preconditioners work very well (2 iterations), (note that they are equivalent in this case). These results are in complete agreement with the Fourier analysis above. · When the velocity is normal or oblique, the Robin±Robin preconditioner has nilpotency properties which will be better illustrated in the next series of tests: this explains why the number of iterations is only 3 and 5 for these cases. · Thus, the Robin±Robin adapts smoothly to the dierent situations presented in Table 1. · In our implementation, one iteration of the Robin±Robin or Neumann±Neumann method costs twice as much as when no preconditioner is used. Even though, the Robin±Robin preconditioner remains always cheaper. 4.2. In¯uence of the number of subdomains: case of strips The purpose of this series of tests is to assess the nilpotency properties of the algorithm. To illustrate the theory above, the domain is partitioned into vertical strips, the velocity is uniform and normal to the interfaces. The dependence of the number of iterations with respect to the number of strips will be tested. Here, the domain is X
0; Nsd =4
0; 1, where Nsd . It is partitioned into Nsd square subdomains, which are rectangles of sizes 0:25 1. In each subdomain, there is a uniform grid of size 20 40. !; c 1, and the viscosity is m 0:001. The velocity is ~ a 3e 1 Here again, the three methods are tested (see Table 2). In Fig. 2, the convergence of the GMRES algorithms with and without the Robin±Robin preconditioner are plotted. It is very clear on the Fig. 2 that for the Robin±Robin preconditioner, the residual does not change signi®cantly for Nsd =2 18 iterations, then drops quickly. For GMRES without preconditioner, the same kind of phenomenon is observed, but the decay of the residual is obtained only after Nsd 36 iterations. This con®rms very well the theory above on the idempotency of the preconditioned or not Schur Table 1 Number of iterations for dierent velocity ®elds and ®ve subdomains Viscosity
Precond.nvelocity
Normal
Parallel
Oblique
Rotating
m 0:001
R±R N±N ±
3 52 14
2 2 34
5 42 13
36 >100 71
m1
R±R N±N ±
9 9 30
9 9 38
10 10 41
10 11 41
164
Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170
Table 2 In¯uence of the number of subdomains Partition Grid c 1, ~ a 3~ e1 , m 0:001
R±R ± N±N
41 20 40
81 20 40
12 1 20 40
24 1 20 40
36 1 20 40
5 13 43
8 18 >100
12 22 >100
23 33 >100
30 44 >100
Fig. 2. The convergence with and without preconditioner: 36 subdomains
complement matrix. Note that, since the number of grid nodes is not large enough, the rates of convergence after the idempotency threshold are of the same order. This would not be the case for a ®ner grid, as we can see on Fig. 3 and in Section 4.3. On Fig. 3, the convergence plots correspond to the same experiment except that the grid has been re®ned with a geometrical progression of ratio 0.9 in the x2 direction. Therefore, the grid is very ®ne near the boundary x2 0. We see on Fig. 3 that the idempotency properties of the preconditioned operator are conserved, although they appear less clearly. On the contrary, the non-preconditioned operator yields a very poor convergence. 4.3. In¯uence of the grid anisotropy and boundary layers We carry out exactly the same test as in 4.2, but the velocity has a boundary layer: 2 ~ e1 a 3 ÿ
300
x2 ÿ 0:1 ~ ~ a 3~ e1
if x2 < 0:1 if x2 P 0:1
43
To capture the boundary layer, the mesh is re®ned in the x2 -direction, near the boundary x2 0 with a geometric progression of ratio 0.9 (see Table 3). In comparison with the tests of Section 4.2, we see that the performances deteriorate due to the change of velocity and grid, but is very clear that the Robin±Robin method is the less severely aected. The nilpotency threshold is also observed here. Table 4 completes these results. Here, the subdomains are squares of sizes 0:25 0:25 and the velocity can either be uniform ~ a 3~ e1 or have a boundary layer, see (43). The meshes can either be uniform, or be progressively re®ned with a geometrical progression of ratio 0.9 in the x2 direction. Here again, we see that preconditioning becomes particularly useful when the grid is dramatically re®ned, as in the case of a boundary layer for instance.
Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170
165
Fig. 3. The convergence with and without preconditioner: 36 subdomains Table 3 Anisotropic grids Partition Grid c1 m 0:001
R±R ± N±N
41 20 40
81 20 40
12 1 20 40
24 1 20 40
36 1 20 40
11 51 49
18 75 > 100
25 91 > 100
39 > 100 > 100
51 > 100 > 100
Table 4 In¯uence of anisotropic mesh re®nement and boundary layers Velocity
Re®nement
Uniform
Geometrical Progression Uniform
B. Layer
Geometrical Progression Uniform
Partition
31 40 40
31 60 60
31 60 60
R±R ± R±R ±
7 37 5 12
7 48 5 10
8 69 6 12
R±R ± R±R ±
9 40 8 18
8 57 7 20
10 73 10 24
4.4. In¯uence of the number of unknowns Here, we test the in¯uence of the number of grid points. The domain is the unit square, which is partitioned into 4 4 subdomains. In each subdomain, the grid varies from 20 20 up to 60 60. The velocity x ÿ~ x0 ,
x0 is the center of X the viscosity is m 0:001 and the constant c equals to 10ÿ7 . Thus, is ~ a ~ e3
~ this case is almost a steady state computation. A coarse space solver of BPS type is used here, but it will be discussed later. Note that this test is reputed dif®cult, since although convection effects dominate, diffusion is the only effect that carries information from the boundary of the domain to its center. The convergence is not affected by the grid size (see Table 5).
166
Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170
Table 5 In¯uence of the number of grid points
c 10ÿ7 , m 0:001, ~ a ~ e3
~ x ÿ~ x0
Partition Grid
44 20 20
44 30 30
44 40 40
44 60 60
R±R ±
34 80
34 82
34 89
34 ±
Table 6 Coarse Space Grid
Coarse sp.
22 30 30
33 30 30
55 30 30
77 30 30
10 10 30 30
c 10ÿ4 ~ x ÿ~ x0 a ~ e3
~ c 10ÿ4 ~ a 3~ e1
Yes No Yes No
20 22 7 7
26 31 13 12
36 51 18 15
42 68 22 18
48 91 25 23
4.5. Coarse space Here, we investigate the possibility of adding a coarse space solver to our preconditioner: we have chosen the space of functions which are the harmonic liftings of piecewise linear functions on the skeleton of the domain decomposition. This coarse space is similar to that used by Bramble, Paschiak and Schatz. In addition, for the Robin problems, we ®x one internal value to 0 in each subdomain which does not touch the physical boundary. The domain is the unit square, the viscosity is m 0:01 and the parameter c equals a ~ e3
~ x ÿ~ x0 (vortex) or ~ a 3~ e1 . to 10ÿ4 (the problem is almost stationary). The velocity can either be ~ The number of subdomains varies from 2 2 up to 10 10, and the grid in each subdomain is 30 30. Note that the coarse space solver is really useful in the case of the rotating velocity: the convergence still varies when the diameter H of the subdomains decreases but the number of necessary iteration seems to grow sub-linearly with 1=H , while the growth is linear when no coarse space is used. For the uniform velocity, the coarse space solver does not seem useful. Remark that the viscous effects are more important for the rotating velocity. Further investigations are clearly needed at this level (see Table 6). 5. Preliminary results in three-dimensions The three-dimensional cases described here deal with domains of arbitrary shapes, decomposed into hexaedra or tetrahedra. The problem (24) is discretized by the stabilized Galerkin Least-Squares technique described in Section 3.2 using second order elements. As in the two-dimensional case, the interface problem is solved by a GMRES algorithm using the Robin±Robin preconditioner. The algorithm stops when the residual is less than 10ÿ6 . 5.1. A model problem The ®rst experiments deal with a regular discretization of the unit cube 0; 1 0; 1 0; 1 on which a problem with a known analytical solution
u x2 y 2 z2 is solved. We chose c 0:1 and m 1. and a uniform oblique velocity ~ a
1; 7; 9. The total number of ®nite elements (NE), the corresponding number degrees of freedom (NDOF) and the number of iterations are given in Table 7. For the decomposition in nine subdomains we have also considered two types of boundary conditions: · Dirichlet condition on all the boundaries (18 iterations); · Dirichlet condition on ®ve faces of the cube and Robin boundary condition on the bottom face (40 iterations).
Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170
167
Table 7 A model three-dimensional problem Partition NE NDOF R±R
211 216 1232 3
221 256 1409 10
331 576 3021 11
333 1728 8294 18
No coarse space is used here. The results are sensitive to the number of subdomains, but the number of iterations remains very reasonable in all cases.
5.2. In¯uence of the velocity ®eld We next study the in¯uence of the velocity ®eld on the behavior of the preconditioner for an advection dominated case. For this we consider the following three decompositions (note that the domain is chosen such as each subdomain has a nice aspect ratio, and that each subdomain is made of 64 ®nite elements): 2 Case 1: The cube 0; 1 0; 0:5 split in 2 2 1 subdomains Case 2: The cube 0; 12 0; 0:33 split in 3 3 1 subdomains Case 3: The cube 0; 13 split in 3 3 3 subdomains The results are summarized in Table 8, where ~ v1 is de®ned by: ~ e1 v1 300 x22~ ~ e1 v1 3~
if x2 < 0:1 if x2 P 0:1
44
and ~ v2 denotes the velocity ®eld given by equation (43). These results show that the Robin±Robin preconditioner is far less sensitive than the original Neumann± Neumann one to the choice of velocity ®elds. This was expected since the original Neumann±Neumann preconditioner ignores all ~ a ~ n boundary coecients. Finally we solved the same problem, with ~ a
y=2 ÿ 0:5; ÿx=2 0:5; 0 on the unstructured decomposition of Fig. 1. Here the unit cube contains 24 576 tetrahedric second order ®nite elements and is split into 45 subdomains by an automatic mesh partitioner. This is why the boundaries between subdomains are less regular than for the other computations. For this decomposition the algorithm converges in 48 iterations with the R±R preconditioner and in 45 iterations with the addition of the coarse space.
Table 8 In¯uence of the velocity ®eld Case 1
Case 2
12 24
21 51
35 165
R±R N±N
9 23
13 65
19 92
~ a ~ v2 ; m 10ÿ2 ; c 10ÿ4 R±R N±N
9 37
16 88
25 >200
~ a
y=2 ÿ 0:5; ÿx=2 0:5; 0; m 10ÿ3 ; c 10ÿ4 R±R N±N
Case 3
~ a ~ v1 ; m 10ÿ2 ; c 10ÿ4
168
Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170
Fig. 4. The decomposition of the tube into subdomains
5.3. In¯uence of the coarse space In this section we consider a long tube (cf. Fig. 4) ÿ2; 22 0; 2 decomposed into 32 subdomains
2 2 8. Each subdomain contains 64 second order hexaedra (425 nodes). The coarse space is the domain based space described in Section 3. We solve the following problem: ÿ div
m
xru ~ a
x ru c
xu 0 u 2 x y on oXB ;
on X;
u 0 on oXT ; ou m
x 0 on oXN ; on
45
46
47
48
where XB denotes the bottom of the beam, XT the top and XN the lateral boundary of the beam. We consider two velocity ®elds: ~ e1 ÿ 0:1 x
y 2 ÿ 4~ e2 4~ e3 ; a1 0:1 y
x2 ÿ 4~ ~ e1 ÿ 0:1 x
y 2 ÿ 4~ e2
4 0:1 z
y=10 x=10 ÿ 0:5~ e3 : a2 0:1 y
x2 ÿ 4~ The case is dicult because of the aspect ratio of the domain, and because of our choice of coecients imposing very little diusion and almost no stabilization c. The results, obtained for different values of m and c are summarized in Table 9. The coarse space slightly improves the eciency of the preconditioner, but this is not really spectacular. The next test is a three-dimensional simulation of the two-dimensional problem solved in the previous section, with the rotating velocities. Starting from a uniform discretization of the unit square in 4 4 subdomains we obtain a three-dimensional mesh by transforming each quadrangle into a hexaedron. In order to obtain a good aspect ratio, the third-dimension is equal to the discretization step. Dirichlet boundary conditions are considered on the lateral boundary. The top and bottom faces are subjected to homogeneous Neumann boundary conditions. The results obtained are described in Table 10 for dierent discretization steps.
Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170
169
Table 9 In¯uence of the coarse space Velocity
m
c
R±R
R±R + coarse
~ a1 ~ a2 ~ a2
10ÿ2 10ÿ2 10ÿ3
10ÿ3 10ÿ3 10ÿ7
30 36 57
28 31 51
Table 10 Three-dimensional simulation of a two-dimensional problem Grid
m
c
R±R
R±R + coarse
44 20 20 20 20
10ÿ3 10ÿ3 10ÿ3
10ÿ7 10ÿ7 10ÿ4
19 20 20
18 18 18
The results are very similar to the bidimensional case and completely insensitive to the chosen grid. Moreover, for such a small number of subdomains, the proposed preconditioner is already very ecient without coarse space.
6. Conclusion The proposed preconditioner is a generalization of the Neumann±Neumann algorithm. Its main advantages are · Its robustness, assessed ®rst theoretically by a Fourier analysis in the special case of domain decomposition into strips, then by various numerical tests in two and three-dimensions. In the tests above, the Robin±Robin preconditioner has proved fairly insensitive to the choice of the grid, the velocity, the viscosity and the constant c. It is sensitive to the sizes of the subdomains, but this seems unavoidable when convections dominates. Another nice feature is that the Robin±Robin preconditioner transforms continuously to the Neumann±Neumann preconditioner as the convection effects vanish. · It has the same algebraic structure as the Neumann±Neumann preconditioner. This means that when a software contains the Neumann±Neumann preconditioner, constructing the Robin±Robin preconditioner requires almost no modi®cations. Therefore, it has been easily implemented in the most general case i.e. a three-dimensional case with unstructured and automatically partitioned mesh. · A theoretical framework for adding a coarse space solver is now available. However, our knowledge on this preconditioner is not complete. In particular, the analysis of convergence in the most general case seems dicult, since energy considerations are clearly not sucient. Further works need to be done, in particular, · the algorithm will be tested on less academic situations. · The coarse space to be used will be speci®ed better and validated. References [1] Y. Achdou, F. Nataf, A Robin±Robin preconditioner for an advection±diusion problem, C.R. Acad. Sciences Paris, series I 325 (1997) 1211±1216. [2] V. Aghoskov, Poincare±Steklov's operators and domain decomposition methods in ®nite dimensional spaces, in: Proceedings of DDM 1. [3] J.F. Bourgat, R. Glowinski, P. Le Tallec, M. Vidrascu, Variational formulation and algorithm for trace operator in domain decomposition calculations, in: Proceedings of DDM 2, AMS, 1988, pp. 3±16. [4] J. Bramble, J. Pasciak, A. Schatz, The construction of preconditioners for elliptic problems by substructuring, Math. Comp. 47 (1986) 103±134.
170
Y. Achdou et al. / Comput. Methods Appl. Mech. Engrg. 184 (2000) 145±170
[5] Xiao-Chuan Cai, Some Domain Decomposition Algorithms for Nonselfadjoint Elliptic and Parabolic Partial Dierential Equations. PhD thesis, Courant Institute of Mathematical Sciences, September 1989, Tech. Rep. 461, Department of Computer Science, Courant Institute. [6] Xiao-Chuan Cai, An additive Schwarz algorithm for nonselfadjoint elliptic equations, in: T. Chan, R. Glowinski, J. Periaux, O. Widlund (Eds.), Third International Symposium on Domain Decomposition Methods for Partial Dierential Equations, SIAM, Philadelphia, PA, 1990, pp. 232±244. [7] Xiao-Chuan Cai, O. Widlund, Domain decomposition algorithms for inde®nite elliptic problems, SIAM J. Sci. Statist. Comput. 13 (1) (1992) 243±258. [8] T. Chan, T. Mathew, Domain decomposition algorithms, in: Acta Numerica, Cambridge University Press, Cambridge, 1994, pp. 13±51. [9] L. Cowsar, J. Mandel, M. Wheeler, Balancing domain decomposition for mixed ®nite elements, Technical Report 93±08, Department of Mathematical Sciences, Rice University, 1993. [10] B. Despres, Domain decomposition method and the Helmholtz problem ii, in: K. Ralph et al. (Eds.), Mathematical and Numerical Aspects of Wave Propagation, SIAM, Philadelphia 1993, pp. 197±206. [11] M. Dryja, O. Widlund, Additive Schwarz methods for elliptic ®nite element problems in three-dimensions, in: Proceedings of DDM 5. [12] F. Gastaldi, L. Gastaldi, A. Quarteroni, Adaptative domain decomposition methods for advection dominated equations, East± West J. Numer. Math. 4 (1996) 165±206. [13] C. Japhet, Optimized Krylov±Ventcell method application to convection±diusion problems. in: Ninth International Symposium on Domain Decomposition Methods for Partial Dierential Equations, Wiley, New York, 1996. [14] C. Johnson, U. Navert, J. Pitkaranta, Finite element method for linear hyperbolic problems, Comp. Meth. Appl. Engrg. 45 (1984) 285±312. [15] J. Mandel, Two-level domain decomposition preconditioning for the p-version ®nite element method in three-dimensions, Int. J. Num. Meth. Eng. 29 (1990) 1095±1108. [16] P. Morice, Transonic computations by a multidomain technique with potential and Euler solvers, in: J. Zienep, H. Oertel (Eds.), Symposium Transsonicum III, IUTAM, G ottingen, 24±27 May 1988, 1989, Springer, Berlin. [17] F. Nataf, F. Nier, Convergence rate of Schwarz type methods for an arbitrary number of subdomains, in: Ninth International Symposium on Domain Decomposition Methods for Partial Dierential Equations, Wiley, New York, 1996. [18] F. Nataf, F. Nier, Convergence rate of some domain decomposition methods for overlapping and non-overlapping subdomains, Numerische Mathematik 75 (1997) 357±377. [19] F. Nataf, F. Rogier, Factorization of the convection-diusion operator and the Schwarz algorithm, Math. Models Appl. Sci. 5 (1995) 67±93. [20] Y. Saad, H. Schultz, Gmres: a generalized minimal residual algorithm for solving non-symmetric linear systems, SIAM J. Sci. Stat. Comput 7 (1986) 856±869. [21] B. Smith, An optimal domain decomposition preconditioner for the ®nite element solution of linear elasticity problems, SIAM J. Sci Stat. Comp. 13 (1992) 364±378. [22] P. Le Tallec, Domain decomposition methods in computational mechanics, Comput. Mech. Advances 1 (2) (1994) 121±220. [23] P. Le Tallec, Y.H. De Roeck, M. Vidrascu, Domain decomposition methods for large linearly elliptic three-dimensional problems, J. Comp. Appl. Math. 34 (1991) 93±117. [24] O. Widlund, Iterative substructuring methods: algorithms and theory for elliptic problems in the plane, in: G. Golub, R. Glowinski, J. Periaux (Eds.), Proceedings of the First International Symposium on Domain Decomposition Methods for Partial Dierential Equations, SIAM, Philadelphia, 1988.