PARALLEL COMPUTING ELSEVIER
Parallel Computing 20 (1994) 1055-1064
A parallel additive preconditioner for conjugate gradient m e t h o d for A X + X B = C D.J. E v a n s
a,*
E. G a l l i g a n i b
5,Parallel Algorithms Research Centre, University of Technology, Loughborough, Leics LE l l 3TU, UK b Dept. of Mathematics, University ofModena, Modena, Italy Received 30 July 1993; revised 24 September 1993
Abstract
This paper is concerned with the solution of a linear system of equations which have the form of APt'+XB = C by the preconditioned conjugate gradient (PCG) method. Such systems arise from finite difference discretization of separable elliptic boundary value problems on rectangular domains. A parallel additive preconditioner for the conjugate gradient (CG) method for solving such systems is presented. Some numerical experiments on the Poisson model problem are carried out on the Sequent Balance 8000 to evaluate the effectiveness of the additive preconditioner with respect to block diagonal preconditioners. Key words:
CG method; Sylvester form; Poisson model problem; Additive block precondi-
tioner
1. T h e P C G m e t h o d for AX + XB = C
We consider the preconditioned conjugate gradient method for solving problems arising from separable, second order, linear elliptic partial differential equations. If an elliptic problem is separable, then we can express the linear system corresponding to the discrete problem in the well-known Sylvester form; which means that it can be reduced to a form corresponding to a simpler and lower dimensional problem.
* Corresponding author.
0167-8191/94/$07.00 © 1994 Elsevier Science B.V. All rights reserved SSDI 0 1 6 7 - 8 1 9 1 ( 9 4 ) 0 0 0 1 3 - Z
D.J. Evans, E. Galligani / Parallel Computing 20 (1994) 1055-1064
1056
Let A and B be matrices of order m and n respectively, the Sylvester matrix equation is, A X + X B = C. (1) The system (1) can be written into a larger form, k__=c,
(2)
where .X = ( X l l . . . . .
Xln, X21 . . . . . X2 . . . . . . Xml . . . . . Xmn) T,
_C = ( C l l . . . . .
Cln, C21 . . . . . C2. . . . . . Cml . . . . . Cnan)T
and ~4~ Rm~×nan can be expressed in terms of Kronecker products of A and B, -allln + B w
al2I n
.......
almIn
a21ln
a22I,, + B T
.......
a 2 In
"'"'"""
7
A=A®I'+Ina®BT=
anal&
anailn
ammIn + B T
The Kronecker product of two matrices U ~ ~m×n and V e ~P×q is the block matrix whose (i, j)th block is uijV. When .4 is a symmetric positive definite and large sparse matrix, the preconditioned conjugate gradient method is an effective means for solving (2) [1]. Since A has a special structure we can formulate the PCG method for the compact form (1). Given an initial guess X = 0 and a nonsingular matrix M of order m n we generate a sequence of matrices {X} approximating to the solution X* of (1) as follows: R=C Md=r P=D n
po = II R ll2=
2 i=lj=l
To= ~ ~DijRij i=lj=l Pl =0
for k = 1 . . . . . k MAX do W=AP
+ PB
p, =
E PijWij i=lj=l
a = To~P1
Pl = To
D.J. Evans, E. Galligani/Parallel Computing 20 (1994) 1055-1064
1057
X=X+aP R =R -aW Md=r
po IIg II2 n
To = ~ ~-~DijRij i=lj=l = TO/P1 P=D+~P
if P0 < e II C II F stop with r21,...,
_d = ( d l l , . . . , din , d21 . . . . .
d 2. . . . . .
din,) T and
dml . . . . .
r = ( r 1 1 , . . . , rl.,
r2n,..., rml,..., rmn)T.
When M = I m , the conjugate gradient method is obtained. The properties that are required for the preconditioner M[2] are: (a) M is symmetric positive definite, (b) M is sparse, (c) M is easy to construct, (d) M allows for a fast solution of M d = r on a vector or parallel computer, (e) 'good' distribution of the eigenvalues of M - ~ A or a small condition number. For the linear system in the Sylvester form we have also: (f) M allows the compact matrix form for solving M d = r. Consider the coefficient matrix .4 in (2) arising in solving the Poisson equation on a rectangular domain by using five point discretization or nine point discretization formula. Such a matrix is a block tridiagonal matrix or block pentadiagonal matrix with off-diagonal blocks diagonal. I n these cases the matrices in (1) are respectively, 2
-1
-a..
2
2.
-1
0
-~
0 -1.
A =
2
-1
B =
" 0
- 1
iiii
"2
-
1
"2
and 30
- 16
1
30
- 16
1
0 -
16.
30
-
""'-.
"'"-.
16
0
1.
A =
-
16.
30
-
16
1.
,B= 1.... 0
"1
1
" - 16
- 16 "30
"-. 0
"1
-16
-16 30
D.Z Evans, E. GaUigani / Parallel Computing 20 (1994) 1055-1064
1058
Now if we choose, all
M-M 1=
"-....
am~I,~+BT.
we have the l-line Jacobi preconditioner for the CG method. If we consider the five point formula or the nine point formula the preconditioner has the same diagonal blocks and it is possible to represent M l_d = _r in the compact form [3]. In fact, by setting T = -4-B T, we have,
aiiIn
MI =
..
,
(3)
0
and
IT-1 =
_d = M l ' r
T- 1
0
"-. 0
]
r = (Im ®
T -1
RT-1.
then D = Otherwise, if we choose, aul . + B T
a121n
a211n
a22In + B T
0
M =-M2 = am-l,m-lln+B
0
a....
lln
T
a m lmln
ammln + B T
then we have the 2-line Jacobi preconditioner, which for the five-point formula can be written [3], T
-I T
-I
0
(4)
M 2
_T 0
I
-I T
D.J. Evans, E. Galligani/ Parallel Computing 20 (1994) 1055-1064
1059
and the inverse is given by, -A - 1
0
0
A-t
I My 1 = T I
0
I T
A-1
0
A-1
with A-I (T 2 - I) -1 = ( T - I ) - I ( T + I) -1. By introducing the m x m matrix, =
2 1
0
1 2
X= 2 1
0
1
2
we obtain, M;I=(X®I,
+ Im ® B ) ( I m
® A-l),
therefore the system Mzd = r can be solved as follows, d = M z ' r = ( A ® In + I m ® B ) ( I m ® A-1)_F,
which in compact form is, D
=X(Ra -~) +
(RA-t)B.
A similar formulation can be obtained for the nine point difference formula. The 3-line Jacobi preconditioner can be obtained for the five point formula with only a little extra computational work. In this case, we have, T -I
-I T
0
I
_iYr
(51
M3= -I
0
"'"l
T-I
T -I
T
D.J. Evans, E. Galligani/ Parallel Computing 20 (1994) 1055-1064
1060
and the inverse is given by, -A - 1
T I I
T
A -1
0
I T
0
A-I
M3 1 =
T I
0
A-I
I T
0
A-1
I ir
A-1
where instead of A = T z - I we compute, A = T(T 2-I)
- T,
this will increase the convergence rate asymptotically to V~ [4]. The computation of A-I=(T
-
v / 2 I ) - l ( T + 7 ~ - i ) - l r -1,
can be easily carried out because they are all tridiagonal back solvers.
2. An additive preconditioner Let us now consider the following splitting of the matrix ~{ [5]: z~=n
I
+ K 1= H 2 +K
2,
where,
allln + H T
a12I n
a21In
a22In + B T
0
H 1=
0
am_l,m_lln + B T
a m - l,mln
am,m-lln
ammI,, + B "r
(6)
K 1 =¢{ - HI, a11In + B T
H2=
a22I n + B T
a 32 In
a23In
a33I n + B T
0
0
ammI n + B v
D.J. Evans, E. Galligani / Parallel Computing 20 (1994) 1055-1064 K2 = z l -- H2 '
1061 (7)
w h e r e we assume that m is even. If m is odd we can p r o c e e d in a similar way. N o w we define,
M21 = OH[ ~ + (1 - O ) H ; 1,
(8)
w h e r e 0 < 0 < 1 is a c h o s e n parameter. If N 2 = / ~ 2 -- z l the matrix,
T =/~21N2 =/~21 (/~2 - Z ) = I = I-
- ~¢2 ' ~
0 H [ 1 , 4 - (1 - O)H21.q = - O H [ - I K I - (1 - O ) H f l K 2 .
For 0 = ½, T b e c o m e s the iteration matrix of the Arithmetic M e a n M e t h o d [6]. N o w we must give conditions for/~¢21 to be symmetric and positive definite for the c o m p a c t form o f the m o d e l problems. T h e o r e m 1. Let A be a block tridiagonal symmetric positive definite matrix, then for 0 < 0 < 1 the matrix M 2 1 in (8) is symmetric positive definite.
Proof. Let U 1 be the block diagonal matrix U 1 = diag {-11, 13 . . . . , ( - - 1 ) m / 2 I m - 1} c o r r e s p o n d i n g to the partitioning (6) for H 1 and let U 2 be the block diagonal matrix U 2 = diag { - I 0 , I 2 , . . . , ( - 1 ) m / 2 + l l m } c o r r e s p o n d i n g to the partitioning (7) for H 2, w h e r e I i is the identity matrix of o r d e r 2n for i = 1, 2 . . . . . m - 1 and I 0 and I m are the identity matrices of o r d e r n. Then, U I ( H 1 +K1)U1-1 = H 1 - K 1 and
U 2 ( H 2 + K e ) U 2 -1 = H 2 - K 2.
T h e symmetric matrices H l - K 1 and H 2 - K 2 have the same eigenvalues of / [ and, consequently they are symmetric positive definite matrices. Since the matrices, O(H 1 - K 1) + (1 - 0) ( n I - K1 )T = H 1 - K 1 and o ( n 2 - K 2) + (1 - 0) ( H 2 K2 )T = H 2 - K 2 are symmetric positive definite, the iteration matrix T = ~ t 21N 2 is convergent [7]. By hypothesis, the matrices H~ and H e are symmetric positive definite. C o n s e q u e n t l y the matrix ,~21 of formula (8) is symmetric positive definite. [] T h e o r e m 2. Let ~{ be a symmetric strictly diagonally dominant or irreducibly diagonally dominant matrix with positive diagonal entries, then, for 0 < 0 < 1 the matrix M-2 1 in (8) is symmetric positive definite.
Proof. By hypothesis H 1 - K 1 and H 2 - K 2 are symmetric positive definite matrices. T h u s the last part of T h e o r e m 1 completes the proof.
[]
N o w if A is the matrix arising in solving the Poisson equation on a rectangular d o m a i n by using the five point discretization or nine point discretization formula, it is possible to write the system M 2 d = r o f the P C G algorithm in c o m p a c t form.
D.J. Evans, E. Galligani / Parallel Computing 20 (1994) 1055-1064
1062
For the five point formula we have, T
-I T
-I
T
0
_T
-I
I H 1
0
T
9 2=
-I
0
0
T
then the inverse matrices are given by, -A - 1
0
I
A 1
H11 =
zl-I
T I
0
I
0
zl-I
T
-T A-I
T I I
0
I
0 A-I
H[ 1=
a-I
0
I
zl-I
0 T
3-1
with A - I ( T 2 - I ) - l and , ~ - 1 = ( T 2 ) - l . Now set the matrices of order m, -2
2 1 1 2
2 1 1 2
0
and A2 =
=
2 1 2 1
1 1
1 2 m
t h e n the equations _d = M 2 ar become,
= o[ z , ( R a - ' ) + (Ra-')B] + (~ - o)[ & ( R a ; ' ) + ( R a l ' ) B ] ,
D.Z Evans, E. Galligani / Parallel Computing 20 (1994) 1055-1064
1063
w h e r e the ( q ) t h e l e m e n t z~j of the m × n matrix R A ~ 1 can be f o r m u l a t e d in p s e u d o - c o d e as follows: for j = 1. . . . , n zij=zij+rik*v~j, k=l,...,n;i=2 .... ,m-1 Z l j = Z l j -J¢-r l k * Ukj Zrn j = Zmj + rmk * Ukj
w h e r e Upq a n d Cpq are the ( p , q)th of the matrices ~ - ~ a n d A J If we c o n s i d e r the n i n e p o i n t f o r m u l a we can p r o c e e d in a similar way by using T h e o r e m 2.
3. Numerical experiments I n o r d e r to evaluate the effectiveness of the p r e c o n d i t i o n e r (8) for the conjugate g r a d i e n t m e t h o d , some c o m p u t a t i o n a l e x p e r i m e n t s have b e e n carried out a n d a parallel F o r t r a n i m p l e m e n t a t i o n has b e e n d e v e l o p e d on S e q u e n t B a l a n c e 8000. T h e C G m e t h o d a n d the P C G m e t h o d with the different p r e c o n d i t i o n e r s are a p p l i e d to the Poisson m o d e l p r o b l e m . T h e c o n v e r g e n c e p a r a m e t e r E in the P C G algorithm is t a k e n equal to 10 4 a n d 0 in f o r m u l a (8) is chosen to be 0.5. In T a b l e 1 a n d T a b l e 2, results for the five p o i n t f o r m u l a a n d n i n e p o i n t f o r m u l a respectively are given for the n u m b e r of iterations of the algorithms tested o n system (1) for the different sizes, from 4 x 4, to 32 × 32. T h e n u m b e r of processors used is p, obviously the P C G m e t h o d with the p r e c o n d i t i o n e r (8) requires 2 p processors. All the results for p r e c o n d i t i o n e r (8) are o b t a i n e d using 2 p processors.
Table 1 Number of iterations for the CG and PCG method for the 5 point difference formula mXn
4×4 (p=l)
8x8 (p=l)
12x12 (p=2)
16×16 (p=2)
20×20 (p=4)
24x24 (p=4)
28x28 (p=4)
32×32 (p=4)
CG PCGM 1 PCGM 2 PCGM 3
8 8 6 5
17 14 10 9
25 19 14 11
34 24 18 14
41 27 21 16
49 32 25 19
55 36 28 22
62 41 32 24
Table 2 Number of iterations for the CG and PCG method for the 9 point difference formula
m×n
4x4 (p=l)
8X8 (p=l)
12×12 (p=2)
16×16 (p=2)
20x20 (p=4)
24x24 (p=4)
28x28 (p=4)
32×32 (p=4)
CG PCGM 1 PCGM 2 PCGM 2
8 8 6 6
19 16 10 10
28 21 14 11
38 27 18 15
47 32 21 17
56 37 25 20
63 42 28 24
71 47 32 26
D.J. Evans, E. Galligani / Parallel Computing 20 (1994) 1055-1064
1064
Table 3 Number of iterations and error for the CG and PCG methods for the 5 point difference equation m Xn CG PGCM 1 PCGM 2 PCGM 3 PCGM 2
24 )< 24 Processor 4 4 4 4 8
Iter. 15 11 10 8 8
36 )< 36 Err. 5.7-3 5.7-3 5.9-3 5.7-3 5.7-3
Iter. 22 16 13 12 11
48 )<48 Err. 5.7-3 5.2-3 5.5-3 5.6-3 5.6-3
Iter. 32 20 18 16 12
60 X 60 Err. 5.7-3 9.9-3 5.8-3 5.7-3 9.6-3
Iter. 37 22 19 17 16
Err. 5.7-3 7.4-3 5.8-3 5.7-3 5.8-3
Table 4 The computer time in seconds for the various preconditioners m ×n
24x24
36×36
48x48
60)<60
PCGM2(4 proc.) PCGM3(4 proc.) PCGM2(8 proc.)
7.6 6.6 8.3
32.8 31.0 30.5
92.8 83.9 60.5
198.0 179.8 149.5
In Tables 1 and 2, the error on the solution is about 10 -4, 10 -5 and the function in the Poisson equation is u(x, y ) = e-Xe -y. Tables 3 and 4 show that the rate of convergence of the PCG method can be improved considerably if we choose the preconditioner of formula (8) instead of the 2-line Jacobi or 3-line Jacobi preconditioner. Number of iterations, error of the solution and computer time expressed in seconds are reported. The Poisson equation is solved by five point discretization formula and the solution is taken u(x, y) = sin(½nTrx) cos(½n~'y).
References [1] J.M. Ortega, Introduction to Parallel and Vector Solution of Linear Systems (Plenum Press, New York, 1985). [2] D.J. Evans, ed., Preconditioned iterative methods, Int. J. Comp. Math. 44 (1992) 1-364. [3] D.J. Evans and C.R. Wan, A preconditioned conjugate method for A X + X B = C, Int. Rep. 778, Comp. Stud. (LU.T. 1993). [4] R.S. Varga, Matrix Iterative Analysis (Prentice Hall, Englewood Cliffs, NJ, 1962). [5] V. Ruggiero and E. Galligani, A parallel algorithm for solving block tridiagonal linear systems, Comput. Math. Applic. 24 (4) (1992) 15-21. [6] V. Ruggiero and E. Galligani, An iterative method for large sparse linear systems on a vector computer, Comput. Math. Applic. 20 (1) (1990) 25-28. [7] D.P. O'Leary and R.E. White, Multisplitting of matrices and parallel solution of linear systems, SIAM J. Alg. Disc. Math. 6 (4) (1985) 630-640.