MECH. RES. COMM. Vol. 5(5), 239-249, 1978.
ON PRECONDITIONING
Pergamon Press. Printed in USA.
A NONSINGULAR MATRIX
D. W. Nicholson Goodyear Research, The Goodyear Tire and Rubber Company, Akron, Ohio* (Received 9 March 1978; accepted for print 28 September 1978)
Summary For a nonsingular n by n matrix A, a diagonal matrix D* is derived which minimizes an upper bound on the spectral condition number of DA. Replacement of the linear system Ax=c with the prescaled system D*Ax=D*c requires about 3n 2 operations for dense matrices and fewer for sparse, banded matrices and is recommended for the conjugate gradient and other methods of solution. Examples are given showing the advantageous effect of prescaling on condition number, and a simple computational algorith~ is presented. The extension to nondiagonal scaling matrices is discussed. Introduction In vector and matrix notation we consider the linear system
Ax =
c.
(i)
Such equations arise routinely in finite element analysis.
For
computing x by the conjugate gradient and other methods it is b"
useful to introduce the prescaled
-I
system hlJ
D'Ax = D'c
(2)
where the diagonal matrix D' minimizes the spectral condition *Currently, Explosion Effects Branch, Naval Surface Weapons Center, Silver Spring, Maryland
oo93-6~13/78/o5o259-115o2.oo/o
Copyright
(c) 1978 Pergamon P r e s s
240
D. W.
NICHOLSON
[4 However D' is difficult to determine | ~ .
number of DA.
Instead,
in this work we derive a matrix D* minimizing an upper bound on F
the spectral condition number of DA 13, p . 8 ~ , and we illustrate its application with some simple examples.
A computational
algorithm is provided. Prescaling is common practice in the conjugate gradient method and is reported to promote efficiency Ii~.
But the prescaling
matrix seems to be primarily chosen on qrounds of intuition or experience. One contribution is to derive a scaling matrix rigorously by an error minimization argument.
Our results
appear to be new and naturally lead to further developments such as scaling with a lower bidiagonal matrix. Notation and Background For any D by 1 vector z, IIzll will denote the Euclidean norm ! h4, p.26~.
If Z is an n by n nonsingular matrix,
IIZII will
denote the matrix norm of Z, induced by the Euclidean vector norm
~, p.26J.
The eigenvaluesof Z will be denoted by
ordered such that lllI>ll21>...~ll I _I _ n' now denote the modulus.
li(Z)'
where the vertical lines
ZH and Z -I will denote the Hermitian
transpose and inverse of Z, respectively, while T(Z) and ~(Z) will denote the trace and determinant of Z, respectively. For the matrix being used here, the spectral condition number is given by [3, p.81]
kcz) : llzll llz-lll =
lCZ)/OnCZ)
where gi(Z) = I~/21(zHz) are the Hilbert singular Olin2! ..... ~On>0.
values ordered such that
ON PRECONDITIONING A NONSINGULAR MATRIX
241
Let x be the computed and x = ~ + ~x the exact solution of (i), where I16 I I<
_J
l l xl I/I Ixl I
I I~AI I<
(3)
Clearly k is involved in hounding
computational error and a small value for it is desirable. The spectral condition number not only has the foregoing error interpretation,
but it controls the efficiency of the conjugate
gradient and other computational methods [I,3], anditem7 reduction by scaling is reported to be r beneficial Ii|. Various -I LA ad hoc scaling methods have been used 111. As previously stated, our contribution is to derive a prescaling matrix using a rigorous error minimization argument. The exact solution of the equation DAx = Dc
(4)
is the same as that of (i) for any nonsingular diagonal matrix D.
We seek a particular choice D* minimizing an upper bound on
k(DA).
Now from the definition k(DA) = c I(DA)/c n(DA) = k I/2(DAAHD H).
Since the matrix DAAHD H is Hermitian and positive definite, is subject to the bound
[5]
k(DAAHD H) < 4 nn - -
it
Tn(DAAHDH) ___ 4__ ~(DAAHD H) A (DAAHD H) nn
For completeness, in the next section we briefly derive this important bound [5], and thereafter we derive a matrix D* minimizing ~(DAAHDH).
(5)
242
D. W. NICHOLSON
Proof
of B o u n d
L e t C be a p o s i t i v e Cl~C2~..-~Cn>0.
definite
Then,
since
Hemitian the
matrix
trace
with
eigenvalues
and determinant
are
invariants n T(C)
= cI + c2 +
~(C)
The well means
known
asserts
"'" + c n = [ i=l
= C l - C 2 ..... c
inequality
between
n = n n i=l
c. 1
c.. l
the alithmetic
and geometric
that
1 n(c)
>
n-~ ~ - -
i.
_
But
c c
and
since
c
< cI n --
1 n
the
c2c 3 -.- C n C n
1
c .c 2 c 1 ...... n
~
c
2
.c
foregoing
3
......
c
n
inequality
< c 2 c 3 ..... C n . C n . ¢ l
c
n
gives
= A(C)
_< T n ( c ) / n n.
Hence
k(C) proving
Choice
= Cl/C n > ~
4 Tn(c) ~
_
4 nn
¢(C),
the bound.
of P r e c o n d i t i o n i n g
L e t B = /~.H
!
Matrix
and B i s p o s i t i v e
definite
and H e r m i t i a n
•
In t e r m s
ON PRECONDITIONING A NONSINGULAR MATRIX
243
of B, we will show that $(D*AAHD *H) < $(DAAHD H) for all complex n by n diagonal matrices D, where D* = diag Ii/bl,''',
i/bn}
and b i = /~-bii" From the properties of the trace and determinant n
T(DAAHDH)
=
A (DAAHD H) =
2
~IdiiI i=l
b.. 11
(6)
dii 12
A (B) .
(7)
2
Hereafter
let d i = Idiil
•
Now
for a minimum
Bln~/~d i = nbii/ T(DAAHD H) - i/d i.
This is equivalent
to the linear system n b q=l
d - nbiid i = 0 qq q
whose matrix is singular and of rank n-1. parameter we choose
(8)
As a minimizing
di*-- 1/b~iii -- i/h which clearly satisfies (8). To prove that D* diag Id[, d~,...,d*In gives a minimum we must show that the Hessian matrix ~ with entries T..
I~
= a21n¢
0d i 0 ~ J
244
D. W. NICHOLSON
evaluated at dp, * is positive definite. vector q with elements qi' ~T_ ~ = ~ inii
1
But for an arbitrary
(qibii)2 - n7 qibii 21 > 0, i=l
and hence ~ is positive definite and the minimizing property of D* is proved. Substituting gives
II~ x 1{/11 x I{ <_ ~ *(D*~HD *R) <__ 2
biilA (B)
II~AII/IIAII
llaAllllIAII.
i
n
i~l bii >- A(B) .) Applications i.
Algorithm
We discuss the use of the foregoing results in computation. To avoid introducing rounding errors, b i is replaced by the nearest integral power to the base 8 of the computer, similar to the procedure used in Reference (6). The operation round (bi) = 8 s will mean b i - 8/2 ~ 8 s ~ b i + 8/2. The algorithm is as follows. i.
Calculate: n
bii = ~ aiq a. q=l lq 2.
dii = i/ b i = round (I/~bii)
ON PRECONDITIONING A NONSINGULAR MATRIX
3.
Calculate
4.
Solve the linear problem,
245
D*A. say by the conjugate gradient
method. Note that the diagonal round operation
2.
elements of D*BD are all unity if the
is not performed.
Simple Example
Consider the nearly singular matrix
xI +e
where £/x << i.
x 2.
To first order in e, we find k(A) = o l(A)/sn(A) k(D*A)
= (x2+l)2/2xe
= (x2+l)/c
and hence k(D*A)/k(A)
= 2x/[x2+l].
So the advantage of preconditioning
in this example
as the range of the diagonal elements is seen for equal diagonal
is greater.
is greater No advantage
elements no matter how nearly
singular the matrix is. 3.
Tridia@onal
Stiffness Matrix
The finite element method Hermitian
positive definite matrices.
tridiagonal elements.
frequently
example Extension
similar path.
leads to sparse banded, We consider a simple
such as might arise from using line to higher order problems
follows a
246
D. W. NICHOLSON
Consider 2 n n u m b e r s
ei
We introduce
and B i where e I = 0.
the
t r i d i a g o n a l m a t r i x K w i t h elements 2 kii = ui
ki , i + 1 The m a t r i x expressed
2 + 8i
ki,i-i
= ei+ 1 8 i
is H e r m i t i a n
= uiSi-i
kij = 0 otherwise.
and p osi ti ve
definite
since it can be
as the p r o d u c t K = LL H
where
lii = 8 i
li,i- 1
lij = 0 There exists Clearly,
a lower b i d i a g o n a l
otherwise.
m a t r i x E* such that E*KE *H = I.
E* = L -I and (E*KE *H)
By the i n e q u a l i t y follows
= ai
between
that E* m i n i m i z e s
= n n.
the arithmetic
(9)
and g e o m e t r i c means
@(EKE H) for E any b i d i a g o n a l
matrix.
In fact, E* is given by
eii = i/8 i
ei,i_ 1 = _ e i / S i S i + 1
e.. = 0 13
Some effort derivation
otherwise.
shows that these relations leading
to D*.
In co nt ra st
result
d*. = i/ /~i2+8i 2 ii
from the same
to E*, we have
it
ON PRECONDITIONING A NONSINGULAR MATRIX
dij = 0
247
otherwise
and n (~i2+8i2)/ n i=l i=l
¢(D*KD *H) = n n =
nn
2
n [1 + (ui/8 i) 21 i=l
(i0)
in contrast to (9). Finally, note that (~) =
{~
2)2
Ca.2 + 13
l
n 2 n/ n i=lSi
and therefore 2/n n 2 2 (E)/~ (D*ED *H) =~.! (ui2 + 8 2 ) ~ n n ~ (~i + 8i ) 1 i=l
< 1 --
by the inequality between the arithmetic and geometric means. 4.
Extension of the Scaling Results
The results of the foregoing sections can be extended to scaling matrices G k with the main diagonal and first k-1 subdiagonals populated, all other entries being zeroes. The mathematics is involved and will be reported elsewhere. Here we simply state the result. For a positive definite Hermitian scaling matrix, and let Pk be the setting all entries to zero which subdiagonal and above the (k-l) st
matrix P, let G k be the a matrix formed from P by lie below the (k-l) st superdiagonal. We assume that
Pk is positive definite; certainly P1 and P2 are by virtue of Hadamard's inequalities. Then our result is
Tn (GkPGk) min
in A (GkPGk)
= ~(Pk )/~(P)
>_ i.
248
D.W. NICHOLSON
+ and in fact the minimizing matrix G k is such that
G~kPk Gk~H = I. For a diagonal scaling matrix, we see that Pk=Pl=diag ~ _ ~ P 2 2 , Continuin g, G + .... , Pn n J". ~= , -1/2 -1/2 -1/,2 . IPll
.
G +I and G+P I iG+ 1 = I, so that'G[ = diag . . .
' P22
''''' Pnn } . n Also, d(Pl ) = i=iE Pii" Hence
! ,(G nn
PG ) =
~ Pii /4(P). i=l
With the identification P = B = AA H, it readily follows that G~ = D*~ Discussion In the foregoing sections, an error minimization argement has been used to derive a diagonal matrix for preconditioning the linear system of (i).
Several examples have been given
illustrating the advantages of prescaling. algorithm has been presented,
A computational
and the extension of the results
to more general scaling matrices has been described.
To
determine the quantities d i in the diagonal scaling matrix, about 3n 2 operations are needed, compared to about n 3 for elimination.
For large matrices and the conjugate gradient
and other methods,
the present prescaling method involves only
a small fraction of the total computational effort while promoting efficiency. Acknowledgment The author is most grateful to the reviewer
for his extensive
suggestions,particularly of the computational algorithm~
ON PRECONDITIONING A NONSINGULAR MATRIX
249
References i. 2. 3. 4. 5. 6.
Luenberger, D. C., An Introduction to Linear and Nonlinear programming, Adison-Wesley, London, 1973. Forsythe, G. E. and Moler, C., Computer Solutions of Linear Al~ebraic Systems, John Wiley and Sons, New York, 1964. Householder, A. S., The Theory of Matrices in Numerical Analysis, Blaisdell Publishing Co., New York, 1964. Young, D. M., Iterative Solutions of Lar@e Linear Systems, Academic Press, New York, 1971. Kato, T., "Estimation of Iterated Matrices with Application to the von Neumann Condition," Numer. Math, 2 22-29, 1960. Parlett, B. N., and Reinsch, C., "Balancing ~ Matrix for Calculation of Eigenvalues and Eigenvectors," Handbook for Automatic Computation Vol. II Linear Algebra," J. H. Wilkinson and C. Reinsch, Springer-Verlag, Berlin, 1971.