On preconditioning a nonsingular matrix

On preconditioning a nonsingular matrix

MECH. RES. COMM. Vol. 5(5), 239-249, 1978. ON PRECONDITIONING Pergamon Press. Printed in USA. A NONSINGULAR MATRIX D. W. Nicholson Goodyear Resear...

336KB Sizes 2 Downloads 76 Views

MECH. RES. COMM. Vol. 5(5), 239-249, 1978.

ON PRECONDITIONING

Pergamon Press. Printed in USA.

A NONSINGULAR MATRIX

D. W. Nicholson Goodyear Research, The Goodyear Tire and Rubber Company, Akron, Ohio* (Received 9 March 1978; accepted for print 28 September 1978)

Summary For a nonsingular n by n matrix A, a diagonal matrix D* is derived which minimizes an upper bound on the spectral condition number of DA. Replacement of the linear system Ax=c with the prescaled system D*Ax=D*c requires about 3n 2 operations for dense matrices and fewer for sparse, banded matrices and is recommended for the conjugate gradient and other methods of solution. Examples are given showing the advantageous effect of prescaling on condition number, and a simple computational algorith~ is presented. The extension to nondiagonal scaling matrices is discussed. Introduction In vector and matrix notation we consider the linear system

Ax =

c.

(i)

Such equations arise routinely in finite element analysis.

For

computing x by the conjugate gradient and other methods it is b"

useful to introduce the prescaled

-I

system hlJ

D'Ax = D'c

(2)

where the diagonal matrix D' minimizes the spectral condition *Currently, Explosion Effects Branch, Naval Surface Weapons Center, Silver Spring, Maryland

oo93-6~13/78/o5o259-115o2.oo/o

Copyright

(c) 1978 Pergamon P r e s s

240

D. W.

NICHOLSON

[4 However D' is difficult to determine | ~ .

number of DA.

Instead,

in this work we derive a matrix D* minimizing an upper bound on F

the spectral condition number of DA 13, p . 8 ~ , and we illustrate its application with some simple examples.

A computational

algorithm is provided. Prescaling is common practice in the conjugate gradient method and is reported to promote efficiency Ii~.

But the prescaling

matrix seems to be primarily chosen on qrounds of intuition or experience. One contribution is to derive a scaling matrix rigorously by an error minimization argument.

Our results

appear to be new and naturally lead to further developments such as scaling with a lower bidiagonal matrix. Notation and Background For any D by 1 vector z, IIzll will denote the Euclidean norm ! h4, p.26~.

If Z is an n by n nonsingular matrix,

IIZII will

denote the matrix norm of Z, induced by the Euclidean vector norm

~, p.26J.

The eigenvaluesof Z will be denoted by

ordered such that lllI>ll21>...~ll I _I _ n' now denote the modulus.

li(Z)'

where the vertical lines

ZH and Z -I will denote the Hermitian

transpose and inverse of Z, respectively, while T(Z) and ~(Z) will denote the trace and determinant of Z, respectively. For the matrix being used here, the spectral condition number is given by [3, p.81]

kcz) : llzll llz-lll =

lCZ)/OnCZ)

where gi(Z) = I~/21(zHz) are the Hilbert singular Olin2! ..... ~On>0.

values ordered such that

ON PRECONDITIONING A NONSINGULAR MATRIX

241

Let x be the computed and x = ~ + ~x the exact solution of (i), where I16 I I<
_J

l l xl I/I Ixl I
I I~AI I<
(3)

Clearly k is involved in hounding

computational error and a small value for it is desirable. The spectral condition number not only has the foregoing error interpretation,

but it controls the efficiency of the conjugate

gradient and other computational methods [I,3], anditem7 reduction by scaling is reported to be r beneficial Ii|. Various -I LA ad hoc scaling methods have been used 111. As previously stated, our contribution is to derive a prescaling matrix using a rigorous error minimization argument. The exact solution of the equation DAx = Dc

(4)

is the same as that of (i) for any nonsingular diagonal matrix D.

We seek a particular choice D* minimizing an upper bound on

k(DA).

Now from the definition k(DA) = c I(DA)/c n(DA) = k I/2(DAAHD H).

Since the matrix DAAHD H is Hermitian and positive definite, is subject to the bound

[5]

k(DAAHD H) < 4 nn - -

it

Tn(DAAHDH) ___ 4__ ~(DAAHD H) A (DAAHD H) nn

For completeness, in the next section we briefly derive this important bound [5], and thereafter we derive a matrix D* minimizing ~(DAAHDH).

(5)

242

D. W. NICHOLSON

Proof

of B o u n d

L e t C be a p o s i t i v e Cl~C2~..-~Cn>0.

definite

Then,

since

Hemitian the

matrix

trace

with

eigenvalues

and determinant

are

invariants n T(C)

= cI + c2 +

~(C)

The well means

known

asserts

"'" + c n = [ i=l

= C l - C 2 ..... c

inequality

between

n = n n i=l

c. 1

c.. l

the alithmetic

and geometric

that

1 n(c)

>

n-~ ~ - -

i.

_

But

c c

and

since

c

< cI n --

1 n

the

c2c 3 -.- C n C n

1

c .c 2 c 1 ...... n

~

c

2

.c

foregoing

3

......

c

n

inequality

< c 2 c 3 ..... C n . C n . ¢ l

c

n

gives

= A(C)

_< T n ( c ) / n n.

Hence

k(C) proving

Choice

= Cl/C n > ~

4 Tn(c) ~

_

4 nn

¢(C),

the bound.

of P r e c o n d i t i o n i n g

L e t B = /~.H

!

Matrix

and B i s p o s i t i v e

definite

and H e r m i t i a n



In t e r m s

ON PRECONDITIONING A NONSINGULAR MATRIX

243

of B, we will show that $(D*AAHD *H) < $(DAAHD H) for all complex n by n diagonal matrices D, where D* = diag Ii/bl,''',

i/bn}

and b i = /~-bii" From the properties of the trace and determinant n

T(DAAHDH)

=

A (DAAHD H) =

2

~IdiiI i=l

b.. 11

(6)

dii 12

A (B) .

(7)

2

Hereafter

let d i = Idiil



Now

for a minimum

Bln~/~d i = nbii/ T(DAAHD H) - i/d i.

This is equivalent

to the linear system n b q=l

d - nbiid i = 0 qq q

whose matrix is singular and of rank n-1. parameter we choose

(8)

As a minimizing

di*-- 1/b~iii -- i/h which clearly satisfies (8). To prove that D* diag Id[, d~,...,d*In gives a minimum we must show that the Hessian matrix ~ with entries T..

I~

= a21n¢

0d i 0 ~ J

244

D. W. NICHOLSON

evaluated at dp, * is positive definite. vector q with elements qi' ~T_ ~ = ~ inii

1

But for an arbitrary

(qibii)2 - n7 qibii 21 > 0, i=l

and hence ~ is positive definite and the minimizing property of D* is proved. Substituting gives

II~ x 1{/11 x I{ <_ ~ *(D*~HD *R) <__ 2

biilA (B)

II~AII/IIAII

llaAllllIAII.

i

n

i~l bii >- A(B) .) Applications i.

Algorithm

We discuss the use of the foregoing results in computation. To avoid introducing rounding errors, b i is replaced by the nearest integral power to the base 8 of the computer, similar to the procedure used in Reference (6). The operation round (bi) = 8 s will mean b i - 8/2 ~ 8 s ~ b i + 8/2. The algorithm is as follows. i.

Calculate: n

bii = ~ aiq a. q=l lq 2.

dii = i/ b i = round (I/~bii)

ON PRECONDITIONING A NONSINGULAR MATRIX

3.

Calculate

4.

Solve the linear problem,

245

D*A. say by the conjugate gradient

method. Note that the diagonal round operation

2.

elements of D*BD are all unity if the

is not performed.

Simple Example

Consider the nearly singular matrix

xI +e

where £/x << i.

x 2.

To first order in e, we find k(A) = o l(A)/sn(A) k(D*A)

= (x2+l)2/2xe

= (x2+l)/c

and hence k(D*A)/k(A)

= 2x/[x2+l].

So the advantage of preconditioning

in this example

as the range of the diagonal elements is seen for equal diagonal

is greater.

is greater No advantage

elements no matter how nearly

singular the matrix is. 3.

Tridia@onal

Stiffness Matrix

The finite element method Hermitian

positive definite matrices.

tridiagonal elements.

frequently

example Extension

similar path.

leads to sparse banded, We consider a simple

such as might arise from using line to higher order problems

follows a

246

D. W. NICHOLSON

Consider 2 n n u m b e r s

ei

We introduce

and B i where e I = 0.

the

t r i d i a g o n a l m a t r i x K w i t h elements 2 kii = ui

ki , i + 1 The m a t r i x expressed

2 + 8i

ki,i-i

= ei+ 1 8 i

is H e r m i t i a n

= uiSi-i

kij = 0 otherwise.

and p osi ti ve

definite

since it can be

as the p r o d u c t K = LL H

where

lii = 8 i

li,i- 1

lij = 0 There exists Clearly,

a lower b i d i a g o n a l

otherwise.

m a t r i x E* such that E*KE *H = I.

E* = L -I and (E*KE *H)

By the i n e q u a l i t y follows

= ai

between

that E* m i n i m i z e s

= n n.

the arithmetic

(9)

and g e o m e t r i c means

@(EKE H) for E any b i d i a g o n a l

matrix.

In fact, E* is given by

eii = i/8 i

ei,i_ 1 = _ e i / S i S i + 1

e.. = 0 13

Some effort derivation

otherwise.

shows that these relations leading

to D*.

In co nt ra st

result

d*. = i/ /~i2+8i 2 ii

from the same

to E*, we have

it

ON PRECONDITIONING A NONSINGULAR MATRIX

dij = 0

247

otherwise

and n (~i2+8i2)/ n i=l i=l

¢(D*KD *H) = n n =

nn

2

n [1 + (ui/8 i) 21 i=l

(i0)

in contrast to (9). Finally, note that (~) =

{~

2)2

Ca.2 + 13

l

n 2 n/ n i=lSi

and therefore 2/n n 2 2 (E)/~ (D*ED *H) =~.! (ui2 + 8 2 ) ~ n n ~ (~i + 8i ) 1 i=l

< 1 --

by the inequality between the arithmetic and geometric means. 4.

Extension of the Scaling Results

The results of the foregoing sections can be extended to scaling matrices G k with the main diagonal and first k-1 subdiagonals populated, all other entries being zeroes. The mathematics is involved and will be reported elsewhere. Here we simply state the result. For a positive definite Hermitian scaling matrix, and let Pk be the setting all entries to zero which subdiagonal and above the (k-l) st

matrix P, let G k be the a matrix formed from P by lie below the (k-l) st superdiagonal. We assume that

Pk is positive definite; certainly P1 and P2 are by virtue of Hadamard's inequalities. Then our result is

Tn (GkPGk) min

in A (GkPGk)

= ~(Pk )/~(P)

>_ i.

248

D.W. NICHOLSON

+ and in fact the minimizing matrix G k is such that

G~kPk Gk~H = I. For a diagonal scaling matrix, we see that Pk=Pl=diag ~ _ ~ P 2 2 , Continuin g, G + .... , Pn n J". ~= , -1/2 -1/2 -1/,2 . IPll

.

G +I and G+P I iG+ 1 = I, so that'G[ = diag . . .

' P22

''''' Pnn } . n Also, d(Pl ) = i=iE Pii" Hence

! ,(G nn

PG ) =

~ Pii /4(P). i=l

With the identification P = B = AA H, it readily follows that G~ = D*~ Discussion In the foregoing sections, an error minimization argement has been used to derive a diagonal matrix for preconditioning the linear system of (i).

Several examples have been given

illustrating the advantages of prescaling. algorithm has been presented,

A computational

and the extension of the results

to more general scaling matrices has been described.

To

determine the quantities d i in the diagonal scaling matrix, about 3n 2 operations are needed, compared to about n 3 for elimination.

For large matrices and the conjugate gradient

and other methods,

the present prescaling method involves only

a small fraction of the total computational effort while promoting efficiency. Acknowledgment The author is most grateful to the reviewer

for his extensive

suggestions,particularly of the computational algorithm~

ON PRECONDITIONING A NONSINGULAR MATRIX

249

References i. 2. 3. 4. 5. 6.

Luenberger, D. C., An Introduction to Linear and Nonlinear programming, Adison-Wesley, London, 1973. Forsythe, G. E. and Moler, C., Computer Solutions of Linear Al~ebraic Systems, John Wiley and Sons, New York, 1964. Householder, A. S., The Theory of Matrices in Numerical Analysis, Blaisdell Publishing Co., New York, 1964. Young, D. M., Iterative Solutions of Lar@e Linear Systems, Academic Press, New York, 1971. Kato, T., "Estimation of Iterated Matrices with Application to the von Neumann Condition," Numer. Math, 2 22-29, 1960. Parlett, B. N., and Reinsch, C., "Balancing ~ Matrix for Calculation of Eigenvalues and Eigenvectors," Handbook for Automatic Computation Vol. II Linear Algebra," J. H. Wilkinson and C. Reinsch, Springer-Verlag, Berlin, 1971.