Iterative Methods of Solution

Iterative Methods of Solution

Chapter 9 Iterative Methods of Solution Iterative or indirect method are based on reformulating Ax x(n+l) = Mx(n) = b as +c where n refers to th...

1MB Sizes 1 Downloads 122 Views

Chapter 9

Iterative Methods of Solution Iterative or indirect method are based on reformulating Ax x(n+l)

=

Mx(n)

= b as

+c

where n refers to the iteration number, M is known as the iteration matrix and c is a new vector formed by the reformulation process. This approach requires that x(n) converges to x, the exact solution, i.e.

x

=

lim

x(n)

n->oo

Convergence depends on the choice of M and should not depend critically upon the choice of x(D). Ideally, the choice of xeD) should be based on a priori information on x but, in practice, we can set xeD) = o.

Advantages of Iterative Methods (i) The iteration matrix is virtually unaltered by most methods. (ii) The methods are generally self correcting. (iii) The methods are generally simple to program and require little additional storage. (iv) The methods lend themselves to the application of parallel processing.

Disadvantages of Iterative Methods (i) In general, there is no guaranty of convergence. (ii) Even when the method converges, the number of iterations can be large. (iii) The computing time can be large if many iteration are required and the system of equations is large. As a general rule of thumb, iterative methods are recommended only for large linear systems of equations Ax = b when A is a large sparse matrix. Systems of this type,

237

238

CHAPTER 9. ITERATIVE METHODS OF SOL UTION

frequently occur in numerical solutions to partial differential equations using finite difference analysis and (to a lesser extent) finite element analysis. They also occur in cases when the impulse response function of a system is characterized by a matrix which is banded and sparse. The types of iterative techniques that are discussed in this chapter include: Jacobi's method, the Gauss-Seidel method, the relaxation method, The conjugate gradient method which is a semi-iterative technique. The conditions for convergence (which are discussed later on in this chapter) are as follows: (i) A sufficient condition for convergence is that A is strictly diagonally dominant, i.e. n

I aii I>

L

I aij I .

j=l j#i

(ii) A necessary and sufficient condition for convergence is that

p(M) < 1 where p(M) is the 'spectral radius', given by

p(M) = max I Ai I . 2

Here, Ai are the eigenvalues of M. The largest eigenvalue of the iteration matrix must therefore be less than unity to ensure convergence.

9.1

Basic Ideas

Consider the following system of equations:

en

a12

al n

a2l

a22

a2n

anI

a n2

ann

)(

bl b2

Xl

X2

Xn

)() bn

We can rewrite this set of equations as Xl

1

=

-(bl -

a12x2 -

... -

alnX n),

a2lXl -

... -

a2nXn),

all

1

X2

= -(b2

-

a22

Xn =

1

--(bn

-

anlxl -

... - an,n-lxn-l).

ann

Clearly this requires that

aii

=I- 0, i.e. the diagonal elements are non zero.

9.1. BASIC IDEAS

9.1.1

239

Jacobi's Method

Jacobi's method involves the most basic iteration process of the type:

9.1.2

The Gauss-Seidel Method

The Gauss-Seidel method involves updating the sub-diagonal elements as the computation proceeds. The iteration process is

9.1.3

The Relaxation or Chebyshev Method

The relaxation method involves using a 'relaxation parameter' to 'tune' the system in order to reduce the number of iterations required. The iteration process is

where w is called the 'relaxation parameter'. The value of this parameter affects (increases) the rate of convergence. There are two cases in which 1 < w < 2 giving the so called 'Successive-Over-Relaxation' or SOR method and the case when 0 < w < 1 known as the 'Successive-Under-Relaxation' or SUR method.

CHAPTER 9. ITERATIVE METHODS OF SOLUTION

240

9.2

Iterative Methods

In the previous section the basic ideas were introduced. In this section, we develop the approach further. We start with b = Ax where A is a n x n matrix and write this equation in the form n

bi =

2:

n

aijXj

=

j=1

2:

aijXj

+ aiiXi·

j=1 jf-i

Rearranging,

and iterating,

where

which is equivalent to setting x~O) = O\ii. The above iterative scheme is Jacobi's method. Gauss-Seidel iteration is given by 1

(b 2: i

(k+l) _ 1

Xi

a.

-

t>:

n

j=1

n

2:

(HI) -

aijX' J

. (k))

aiJ X· J

j=i+l

and the relaxation method is compounded in the result

It is easy to translate the formulae above into code which requires a loop to compute the sums, a loop over i and a loop over k. Note that if w = 1, then the relaxation method reduces to Gauss-Seidel iteration. For w i- 1, the relaxation method can increase the rate of convergence, depending on the characteristics of the system.

9.3

Example of the Iteration Method

Suppose we are required to solve 10Xl -

X2

+ X3

=

20,

9.3. EXAMPLE OF THE ITERATION METHOD

241

We re-write this system as

Note that the characteristic matrix A of this system is diagonally dominant and that convergence is likely because 1/aii is small. Applying Jacobi iteration, we have

x~n+l) = (16

x~n+l) and taking

xCO) =

=

+ x~n)

- 2x~n))/20,

(23 - 2x~n)

+ x~n))/20,

0, gives

x O = (0,0, of, x(1)

= (2.0,0.8, 1.15f,

x (2 ) = (1.965,1.015, 0.990f, x (3 ) = (2.0025,0.99725, 1.003f, Applying Gauss-Seidel iteration, we get

x~n+l) x~n+l)

and with x''

=

= (16 + x~n+l) _

=

(23 _ 2x~n+l)

2x~n))/20,

+ x~n+l))/20,

0, we have xCO) x(1)

= (0 " 0 O)T,

= (2.0,0.9, 0.995f,

x (2 ) = (1.995,0.999475, 1.000024f, x (3 )

=

(2.000055,1.000050, 0.999995)T.

The exact solution vector is x = (2,1, l)T.

CHAPTER 9. ITERATIVE METHODS OF SOLUTION

242

9.4

General Formalism

Given Ax = b, we decompose the characteristic matrix A into lower triangular L, diagonal D and upper triangular U matrices, i.e,

A=L+D+U or

Characteristic Matrix = Lower triangular

+ Diagonal + Upper

triangular.

Then,

(L + D + U)x = b, Dx= -Lx-Ux+b, and where D-

1

=

dilag (-1 -1) au ,a -1 22 , ... , ann .

The iteration methods can then be written as follows:

Jacobi iteration

Gauss-Seidel iteration

or after rearranging,

Relaxation iteration

which after rearranging gives

These results are compounded in the table below.

Method Jacobi Gauss-Seidel Relaxation

Iteration Matrix M -D1(L + U) -(D + L)-1U -(D + WL)-l[(W - l)D + wU]

Vector c

D1b + L)-lb w(D + WL)-1b (D

9.5. CRITERION FOR CONVERGENCE OF ITERATIVE METHODS

243

SOR .v. SUR When do we use under (w < 1) and over (w > 1) relaxation? Experimental results suggest the following: (i) Use w > 1 when Gauss-Seidel iteration converges monotonically (most common). (ii) Use w < 1 when Gauss-Seidel iteration is oscillatory. Note that as a general rule of thumb, if A is diagonally dominant, SOR or SUR iteration is only slightly faster than Gauss-Seidel iteration. However, If A is near or not diagonally dominant, then SOR or SUR can provide a noticeable improvement.

9.5

Criterion for Convergence of Iterative Methods

There are two criterion for convergence: (i) Sufficient condition: A is strictly diagonally dominant. (ii) Necessary and sufficient condition: The iteration matrix M has eigenvalues Ai such that max I 1< 1.

x,

"

9.5.1

Diagonal Dominance

The matrix

A~

5 ( 1 2

2 6 1

-1

-3 4

is diagonally dominant because 5> 2+ 1-11, 6> 1+ I -31, 4>2+1

and the matrix

is not diagonally dominant because 1

< 2+ 1-21, 1

< 1 + 1,

1 < 2 + 2.

)

CHAPTER 9. ITERATIVE METHODS OF SOLUTION

244

In general, A is diagonally dominant if n

I aii I>

L

I aij I Vi.

j=l

j#-i

9.5.2

Sufficient Condition for Convergence

Let us define the error ern) at each iteration (n) as

where x is the exact solution. Now, since

we have x + e(n+1)

= M(x +

ern)) + c.

If this process is appropriate to obtain a solution, then we must have x

= Mx+c.

From the equations above, we get

i.e. e(l) = Me(O), e(2)

= Me(l) = M(Me(O)) = M 2e(0),

e(3)

= Me(2) = M(M 2e(0)) = M 3e(0),

Now, for global convergence, we require that ern)

-+

0 as n

-+ 00

or lim n->CXJ

This will occur if But and hence, we require that

Mne(O) =

0 V e(O).

9.5. CRITERION FOR CONVERGENCE OF ITERATIVE METHODS

245

or

IIMII < 1. Consider Jacobi iteration, where

For the £00 norm

and therefore

n

I aii I>

I aij I .

L j=l j#i

Diagonal dominance is a sufficient but not a necessary condition for convergence. However, in general, the greater the diagonal dominance, the greater the rate of convergence.

9.5.3

Proof of the Necessary Condition for Convergence

For convergence, we require that lim ern) n-+oo

=

o.

Let M be the iteration matrix associated with a given process and consider the eigenvalue equation where Ai are the eigenvalues of M and Vi are the eigenvectors of M. (N.B. Eigenvalues and eigenvectors are discussed in the following chapter.) Let us now write the error vector e(O) as n

e(O)

=

Laivi i=l

where

a;

Hence,

since Further,

are appropriate scalars. From previous results we have

246

CHAPTER 9. ITERATIVE METHODS OF SOLUTION

and by induction,

Now, we want e(n)

-t

0 as n

- t 00.

But this will only occur if

Af - t 0 as n

- t 00,

a condition which requires that

p(M) < max Ai 1

1

.

z

The parameter p(M) is called the 'spectral radius'. This result leads to the following recipe for establishing the convergence of an iterative process for solving Ax = b: Given x(n+l)

=

Mx(n)

+ c,

(i) find the largest eigenvalue of M; (ii) if the eigenvalue is < 1 in modulus, then the process converges; (iii) if the eigenvalue if > 1 in modulus, then the process diverges. Example Find the spectral radii for M J and Me given that

A

=

(~4 ~4 ~). o

1

-4

The iteration matrix for Jacobi's method is

1/4

0) 1/4 . 1/4 0

o

The eigenvalues of M J can be found by solving equation (see Chapter 10) 1

M J -),11= 0 for A,

-A 1/4 0 1/4 -A 1/4 0 1/4 -A or

i.e.

=0

-A ~ 1/ 4 1/4 -A 1/4 1/41_ -A 4 0 -A 1

1

Thus,

x( A2 - ~ ) 16

-

~ =0 16

1

= O.

9.5. CRITERION FOR CONVERGENCE OF ITERATIVE METHODS and

A = 0 or

247

1

±

VS·

Hence,

p(MJ )

= max I Ai 1= i

1 fO

y8

~ 0.3536.

The Gauss-Seidel iteration matrix is given by

Using Jordan's method (see Chapter 7) to find (D

+ L)-l

we get

1/4 1/16 1/64 The eigenvalues of Me are then obtained by solving

-A

o o

o 1/4 1/16 - A 1/4 =0 1/16 - A 1/64

giving

A = 0,0 or

1

8

Therefore

p(Me) = 0.125.

9.5.4

Estimating the Number of Iterations

The smaller 1A Imax, the faster e(n) ---+ 0 as n ---+ 00. Hence, the magnitude of p(M) indicates how fast the iteration processes converges. This observation can be used to estimate the number of iterations. If ti is the number of iterations required to obtain a solution vector accurate to d decimal places, then

so that n is given by d 10glOP'

n~---

-

In the SOR method, the iteration matrix is a function of the relaxation parameter w. Hence the spectral radius p is a functional of w. The optimum value of w is the one for which p is a minimum. In some simple cases, the optimum value of w can be computed analytically. However, in practice, it is common to sweep through a range of values of w (in steps of 0.1 for example) using an appropriate computer program and study the number of iterations required to generate a given solution. The relaxation parameter w is then set at the value which gives the least number of iterations required to produce a solution with a given accuracy.

CHAPTER 9. ITERATIVE METHODS OF SOLUTION

248

9.5.5

The Stein-Rosenberg Theorem

In general, p(Mc) :S p(MJ). Therefore Gauss-Seidel iteration usually converges faster than Jacobi iteration for those cases in which both methods of iteration can be applied. The comparisons between Jacobi and Gauss-Seidel iteration are compounded in the Stein-Rosenberg theorem: If M J contains no negative elements, then there exists the following four possibilities: Convergence

(i) p(MJ) = p(Mc) = 0; (ii) 0

< p(Mc) < p(MJ) <

1.

Divergence

(iii) p(MJ) = p(Mc) = 1, (iv) 1 < p(MJ) < p(Mc).

9.6

The Conjugate Gradient Method

The conjugate gradient method can be used to solve Ax = b when A is symmetric positive definite. We start by defining the term 'conjugate'. Two vectors v and w are conjugate if v T Aw = 0, v T Av > OVv. The basic idea is to find a set of linearly independent vectors (VI, V2, ... , v-) which are conjugate with respect to A, i.e.

If we expand x as a linear series of basis vectors n X

=

Lajvj j=1

then the problem is reduced to finding ajVj. Now, the equation Ax = b becomes n

LajAvj = b. j=1 Multiplying both sides of this equation by

vT we get

n

LajvT AVj = j=1 Now, since

vTb.

vT AVj = 0 if i cI i. we have n

L ajvT AVj j=1

= aivT AVi = vTb.

249

9.6. THE CONJUGATE GRADIENT METHOD

Thus,

and the solution becomes

This solution only requires the computation of AVi and the scalar products. Therefore, the computation is easy once the conjugate directions Vi are known. In principal, the conjugate gradient method is a direct method. However, in practice, an iterative solution is required to find the conjugate directions using a 'Gram-Schmidt' type process.

9.6.1

The Gram-Schmidt Process

Our problem is, given (VI, V2, ..., v n ) of a vector space H" with inner product denoted by ( ;, find an orthonormal basis (WI, W2, ..., w n ) of R": The Gram-Schmidt process is based on the following procedure:

WI = VI, W2 = V2 W3

=

V3 -

(V2' WI; IIwII1 2 WI,

(V3' W2; (V3' WI; II w 2112 W2 - IIwII1 2 WI,

where

(x.y) == xTy,

Ilxll ==

j(x,x;.

The orthonormal basis is then given by

WI W2 ( IlwIII' Il w211' ...,

W n

)

Ilwnll '

i.e.

9.6.2 Consider Then

Example of the Gram-Schmidt Process

CHAPTER 9. ITERATIVE METHODS OF SOLUTION

250

en Therefore

9.6.3

WI

and

(-1)

---

W2

2

0 + (1/2) 1/2 (-1/2) 1/2, ( 1o1) (-1) 1

WfW2

=0,

WfWI

= O.

0

1

are orthogonal and the orthonormal basis is

Practical Algorithm for the Conjugate Gradient Method

(1,1, . . ,If

To solve Ax = b, we begin with an arbitrary starting vector Xo = say and compute the residual ro = b - Axo. We then set Vo = ro and feed the result into the following iterative loop: (i) compute coefficients of conjugate gradients

(ii) update x,

(iii) compute the new residual

(iv) compute the next conjugate gradient using the Gram-Schmidt process

Important points 1. The conjugate directions are

Yo, VI, V2, ... , V n .

2. The matrix A is only used to compute AVi. 3. The process is stopped when

Ilrill :::; E,

i.e, the required tolerance.

4. The algorithm works well when A is large and sparse.

9.7. SUMMARY OF IMPORTANT RESULTS

251

We conclude this chapter with a schematic diagram illustrating the principal iterative methods of approach to solving Ax = b as given below.

Ax = b where A is sparse, symmetric and positive definite

Ax = b where A is sparse and symmetric or non-symmetric

1

1

p(>.) < 1

Conjugate gradient method

1

1

SOR with w = 1

x

/',. Monotonic

Oscillatory convergence

1 SOR

convergence

1 SUR

1

1

x

x

Diagram 9.1: Schematic diagram illustrating the iterative methods

9.7

Summary of Important Results

Iterative Method

Jacobi Iteration

Gauss-Seidel Iteration

Relaxation Iteration M = -(D + wL)-l[(w - l)D

+ wU],

SOR Method

1
O
IIMII«

1

c = w(D

+ WL)-lb

CHAPTER 9. ITERATIVE METHODS OF SOLUTION

252

Sufficient Condition for Convergence n

I aii I>

L

I aij I

j=l j#i

Necessary Condition for Convergence

p(M) < max I Ai I 2

Conjugate Gradient Solution

vi 9.8

AVj =

0 if i

ie j.

Further Reading

• Traub J F, Iterative Methods for the Solution of Equations, Prentice-Hall, 1964. • Smith G D, Numerical Solution of Partial Differential Equations, Oxford University Press, 1978. • Fogeil M, The Linear Algebra Problem Solver, Research and Education Association, 1986. • Gerald C F and Weatley P 0, Applied Numerical Analysis, Addison-Wesley, 1989. • Ciarlet 0 G, Introduction to Numerical Linear Algebra and Optimization, Cambridge University Press, 1989.

9.9

Problems

9.1 (i) Solve the following system of equations using Gauss-Seidel iteration: -2XI

+ X2 =

-1,

+ X3 = 0, X2 - 2X3 + X4 = 0, X3 - 2X4 = o.

Xl -

2X2

Observe the number of iterations that are required for the process to converge, giving a solution that is correct to 2 decimal places.

9.9. PROBLEMS

253

(ii) Compare the number of iterations using Jacobi and then Gauss-Seidel method that are required to solve Ax = (2,4,6, 8)T where

o

1 4 1

1 4 1

o 9.2 Consider the following equations:

+ 6X2 - 3X3 = 4, 2Xl + X2 + 4X3 = 7.

Xl

State whether or not the characteristic matrix of this system is diagonally dominant. Compute the Jacobi and Gauss-Seidel iteration matrices for this system and solve these equations using Jacobi iteration and then Gauss-Seidel iteration. Use a 'stopping criterion' given by max

I x(n+l)

- x(n)

i"

1< ~ 2

"-

x 10- 2 .

If we wanted to improve the rate of convergence of the Gauss-Seidel process, should we use SOR or SUR. By choosing a suitable value for the relaxation parameter, solve the system of equations above using the relaxation method. Is the rate of convergence significantly improved? Comment on your result.

9.3 Compare the convergence rates of GS and SOR iteration using a relaxation parameter value of 1.6 to solve the equations:

+ 6X2 + X3 = + 6X3 + X4 =

Xl X2 3Xl

Use a 'stopping criterion' of 5 x

+

10- 3 .

X3 -

3X4

1, 1,

= 1.

Comment on the result.

9.4 For which of the following system of equations does the (i) Jacobi method and (ii) Gauss-Seidel method converge? Start with the zero vector and use a tolerance of 0.001: (a) -4Xl +X2 = Xl X2 -

1,

+ X3 = 1, 4X3 + X4 = 1, 4X2

X3 -

4X4 =

1.

254

CHAPTER 9. ITERATIVE METHODS OF SOLUTION

(b)

+ X2 + X3 = 4, X2 + 2X2 + X3 = 4, Xl + X2 + 2X3 = 4. 2Xl

(c) Xl

+ X2 + X3

=

0,

9.5 Find the spectral radii of the Jacobi and Gauss-Seidel iteration matrices corresponding to each system given in question 9.4. Hence, comment on the convergence or divergence in each case. 9.6 Which, if any, of the coefficient matrices in question 9.4 are diagonally dominant? Is diagonal dominance a necessary condition for convergence of the Jacobi and GaussSeidel method? 9.7 Ilvll denotes any norm of R" and IIAII is the associated norm for an n x n matrix A. Prove that if p(A) denotes the spectral radius of A, then

p(A) < IIAII. 9.8 Find a set of vectors which are mutually conjugate with respect to the matrix

A

=

(~ ~ ~1). -1

1

3

Hence, solve the equation

Ax = (4,6,4f. 9.9 Use the Gram-Schmidt process to solve the equation Ax = (4 4 4)T where 1

3 -1