A fast algorithm for solving the tensor product collocation equations

A fast algorithm for solving the tensor product collocation equations

A Fast Algorithm for Solving the Tensor Product Collocation Equations by WEIWEI SUN and N. G. ZAMANI Department Canada of Mathematics and Statistic...

509KB Sizes 1 Downloads 103 Views

A Fast Algorithm for Solving the Tensor Product Collocation Equations by WEIWEI SUN and N. G. ZAMANI

Department Canada

of Mathematics

and Statistics,

University of Windsor, Windsor, Ontario,

ABSTRACT :

A fast algorithm is presentedfor solving the tensor product collocation equations (A, @ B,+B, @ A,)u = b, obtained from the discretization of the Poisson equation in a rectangular region by the collocation method. The Fast Fourier Transformation (FFT) algorithm is employed to achieve the above objective. The operation count is shown to be O(N* log, N) which makes the overall calculations very economical.

I. Introduction Over the past twenty years, fast algorithms in terms of FFT have been used to solve the Poisson equation in a rectangular region (l-3). These algorithms are designed for a system of linear equations by the finite difference method. When the collocation method with the Hermite bicubic interpolation polynomials is applied to solving a Poisson equation in a rectangular region, the system of linear equations can be written in the tensor product form as (A, 0 BY+ B, @ A,)u = RHS.

(1.1)

Since the matrix A, @ BY + B, @ A, is nonsymmetric, standard iteration methods do not converge for such a system (4,5). Most investigators have used the Gauss elimination method for solving Eq. (1.1). Recently, Dyksen (6) has used a tensor product generalized AD1 method for separable elliptic problems to tackle the above equation. He has shown the operation count to be 328 N’ on an N x N grid. This is lower than the operation count 32 N4 of the Gauss elimination method. In this paper, we present a new algorithm. FFT algorithm is used to solve Eq. (1.1). Using a matrix transformation, the two-dimensional problem is reduced to a series of one-dimensional problems. Hence, only lower order matrices are solved in this algorithm and, therefore, the operation count is 32 N2 log, N for an N x N grid. In Section II, we present the algorithm by a similarity transformation Q and a series of simpler one-dimensional problems equivalent to (1.1) are given. In Section III, the eigenvectors of matrix B; ‘A, are found explicitly and matrix Q is constructed. Section IV contains a computational procedure in which FFT algorithm is used and finally, in Section V, a numerical example is considered.

0 The Franklin lnstltute 00164032/89

%3.00+0.00

295

Weiwei Sun and N. G. Zamani II. The Algorithm For the sake of convenience, we consider a Poisson boundary conditions in a rectangular region R, i.e.

equation

with Dirichlet

(X,Y> Efl

v2u = f(X,Y) 4an = 9. Using the Hermite

bicubic

(2.1)

interpolation,

u is approximated

as

W%Y) = : % ~,jri(x)r,(Y)> i= 1 j=]

where ti(x) and u],(y) are one-dimensional Hermite cubic interpolation The collocation method forces U(x, v) to satisfy (2.1) at the Gaussian result is Eq. (1 .l), i.e.

functions. points. The

(A, 0 By +B, 0 A,)u = RHS,

(2.2)

where A 0 B denotes the tensor product (Kronecker product) of A and B. Many properties of tensor products are given in Halmos (7). It has been proven by Dyksen (6) that the eigenvalues of matrix BJ: ‘A, are real, negative and distinct. Hence, there exists a non-singular matrix Q, such that B>:‘A, = Q- ‘AQ where the column vectors of matrix Q are eigenvectors of matrix B,: ‘A,, A is a diagonal matrix, A = diag (A,,, I,,, . , A,,- ,. AN, A_ ,, . . . , A-,, ,) and lli are eigenvalues of B,; ‘A,. . Multiplying both sides of (2.2) by Z@ Q- ’ Bj: ‘, results in (A, 0 I+ B, @ A)21 = RHS,

(2.3)

U = (Z@ Q-‘)u

(2.4)

where

RHS = (Z@ Q-‘B,-‘)RHS. After interchanging

certain

rows and columns

(2.5)

u* = 1

of (2.3), it can be rewritten

A,OI+B,O& A,OI+B,Ol,

as

RHS*.

A,OZ+B,OA-N+,

I

(2.6)

Here u*=B.u

(2.7)

RHS* = R.RHS, and R is the products 296

of some permutation

(2.8)

matrices. Journal

of the Frankhn Pergamon

lnst~tute Press plc

Solving Tensor Product Collocation Thus problem (2.1) is reduced to solving Eq. (2.6), and therefore, of lower order problems, i.e. (A,O1+B,O/Zi)u,*=RHS,*, III. The Eigenvectors

of Matrix

i=O,fl,...,

+(N-l),N.

Equations

solving a series

(2.9)

B; ‘A,

The algorithm in Section II relies on the explicit knowledge of eigenvalues and eigenvectors of matrix B.,: ‘A,, i.e. the solution of a generalized matrix eigenvalue problem A,,v = l_Byv. In Dyksen’s paper (6), the eigenvalues are shown to be

(3.1) where p = h4[(16r4-16r2+3)p-8r2+2] v = h’[(-128r2+48)p+48] w = 192~ and p = tan’ (lhn/2), In this section, we derive a formula problem of differential equation,

1 z = __ 24’

for eigenvectors.

Let us consider

an eigenvalue

p”(X) = &?-p(x) x E (0, L) p(0) = p(L) = 0.

(3.2)

When the collocation method with the Hermite cubic functions (3.2), we obtain a generalized matrix eigenvalue problem (8) : Au = IuBv.

is applied

to

(3.3)

Let p be the Hermite cubic collocation approximation of the eigenvalue problem (3.2) corresponding to eigenvalue 2 and h = L/N, p can be expressed as a piecewise polynomial : pk(x) =a,thk(x-x,)+ck(X-:)2 defined Then

+dkelx,

in [xk, xki ,I, where Xk is midpoint

Vol. 326, No. 2, pp. 29S307, Pnnted in Great Britain

k=O,l,...,N-1, of interval

(3.4)

(xk, xki ,) and xk is node.

1989

297

Weiwei Sun and N. G. Zamani

PkMl) = w*k+

19

PkcT&)= (Bv)*k+2, where xt, and x& are two Gaussian points is [xk, xk+ ,I. Let U, be eigenvector of problem (3.3) corresponding to eigenvalue P=

(Bv,,Bv

,,...,

Bv,_,,Bv,,Bv_

,,...,

A, and

Bv_,+,),

i.e. P=BQ.

(3.5)

Now, we find pp(x). First, let pk(x) satisfy Eq. (3.2) at Gaussian We obtain 2

2

ck+dkTh = ;1 akibkzh+ckT ( Adding

and subtracting ck =

fdk6

point Xkfzh.

r3h3 (3.6)

)

(3.6), gives dk = A(,,,?),

,(,+bkF),

and therefore, ak=

=

bk=

(i-T)dk.

(3.7) into (3.4), pk can be rewritten

Substituting Pk(X)

(k--T)ck,

(3.7)

as

+ (x-22kJ2 )ck+((;-~).(x-%k)+(X-;)3)dk.

;-T

(3.8) By the continuity and Pi(xk+ I) = pi+

conditions ,@k+

1),

we

ofp and p’ at node xk+ ,, i.e. pk(xk+ ,) = pk+ ,(xk+ ,) obtain

r(-ck+Ck+l)

ck+ck+l

=

=

(3.9)

@k+dk+,),

(3.10)

+dk+dk+,),

where 1 z2h2 r=n-2+8,

h2

s=

1 z2h2 h h3 --6 2+48 ( I. )

and

t=--

From (3.9) and (3.10), one can arrive at rt(&,-2dk+dk+,) rt(ck_1_2ck+ck+I)

=s(dk_,+2dk+dk+,)

(3.11)

=s(ck_I+2ck+ck+I),

(3.12)

i.e. 298

Journal of the Franklin Institute Pergamon Press plc

Solving Tensor Product Collocation Equations (rt-s)dk_,--2(rt+s)dk+(rt-s)dk+,

=O,

(3.13)

(rt-s)ck_,

= 0.

(3.14)

and -2(rt+s)c,+(rt-s)ck+,

Hence, ck and dksatisfy the same difference equation. Since r, t and s are independent of k, (3.13) and (3.14) are two difference equations with constant coefficients. The general solution of (3.13) and (3.14) can be expressed by ck = acosk0+fisinkt),

(3.15)

dk = ycoske+6sinktI

(3.16)

and

where rt+s case = ~ rt-s’

sindz2*

rt-s



(3.17)

and a, p, y and 6, are independent of k and only dependent on eigenvalue il. In order to determine ~1,B, y and 6, we now consider the boundary conditions. Since p(O) = p(L) = 0, one can write

i.e. p,(O) = rcO-sd,

= 0.

(3.18)

Similarly, P,,-

, (L)

= rc,,- , +sd,_

, = 0

(3.19)

and from (3.9) and (3.10), we have = s(d,+d,)

r(-cO+cJ

(3.20)

and co+c, which are simplified

(3.21)

= t(-d,+d,),

to tr(-co+c,)-s(co+c,)

= 2std,

(3.22)

and 2rco = rt(-d,+d,)-s(do+d,). The coefficients OC,jI, y and 6 are determined by the boundary conditions (3.19), (3.22) and (3.23). Therefore, by (3.15) and (3.16), we obtain Vol. 326, No. 2, pp. 295-307, Printed in Great Britam

(3.23) (3.18),

1989

299

Weiwei Sun and N. G. Zamani cg = a

do =

Y

c, = acos6+/?sin6 d, = ycos8+6sin8 c,,_ r = CIcos (N-

1)6+/I sin (N-

l)d

and d,,_, = ycos(N-1)8+6sin(N-1)8. Substituting m-q

(3.24)

(3.24) into (3.18), (3.19), (3.22) and (3.23) respectively,

we have

= 0,

~[~c0s(N-l)~+psin(N-i)~]+~[~c0s(N-l)8+6sin(N-1)8] -(tr+s)a+(tr--)(acos@+fisin8)

=0,

= 2s~

and 2ra = -(rt+s)y+(rt-.s)(ycos8+6sinQ, and from these formulae,

(3.25)

the following

p -

are obtained

:

--J-try r

and (&jsrry rt

(3.26)



which satisfy y 2scos(N-

1>e+ -&z-

Since p(x) is an eigenfunction,

e)sin(N-l)Q]

= 0.

(3.27)

y # 0. Then by (3.17), we have

2~~0s (N- l)e+2sc0t

8sin (N-

i)e = 0,

i.e. sin NB = 0, and therefore 300

Journal of the Franklm Institute Pergamon Press plc

Solving Tensor Product Collocation

B,=$

z=o,+1,...,

&-(N-l),N,

Equations

(3.28)

satisfying rtsin’i

8

+~cos*~

6 = 0.

Equation (3.29) is just (18) in (6) and the eigenvalues of problem determined by solving (3.29). Thus the solution of Eqs (3.11) and (3.12) can be expressed by ck =

scoskO+ r

-JsinkB r

y

(3.3) can be

(3.30)

>

and (3.31) By (3.29), we have s

=

Substituting

2’

t.tan-

r

8

2’

into (3.30) and (3.31) results in c _ k-

fsin(k+i)e

_t.tan

2 dk =

Substituting given by PkcX)

J-““.=

_t.tan*!

r

cos@

cos(k+:)O

1

T2h2

j-T+

(x-.&)2 2

(3.33)

Y.

0 cos ~ 2

(3.32) and (3.33) into (3.8), the piecewise

=

(3.32)

Y

eigenfunction

p(x) can be

)e,.((;-$+,)+v’),,

L

(3.34) therefore

Pkki!,)

=

;

(ck

-

thd,)

(3.35)

and Pk(Xk*2)= Vol.326,No.2.pp.295-307.1989 Printed in Great Britain

; (Ck+ z&z).

(3.36)

301

Weiwei Sun and N. G. Zamani Since p(x) is an eigenfunction and for eigenvalue AI, let

and y is an arbitrary

constant

for k, let y = A cos2 40

(3.37)

From

(3.35) and (3.36), we get P!$ =pk(xf,)

= -t,sin:sin(k+i)tl,rhcos:cos(k+i)0,

(3.38)

and #J

= pk(x&) = - tl sin ; sin (k + 4) O1+ zh cos z cos (k + 2) 8,.

Let p=

PV)P'1"...Pi,, p$Z) p\2’ . . . Pg:,

Pjy) Pcl), . . . P!!)N+, P$‘Pcz),

Hence, by (3.5), matrix P can be determined

. . . P”’ N+I

1

(3.39)

.

by

P = R,p, where R, is the product

of some permutation

(3.40) matrices,

and Q is expressed

as

Q = B- ‘RIP.

(3.41)

IV. Use of FFT Algorithm In Section III, Matrix Q has been constructed. If we use the Gauss elimination method to solve Eq. (2.5), the algorithm shown in Section III can be achieved with operation count 18: N3 less than those of the tensor product generalized AD1 method and Gauss elimination method. But our purpose is to find an algorithm with lower order operation count rather than 0(N3). In this section, we shall modify the expression of matrix Q shown in Section III such that FFT algorithm can be applied to (2.4) and (2.5) and the operation count is only 32 N2 log, N. We consider matrix P. By (3.38), we obtain pi;’ -pjj’

= - 2zh cos f! cos (k + 4) 01, 2

(4.1)

0, . p&j)+ph:) = - 2t, sin - sm (k + 4) 0,. 2

(4.2)

Let

302

Journal

of the Franklin Pergamon

Institute Press plc

Solving Tensor Product Collocation

Equations

Q, H&f) = 2 sin - sin (k + 1) el. 2

HI,” = - 2 cos f! cos (k + ‘2)or, 2 For I # 0, 1 # N, we obtain H#) = -cot;

[sin (k + 1)8, -sin

k&l,

(4.3)

Hj$ = -tan:[sin(k+l)B,+sinkB,],

(4.4)

and furthermore jcO~j,j) = -cot:sin(k+l)A,

(4.5)

and iiO(-l)‘P’@;)=

-tan~sin(k+l)8,.

(4.6)

By (4.3) and (4.4), we have, for 1 # 0, 1 # N, Hi:’. = Hi2:,

Hi:’ = Hi”,I7

(4.7)

and Hj$=

-2 3 H&j=0

Hi? = 0,

H#

= -2sin(k+i)z

For the sake of convenience of numerical computation, operations as some matrix transformations. By (4.1)-(4.4), HD = 7. P,

(4.8) we express we obtain

the above

(4.9)

where

where D,,

= diag(-rh,

-zh ,...,

D,2 = diag(0,

-th,.

D2, = diag(0,

-t,,

D22 = diag(-tt,,

-Th),

. . , -zh), -t2 ,..., -t_

--tN_,),

,,.. ., -t_N+l),

and H(l) = (H’!‘),.,, lJ Similarly,

H’*’ = (Hj;))NxN.

by (4.5) and (4.6), we obtain

Vol. 326, No. 2, pp. 295-307, Printed in Great Britain

1989

303

Weiwei Sun and N. G. Zamani

EH = H,y [

D, 01 0 D,

(4.10)

where

-1 0, --anI

D2 =

..

where Z, = (2,4,. . . ,2(N- I))‘, Z2 = (2, -4,. . . , (- 1)N-22(Nz2 = (- l)N-’ 2N and Fs is Fourier sine transformation, i.e. sin8,

sin8,

...

sin Bx_,

sin 28,

sin282

...

sin 2dN_,

l))‘,

zI = 2N,

F, = [ sin (N-i l)O, Thus, a new expression

sin (N-

l)O,

...

sin (N-

1)8,,_,

of matrix P can be given by P = R,I-‘E-‘H,&

(4.11)

where 6=

and matrix

304

Q is also rewritten

DI [ 0

0 D, 02 1

as Journal of the Franklin Institute Pergamon Press

plc

Solving Tensor Product Collocation

(4.12)

Q = B-‘R,i-‘E-‘H,D.

Since F, is a Fourier sine transformation, algorithm proceeds as follows :

Equations

FFT algorithm

can be used. The overall

Step 1

Calculate

and

RHS = (Z@ Q- ‘B_; ‘)RHS

by

1.1

solve R;‘iEwj”

1.2

solve H,Tw\2)= wj*)

1.3

solve bwi3) = wj*),

RHS = wc3),

= RHS,

i=

1,1,...,2N

RHS* = RRHS.

Step 2

Solve equation (A, @ 1+ B, 0 A&,+ = RHS:, and U = R- ‘IA*, where h, are calculated

i=

1,2,...,2N

by (3.1).

Step 3

Calculate

u = (I @ Q) U by 3.1

calculate

zt ‘) = bUi

3.2

calculate

zj” = Hszj’)

3.3

calculate

zj3) = RF ‘iEzj*)

3.4

solve equation

Byzi4) = zj”.

Finally, we obtain ui = zf4), i = 1,2,. . . ,2N. In the above computational procedure, Step 1.2 and Step 3.2 can be achieved by FFT algorithm (1, 2). TABLE I

Comparison in operation count and storage

Method BGl (TP ordering) BG2 (SP ordering) TPGADI FIG* FFT

Operation count

Storage

128 N4 32 N4 328 N3 18: N’ 32 N2 log, N

48 N3

24 N3 12 N2 8N2 8NZ

* Gauss elimination method is applied to solving Eq. (2.4) and calculating (2.5) rather than FFT. From the theoretical results in Table I, FFT is fastest. Vol. 326, No. 2, pp. 295-307, Prmted in Great Britain

1989

305

Weiwei

Sun and N. G. Zamani TABLE II

Comparison in CPU for FFT, FIG and BGl

N = 2/h

CPU

Number of unknowns

FFT

FIG

64 256 576 1296

0.106 0.505 1.136 3.149

0.089 0.493 1.476 4.846

4

8 12 18

time (s) BGl (TP ordering) 0.589 9.239 41.866

Now we consider the performance of this algorithm. The operation count and storage of this algorithm are presented in Table I, and are compared with band Gauss elimination method with partial pivoting, band Gauss elimination method by using some special order of the collocation point (5, 9, lo), and tensor product generalized AD1 methods. V. Numerical Result and Conclusion In order to test our algorithm, we consider a simple example. Here a Poisson equation in region R = {(x, y) IO d x d 2,0 < y ,< 2) is solved by the algorithm in which S is chosen such that u(x, y) = x(2 - x)y(2 - y). The band Gauss elimination method with partial pivoting [BGI (TP ordering)] is also applied to this problem. The results are summarized in Table II. These numerical results were computed on an IBM-438 1. We have derived an algorithm of O(N* log, N) operation which employs FFT algorithm to solve a tensor product collocation equation for the Poisson problem in a rectangular region. We have shown that the operation count of this algorithm is less than any previously published results and less CPU time is needed. Because FFT algorithm is employed, the algorithm is stable. hence it can be used to solve some large obtained.

problems.

For

some

other

boundary

condition,

similar

results

can be

References (1) F. Door, “The direct solution of the discrete Poisson equation on a rectangular”, SIAM Rev., Vol. 12, pp. 248-263, 1970. (2) R. W. Hackney, “A fast direct solution of Poisson’s equation using Fourier analysis”, J. Assoc. Comp. Math., Vol. 12, pp. 95-113, 1965. (3) P. N. Swartztrauber, “The method of Cyclic reduction, Fourier analysis, and the FACR algorithm for the discrete solution of Poisson’s equation on a rectangle”, SIAM Review, Vol. 19, pp. 490-501, 1977. “On the iteration solution of (4) R. Balart, E. N. Houstis and T. S. Papatheodorou, 10th IMACS Congress, pp. 988100. collocation method equations”, (5) W. R. Dyksen and J. R. Rice, “A new ordering scheme for the Hermite bicubic collocation equations”, in “Elliptic Problem Solvers III” (Edited by G. Birkhoff and A. Schoenstadt), pp. 467480, Academic Press, New York, 1984.

306

Journalof the

Franklin lnstrtute Pergamon Press plc

Solving Tensor Product Collocation (6) W. R. Dyksen, “Tensor product generalized ADI methods for separable blems”, SIAM J. Numer. Anal., Vol. 24, pp. 59-76, 1987.

Equations elliptic pro-

(7) P. R. Halmos, “Finite-Dimensional Vector Spaces”, 2nd edn, D. Van Nostrand Company, Inc., Princeton, N.J., 1958. “A collection based finite element method for eigenvalue cal(8) N. G. Zamani, culations”, J. Franklin Inst., Vol. 324, No. 2, pp. 2055217, 1987. (9) W. R. Dyksen, R. E. Lynch, J. R. Rice and E. N. Houstis, “The performance of the collocation and Galerkin methods with Hermite bi-cubic?, SIAM J. Numer. Anal., Vol. 21, pp. 695-715, 1984. (10)P. M. Prenter and R. D. Russell, “Orthogonal collocation for elliptic partial differential equations”, SIAM J. Numer. Anal., Vol. 13, pp. 923-939, 1976.

Vol. 326, No. 2, pp. 295-307, Printed in Great Britam

1989

307