Mathematics and Computers in Simulation XX (1978) 83-92 0 North-Holland Publishing Company
ORTHOGONAL
COLLOCATION
ON FINITE
ELEElENTS FOR ELLIPTIC
EQUATIONS
P.W. Chang, B.A. Finlayson
Department of Chemical Engineering University of Washington Seattle, Washington 98195
In the I&-th element the unknown T is approximated by
Abstract The method of orthogonal collocation on finite elements (OCFE) combines the features of orthogonal collocation with those of the finite The method is illustrated for element method. a Poisson equation (heat conduction with source term) in a rectangular domain. Two different basis functions are employed: either Hermite or Lagrange polynomials (with first derivative continuity imposed to ensure equivalence to the Hermite basis). Cubic or higher degree polynomials are used. The equations are solved using an L&decomposition for the Hermite basis and an alternating direction implicit (ADI) method for the Lagrange basis.
T@(x,y) x-x
u=
=I:
YE: Li(u) Lj(v) CE
.
k
.
,
(3)
Y - Ye (4)
v=
yL+l - ye
xk+l - 'k
ti Here C is the value of T at the collocation ij The colare the zeroes of the Legendre polynomial = 0, UNPX = 1. Li(u) is 1 the Lagrange interpolation function
on o 6 u 6 1 and u
Introduction We solve the problem a2T a2T V2T = 3 + ay2 = f(x,y) ‘I = g(x,y)
on
NPX -
in A
II
c
(2)
for a rectangular domain xc(O,l), yc(O,l). Problem I uses f = -4, g = 0 and corresponds to fully developed flow of a Newtonian fluid in a rectangular duct, or heat transfer with zero wall temperature and uniform heat generation, Problem II uses f = 0, g = 0, except on y = 0 where g = 1. This problem corresponds to heat transfer with one surface maintained at unit temperature, and the solution has a singularity at the corners (x,y) = (O,O), (1,O). This problem is linear and easily solved using separation of variables, fast Fourier transforms, etc. . However, we use it as a prototype problem to test methods than can be applied to nonlinear problems. The problem has a variational principle, so that the Ritz method is applicable, but we concentrate on the collocation and Galerkin methods which are applicable to all problems. Lagrange Interpolation and AD1 The domain is discretized lar elements with the knots
into rectangu-
x=x
1’
X2’
***,
Xk’
. . . .
SEX
Y=Y
1’
Y2.
*--9
Ye,
-**,
YNEY
+
+
1
1
where NEX, NEY are the numbers of elements in the x- and y- directions respectively.
L,(u) = j=l, j#i("-"j) 1 NPX
(5)
Similar definitions apply in the y(v) direction. Orthogonal collocation' is applied at each interior collocation point of each element (ke). NPX
NPY
i = 2, .... NPX-1; j = 2, .. .. NPY-1, k = 1, . ... NEX; e = 1, .. .. NEY NPX and NPY are the degree + 1 of the polynoCubic polynomial in the x and y direction. mials have NP = 4. In addition to these equations we require the solution and first derivative be continuous across element The boundary conditions are boundaries2. satisfied at the collocation points on the boundary, e.g. at (x,y)=(O,vjAyk+yk), k=l, . ... NEY; j = 2, . ... NPY-1. In addition the function and first tangential derivative are made continuous on the boundary, and the boundary conditions are satisfied at the four corners of the domain. The approximation is in C2 at least on each element, and is in C1 globally.
84
P.W. Chang, B.A. Finlayson/Orthogonal
collocation on finite elements for elliptic equations
The algebraic equations (5) plus the boundary and continuity conditions are solved using ADI:
is known, and M = y, the maximum errors at the s-th iteration=decFease as Error = p2'
(7.2)
(12)
Thus knowledge of p permits calculation of the number of iterations s needed to make the iterate error less than a specified value. Sequences of iteration parameters, as, are not considered here. It is essential to use a p There is no which is as small as possible. point, of course, in reducing the iterate error (to the solution of the algebraic equations) much below the approximation error (to the solution of the differential equations).
The iteration parameter, w, strongly affects iteration number, the rate of convergence with Eq. (7.1) is solved line by line at con.... i;ant y (i.e. for j=2, .... NPY-1, &l, NEY). Since the matrix is the same for all j, e an LU decomposition is performed only once per problem. The matrix is block diagonal and the LU decomposition does not require pivoting. Each block is of size NPX x NPX with one line overlap between the elements. After onehalf iteration the TF' '+'I2 is known for j=2 , ..-, NPY-1 but not iJ for j=l or NPY. These are obtained by smoothing the solution in the y-direction at each x value. This process chooses the value at j=l and NPY so that the first derivatives are continuous across the elements. The other half iteration is performed similarly, line by line at constant x, followed by smoothing in the x-direction. The iteration error is governed by
(8)
I
50 where ES=TS-T, the difference between the solution a? t% -th iteration and the exact soluEX is used to tion to the algebraic equations. eliminate the values on element boundaries (which do not affect the iteration error) and M and N are matrices derived from those in Eqs. 3). iSee Ref. 3 for details). We thus have
Ig+1-&I/‘5
ItgIl I~s-~sII II gTII
The reduction in error from one iteration to another is thus dependent on the spectral radii of g and zT (actually 5,
a portion of M and g, =-
a portion of ET.]
Ig+l-gs+l II 5 1~11”1~11”1!g*-EX* II
(10)
:Ell Ells
(11)
l&11" = 0 if p(MR)p(NR) < 1
Clearly, the error decreases faster each iteration if 0 is smaller. p < 1 is necessary for convergence, and the value of p depends on the iteration parameter, w. If w is specified, P
I
I
60
w Fig.
1.
I
1 7(
Spectral Radius Versus Iteration Parameter
The dependence of p(g) on w is shown in Figure 1 as a function of NP (the degree of polynomial is NP-1) and NE (the number of The optimal parameters (minimum p) elements). are shown in Fig. 2. Note that p increases (requiring more iterations) as NP is increased, as NE is increased, or as the total number of points in one direction, NT = (NP-1) NE+l, is increased. The spectral radius depends also on the element sizes, Ax and Aye. We consider a given number of e! ements, say NEX, and specify and element distribution given by H = Axk+lfAxk
'
(13)
70
0.6
60
i
Wmtn
0.4
50
03
40
0.2
30
0.1
20
0
IO 45
IO
15
20
25
NT
Fig. 2.
Optimum Parameters
85
collocation on finite elements for elliptic equations
P.W. Chanq, B.A. Finlayson/Orthoqonal
Thus, the elements are not all the same size. Fig. 3 shows the dramatic effect on p. For example, for NE=3, NP=5, changing H from 1 to 2 to 10 changes p from 0.817 to 0.88 to 0.992. If the same element distribution is used in the x and y directions and the iterate error is 0.001, this increases the number of iterations from 17 to 27 to 430. Clearly the AD1 method is not as suitable for very non-uniform element sizes. The iterate error is plotted versus iteration number in Fig. 4 for problem II and the error decreases at a rate slightly faster than the theoretical upper bound given by Eq. (12). For problems which require widely different element sizes, the ADI method is less suitable. Then we must go to a direct solution of all the equations (7) together. Unfortunately, the bandwidth of the matrix is too large, being roughly twice as wide as the bandwidth for This causes a fourfold Hermite polynomials. increase in decomposition time, so the Hermitian interpolation is used in these cases.
P 0.85 -
0.80 -
0.75-
H=l --w-m -my
NE = 2 NE = 3 NE = 4 5
IO W
Fig. 3.
Effect of Non-uniform Elements
15
20
P.W. Chang, B.A. Finlayson/Orthogonal
86
collocation on finite elements for elliptic equations For cubic polynomials the x-derivatives at the four collocation points are given by a 4x16 matrix multiplying a 16x1 vector (Ctf).
,
2
3
4
5
6
7
s
9
lo II I2 13
ITERATION NUMBER. S
Fig. 4.
Bate of Convergence
The differential equation (1) is satisfied at the collocation points interior to each element (see Figure 5b) and the boundary conditions va1ue for are Tkytisfied on the boundary. The at an element edge is shared by in one element being the same variable as ti the corresponding C. in the adjacent element. ij Thus the approximation is automatically in Cl. The resulting matrix (of size NTX.NTY x NTX.NTY) can be decomposed (LU) using the same block diagonal version used for ADI, except that pivoting is necessary. 'Here NTX = NEX (NPX-2)+2 and NTY = NEY (NPY-2)+2. We perform partial pivoting within a block. The numbering scheme for the variables is given in Fig. 5a. The extraneous zeros introduced by using the block diagonal decomposition cause additional, unnecessary multiplications, but they are minimized by checking for zeros in the decomposition. A banded decomposition with a variable bandwidth and pivoting is also applicable to this matrix.
HerrjliteInterpolation and Direct Solution The Hermite interpolation gives
13
23
37
47
61
(14)
II
22
35
46
59
where the Hermite polynomia s are defined on UE & are values of T, [O,l] and the parameters C. ij ST/ax. aT/av, or a'T/axay at the element corners (for NPX=NPY=4, cubic polynomials).
7
20
31
44
55
5
19
43
53
I
17
Tti =N;X T' ix1 j=l
Hi(u) Hj(v) Cz
29
H;(u) = (1-u)' (1+2")
25
41
. 49
H;(u) = (1-u)' uhk Fiq. 5(a) H:(u) = (l-u)2 u
i-l
, i=3
,...,NP-2 NP 3 5
Numbering of Unknowns, NP = 5
(15) Hk NP_l(u) = u2 (3-2")
H;p(“,x=_“,’ (u-l)hk “=
k __ xk+l - Xk
(16) hk = xk+l - Xk Within an element ti the derivatives of T can be related to the derivatives of H, e.g. (17) ;i:
$bm)Hj
h,$
Fig. 5(b).
X BC Collation Points, o DE
P.W. Chang, B.A. Finlayson/Orthogonal Approximation
87
collocation on finite elements for elliptic equations
Error
For one-dimensional problems deBoor and Swartz4 showed that the error of the collocation method depends on the number of elements as follows: error
a
(l,,NE)NP, NE -t m, NP fixed. (18)
For global polynomials (NE=l) in one direction Ciarlet, Schultz and Varga' show that the error of a Galerkin method follows: error
(I
(1/NP-2)NP-2.
6
(19)
A similar dependence was found empirically for the collocation method.3 For two-dimensional problems Prenter and Russell6 have shown that error
CL (11~~)~~~
(20)
where the 3 characterizes the continuity of the exact solution. We find for problems I and II the approximation errors follow
,8 2
I o(Np-Z)(_l)min (NP,k) , for NP 3 5(21) NE where NEX=NEY=NE and NPX=NPY=NP. The parameter k characterizes the continuity of solution and C and a are independent of NP and NE.
3
4
NEY ilg. 6.
Errors for Problem I.
For Problem I Fig. 6 shows the error as a function of NP and NE. Here 6 k=3 and the curves for different NP > 4 all have the same slope, since min (NP,k)=3. Fig. 7 shows the same data plotted versus (NP-2) log (NP-2) and the straight lines are evident. For this problem the constants in Eq. (20) are C=O.O059, a=0.53, k=3. For Problem II the k=l since the solution has a singularity. The results are shown in Fig. 8, and here C=O.O19, (x=0.26, k=l. This problem has discontinuous first derivatives at the corners, yet we are approximating it with a basis function in Cl. Adequate results can be achieved, however, and improvement results 'from using a graded mesh. The boundary conditions must be carefully applied if the correspondence between Lagrange and Hermite basis functions is to hold. Comparison of Lagrange-AD1 with Hermite-Direct The basis function for Lagrange-AD1 is the same as that for Hermite-Direct, and the collocation points are the same. Thus the solutions are the same. To compare the methods we must compare the relative cost of the two calculations. The equations are linear and the Hermite-direct method solves them exactly (within computer round-off error). The Lagrange-AD1 method solves them iteratively, and the cost of calculation depends on the number of iterations.
-61 0
I
I
I
2
(NP-2)
Fig. 7.
log(NP-21
Errors for Problem I.
3
88
P.W. Chang, B.A. Finlayson/Ort.hogorlalcollocation on finite elements for elliptic equations sirable for problems with smooth solutions (high k in Eqn. 21), while low order polynomials (and more elements) are suitable for problems with non-smooth or discontinuous solutions (low k in Eqn. 21).
I
2
34 NEY
Fig. 8.
Errors for Problem II.
Choose an iterate error of 10% of the approximation error for the discretization. Then use Eqn. (12) to find how many iterations are necessary. The results in Table 1 show that when the elements are uniform the Lagrange-AD1 method uses only about one-fourth to one-sixth of the operation counts (multiplications) used by the Her-mite-direct method. When the elements are non-uniform however, approaches one, s increases and the direct 'min methods are more competitive. For the OCFE-Hermitian, it is interesting to compare whether it is better to increase the number of elements or the degree of polynomial. Using the actual approximation error for Problem I (Fig. 6) and decomposition cost in Table 1, we can construct Figure 9 giving error as a function of computation cost. For equivalent accuracy (ER = 3x10e4), cubits (NP = 4) requires 46,000 multiplications (NE -3.5) as compared to 26,000 for NP = 5 (NE -1.5). In this case a 40% computational savings accrues from using higher order polynomials rather than Similar comparisons are appamore elements. rent in Figure 9. For Problem II, with accuracy O(Ax) the lower order polynomials are as efficient as higher order ones. With a slight extrapolation we can conclude that high order polynomials (and few elements) are de-
P.W. Chang, B.A. Finlayson/Orthogonal
collocation on finite elements for elliptic equations
DECOMPOSITION
MULTIPLICATION
89
COUNTS ,6
0 NP=4 A NP=5 -2. NE=2 k-3. 0
0
NP=6
v
NP=7
1
-
NE* x
0
NE=4
-4. -
Fig.
9.
Error versus Decomposition
Comparison of Collocation with Galerkin Methods we next compare computation time of OCFE with that of the Galerkin and Ritz methods. The calculation time is due to the formulation of the equations, decomposition of the matrix, and forward and backward sweeps of the right hand sides. Using the direct method, OCFE has at most (NPX)(NPY) non-zero entries in a row where each entry is composed of some functions evaluated at the collocation point. Galerkin/Ritz method can have [2(NPX-2)+2] non-zero entries in a row. The Galerkin/Ritz method thus has approximately 4 times as many non-zero entries as the Each entry in the Galerkin collocation method. method is the integral of some function over For non-linear problems, Galerkin/ the element. Ritz method requires even more time since more quadrature points than collocations points are generally used to approximate the integral. Therefore, the formulation cost of Galerkin/ Ritz is much higher than that of OCFE (by a factor greater than 4 for Galerkin, and greater than 2 for Ritz, with the factors increased by the ratio of quadrature to collocation points). The decomposition cost depends on the method used. Prenter and Russell6 compared the decomposition cost of OCFE and Ritz method at NPX = NPY = 4 and NEX = NEY = n. The unknowns on the boundary are eliminated by the appropriate variation of the Hermite shape functions.
Cost
The number of the unknowns left is (2n)'. When n is large the half band width is about 2" for OCFE and 4n for Galerkin/Ritz methods. The decomposition costs are then (1/2)(4n)2(2n)2 = 32n4 multiplications for Ritz (symmetric matrix, no pivoting) (411)~(2n)~ = 6411~multiplications for Galerkin (unsymmetric matrix, no pivoting) 2(2n)2(2n)2 = 32n4 multiplications for OCFE (unsymmetric matrix with pivoting) For the decomposition cost OCFE is then equivilant to the Ritz. method and is twice as fast as the Galerkin method (assuming an unsymmetric matrix). These counts do not consider the zeroes in the matrix. These zeroes can be skipped in the actual calculation, and this improves OCFE drastically (sometimes by 2 times) so that it is even better than the Ritz method, which has more non-zero entries. We have found it not convenient to adjust the shape functions such that the unknowns on the boundary can be eliminated. If those unknowns are included, the total number of unknowns is (2n+2)2 (Figs. 10, 11). The halfbandwidth for OCFE is increased to 4n+ll (Fig. 12) but remains as 4n+ll for Galerkin/Ritz methods (Fig. 13).
P.W.
90
Chang, B.A. Finlayson/Orthogonal
collocation on finite elements for elliptic equations
51 52 49 50
55 56 53 54
59 60 57 58
63.64 61 62
35 36 33 34
39 40 37 38
43 44 41 42
47 40 45 46
19 20
23 24
27 28
31 32
I7 I8
21 22
25 26
29 30
II I2 9 IO
I5 I6 I3 I4
ii’,:/ 41
42
43
50
51
56
57
58
2 8
29
30
33
34
38
39
40
25
26
27
31
32
35
36
37
8
9
I4
I5
22
23
24
5
6
I2
I3
I9
20
2 I
2
3
IO
II
I6
I7
I8
7
44 78 56
34 I 2
Fig. 10a.
Illustration for NPX = NPY = 4, NEX = NEY = 3 Sequence of Unknowns
I
Fig. lob.
m -NEW----J ,
JO
20
30
50
40
NBW -NEW-
I
60 64
2NBW
-NNBW-
Fig. 11.
Illustration for NPX = NPY = 4, liEX = NEY = 3 Sequence of Equations
Non-zero Entries of Jacobian Matrix by OCFE.
Fig. 12.
OCFE Matrix after decomposition with L and U denote the row pivoting. actual decomposed matrix.
P.W. Chang, B.A. Finlayson/Orthogonal
collocation on finite elements for elliptic equations
91
The decomposed matrix looks at worst like the one shown in Fig. 15. Compared with Fig. 12, it has slightly fewer non-zero entries. Therefore, the computation cost can be less at the probable expense of not finding the optimum pivot.
-I
Fig= 13.
Non-zero Entries of Jacobian Matrix from Calerkin FEM
The decomposition cost without pivoting for large n is then about (1/'2)(4n+11)2(2n+2)2
32n4 multiplications for Ritz
Fig. 14.
NBS
-
OCFE Matrix Regarded as a BlockDiagonal Matrix
-NEW-
64n4 multiplications for Galerkin 128n4multiplications for OCFE Since pivoting cannot be avoided, we have used computer routines that skip operations on zeroes. It is fortunate that the band widths of the decomposed matrix for OCFE with pivoting do not exceed the original band width 4n (Fig. 12). Therefore the cost is at worst (2n+2)2(4n)2 = 64n2 which is equivalent to that Houstis' reported that for Galerkin method. with a routine that skips zeroes, the decomposition time for OCFE is between those for Galerkin and Ritz methods in the test cases. This is in agreement with the analysis here. The block-diagonal solver with rowpivoting and skipping zeroes differs from the corresponding banded matrix solver only in the numbers of rows available for pivot-searching. The search is limited within the blocks such that the block structure is unaffected in the The matrix of OCFE course of decomposition. can be regarded as block-diagonal (Fig. 14).
L
-NBS-
Fig. 15.
OCFE Matrix After Decomposition Pivoting within Blocks
with
92
P.W. Chang, B.A. Finlayson/Orthogonal
collocation on finite elements for elliptic equations
The actual decomuosition cost deuends on the particular problem. The block-diagonal solver used in this work solves the sample problems successfully and is presumed to be less expensive than the corresponding banded matrix solThe detailed estimates of the decomposiver. tion cost are given in Tables 1 and 2.
Conclusion OCFE solves the elliptic boundary value problems with the approximation error in the form of a(NP-2) ER = C(&)
__-_
(k)min(NP,
W
(22)
where cx, k depend on the problem. Among the direct methods, the decomposition cost of OCFE is less than or equal to that The total cost of OCFE of the Galerkin method. If the is less than that of the Ritz method. boundary conditions are incorporated into the basis functions the OCFE is superior to all It is preferable with OCFE to inmethods. crease NP rather than NE for problems with smooth solutions (high k).
N
k 2
OCFE-Lagrange-AD1 is more efficient than the direct methods when the elements are uniform but can be less efficient when the elements are non-uniform. Acknowledgment This research was supported by NSF Grant 75-02563. References
J
A fair comparison of the computation cost should include both the formulation and decomposition cost. Houstis et al* reported that the total cost favors OCFE over the Ritz method at the same accuracy mainly because the small advantage of Ritz method in decomposition cost is offset by the large difference in formulation
cost.
1.
Finlayson, B. A., The Method of Weighted Residuals and Variational Principles, Chap. 5, Academic Press, 1972.
2.
Carey, G. F. and B. A. Finlayson, Chem. Eng. Sci. 30 587 (1975).
3.
Chang, P. W., M.S. thesis, U. of WA (1975).
4.
deBoor, C. and B. Swartz, SIAM .I.Nurser. Anal. g 582 (1973).
5.
Ciarlet, P. G., Schultz, M. H., and Varga, R. S., Num. Math. 2 394 (1967).
6.
Prenter, P. M. and R. D. Russell, SIAM J. Numer. Anal. -13 923 (1976).
7.
Houstis, E. N., The Structure and Direct Solution of Linear Finite Element Methods, paper presented at ITPACK/ELLPACK Workshop, July, 1977.
8.
Houstis, E. N., Lynch, R. E. Papatheodorous, and Rice, .I. R., to be published, J. Comp. Phys. (1977).