Computer methods in applied mechanics end engineering Comput. Methods Appl. Mech. Engrg. 170 (1999) 39-64
ELSEVIER
A full 3D finite element analysis using adaptive refinement and PCG solver with back interpolation C.K. Lee 1 S.H. Lo* Department of Civil and Structural Engineering, The University7 of Hong Kong, Hong Kong, China Received 10 September 1997; revised 7 May 1998
Abstract In this paper, adaptive refinement finite element analyses were carried out for full 3D problems. In order to achieve an optimal computation cost and to eliminate the effects of singular points, the adaptive refinement procedure was applied in conjunction with a back interpolation for the construction of a good initial guess for the solution of the linear equations system. The combined use of the adaptive refinement procedure, the back interpolation scheme and the preconditioned conjugate gradient (PCG) solver lead to a significant reduction in the operations for the solution of the simultaneous linear equations in the adaptive refinement analysis. In many cases the number of iterations needed by the PCG solver to reach a converged solution is independent of the number of equations in the global system. The numerical results obtained indicated that for a series of carefully designed adaptive meshes, the computational cost required to solve the linear system could be made only proportional to the number of degrees of freedom in the mesh. To our knowledge, this is the first time that the global stiffness equations in 3D stress analysis have been solved with this efficieacy. The same set of numerical results also showed that, in general, the total computational cost of the adaptive refinement procedure can usually be minimized by gradually reducing the target relative error of the solution during successive refinements rather than employing a constant target relative error throughout the whole adaptive analysis. © 1999 Elsevier Science S.A. All rights reserved.
1. Introduction Due to the c o n t i n u o u s l y rapid increase in the capacity and speed of digital computers and the advancements o n both the theory and application algorithms of adaptive refinement techniques, it is n o w m u c h easier than before, to carry out highly efficient adaptive finite e l e m e n t analysis on a wide range of physical problems. As a result, the use o f adaptive refinement procedure in full 3D finite e l e m e n t formulation has received m u c h attention [ 1 - 3 ] for the obvious reason that it embraces nearly all practical problems in solid mechanics. In addition, the m a n y recent advances in automatic 3D adaptive mesh generation [ 4 - 6 ] also facilitate the use of full 3D model in day-to-day e n g i n e e r i n g works. W h e n e m p l o y i n g the full 3D finite element models for the solution of realistic problems, it will always result in a system of linear equations of the form Kt~ = f
(1)
where K is the stiffness matrix (usually symmetric), f is the equivalent nodal load vector and li is the nodal displacement vector. The solution of Eq. (1) can be obtained by using either: (i) a direct solver based on Gaussian elimination such as those using the frontal method [7,811 or the skyline L U factorization approach [ 9 - 1 1 ] ; or (ii) an indirect solver such as those based on the preconditioned conjugate gradient ( P C G ) approach [ 1 2 - 1 5 ] . It is generally accepted that while the direct solver can easily outperform the iterative solvers in 2D * Corresponding author. E-mail:
[email protected] 'Current address: Nanyang Technological University, School of Civil and Strttctural Engineering, Block K N1 #1A-37, Nanyang Avenue, Singapore 639798. E-mail:
[email protected] 0045-7825[99/ - see front matter © 1999 Elsevier Science S.A. All rights reserced. PII: S0045-7825(98)00188-1
40
C.K. Lee, S.H. Lo / Comput. Methods Appl. Mech. Engrg. 170 (1999) 39-64
applications due to their robustness and straightforward implementation, the iterative approach is more suitable for the solution of large-scale global stiffness equations system due to its high computational efficiency and the storage limit of the computer. In this paper, some recent numerical experiences on the aspecl of using an adaptive refinement procedure in full 3D finite element analysis will be reported. In addition to the effect of using an adaptive refinement scheme on the global convergence rate of the finite element solution, attention will be paid on the following two important aspects of the computational efficiency of the adaptive scheme. (1) The efficient solution of the global system using a PC(; iterative solver for the adaptive refinement analysis. Preliminary studies in [1] indicated that with a proper use of initial guess for the PCG solver, the computational cost needed for solving Eq. (1) may be reduced. As a further development, in this study, a more thorough investigation on this aspect, especially on tlae combined effects of different element types and meshes to the total solution cost, will be carried out. (2) In addition to the effects of elements types and mesh, in this study, the effect of using different adaptive refinement strategies on the total computational cost for achieving a prescribed accuracy of the finite element solution will first time be investigated. In particular, the significance of specifying a proper relative error for different adaptive refinement steps will be studied in details.
2. Basic notation used, the a priori and a posteriori error estimations Since the displacement formulation for the full 3D finite-element analysis is well known [16] and both the a priori and a posteriori error estimations used are exactly the same as those used in the previously studies [1,3], only a concise summary of the notation used is given below. (1) The exact displacement u and stress o-: U : [U, V, W]T
and
o" =
(2)
DSu
where S is the usual strain operator and D is the elastic matrix. (2) Finite element approximation, a, the global stiffness matrix K and the load vector f : = NK, K =
f,~
(SN)TD(SN)
dO
and
f NvqdS~
f =
(3)
where ~ is the problem domain, N is the interpolation function and q is the loading term. (3) Finite element stress 6-, error in displacement e u and stress e~: O" = D S N t i ,
eu = u - t~
and
e,, = o- - 6"
(4)
(4) The energy norm, Ilulla the energy norm of the errors, lle.l]a and the relative error, r/:
=
(Se~) D ( S e . ) d O =
liege ,1--Ilull~
ea TD
- -
1
e,~dO
(5b)
(5c)
(5) A priori error estimation for 3D problems If h-version refinement is carried out, the convergence rate of the finite element solution will be given by [17] [le, l[.o ~ C(Nf) -min(p')t)/3
(6)
where Ny is the total number of degrees of freedom and A < 1 is the strength of singularities of the
C.K. Lee, S.H. Lo I Comput. Methods Appl. Mech. Engrg. 170 (1999) 39-64
41
problem, p is the order of the interpolation function N and C is a constant independent of Nf. In case that an optimal mesh [17,18] is used the convergence rate willt be improved to
Ile.[lo ~< C'(Nf) -p/3
(7)
with C' being another constant independent of Nf (6) A posteriori error estimation by the Zienkiewicz and Zhu (Z 2) error estimator and the superconvergent patch recovery technique (SPR) The well-known Z e error estimator [18] is used in this study. It is one of the simplest yet robust [19] error estimator available today. The estimated error norm of the linite element solution, Ile, l]a, is defined as
ILe.ll~ =~/fr (o* - ~)'D-'(o,* - ~) dg2
(8)
where o-* is the smoothed stress field constructed by using the SPR. It is interpolated over a given patch by a polynomial as (9)
o'* = Pa
where P contains polynomial terms of the local spatial co-ordinates (x, y, z) [1,3]. a contains recovery stress coefficients obtained by performing patch-by-patch least-squares fits over the mesh which is equivalent to minimizing the following functional /7 with respect to a. u,
r / = ~ ( o - * , - ~^): +/7A
(10)
i=l
Np is the number of superconvergent sampling points inside the patch [20]. or* and ~ are, respectively, the values of the smoothed stresses and finite element stresses evaluated at the ith sampling points. /7A is an additional functional used for additional accuracy and further stability. Different schemes of the SPR [21-24] may employ different forms of/7A: A more detailed account on the different forms of/7A suggested and the implementation used in this study can be found in Appendix A. No matter which variant of the SPR is used, a can be obtained by setting O/7/Oa = 0 and solving the resulting recovery equation of the form
(11)
Ra = F
3. Solution of linear equations system by the PCG solver In full 3D finite element analysis, the solution of Eq. (1) by direct linear equation solvers will soon become impractical since the number of float point operations (flops), Nf~op, and the storage space needed, N~., by the most efficient skyline or frontal solver is of the order [12]
Nnop=O(Nf/3) and
N~.=O '(Nf )5/3
(12)
and will soon become prohibitively large in realistic applications. Indirect solvers based on the PCG technique [12-15,25,26] offer a much more efficient solution approach. Since in general for these methods [1,12]
N. op = O(N N,) = O(N/V, )
and
N~ = O(N~:) = O(NI)
(13)
as for a general 3D mesh [1]
(14) where N~ is the number of non-zero terms in the global stiffness matrix K and Ni is number of iterations needed by the PCG solver to obtain a solution satisfying the convergence tolerance. From Eq. (13), the computational cost needed by a PCG solver will be proportional to the number of iterations needed. Let a and ~ik be the exact solution to Eq. (1) and the approximate solution after k iterations respectively. ~ is considered adequately accurate if it can satisfy the following convergence requirement
42
C.K. Lee, S.H. Lo / Comput. Methods Appl. Mech. Engrg. 170 (1999) 39-64
~/(Kti k - f)r(Kak - f ) <~ 6 f ~ f
(15)
for a small constant 6 << 1.0, then it can be proved that the nun~ber of iterations needed by the PCG solver is given by [ 12] _
~
2
N/= ~ ( l n ( - ~ )
p)
+ C /
(16)
where K(/~) is the spectral condition number of the preconditioned stiffness matrix /~. C and C' are two constants independent of ~ and K(/~). For a preconditioned global stiffness matrix, the general relationship between K(/~) and ~ is
K(/t~) = O(N)./3 )
(17a)
and by Eq. (16)
N/= O(NT '6 )
(17b)
Therefore, the number of flops needed to solve Eq. (1) will be of the order
Unop = O ( U ; / 6
)
(18)
which is obviously much more efficient than any direct solver for a large value of N/. From Eqs. (15) and (16), it can be seen that the computational cost needed by an iterative solver can be reduced by at least two approaches: (1) Constructing a good preconditioned global stiffness matrix so that the value of K(/~) is minimized, or by (2) Constructing a good initial guess ti o which is close to the exact solution ti so that the number of iterations needed can be reduced. The first approach is more general than the second approach and can be applied to any problems without having any a priori knowledge about the behavior of tL Furthermore, once the preconditioned matrix is constructed it can be applied to multiple loading cases involving more than one load vector f Many previous researches have been done on the construction of an effective preconditioning matrix: details of these can be found in [12-15,25,26] and the references therein. When comparing with the first approach, the second method usually requires some special solution strategies and more than a single finite element analysis are needed. Examples of this approaches included the multigrid method which had been used in both solid mechanics [27,28] and large-scale CFD applications [29]. A simpler method involving direct back interpolation from the last finite element solution had also to be used by the authors [1,3]. A good initial guess which is close to ti not only increases the efficiency but also the robustness of the PCG solver. Such a consequence is especially important in applications involving ill-conditioned matrices caused by a huge volume or aspect ratio of elements or regions with very different material properties [14]. However, it should be mentioned that for problems involving multiple loading cases the construction procedures have to be repeated for each load vector. If the second approach is used, the performance of the back intelpolation method can be improved by refining the mesh adaptively rather than uniformly since after a few cycles of adaptive refinement, the adaptive meshes will become stable [30]. That is, successive refinement meshes will become similar to one another and extensive refinement will often be confined at where the stress gradient is high. For this reason, the simple interpolation scheme used in [1] and [3] for the construction of initial guess will be reconsidered more deeply in this study. (Details of the interpolation procedure employed in the current study is given in Appendix B.)
4. Optimal adaptive refinement strategy for 3D finite element analysis The target of an adaptive refinement procedure is to achieve the optimal rate of convergence stated such that the condition 7/~< 7/t
(19)
43
C.K. Lee, S.H. Lo I Comput. Methods Appl. Mech. Engrg. 170 (1999) 39-64
where r/t is the user specified target relative error, is satisfied with the least total computational cost. In the design of a refinement scheme, the following two factors significantly affect the overall efficiency of the adaptive procedure.
1. Effectiveness of eliminating the effect of singular points This means that whether the adaptive refinement procedure can achieve the theoretical convergence rate stated in Eq. (7) with the effect of singularities being eliminated. A pe:ffectly effective adaptive refinement strategy should completely eliminate the effect of singularities and maintair,s the optimal theoretical convergence rate for every refinements. However, in the actual adaptive analysis, such ideal condition can seldom be achieved. The reason is simply that in any refinement step, the adaptive mesh generated can only provide an approximate solution to the model problem and the error estimator will only be asymptotically exact when the degrees of freedom of the mesh approach infinity. As a result, even with the best automatic mesh generator which can generate a highly compatible adaptive mesh, the convergence rate of the refinement process will be lower than the optimal value. A huge increase in the degrees of freedom in z, single refinement step will usually lead to a lower convergence rate for that step and vice versa. Hence, ~:he effectiveness of an adaptive refinement procedure can be increased by reaching the target relative error in a large number of refinements so that the increase in the degrees of freedom in each refinement is small. Very often, with a sufficiently large number of refinements, the actual convergence rate achieved will be very close to the optimal theoretical convergence rate (Fig. 1).
2. Number of refinements needed to achieve the target accuracy In order to achieve a high efficiency, the adaptive refinement procedure should be able to achieve the target accuracy in a reasonable number of refinements. The use of a large number of refinements to maintain the optimal convergence rate may not always result in a minimum computational cost. Hence, it is often necessary to limit the total number of refinements to a reasonable value by allowing a larger increase of the degrees of freedom in the first few refinements of the adaptive analysis even the convergence rate may be slightly reduced (Fig. 2). Furthermore, another advantage of allowing a faster increase in the degrees of freedom in the first few refinements is that, by doing so, the relative error of the finite element solution in the last few refinements will be closer to the target value (Fig. 2). In case that the error estimator is asymptotically exact, this will also increase the reliability of the error estimator. However, it should be mentioned that if there is a too rapid increase in the degrees of freedom in the first few refinements when the meshes are relative coarse and the error estimator is not that accurate, the overall efficiency of the whole adaptive procedure sometimes may also be reduced. In practice, in order to minimize the total computational cost of the adaptive refinement procedure, a balance must be found between the effectiveness of the adaptive procedare and the number of refinements needed to achieve the target accuracy. An adaptive refinement strategy which lead to too many refinements will certainly not be an optimal one with respect to the total computational cos~: needed as too many unnecessary refinements
log(.q)
qo
~
p
r
qo
~
~_,~.~'/3
I
P>P' 1
,qo
i. No
N, |og(N0
Fig. 1. The effects of singularities are completely eliminated by adopting a large number of refinements in the adaptive analysis.
--
p No
N. log(N~)
Fig. 2. The effect of reducing the number of refinements to achieve the target accuracy.
44
C.K. Lee, S.H. Lo / Comput. Methods Appl. Mech. Engrg. 170 (1999) 39-64
are carried out in the initial stage of analysis. On the other hand, if the degrees of freedom is increased too rapidly, the convergence rate of the adaptive procedure may be considerably lower than the optimal value. The adverse effect will be much more pronounced when: (i) a low order element is used in the analysis; (ii) the relative error of the initial mesh is much higher than the target value; and (iii) for a full 3D analysis where the number of degrees of freedom of the mesh, Nr is proportional to 1 [ h 3 (h is the element size) such that any drop in the convergence rate of the adaptive scheme will result in a huge increase in the value of N / i n the final mesh. When using the adaptive refinement procedure in engineering applications, it is quite often that the target relative error, r/,, is fixed and a given (usually coarse) initial mesh is used to start the analysis. Therefore, the two most important parameters that the user can adjust to achieve an cptimal computational efficiency will be: (i) the number of refinements carried out to achieve the target accuracy; and (ii) the rate of increase in the degrees of freedom in each refinement step. Note that the second parameter is equivalent to specifying the relative error achieved for each refinement step since it is much easier to specify the relative error rather than to control the number of degrees of freedom. In general, as the total computational cost will be simultaneoasly affected by these two factors, it is rather difficult to predict, before carrying out the actual adaptive anal~ sis, which combination of the values of them will result in a most efficient adaptive scheme. In practice, it will be more convenient to let the users specify the number of refinements they wish to carry out. Eventually, in the numerical examples given in the next section, attention will be focused on the performance and sensitivity of the refinement strategy to the relative error set in each refinement step. 4.1. The adaptive refinement strategy
In the adaptive analysis, it is assumed that the target relative error r/t is much smaller than 70, the relative error of the initial mesh. Therefore, instead of setting r/, as the target relative error in every refinements, the total computation cost may be reduced by allowing a gradual decrease of the relative error in each refinement (Fig. 3). Assumed that in the last refinement step the target relative error will be set to r/t and a total of n refinements is to be carried out. Let ce be the relative error reduction in the. last refinement step, i.e. the relative error is reduced from a + r/, to r/, in the last refinement. It is further as,;umed that as the mesh is refined, the relative error is reduced geometrically so that the relative error reduction :ratio between successive refinement is denoted by r 1> 1.0 (Fig. 3). Since it is hoped that, in the ideal case, after n refinements the relative error of the finite element solution will be reduced from r/o to r/, = r/,, the relationship between a, r, r/t and r/0 will be (see Fig. 3) n-I
(rio - zlt) = ~
(20a)
a r "-i
i=0
after summing up the series (r/°-r/')=a
(r"-l) (r-l)
or
(r-l) a=(r/o-rl,),--_.~-- ~ tr
(20b)
l)
llq
log(~)
~ 11., q,
11s=~S No
N5
Jos(~O
Fig. 3. Adaptive refinement with different relatiw: error in each refinement.
C.K. Lee, S.H. Lo / Comput. Methods Appl. Mech. Engrg. 170 (1999) 39-64
45
From Fig. 3 and Eq. (20b), in the ith refinement step with 1 <~ i :~< n, the relative error, ri~ equals to ,,_~ rii = r i i - l -
ar
. (r= rii- 1 -
(rio -
1)
r i , ) (r-TQ~_ l )
. -i r
(21)
In case that r = 1, since lim r--+l
r
r - 1 1 . -1 n
(7/o - ri,) a - - n
and
(22)
Eq. (21) will reduce to the trivial case of uniform reduction in relative error, i.e. 1
(23)
r/i = rio - (rio - r/t) n On the other hand, r = ~, since ( r - 1) ~-~ =1"1 ~__,~ r l ( ri- ~m _ 1) ~0
for i = l fori>l
(24)
Then, Eq. (21) will become ri~ = ri,
for i . . . . . n
(25)
That is, the target relative error ri, is used in all refinement steps. By adopting Eq. (21), for any given value of n, the rate of increase in the degrees of freedom of the mesh can be controlled by using different values of r. In general, the effects of adopting a higher value of r are (1) The rate of increase of the degrees of freedom in the first few refinements will be increased. This also means that the error of the finite element solution will be reduced more rapidly in the lirst few refinements (but may be at a lower convergence rate). (2) The rate of increase of the degrees of freedom and equivalently the differences among the relative error achieved in the last few refinements will be reduced. (3) The target relative error r/, may be reached by using a fewer number of refinements. Once the values of n and r are selected, the relative error at each refinement step, ri~ can be computed by using Eqs. (21 ), (23) and (25). The implicit refinement strategy developed in [31 ] is then used to define the new element size for the next cycle of adaptive analysis. The whole adaptive refinement procedure can be summarized as follows: (i) Carry out finite element analysis on the initial mesh and set the number of refinement, i = 0. (ii) Construct the smoothed stress using the SPR and then compute the estimated error norm Ile~lla. The estimated energy norm of the current finite element solution, IlulL and the relative error of the finite element solution, ~ are then defined as
=
(ltull
+[leul[,±)
and
~i---
Ile.lL,, rlulr.
(26)
(iii) If the accuracy requirement (27)
~, ~
is not satisfied, increase the refinement counter i to i + ] and compute the relative error ri, using Eqs. (21), (23) and (25). Otherwise, the adaptive refinement procedure will be terminated. (iv) The expected number of element, NE e required to achieve ri~ is computed as
\
~, /
where NE c is the current number of element in the meslh. (v) Definition of local allowable error norm e, and local refinement indicator ~ for the j t h element
(28)
46
C.K, Lee, S.H. Lo / Comput. Methods Appl. Mech. Engrg. 170 (1999) 39-64
JluJl e.=r/i~
fle.lr , and
~-
(29)
e
where ~ and Ile,,lla~ are respectively the domain and the estimated error norm of the jth element. (vi) Definition of new element size hj for next cycle of analysis
h; hi= .
.~+~5~
(30)
where hi is the current element size and/3 is the local rate of convergence of the element and will be equal to h if the element locates near a singularity, otherwise ¢1 = p. (vii) An adaptive new mesh will be generated [32], the next cycle of finite element will be carried out and goto step (ii). Two remarks about the above adaptive refinement strategy should be noted. (1) In practice, the actual number of refinements needed to reach the target accuracy may be less than the input value n and it is common when a higher value of r is used (e.g. r = m). The reason is that the error estimator may overestimate the error of the finite element solution and leads to an over-refined mesh. (2) In many previous studies [1-3,18,20-24,30,31], the adaptive refinement strategies used are corresponding to the case of r = ~ (i.e. 7/,.-----r/, in all refinement steps) and hence the refinement strategy described in this section is a more sophisticated approach. It should be pointed out that when carrying out adaptive analysis for 2D (or plate bending) problems, the, efficiency of the adaptive procedure would be less sensitive to the value of r since in such a case the total computational cost of the adaptive procedure is proportional to 1/h e (rather than 1/h 3 in 3D analysis). Thus, a small drop in the efficiency of the adaptive scheme may not largely increase the computational cost of the whole procedure.
5. Numerical studies
5.1. Main objectives The two main objectives of the present numerical studies are: (1) To assess the effect of using back interpolation for the con struction of initial guesses for the PCG solver. In the numerical studies, the PCG solver using the symmetric successive overrelaxation (SSOR) preconditioning method [1] will be used for solving the global stiffness equations system for both uniform and adaptive refinements. Two different ways of constructing the initial guess for the PCG solver, Uo will be tested. The first way is to use simply a null vector (NV) as the initial guess, i.e. ti 0 = 0 and the second way is to obtain the initial guess by back interpolation (BI) from the last converged solution of the previous refinement (Appendix B). It is hoped that through the numerical studies involving model problems with different smoothness, different order of elements and different adaptive refinement strategies, the effect of using the BI scheme on the overall computational efficiency of the PCG solver can be assessed. (2) To evaluate the effect of using different values of r on the total computational cost of the adaptive refinement procedure. The refinement strategy described in Section 4.1 will be employed, different values of r are used to carry out the adaptive refinement and the total computational cost needed to achieve a solution with the prescribed accuracy will be logged. By doing so, the sensitivity of the adaptive refinement procedure as well as the effectiveness of the BI scheme with respect to tl:Levalues of r can be studied. In this paper, the computational cost is measured in unit of flop and only the computational cost used in the solution of the global stiffness matrix is considered since for significantly large 3D problems, the solution of Eq. (1) will often be the most computational expensive step in the whole., adaptive procedure. As mentioned in [3], if a PCG solver is used, the actual computational cost in terms of flops can be accurately represented by Nnop~ = 4NzN~
(31)
47
C.K. Lee, S.H. Lo / Comput. Methods Appl. Mech. Engrg. 170 (1999) 39-64
Hence, the sensitivity of the total computational cost of the adaptive refinement procedure with respect to r can be studied by plotting a graph of the actual relative error (r/) achieved against the cumulative value of NzN~ for different values of r used. 5.2. The model problems and their reference solutions 5.2.1. Problem 1: Spherical container under internal pressure The spherical container with internal and external radii of 5 and 20 units respectively under a unit internal pressure as shown in Fig. 4 is used as the first example. Owing to symmetry, only one-eighth of the container was modeled and the exact solution of this problem is known [33]. The exact solution of stress is smooth inside the entire problem domain. This example is employed to investiga1:e the effectiveness of the BI scheme during uniform refinement for problems with smooth exact solutions. 5.2.2. Problem 2: Short cantilever under end shear The short cantilever shown in Fig. 5 is used as the second model problem. The dimensions, loading and supported conditions are also shown in Fig. 5. Owing to symmetry, only half of the beam will be analyzed. The solution of this problem involve three singular lines along AB, BC and CO and stress along there are unbounded. The exact value of A for these singular lines are known and is equal to 0.7111. 5.2.3. Problem 3: L-shaped domain under tension In this problem, an L-shaped domain under tension at one face is considered. Fig. 6 shows the dimensions of the problem domain and the loading and boundary conditions applied. For this problem, only one singular line AB exist with a value of A = 0.5445. Again, owing to symmetry, only half of the problem domain will be analyzed. 5.2.4. Problem 4: Biaxial bending o f a column footing In the last example a more difficult problem involving the biaxi~d bending of a column footing is considered. A uniform shear and a uniform pressure are acted at the top of the column and near the base of the footing respectively as shown in Fig. 7. The footing is fixed at the base and totally eight singular lines are present in this problem. They are located at the base of the footing (lines AB, BC, CD and DA) with A = 0.7111 and at the junction between the column and the footing (lines EF, FG, GH and HE) with A = 0.5445. When compared with Problems 2 and 3, this problem is more difficult and more degrees of freedom are required to achieve a solution with the same relative error. For all these four model problems, only Problem 1 has analytical solution and hence the performance of the adaptive scheme for the other three problems have to be assessed by using some accurate 'reference' solutions. These reference solutions were constructed by analyzing the problems using some highly-refined adaptive
~_
b
•
~ V , , \p~
•
)
Young'smodulus=l.O, Poisson'sratio--0.3, a=5.0,b=20.O.
Fig. 4. Problem 1. Spherical container under uniform internal pressure.
(fixed
end).
~
-.
+ [I ]
Fig 5. Problem 2. Short cantilever under end shear.
C.K. Lee, S.H. Lo / Comput. Methods Appl. Mec, h. Engrg. 170 (1999) 3 9 - 6 4
48
~. ~.z F~r..e z= 1, vr-O.
Young'smodulus=lO0.O, I / ~ Poisson's nitio---0.3. ~ F~y=O,i~i.o.~
, ~-~" I I
Jlo.5
2.0
/
LY
/
Facex=l.5, J-
Vouns'smodulus=1.0,
i>oi,,<>.,,,.io--~.3.
~ Face2-=3.0, .~T~I~=I.0.
~
~
110,5
F
~-~
1.0
\-%...
i L...~c
U=~'~v--O. F i g . 6. P r o b l e m
3.
L-shaped domain subjected to horizontal force.
Fig 7. Problem 4. Biaxial bending of a column footing.
meshes of TIO elements (Fig. 8). These meshes were constructed by refining the mesh to the limit of the machine used and the properties of the reference mesh are listed in Table 1. As the reference meshes are highly refined, the estimated energy norm of the reference solutions (usi:ag Eq. (26)) are taken as the exact energy norm of the corresponding problems. In the numerical studies, the e~act error norm is obtained by the well-known adjoint law [18] as
Ile.ll
(32)
_ II'ill
(a) Problem 2
(b)Problem3
(d) Problem 4 F i g . 8.
References meshes used.
49
C.K. Lee, S.H. Lo / Comput. Methods Appl. Mech. Engrg. 170 (1999) 39-64 Table 1 Properties of reference meshes for Problems 2 to 4 Problem
NN
NE
NF
II.ll,,
~(%)
2 3 4
77117 76851 77003
46324 50057 49587
231351 230553 231009
1.3169828769 0.8805987230 0.7416015153
0,32 0.50 2,67
where IM,II~ is the exact error norm of the finite element solution while Ilull,~ and Ilalb the energy norm of the estimated exact solution (from the reference solutions) and the, finite element solution respectively.
5.3. Numerical experiments done A total of three types of element, namely the T4, T10 and T20 I.,agrangian tetrahedral elements were used in the numerical study. While the element stiffness matrix for the T4 element is available in closed form, the 5-point and 15-point numerical integration rules were used for the computation of the element stiffness matrices for the TI0 and T20 elements, respectively. The PCG solver based on SSOR preconditioning used in [1] was used for the solution of the global linear equations system and double precision words were used to store all the real variables in the analysis. In all the analyses, the value of S (Eq. (15)) used was set to 1 x 10 -5. For each model problem, uniform refinement was carried out by using four uniformly refined meshes for each element type tested. Hence, a total of 48 finite element analyses (4 x problems x 4 meshes X 3 element types) were carried out for uniform refinement. The uniform meshes used in Problem 1 are shown in Fig. 9 while those corresponding to Problems 2 to 4 are shown in Figs. 10-12, respectively. For the solution of the global stiffness equations, except for the initial mesh U1 to which a null vector was used, during subsequent refinements, both a NV and the BI scheme were applied to construct the initial guess for the PCG solver. Adaptive refinement was also carried out for all the model problems except Problem l in which the exact solution is smooth. All three elements were used and the adaptive refinement strategy described in Section 4.1 was applied. The meshes U1 shown in Figs. 10-12 were used as initial meshes. A value of n = 4 was used for all the model problems and elements tested, while five different values of r corresponding to r = 1.0, 2.0, :3.0 4.0
(a) Mesh U 1
Co) Mesh U2
(¢) Mesh U3
(d) Mesh U4
Fig. 9. Uniform refinement for Problem 1.
(a) Mesh UI
(b) Mesh U2
(e) Mesh 133
Fig. 10. Uniform refinement for Problem 2.
(d) Mesh U4
C.K. Lee, S.H. Lo / Comput. Methods Appl. Me~h. Engrg. 170 (1999) 39-64
50
(a) Mesh U1
(¢) MeshU3
(b) Mesh U2
(d) Mesh U4
Fig. 11. Uniform refinement for Problem 3.
(a) MeshU1
(b) MeshU2
(e) MeshU3
(d) MeshU4
Fig. 12. Uniform refinement for Problem 4.
and ~ were used to test the sensitivity of the refinement strategy Hence, for Problems 2 to 4, a total of 15 sets (3 elements × 5 values of r) of adaptive refinements had been carried out. In all the adaptive analyses, the value o f A was set to 0.5 regardless of the actual strength of singularity. As in the case of uniform refinement, except for the initial mesh to which a NV was applied, the initial guesses for the PCG solver in all other adaptive refinement steps had been constructed by using both a NV and the BI scheme. Finally, due to the differences in the difficulty of the test problems and the relative accuracy of the elements tested, different target relative errors r/, were selected for the three model problems and their values are shown in Table 2. 5.4. Numerical results 5.4.1. General behavior o f the error estimator, the adaptive procedure and efficien O, o f different elements Since quite a large number of analyses were done (48 and 162 analyses for uniform and adaptive refinements, respectively), it will be too tedious and clumsy to list all the properties of the finite element meshes generated and the results obtained. In fact, in the previous studies [1,3], the behavior of the error estimator and the adaptive procedure as well as the relative efficiency of different elements have been studied in detail. A concise summary of the results obtained is given below. Basically, the results obtained in the current study reconfirm the conclusions obtained in [1,3]. It is found that (1) For problems with stress singularities, the adaptive refinement procedure described in Section 4.1 can effectively eliminate the effect of singularities and the c3nvergence rates were much improved when comparing with uniform refinement. Such result is obtained for all the values of r and elements tested. (2) The Z 2 error estimator using the SPR for the construction of smoothed stress can provide reliable error Table 2 Target relative error, r/, for different elements in adaptive refinement Element
Problem 2
Problem 3
Problem 4
T4 TI0 T20
5% 1% 1%
5% 1% 1%
15% 5% 5%
C.K. Lee, S.H. Lo / Comput. Methods Appl. Mech. Engrg. 170 (1999) 39-64
51
estimates in both uniform and adaptive refinements for problem without singularities and in adaptive refinement for problems with singularities. (3) In terms of total computational effort and storage requirement, the T10 element outperform the linear T4 and the Cubic T20 elements. All the last adaptive meshes generated for Problems 2 to 4 for different values of r and elements used are shown in Figs. 13-15, respectively. 5.4.2. The general performance o f the SSOR PCG solver Fig. 16 shows the relationship between N~ and ,A]I/6 for all the problems during uniform refinements when ,f different initial guesses are used with the PCG solver. The results obtained during adaptive refinements are shown in Figs. 17-19 for Problems 2 to 4, respectively. The relationship between Nz and NI for all the uniform and adaptive meshes generated is shown in Fig. 20. From the results shown in Figs. 16-20, the following conclusions can be drawn on the performance of the PCG solver. (1) For a given element type, the value of N z is directly proportion to Ni and Eq. (14) is strictly hold. For the finite element meshes tested in this study, the relationship between N. and ~ can be written as
(33)
N z = CppNj
where Cp is a constant dependent only on the type of elements in the mesh. In the current study, it is found that Cp -~ 20 for the T4 and T10 elements while for the T20 element, Cp ~ 22. (2) In case that a NV is used, the value of N, is directly proportional to N}/6 and Eq. (17b) is validated. (3) The PCG solver used in the current study is highly efficient in solving the linear equation systems in 3D adaptive analysis in which the volume ratio of elements can be very large. While in the most extreme case the volume ratio between the largest and the smallest elements in the mesh is greater than 1 × 107, the numbers of iterations required by the solver to achieve a converged solution for all the last adaptive meshes are only a small fraction of the number of degrees of freedom in the meshes, especially when the BI scheme was employed. Table 3 shows the results obtained from different model problems when r = 3.0.
it (a) r=1.0
(b) 1"=2.0.
(¢) r=3.0
(d) r--4.0
(e) r---~
Fig. 13. Last meshes of adaptive refinement for Problem 2. (Note: Top row: T4 elements, middle row: T l 0 elements, bottom row: T20 elements.)
C.K, Lee, S.H. Lo / Comput. Methods Appl. Meets. Engrg. 170 (1999) 39-64
52
(a) r=l.0
Co) r'=2.0
(c) 1"=3.0
(d) r=4.0
(e) r'----co
Fig. 14. Last meshes of adaptive refinement for Problem 3. (Note: Top row: T4 elements, middle row: TI0 elements, bottom row: T20 elements.)
4k k (a) r=l.O
(b) r=2.0
(c) r~3.0
(c) r----4.0
(e) r-----oo
Fig. 15. Last meshes of adaptive refinement for Problem 4. (Note: Top row: T4 elements, middle row: TI0 elements, bottom row: T20 elements.)
C.K. Lee, S.H. Lo / Comput. Methods Appl. Mech. Engrg. 170 (1999) 3 9 - 6 4 Ni
53
NI 6C0 )< -/~- . . . . . . /
5C0 . . . . . . . . . . . . . . . . . . . . . . . . .
iit
4(]0 . . . . . . . . . . . . . . . . . . . . / - / - "
2(]0 .i . . 3. . . . . . . .(.~. . . . ] ~
~~ . . . . . o. . . . . . . 1 ~ - - ~~-~ . . . . . . . . . . 0
1.8
i
i
i
2.3
2.8
3.3
i
3.8 NFU~
/
~0
- .......
Y 1-- ~ /~/ - -0- " . . . 0.". . . . . . .
-
i
i
i
4.3
4..8
5.3
]
0
'
25
3
'
3.5
(a) Problem 1
i
. . 5 . 5.5 . .
4.5 NFJ~
6
6.5
7
(c) Problem 3
NI
NI
600
6C0 v
500 . . . . . . . . . . . . . . . . . . . . . . . . . . . .
-~ / /
5O0
--
x
,oo
3O0 20O ............................
100
02
i
2.5
i
3
i
3.5
i
4
~
i
4.5 5 NFI,~
i
5.5
i
6
i
6.5
0
7
i
h
i
i
i
I
i
~
i
2.5
3
3.5
4
4.5 NF~,
5
5.5
6
6.5
(b) Problem 2 Legends:
7
(d) Problem 4
-~-T4-NV -=-T4-BI ~ T10-NV tI-T10-BI ~ T 2 0 - N V
Fig. 16. G r a p h s of NI against N F 'I~ for uniform refinement.
~-T20-BI
(Note: N V = N a l l vector as initial guess, BI = Initial guess by b a c k
interpolation.)
5.4.3. The effect of using the BI scheme to construct an initial guess during uniform and adaptive refinements From Figs. 16-19, the following conclusions can be drawn on tile effect of using the BI scheme with the PCG solver. (1) In all the cases, by using the BI scheme, the number of iterations N~ required to achieve a converged solution was reduced. (2) In terms of different elements, the use of the BI scheme re~;ulted in most remarkable improvement when the T20 element was used and it was the least for the T4 element. (3) From Fig. 16, when the meshes were uniformly refined, even when the BI scheme was used, the value of N~ still increased proportional to N s~/6 so that the order of computational cost required to solve the global Table 3 Ratio o f N,/NI (%) for different model problems during adaptive refinement, last adaptive meshes, r = 0
Problems
2 3 4
A N V is used as initial guess
The BI scheme is used
T4
T10
T20
T4
TI0
T20
0.37 0.28 0.65
0.35 0.39 0.74
0.47 0.58 0.54
0.29 0.22 0.58
0.19 0.30 0.30
0.25 0.32 0.33
C.K. Lee, S.H. Lo
54
--[
Comput. Methods Appl. Mech Engrg. 170 (1999) 39-64
i I
i NI
I ¸
g 9
,d I
"8
I r~
,,4 I
I
e'~
I ~4 I
t~
I
`6
II
etO
,4 I
`6
.-=
t'q
i
\
i
I I
I
I
X I
ca
=
] ~4
Q
z
t~
~q
t~
~o t~ ,,4
-el
Z
~4
,4
r~ i< ,4
N
o
t~
C.K. Lee, S.H. Lo / Comput. Methods Appl. Mech. Engrg. 170 (1999) 39-64
55
t.-.
! aa
,,4
0
"6 I
I
;tr~
1I r,i o
¢,1 ,4
:'-2."
t-,
.~
,A
z PI
b-,
z z
e4 I
i
I
,,4
E
,,4
"~
I
1 1 I
,,4
1
e,i ,4
ool
-7
I I
o
56
C.K. Lee, S.H, Lo / Comput. Methods Appl. Mech. Engrg, 170 (1999) 39-64
"2".
L~ ,fi
I
e'~ ~4
I
w~ v4
I
.4 I
el
I
:=,
e-,
I
"¢
H
Z J
ca)
w~ ¢q
m w~ ~4
Z II >, Z
z
t~
TI + "o I
~
I
,
,
r
~
~
~
i
I
t
I
,,.41
4
~
•
Z
M
o e'l
,5
C.K. Lee, S.H. Lo / Comput. Methods Appl. Mech. Engrg. 170 (1999) 39-64
57
N Z (Millions) 10,00 I
g.oo
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
/
-
.
.
.
.
.
.
.
.
.
,,n/
N F (Thousands)
+I"4 ~
TIO ~
T20
Fig. 20. NZ against NF for all meshes generated. stiffness equations remained the same. When comparing the effect of the BI scheme on different model problems for uniform refinement, the same figure indicates that the greatest reduction in N/is obtained for Problem 1 (up to 60% reduction) in which the exact solution is smooth. For Problems 2 to 4 only at most a 10-20% reduction was achieved. (4) From Figs. 17-19 it can be seen that, similar to the case of uniform refinement, for adaptive refinement, the BI scheme is more effective for the T10 and T20 ele~aaents than for the T4 element. However, in general, the use of the BI scheme in adaptive refinement can more effectively reduce the value of N~, especially in the last few refinement steps and the reason for this is not difficult to understand. Since in adaptive analysis the mesh patterns in the last few refinements will become rather similar, the performance of the BI scheme is further enhanced. From the results shown in Figs. 17-19, when the BI scheme was used, if the value of r used was greater than a certain minimum value ?'rain, the value of N i needed in the last few refinement steps of the adaptive analysis will increase at a rate lower than N)/6 or even become independent of N/. Hence, by using Eqs. (31) and (33), the total computational cost needed in terms of flops can be written as Nflop
~-
4CI, p N ) +~ = O ( N ) ~ ' )
(34)
where 0 ~< 3' < 1 ] 6 and from the numerical results obtained, in many cases with a suitable value of r, y--0. From Figs. 17-19, Eq. (34) is at least true using T10 and T20 elements for the model problems tested. In fact, from the model problems used in this study, it appears that the value of rmi. for a given element will increase as the value of A decreases. For example, in Problem 2 in which ,~ -- 0.7111, the value of rmin is approximately equal to 2.0 for the T10 element while for Problems 3 and 4, r,, m is approximately equal to 3.0 corresponding to a value of ,,/= 0.5445. It should be remarked that such a decrease in the order of operations for the PCG solver will have a significant effect on the total computational cost in full 3D finite element analysis. Since in realistic 3D applications, finite element meshes with a large amount of degrees of freedom are commonly used and the solution of the global system will be one of the most computational expensive modules in the adaptive analysis, a reduction of the order of operation from O(N~/6 ) to O(Nr) will lead to considerable savings in the total CPU time. 5.4.4. The effect o f using different values o f r on the total solution cost In order to study the effect of using different values of r on the total computational cost by the PCG solver, the total computational cost in terms of cumulative NzN~ was logged and Figs. 2 1 - 2 3 shows the graphs of relative error against the computational cost for different values of r (in logarithm scale) for Problems 2 to 4, respectively. Note that Figs. 2 1 - 2 3 are all corresponding to the recruits obtained when the BI scheme was used since it is more efficient than using a NV. From these figures, it can be seen that (1) In all model problems, by using r = ~ clearly did not lead to a most efficient adaptive refinement procedure in terms of computational cost for the solution of the global system. In fact, a considerable
C.K. Lee, S.H. Lo I Comput. Methods Appl. Mech. Engrg. 170 (1999) 39-64
58
----'~'"l
I
I
I
I r I I
t I I I
/ ~
I~
I I I I
I / I/
I//
li
I
I
)
--
II
I I
;
I I
I I
II II
I
II i ~
,/',Z : '/
I I
I I
i
',i i72
,/ Ir I I
i ,.~ "~
II I
l
/,
I
+
I111
I I
I
I
Illl
I
I
4 H
~
4
Illi
14~i
Iil~l l I~! 1 ;7 iI~1 Illl
I
m-
I
~ "}
I
e4
.~
,.
I
IIIII
I I
I
I
IIIII
I I
I
I
i I
I
II
0~
I I
z
III Ill
I I
[11
I
III III III IIJ Ill
I
l
+
IIll
I I I
I
IIIl~l
I I ~I"
I I I ! i I
~ I I I I
I III IIIII III liI II I I
I I I I
i
I
I
I
T
II
1 I I I i i
I J I I I I
I ] I I
II I I I
I
I
I
Ii
I
..
~N
~
-~
II III
I
Illll
I I
I
I
I
Idlll
I
I
I
I
I
fill
I I
II
rll
II II II
III Ill II~
t:
';fP Jill Ill III Ill Ili
li'
r-
!
I I ;
IlllJ PIIII I ill lill
I
llJl
1
IIIitl
I
~ i '-~
i I
i
[i PI
I
IP J~LLC~
I
I I
II Ili I[11 Illi Ilrl iPll ;Ill Olll ill] illl lilll
N
I
I
+
II
~s
~
I
•
IIIII
Q
I
i ~
IIIIII
l]]]t Illll IIIII IIPII IIIII lilll Illll Idll I Illll IIIII Illil IIIli Illll I
I
I IXI
o
t'M
I
l
I
'
I
I
,~
I
IIII frill
I IIIII I ,I ,,~/ IIHII', lilt I I
I I I
~ ~ ~
IIIFI
I 4
~_ r I
PIIII
I
I
i,~ e~
Illll
I
I
I
I
I
.=
C.K. Lee. S.H. Lo / Comput.
Methods
Appl.
Me&.
Engrg.
170 (1999)
39-64
59
60
C.K. Lee, S.H. Lo / Comput. Methods Appl. Mech. Engrg. 170 (1999) 39-64
amount of computational cost can be saved by using an optimal value of r < ~. Figs. 2 1 - 2 3 also indicate that whenever the relative error of the initial mesh is much greater than the target relative error r/,, it is preferable to use a value of r < ~. (2) When a value of 1.0 ~< r ~< 4.0 is used, the efficiencies of the adaptive refinement scheme are quite similar and it is not possible to conclude that which vaiue of r will always result in a most efficiency refinement procedure. It seems that the optimal values of r will depend on both the element used and the real problem in hand (probably affected by the value of A of the model problem). (3) In general, the final relative errors achieved by the adaptive refinement procedure are all close to the target value r/,. Moreover, it was found that a lower final relative error were usually achieved by adopting a higher value of r. In fact, from Figs. 21-23, it can be seen that when a value of r = ~ was used, the final relative error obtained is usually lower than r/, (but in the expense of more computational cost). As the total computational cost is rather insensitive to the value of r taken in the range 1.0 ~< r ~< 4.0, it appears that in practice a higher value of r (say r = 3.0 o= 4.0) should be preferred. In fact, for all the model problems and elements tested in this study, a nearly optimal computational cost can be achieved by setting r = 3.0. By doing so, the number of iterations needed to solve the global linear systems in the last few adaptive refinement steps is virtually independent of 1:he value of Nr.
6. Conclusions and future investigations In this paper, several aspects of efficient solution of full 3D finite element analysis problems have been studied. From the results obtained in the numerical experiments, the following conclusions can be drawn. (1) For large scale 3D problems involving stress singularities, adaptive refinement analysis appears to be the only feasible approach to achieve a finite element solution with a high prescribed accuracy. Due to the effect of singular points, solution by uniform refinement is virtually impractical in terms of storage and computational effort needed. Furthermore, the T l 0 element is again found to be the most effective element tested in the Lagrangian tetrahedral family. (2) For both uniform and adaptive refinements, when comparing with the usual practice of using a null vector (NV), the use of the back interpolation (BI) scheme for the construction of the initial guesses can reduce the number of iterations needed by the PCG solver to reach a converged solution. More importantly, when adaptive refinement is used, with a series of properly constructed adaptive meshes, the number of iterations needed for the solution of the global system will increase at a lower rate than expected. In fact the numerical results obtained in this study shows that for an optimal case it will even become independent of the number of degrees of freedom in the mesh. As a result, for a large scale full 3D analysis, if a suitable adaptive refinement strategy and the BI scheme are used, the order of the total computational cost required to solve the global linear equations system will be only proportional to the number of degrees of freedom in the mesh and this implies a significant speed up of the whole analysis procedure. Table 5 summaries the order of computational cost required for the solution of the global system by using different approaches. (3) In the numerical studies, it has also been fi~und that, tor 3D adaptive analysis, a minimum total computational cost is usually n o t achieved by adopting a constant target relative error during every refinement steps. The total computational cost needed to achieve a solution with the prescribed accuracy can be minimized by gradually reducing the target relative error during the adaptive refinement. The numerical results obtained indicate that the total computational cost can be nearly minimized over quite a Table 5 Order of computational cost needed by different solution approaches Methods
Order of computational cost
Direct methods (Gaussian elimination) [12l Conjugate gradient method [ l 2] Precondition conjugate gradient method using a NV as initial guess [12] Precondition conjugate gradient method using adaptive refinement and the BI scheme to construct the initial guesses (this study)
O(N~'3) O(N~1~) O(N] ~ )
O(Nj)
C.K. Lee, S.H. Lo / Comput. Methods Appl. Mech. Engrg. 170 (1999) 39-64
61
large range of reduction rate of the relative error characterized by the parameter r defined in the last section. Numerical results obtained in this study also lead to the following future research directions. (1) Use of other PCG solvers In this study, due to the limitation of computational resource and time to carry out the numerical study, only the PCG solver based on the symmetric successive over relaxation preconditioning method was used. It will be very valuable if the performance of other preconditioning methods such as the element-by-element technique [ 13] and the incomplete Choleski factorization method [ 12,14,15] can be tested in the future studies. (2) Effect of node ordering on PCG methods. It should be pointed out that in this study, node and element renumbering has not been carried out for all the finite element meshes generated (in both uniform and adaptive refinements) and the original numbering by the mesh generator is retained. As it is well-known that [34] the ordering of the nodes in the finite elements mesh will affect the performance of a PCG solver, a systematic study on the effect of those commonly used node renumbering schemes will be undoubtedly useful to further increase the efficiency of the PCG solvers. (3) Applications to other types of finite element formulation Another topic in structural analysis in which the PCG solver is frequently used is the analysis of shell structures using degenerated solid elements [35]. In principle, the BI scheme used in this study can also be used to increase the computational efficiency in such formulation. However, care should be taken during the degenerated solid shell analysis as the unknowrs in the global equations system will involve both nodal transitions and rotations. The nodal rotations are usually defined with respect to some local co-ordinate systems which are not constant for the elements in the mesh. Furthermore, the order of magnitude of the rotation terms will be different from the nodal translations. Hence, special precautions should be needed when using the BI scheme to obtain a good initial guess for the PCG solver for such applications.
Acknowledgment The financial support from the Crouched Foundation and the U.G.C. Grant Council for the project 'Adaptive Refinement Finite Element Analysis For Three-dimensional Problems' is gratefully acknowledged.
Appendix A. The SPR for 3D adaptive analysis In the original form of the SPR proposed by Zienkiewicz and Zhu [20], no addition functional was used so that //A = 0
(A.1)
However, this original form sometimes may become unstable when the sampling points in the patch are badly aligned [36]. Hence, some researchers [21,22] suggested that a sl:abilizing functional which is derived from the governing differential equation of the problem should be added. For 3D finite element analysis, such functional can be written as
f~ (SXo"* - q)T(sTo'* - q) dO
(A.2a)
so that S T o - -- q = 0
(A.2b)
is the governing differential equation for 3D problem. Note that another form o f / / a based on the virtual work principle has also to be used [23,24] such that
C.K. Lee, S.H. Lo / Comput. Methods Appl. Mech. Engrg. 170 (1999) 39-64
62
1
I IA = -~ (Ha - G)T (Ha - G) H =~ =
BxPdO e
and
(A.3a) G=
BTo" d O i=1
(A.3b)
i
where NEP is the number of elements in the patch and B is the strain-displacement matrix of the element. In general, the performance of the SPR (especially the convergence rate of the recovered stress) is very much dependent on the accuracy of the finite element stress over the sampling points (Eq. (10)). The main function of //A is tO increase the stability of the recovery matrix (Eq. ( 11 )) arid in most cases it can only slightly increase the accuracy of the recovered stress and has very minor effect on the convergence rate. For this reason, in this study the addition functional corresponding to Eq. (A.2) was used since it requires only a little extra computational effort. Another point worth mentioning is that, in [23] and [24], it was suggested that if the additional functional corresponding to Eq. (A.3) is used, it is possible to improve the accuracy (or sometimes even the convergence rate) of the recovered stress by increasing the order of polynomial terms used in P (Eq. (9)) in the stress recovery. However, such practice is not recommended in 3D analysis due to the following reasons. (1) Some preliminary results obtained in this study indicate that the improvements in the accuracy obtained by raising the order of P is relatively minor comparin~, to the cases of 2D [37] and plate-bending problems [3811. (2) More importantly, a p t h order 3D complete polynomial will consist of (p + 1)(p + 2)(p + 3 ) / 6 terms so that the computational cost of raising the order of P will increase in an expensive rate of O(p3). (3) For some elements, raising the order of P in the 3D SP1;', will lead to an unstable recovery matrix. Hence, in all the analyses carried out in this study, the order o5 polynomial terms used in the SPR is equal to the order of the element used.
Appendix B. Construction of initial guesses by back interpolation during 3D adaptive analysis In 3D adaptive analysis, the computational efficiency of the PCG solver can be increased by constructing a good initial guess using back interpolation (BI) from the finite element displacements obtained from the previous analysis. In the BI scheme, the following steps will be applied to every nodal point p in the current finite element mesh. (i) Find out the element E in the previous finite element mesh which contains the point. (ii) Compute the element local co-ordinates of p with respect to the element E. (iii) Compute the initial guess at point p using the local co-ordinates, the element shape function and the finite element displacements from the previous analysis. Two points about the computational efficiency for the above ]3I scheme should be noted:
1. The overall order of operations needed A 3D adaptive finite element mesh can easily consist of a huge number of elements, hence if in step (i) the element E is located by a linear search (i.e. check all the elements in the previous mesh one by one), the overall order of operations for the BI scheme will become O(NN, x NEp)
(B.1)
where N N and NEp are the number of nodes in the current mesh and the number of elements in the previous mesh respectively and the speed of the BI scheme will become unacceptably slow for large values of N N and NEp. In practice, a more efficient way is to store the extents of the elements in the previous mesh in an alternating digital tree structure [39] such that the searching procedure can be accomplished in an order of O(NN,. X log(NEp)) steps.
2. Computation of local co-ordinates In step (ii), for a linear T4 element, the local co-ordinates of the pointp can be easily obtained by considering
C.K. Lee, S.H. Lo / Comput. Methods Appl. Mech..Engrg. 170 (1999) 39-64
63
the volumes of the tetrahedrons formed by p and the four faces cf element E. However, to compute the local co-ordinates of the point for the higher order T10 and T20 elements is not a trivial task. In general, it will involve the solution of a high order polynomial equation [40] and is obviously not desirable. In the current implementation, in case that higher order elements are employed, the element will first be broken down into a number of subelements (10 and 20 subelements for the T10 and T20 elements, respectively). The subelement which contains the point is then located and the local co-ordinates o f p with respect to that subelement is found. Finally, the estimated local co-ordinates of p with respect to the parent element is computed by linear interpolation from the local co-ordinates of the nodal points of the subelement. This method is adopted because it is simple and fast. Furthermore, the local co-ordinates obtained will be exact if all the edges and faces of the elements are fiat surfaces and straight lines respectively. In practice, such conditions are true for most of the elements in the mesh except those lying on the domain boundary.
References [1] C.K. Lee and S.H. Lo, Automatic adaptive 3-D finite element refinement using different-order tetrahedral elements, Int. J. Numer. Methods. Engrg. 40 (1997) 2195-2226. [2] K.T. Schietze, P.S. Shiakolas, S.N. Muthukrishnan, R.V. Nambiar and K.L. Lawrence, A study of adaptively remeshed finite element problems using higher order tetrahedra, Comput. Struct. 54 (1995) 279-288. [3] C.K. Lee and S.H. Lo, Automatic adaptive refinement finite element procedJre for 3D stress analysis, Finite Elem. Anal. Des. 25 (1997) 135-166. [4] O. Hassan, K. Morgan, E.J. Probert and J. Peraire, Unstructured tetrahedral mesh generation for three-dimensional viscous flow, Int. J. Numer. Methods. Engrg. 39 (1996) 549-567. [5] R. Lohner, Automatic unstructured grid generators, Finite Elem. Anal. Des. 25 (1997) 111-134. [6] P.L. George, Improvements on Delaunay-based three dimensional automatic mesh generator, Finite Elem. Anal. Des. 25 (1997) 297-317. [7] B.M. Irons, A frontal solution program for finite element analysis, Int. J. Numer. Methods Engrg. 2 (1970) 5-32. [8] .P. Hood, Frontal solution program for unsymmetric matrices, Int. J. Numer. Methods Engrg. 10 (1976) 379-399. [9] D.P. Mondkar and G.H. Powell, Towards optimal in-core equation solving, Comput. Struct. 4 (1974) 531--548. [10] C.A. Felippa, Solution of linear equations with skyline-stored symmetric matrix, Comput. Struct. 5 (1975) 13-29. [1 I] E. Mendelssohn and M. Baruch, Solution of linear equations with a symmetrically skyline-stored nonsymmetric matrix, Comput. Struct. 18 (1984) 215-246. [12] O. Axelsson and V.A. Baker, Finite Element Solution of Boundary Value Problems (Academic Press Inc. 1984). [13] T.J.R. Hughes, R.M. Ferencz and J.O. Hallquist, Large-scale vectorized implicit calculations in solid mechanics on a cray ~-mp/38 utilizing ebe preconditioned conjugate gradients, Comput. Methods Appl. Mech. Engrg. 61 (1987) 215-248. [14] M. Papadrakakis and N. Bitoulas, Accuracy and effectived of preconditioned conjugate gradient algorithms for large and illconditioned problems, Comput. Methods Ap.pl. Mech. Engrg. 109 (1993) 219-232. [15] P. Saint-Georges, G. Warzee, R. Beauwens and Y. Notay, High-performance PCG solvers for FEM structural analysis, Int. J Numer. Methods Engrg. 39 (1996) 1313-1340. [16] O.C. Zienkiewicz and R.L. Taylor, The Finite Element Method, Vol. 1: Basic ]-~ormulation and Linear Problems, Chapt. 5, 4th edition (McGraw-Hill Book Co. 1989). [17] I. Babuska and B. Szabo, On the rates of convergence of the finite element method, Int. J. Numer. Methods Engrg. 18 (1982) 323-341. [18] O.C. Zienkiewicz and J.Z. Zhu, A simple error estimator and adaptive procedure for practical engineering analysis, Int. J. Numer. Methods Engrg. 24 (1987) 337-357. [19] I. Babuska, T. Strouboulis and C.S. Upadhyay, A model study of the quality of a posteriori error estimators for finite element solutions of linear elliptic problems, with particular reference to the behavior near the boundary, Int. J. Numer. Methods Engrg. 40 (1997) 2521-2577. [20] O.C. Zienkiewicz and J.Z. Zhu, The superconvergent patch recovery and a po~;teriori error estimates. Part 1: The recovery technique, Int. J. Numer. Methods Engrg. 33 (1992) 1331-1364. [21] T. Blacker and T. Belytschko, Superconvergent patch recovery with equilibriura and conjoint interpolant enhancements, Int. J. Numer. Methods Engrg. 37 (1994) 517-536. [22] N.E. Wiberg and F. Abulwahab, Patch recovery based on superconvergent derivatives and equilibrium, Int. J. Numer. Methods Engrg. 36 (1993) 2703-2724. [23] B. Boroomand and O.C. Zienkiewicz, Recovery by equilibrium in patches (REP), Int. J. Numer. Methods Engrg. 40 (1997) 137-164. [24] T. Lee, H.C. Park and S.W. Lee, A superconvergent stress recovery technique with equilibrium constraint, Int. J. Numer. Methods Engrg. 40 (1997) 1139-1160. [25] M. Surajana and K.H. Law, A robust incomplete factorization based on value and space constraints, Int. J. Numer. Methods Engrg. 38 (1995) 1703-1719. [26] O. Axelsson, Iterative Solution Methods (Cambridge University Press, Cambridge, 1994).
64
C.K. Lee, S.H. Lo / Comput. Methods Appl. Mech. Engrg. 170 (1999) 39-64
[27] I.D. Parsons and J. F. Hall, The multigrid method in solid mechanics: Part I: Algorithm description and behaviour, Int. J. Numer. Methods Engrg. 29 (1990) 719-737. [28] I.D. Parsons and J.F. Hall, The multigrid method in solid mechanics: Part If: Practical applications, Int. J. Numer. Methods Engrg. 29 (1990) 739-753. [29] K. Morgan, J. Peraire, J Peiro, O. Hassan and E.J. Probert, Unstructured grid methods for compressible flows, in: Y.K. Cheung, J.H.W. Lee and A.Y.T. Leung, eds., Proc. of the First Asian-Pacific Conference on Computational Mechanics, Hong Kong, (A.A. Balkema, 1991) 1489-1497. [30] J.E De, S.R. Gago, D.W. Kelly and O.C. Zienkiewicz, A posteriori error analysis and adaptive processes in the finite element method: Part II--Adaptive mesh refinement, Int. J. Numer. methods Engrg. 19 (1983) 1621-1656. [31] C.K. Lee and R.E. Hobbs, On using different finite elements with an automatic adaptive refinement procedure for the solution of general 2D stress analysis problems, Int. J. Numer. Methods Engrg. 40 0997) 4547-4576. [32] S.H. Lo, 3D mesh refinement in compliance with a specified node spacing function, Engrg. Fract. Mech. 21 (1998) 11-19. [33] S.E Timoshenko and J.N. Goodier, Theory of Elasticity, Chap. 4, 3rd edition (McGraw-Hill, New York, 1970). [34] I.S. Duff and G.A. Meurant, The effect of ordering on preconditioned conjugate gradient, BIT 29 (1989) 635-657. [35] C.K. Lee and R.E. Hobbs, Automatic adaptive refinement for shell analysis using nine node assumed strain element, Int. J. Numer. Methods Engrg. 40 (1997) 3601-3638. [36] C.K. Lee and S.H. Lo, Robust implementation of the superconvergent patch recovery technique, Pruc. Sec. Asian-Pac. Conf. Computations. Mech., Sydney, Australia, S. Valliappan, V.A. Pulmano and F. Tin-Loi, eds. (A.A. Balkema, 1993, 1275-1280). [37] C.K. Lee and S.H. Lo, On using different recovery procedures for the construction of smoothed stress in finite element method, Int. J. Numer. Methods Engrg. [38] C.K. Lee and S.H. Lo, On constructing accurate recovered stress fields for the finite element solution of Reissner-Mindlin plate bending problems, Comput. Methods Appl. Mech. Engrg., to appear. [39] J. Bonet and J. Peraire, An alternating digital tree (ADT) algorithm for 3]) geometric searching and intersection problems, Int. J. Numer. Methods Engrg. 31 (1991) 1-17. [40] R.H. Crawford, D.C. Anderson and W.N. Waggenspack, Mesh rezoning of 2D isoparametric elements by inversions, Int. J. Numer. Methods Engrg. 28 (1989) 523-531.