Computers and Structures 84 (2006) 2306–2315 www.elsevier.com/locate/compstruc
An element-based displacement preconditioner for linear elasticity problems C.E. Augarde a
a,*
, A. Ramage b, J. Staudacher
a
School of Engineering, Durham University, South Road, Durham DH1 3LE, UK Department of Mathematics, University of Strathclyde, Glasgow G1 1XH, UK
b
Received 20 March 2006; accepted 6 August 2006 Available online 27 October 2006
Abstract Finite element analysis of problems in structural and geotechnical engineering results in linear systems where the unknowns are displacements and rotations at nodes. Although the solution of these systems can be carried out using either direct or iterative methods, in practice the matrices involved are usually very large and sparse (particularly for 3D problems) so an iterative approach is often advantageous in terms of both computational time and memory requirements. This memory saving can be further enhanced if the method used does not require assembly of the full coefficient matrix during the solution procedure. One disadvantage of iterative methods is the need to apply preconditioning to improve convergence. In this paper, we review a range of established element-based preconditioning methods for linear elastic problems and compare their performance with a new method based on preconditioning with element displacement components. This new method appears to offer a significant improvement in performance. 2006 Elsevier Ltd. All rights reserved. Keywords: Finite elements; Iterative solvers; Preconditioning; Elasticity; Geotechnics; Structures
1. Introduction Users of finite element software naturally want their calculations to run faster, both to speed up suites of twodimensional (2D) solutions and to make three-dimensional (3D) analyses feasible. In geotechnical engineering, nonlinear material models are routinely used, so a single problem involves a succession of linear system solves (one for each load step) which adds considerably to the time needed to reach a final solution. With very large problems in mind, much interest has therefore been shown in iterative solution techniques. Particularly attractive in this context are methods that do not require that the full system be assembled during the solution process, but work directly with contributions from individual elements. Potentially, these approaches make feasible problems for which direct solu*
Corresponding author. Tel.: +44 0191 334 2504; fax: +44 0191 334 2390. E-mail address:
[email protected] (C.E. Augarde). 0045-7949/$ - see front matter 2006 Elsevier Ltd. All rights reserved. doi:10.1016/j.compstruc.2006.08.057
tion would be almost impossible, due to the memory requirements. Furthermore, such iterative methods are ideal for implementation with vector and parallel architectures. However, the success of iterative solvers critically depends on how the linear system can be adapted to improve convergence, or preconditioned. This paper reviews a range of preconditioning methods for systems arising from linear elastic material models. Linear elasticity is at the heart of almost all continuum-based constitutive models used in structural and geotechnical engineering, and therefore it is reasonable to concentrate efforts (initially at least) on improvement of solvers for these cases. All of the preconditioners discussed here are element-based, that is, they can be computed element-wise and do not need to be stored in assembled form. The methods presented are drawn mainly from the mathematical literature and some may not be familiar to engineers. In addition, we introduce a new preconditioner based on factorising the displacement component of the element matrices. Following a brief explanation of these methods, we
C.E. Augarde et al. / Computers and Structures 84 (2006) 2306–2315
provide comparisons of their performance on a variety of test problems, using structured and unstructured meshes. Although one of the main advantages of element-based methods is that they have high potential for vectorisation and parallelisation, here we demonstrate the capability of each method using a single processor machine with serial implementation. The main conclusions are however transferable to a parallel environment: this will be demonstrated in a further paper which will present our results for parallel implementations [1]. 2. Background 2.1. Finite elements for linear elasticity The displacement-based finite element method uses a weak formulation of the equations of equilibrium, compatibility and material behaviour to arrive at the linear system Ku ¼ f;
ð2:1Þ
where u is the vector of unknown nodal displacements, f is the vector of nodal forces and K, the structure stiffness matrix, is found from Z K¼ BT EBdX; ð2:2Þ X
with integration over the problem domain X. For simplicity, we consider a 2D plane strain problem discretised with elements having nodes without rotational freedoms, and linear elastic material behaviour. Then the components of the right-hand side of (2.2) are 2 o/ 3 1 2 0 o/ 0 o/oxM 0 ox ox 6 7 o/1 2 0 o/ 0 o/oyM 7 B¼6 ð2:3Þ oy 4 0 oy 5; o/1 oy
o/1 ox
o/2 oy
o/2 ox
o/M oy
o/M ox
where /1, . . . , /M are shape functions associated with the M nodes in the mesh, and 2 3 1m m 0 E 6 m 1m 0 7 ð2:4Þ E¼ 4 5; ð1 2mÞð1 þ mÞ 1 0 0 m 2 where E is Young’s modulus and m is Poisson’s ratio (see, for example, [2]). The structure stiffness matrix in (2.2) relates to the whole problem domain and is of dimension n · n where n is the total number of degrees of freedom (n = 2M for our 2D problem). However, it is common practice to calculate element stiffness matrices Ke, which are of dimension ne · ne, where ne is the number of degrees of freedom on an element. Each Ke is formed using (2.2) with the domain of integration being the element volume. The global matrix K can then be constructed from the Ke using a Boolean connectivity matrix Ce (of dimension ne · n) by first assembling Ke ¼ CTe Ke Ce ;
ð2:5Þ
2307
(which distributes the terms in an element stiffness matrix to their associated global degrees of freedom) then summing over the elements to get K¼
E X
Ke :
e¼1
For linear elasticity, the structure stiffness matrix is symmetric positive definite (SPD). The symmetry is present because the Galerkin procedure by which the weak form is generated is self-adjoint, that is, the basis and test functions are the same. The Galerkin method is equivalent to minimisation of potential energy, and the non-negativity of the strain energy in the domain leads to positive definiteness of the structure stiffness matrix. 2.2. Preconditioners for linear elasticity One iterative solution method which is appropriate for SPD systems is the conjugate gradient (CG) method [3], implementation of which is described in detail in many texts for a standard finite element code (see for example [4]). The convergence rate of CG depends on the eigenvalue spectrum of the system coefficient matrix. In particular, it can be shown that the number of CG iterations required for convergence to within a specified tolerance (for instance, to reduce the residual to a small proportion of the original force vector) is proportional to the square root of the condition number j = kmax/kmin, where kmax and kmin are the maximum and minimum eigenvalues of the coefficient matrix, respectively. The aim of preconditioning linear system (2.1) is to reduce the condition number of the structure stiffness matrix, and hence reduce the number of iterations required for convergence. Conceptually, basic left preconditioning involves multiplying both sides of (2.1) by the inverse of a preconditioning matrix P, say, to obtain P1 Ku ¼ P1 f: An alternative is to apply the preconditioner in a symmetric way, for example, solving 1
1
1
1
P2 KP2 ðP2 uÞ ¼ P2 f
or
^ u ¼ ^f: K^
ð2:6Þ
^ is symmetThis ensures that the modified system matrix K ric. In practice, applying CG to (2.6) gives exactly the same iterates as the standard implementation of the preconditioned CG algorithm [5], which requires only the solution of a system with P as coefficient matrix (that is, the explicit 1 calculation of P2 is not required). The result of these facts is that although the aim is to choose P so that P1K has a reduced condition number, the action of the matrix P1 must also be available at a reasonable cost. The art of developing preconditioners is largely about balancing these two issues, namely their ability to improve convergence and the cost of their use. Considerable research on suitable preconditioners for SPD systems has been published within the mathematics
2308
C.E. Augarde et al. / Computers and Structures 84 (2006) 2306–2315
community. However, surprisingly little of this deals specifically with linear elasticity. Eqs. (2.2) and (2.4) show that, of the two elastic parameters E and m, the former will scale all terms in the stiffness matrix, while the latter will affect the diagonal dominance of E. Therefore m plays a significant role in the condition of the element stiffness matrices and hence K. As we will show below, when m becomes close to its theoretical maximum of 0.5, the condition number of the linear system rises dramatically. It is well known (in the mathematical literature at least) that, for a second order elliptic boundary value problem with smoothly varying coefficients (as we have here with linear elasticity) on a grid with no highly irregular elements jðKÞ ¼ Oðh2 Þ;
ð2:7Þ
where h is the discretisation parameter [6]. For a uniform mesh on a unit square with N elements along each side, we have h = 1/N so as the mesh becomes finer, the condition number increases rapidly. The simplest choice for the preconditioner P is the diagonal of the system matrix (this is known as Jacobi preconditioning). While it is clearly easy to compute the action of P1 in this case, the resulting improvement in convergence it is not necessarily very significant. In fact, the preconditioned system is spectrally equivalent to the original system (two methods using preconditioners P1 and P2, say, are said to be spectrally 1 equivalent if jðP1 1 KÞ ¼ ajðP2 KÞ for some constant a independent of the discretisation parameter h [6, p. 338]). Some examples of preconditioners specifically for linear elasticity have appeared in engineering journals. Diagonal preconditioning is promoted in many references (e.g. [2,4]), primarily because it is simple to implement and, in many cases, gives noticeable improvements in convergence over unpreconditioned systems. A second very successful class of methods for preconditioning SPD systems involves incomplete factorisation techniques. These have been used for a wide variety of problems and many variations of the basic algorithm exist: mathematical details of such techniques can be found in a number of texts (see for example, [7–9]). The underlying idea of Incomplete Cholesky (IC) factorisation (which is appropriate for SPD problems) is the formation of a sparse lower triangular matrix L such that LLT ’ K. The factor L can then be used in the same 1 sense as P2 in (2.6) above. The calculation of this factor, which is the major expense of this type of preconditioner, can be done in different ways. For some variants, L is forced to have a sparsity pattern determined in advance, often the original sparsity pattern of the lower triangular part of A itself: any potential elements of L that are created during the factorisation which are outside this sparsity pattern are simply dropped. Alternatively, a rule based on the size of the created entry (a dynamic drop-tolerance approach) may be used. Other modifications of this procedure include, for example, the Modified IC (MIC) method of Gustafsson [10] which adds all or part of a rejected entry to the associated diagonal. Examples of IC-based precondi-
tioners specifically for linear elasticity can be found in [7,11–13]. The main problem with IC and MIC factorisations is that existence of the factor L cannot be guaranteed for a general SPD matrix. One class of matrices for which IC factorisations are guaranteed to exist are so-called Mmatrices (a nonsingular matrix A is an M-matrix if aij 6 0 for i,j = 1 . . . n, i 5 j and A1 P 0). Unfortunately, while most linear systems arising in elastostatics are SPD, they are rarely M-matrices. As a result, various different approaches have been tried to devise robust IC preconditioners with these systems in mind. Axelsson [7] includes an intermediate stage in the MIC procedure where K is converted to an M-matrix before being factorised to provide the preconditioner. Alternatively, the Robust IC approach (RIC) alters the selection criteria for dropped fill-in entries and adds part or all of rejected entries into diagonals during factorisation [7,11]. Beauwens and coworkers in Belgium have made many contributions in this area and conclude in a recent study that a combination of RIC on a reduced version of A works well for large irregular grid problems in linear elasticity, while MIC is generally less well-suited for these situations [12]. One common feature of the methods described above is that they work entirely at the assembled structure stiffness level. In many cases, it is assumed that the material properties of the elements in the domain are uniform. This may be a reasonable assumption in, for instance, the analysis of steel structures. However, in geotechnical engineering, soil is commonly assumed to have increasing stiffness with depth from the ground surface, based on physical observations. In such cases the element stiffness matrices will vary across the domain. Some advanced, but widely used, soil models (see for example [14]) are based on nonlinear elasticity (that is, the properties vary with stress level) so additionally, element stiffness will vary during loading. If the material behaviour is elastoplastic (a situation not covered in the examples below but crucial to most geotechnical problems), element stiffness matrices may change considerably during an analysis. Much previously published research is based on the properties of the assembled structure stiffness matrix. While all element stiffnesses are obviously taken into account in this approach, it nevertheless ignores local element-level information that might lead to a better preconditioner. For these reasons it is important to examine preconditioning methods that do not require the structure stiffness matrix to be formed. In the next section we examine some element-based methods which fall into this category. 3. Element-based preconditioners An element-based iterative solution method can be defined as one in which the element stiffness matrices Ke and connectivity arrays are stored but are never combined together to form K. This allows relatively straightforward parallel implementation of a solver, because groups of
C.E. Augarde et al. / Computers and Structures 84 (2006) 2306–2315 Table 1 List of element-based preconditioners studied Key
Title
Reference
DIAG EBE SGS EMF FEP EDP
Diagonal scaling Element-by-element method Element-by-element symmetric Gauss–Seidel Element matrix factorization Finite element preconditioner Separate displacement preconditioner
[17] [19] [23] [21]
element stiffness matrices (with their connectivity information) can be sent to different processors. An element-based form of the CG algorithm is described in [4]: in this paper, we consider the incorporation of preconditioning into an algorithm of this type. We will describe five preconditioners that do not require the storage of any assembled matrices and present numerical results comparing their performance. The methods studied are listed in Table 1. Note that most of these techniques were originally developed and analysed for application to Poisson-type problems: we indicate in the text what modifications (if any) are required to enable their application to linear elasticity problems. We will assume throughout the following descriptions that K is an n · n SPD matrix arising from discretisation of a second order elliptic boundary value problem (as is the case in (2.1)). We also note here that a numbering of unknowns always exists such that the assembly of the lower (upper) triangular parts of the element matrices leads to a lower (upper) triangular global matrix. This fact is used in the construction of a number of the preconditioners described below. 3.1. Diagonal scaling (DIAG)
where D = diag(K) and De ¼ diagðKe Þ. This forces the ~ e to be the identity, and ensures that K ~ e is diagonal of K ~ positive definite. Each Ke can now be factorised as ~ e ¼ Le De LT ; K ð3:9Þ e
a process which is very inexpensive as it can be done by factoring the individual element matrices and taking advantage of the observation on local/global numbering above. The EBE preconditioner is then " #" #" # E E 1 Y Y Y 1=2 T PEBE ¼ D Le De Le D1=2 ; e¼1
PDIAG ¼ diagðKÞ ¼
E X
CTe diagðKe ÞCe
e¼1
for any problem [15]. As stated previously, this does not change the condition number of K so 2 jðP1 DIAG KÞ ¼ Oðh Þ:
In fact, it is shown in P[16] that this is true for any preconditioner of the form Ee¼1 CTe Qe Ce for some ne · ne matrices Q e.
e¼E
e¼1
which is easy to invert using forward and back substitutions. The analysis in [18,15] indicates that this method is spectrally equivalent to diagonal scaling, that is, 2 jðP1 EBE KÞ ¼ Oðh Þ:
Note that EBE preconditioning can be applied directly to elasticity problems without modification. 3.3. Element-based symmetric Gauss–Seidel (SGS) One disadvantage of the EBE method of the previous section is that additional storage is required for the element factorisations (3.9). To avoid this, we may instead use a symmetric Gauss–Seidel type splitting on an element level ~ e in (see e.g. [19, p. 378]). Here the regularised matrices K (3.8) are simply split as ~ e ¼ In Le LT ; K e
~ e . The where Le is the strictly lower triangular part of K resulting preconditioner is " #" # E 1 Y Y 1=2 1=2 T PSGS ¼ D ðIn Le Þ In Le D ; e¼1
Diagonal scaling can be considered as an element-based preconditioner where, following (2.5),
2309
e¼E
where D = diag(K). As above, inverting this preconditioner is straightforward and we expect 2 jðP1 SGS KÞ ¼ Oðh Þ:
Some examples of how other matrix splittings can be applied at an element level to produce element-based preconditioners can be found in [20]. 3.4. Finite element preconditioner (FEP) A different type of element-based factorisation technique is presented by Kaasschieter in [21]. He uses the decomposition
3.2. Element-by-element methods (EBE)
Ke ¼ ðLe þ De ÞDþ e ðLe þ De Þ ;
Many different variants of EBE preconditioners have been developed. Here we use the Cholesky factorisation method of Hughes, Levit and Winget [17]. This involves applying so-called Winget regularisation to the assembly of each element, Ke in (2.5), to obtain ~ e ¼ In þ D1=2 Ke De D1=2 ; ð3:8Þ K
where Le is strictly lower triangular, De P 0 is diagonal and Dþ e is the pseudo-inverse of De. The resulting preconditioner is given by
T
T
PFEP ¼ ðL þ DÞD1 ðL þ DÞ ;
ð3:10Þ
where L and D are the lower triangular and diagonal matrices arising from assembling the lower triangular and
2310
C.E. Augarde et al. / Computers and Structures 84 (2006) 2306–2315
diagonal parts of Ke, respectively. This requires a particular global and local numbering of unknowns. In addition, this preconditioner will only exist if D > 0. In [21], this condition is met by application of a variant of Reverse Cuthill–McKee (RCM) [22] numbering of the unknowns. This essentially ensures that the zero entries on the diagonal of each Dþ e (one per element in the bottom right position due to the singularity of the element stiffness matrix for the Poisson problem) do not all assemble to the same node, giving a zero on the diagonal of D. The inversion of (3.10) can be achieved using element contributions without assembling the lower triangular matrix L: see [23] for details. It is shown in [21] that the minimum eigenvalue of the preconditioned system is always 1, and that for Poisson-type problems the method appears to satisfy
idea of working with the displacement component of element matrices. In the first of a trilogy of papers on the parallel solution of linear elasticity problems, Gustafsson and Lindskog [24] consider a preconditioner based on assembling the so-called separate displacement component (SDC) part of the elasticity operator to obtain a preconditioner PSDC which is block diagonal (and therefore relatively cheap to invert). Here we use a similar idea in an element-based setting. We begin by explaining what is meant by the SDC part of the elasticity operator. As an example, consider the simplest 2D finite element (the constant strain triangle). Here the vector of nodal displacements can be written as
1 jðP1 FEP KÞ ¼ Oðh Þ:
where u and v are the displacement components in the x and y directions, respectively. If we reorder u, grouping x and y components together to obtain
Unfortunately, the application of this method to linear elasticity problems is not straightforward. The main difficulty is that the element matrices Ke (in 2D) now have a three-dimensional nullspace (corresponding to the three rigid-body modes) leading to the appearance of three zeros on the diagonal of each Dþ e . As a result, application of RCM numbering is not sufficient to guarantee that D > 0 and, as yet, we have been unable to identify a replacement numbering strategy which does guarantee this condition. In the numerical tests which follow, we have gone ahead and applied the FEP with node numbering as described in [21], but on some occasions this method breaks down. 3.5. Element matrix factorisation (EMF) This method, proposed by Gustafsson and Lindskog in [23], is similar to FEP but involves performing a Cholesky factorisation, Ke ¼ Le LTe say, of each element matrix. For Poisson problems, this factorisation always exists because the nullspace of Ke is one-dimensional. Writing Le ¼ Le þ De where Le is strictly lower triangular and De is diagonal with one zero diagonal entry, the factors Le and De can be assembled to form a global lower triangular matrix L and diagonal matrix D, respectively. Again this requires a particular global and local numbering of unknowns. The preconditioner PEMF is defined as 1
PEMF ¼ ½Lð1 þ ghÞ
þ Dð1 þ ghÞ½Lð1 þ ghÞ
1
T
þ Dð1 þ ghÞ ; ð3:11Þ
for some g > 0, although here we only consider the case g = 0 so the method becomes parameter free. As with FEP, for linear elasticity problems this method can break down, as the three-dimensional nullspace means that nodes cannot be numbered to ensure that PEMF is nonsingular. 3.6. Element displacement preconditioner (EDP)
T
u ¼ ½u1 ; v1 ; u2 ; v2 ; u3 ; v3 ;
T
u ¼ ½u1 ; u2 ; u3 ; v1 ; v2 ; v3 ; then the strain–displacement matrix Be for the element becomes 2 o/ o/ o/3 3 1 2 0 0 0 ox ox ox 6 o/3 7 o/2 1 7 0 0 o/ Be ¼ 6 oy oy oy 5 4 0 o/1 oy
o/2 oy
o/3 oy
o/1 ox
o/2 ox
o/3 ox
(cf. (2.3)). Writing the resulting element stiffness matrix in block form gives Z K11 K12 Ke ¼ BTe EBe de ¼ ; ð3:12Þ KT12 K22 e where K11 and K22 involve products of either x-derivatives, or y-derivatives, but not cross-products (involving both) and K12 involves the cross-products. The SDC part is obtained by ignoring the cross-product terms in (3.12), that is, the SDC element matrix is the block diagonal part of Ke. The method proposed in [24] is to take PSDC as the SDC part of the assembled structure stiffness matrix, which is also block diagonal, so is readily inverted. The authors show that the matrices PSDC and K are spectrally equivalent, leading to the fact that jðP1 SDC KÞ ¼ Oð1Þ: Clearly this appears to present an advantage in asymptotic terms over any of the preconditioners described above. The new element-based preconditioner which we propose (EDP) involves applying the FEP method described above to the SDC part of each element stiffness matrix Ke. That is, we form K11 0 KSDC ¼ e 0 K22 and assemble the resulting factors
In this section, we present a new element-based preconditioner which combines the FEP outlined above with the
KSDC ¼ ðLSDC þ DSDC ÞðDSDC Þþ ðLSDC þ DSDC ÞT e e e e e e
C.E. Augarde et al. / Computers and Structures 84 (2006) 2306–2315
via
2311
Table 3 Iteration counts for Problem A using constant strain triangles (CSTs)
PSDC ¼ ðLSDC þ DSDC ÞðDSDC Þ1 ðLSDC þ DSDC ÞT
ð3:13Þ
as before (cf. (3.10)). This removes the difficulties associated with establishing appropriate node-numbering strategies for FEP and has a number of additional advantages. Firstly, the decoupled blocks of the SDC part of the element matrices closely resemble Laplacian matrices, so the RCM-based numbering cited above is enough to ensure that the assembled diagonal matrix is positive. This guarantees the existence of an EDP preconditioner of the type (3.10) for elastic problems. In addition, EDP requires less work per iteration than FEP in the back substitution due to the increased sparsity of the resulting factor L. Finally, due to the spectral equivalence of K and its SDC part, we would hope to retain the improved performance of Kaasschieter’s original FEP (for Poisson problems) in terms of iteration counts. That is, we would hope that
CST
N
8
m = 0.0
DIAG EBE SGS EMF FEP EDP
24 11 11 28 19 14
49 20 21 78 48 21
m = 0.25
DIAG EBE SGS EMF FEP EDP
23 10 11 29 18 15
m = 0.4
DIAG EBE SGS EMF FEP EDP
m = 0.49
DIAG EBE SGS EMF FEP EDP
1 jðP1 EDP KÞ ¼ Oðh Þ
pffiffiffi leading to iteration counts k / j ¼ Oðh0:5 Þ. We note that a similar element displacement method could be obtained by using the EMF of Gustafsson and Lindskog [23] rather than FEP. We also tested this technique, but do not present the results here as they were not significantly different to those for FEP.
16
32
64
128
98 38 40 229 130 32
196 75 80 623 346 48
397 147 161 1559 856 72
48 19 22 107 56 23
97 38 43 1403 428 35
197 69 85 * * 54
396 138 169 * * 83
30 13 14 29 19 19
63 26 28 132 69 30
127 50 54 4525 799 46
256 99 105 * * 70
516 199 209 * * 107
45 22 24 45 31 41
107 42 46 235 124 71
216 82 90 * 2055 114
440 164 178 * * 181
886 327 357 * * 286
An asterisk indicates failure to converge in 5000 iterations.
4. Experiments with element-based approaches 4.1. Remarks on implementation
Table 4 Iteration counts for Problem A using linear strain triangles (LSTs)
In this section, we present the results of some numerical experiments carried out to test the element-based preconditioning techniques described above on a range of linear elastic model problems. Clearly, the efficiency of the methods is critically dependent on the architecture of the computer used and the effectiveness of the implementation. As stated in the introduction, all tests here were carried out on a single processor machine, that is, we have not used any form of vector or parallel architecture of the type for which element-based methods are best suited. Further-
LST
N
8
m = 0.0
DIAG EBE SGS EMF FEP EDP
20 13 14 31 18 14
m = 0.25
DIAG EBE SGS EMF FEP EDP
m = 0.4
Table 2 Condition numbers of linear elastic matrices m
N
16
32
64
128
47 24 25 77 42 22
103 47 51 282 109 33
215 94 103 1836 354 50
438 192 205 * 1702 76
21 13 15 29 17 16
52 25 28 60 30 24
113 49 53 107 47 38
235 100 106 153 69 57
486 200 215 224 106 85
DIAG EBE SGS EMF FEP EDP
24 17 19 27 16 21
68 31 36 47 29 33
145 60 69 81 45 49
301 121 138 127 71 73
615 244 276 191 113 110
DIAG EBE SGS EMF FEP EDP
27 37 42 32 25 46
142 78 90 64 49 79
357 160 181 134 100 127
741 326 363 403 223 192
1511 658 725 2552 574 284
8
16
32
64
128
CST 0.0 0.25 0.4 0.49
3.8780e+01 2.8838e+01 2.6181e+01 5.2101e+01
1.5733e+02 1.1787e+02 1.0687e+02 2.0623e+02
6.3247e+02 4.7544e+02 4.3214e+02 8.2915e+02
2.5348e+03 1.9084e+03 1.7383e+03 3.3315e+03
1.0148e+04 7.6460e+03 6.9729e+03 1.3362e+04
m = 0.49
LST 0.0 0.25 0.4 0.49
1.9100e+02 2.2217e+02 3.1853e+02 2.6425e+03
7.7095e+02 8.9679e+02 1.2888e+03 1.0831e+04
3.0908e+03 3.5952e+03 5.1677e+03 4.3500e+04
1.2370e+04 1.4389e+04 2.0682e+04 1.7413e+05
4.9487e+04 5.7563e+04 8.2741e+04 6.9663e+05
An asterisk indicates failure to converge in 5000 iterations.
2312
C.E. Augarde et al. / Computers and Structures 84 (2006) 2306–2315
more, we have implemented each method in fully assembled form. Although this is obviously not the way in which the algorithms would be applied in a practical setting, it enables us to present a clear comparison in terms of iteration counts and asymptotic behaviour. We have not, therefore, tabulated any CPU times as these would not be informative in this context. Issues of implementation and efficiency in a multi-processor environment will be discussed in [1]. 4.2. Test problems For each of the test problems described below, we present iteration counts for the solution of the linear system arising at the first load step. As the response is elastic, this is fully representative of the systems constructed at other load steps. Each problem was solved with four sets of material properties, namely, Poisson’s ratio m = 0, 0.25, 0.4 and 0.49 (only Poisson’s ratio was varied for the rea-
sons outlined above). In all cases the stopping criterion used was kf Kuk2 =kfk2 6 106
ð4:14Þ
where k Æ k2 represents the Euclidean norm of a vector. 4.2.1. Structured meshes—problem A The first suite of tests modelled the linear elastic response of a square plane strain grid to in-plane loading, using two different sets of structured meshes of triangular elements. The first was a set of five discretisations of constant strain triangles, where the number of elements per side of the domain N was varied from N = 8 to N = 128, doubling each time. The resulting linear systems were therefore of dimensions ranging from n = 162 (N = 8) to n = 33,282 (N = 128). A second set of five discretisations of linear strain triangles of matching sizes (that is, with the same number of nodes as the constant strain triangle meshes) was also used. The condition numbers of the
Fig. 1. (a) Problem B: cantilever with end point load. (b) Problem C: smooth rigid footing. Due to symmetry only one half of the domain is meshed. (c) Typical unstructured mesh used for Problem C.
C.E. Augarde et al. / Computers and Structures 84 (2006) 2306–2315
resulting linear systems (with diagonal scaling applied) are given in Table 2. The numbers of iterations required to satisfy criterion (4.14) in each case are given in Tables 3 and 4. An asterisk indicates failure to converge in 5000 iterations. 4.2.2. Unstructured meshes—problems B and C The five preconditioning methods were also tested on two more physically interesting problems, a cantilever with an end point load (problem B) and a smooth rigid footing (problem C). Details are displayed in Fig. 1. Here the discretisation was carried out on unstructured grids of linear strain triangles, with two values of n for each problem (n = 9004, 18,776 for the cantilever and n = 9818, 22,518 for the footing problem). The smaller grid for problem C is shown in Fig. 1c. The resulting iteration counts are shown in Table 5. An asterisk again indicates failure to converge in 8000 iterations. 4.3. Discussion of results 4.3.1. Problem A The numbers of iterations k required to reach convergence for various values of N and m with constant strain triangles (CSTs) and linear strain triangles (LSTs) are tabulated in Tables 3 and 4. The same information is displayed graphically in Figs. 2 and 3, where log2(k) is plotted Table 5 Iteration counts for problems B and C using n
B 18,776
DIAG EBE SGS EMF FEP D-FEP
1164 536 534 * * 179
m = 0.25
DIAG EBE SGS EMF FEP D-FEP
m = 0.4
m = 0.49
m = 0.0
12
12
11
11
10
10
9
9
8
8
7
7
6
6
5
5
4
4
ν = 0 .0
3 3
4
5
6
7
3
12
12
11
11
10
10
9
9
8
8
7
7
6
6
5
5
1658 763 742 * * 227
405 194 192 * * 86
641 289 289 * * 116
1312 598 600 * * 176
1869 849 824 * * 228
445 212 212 * * 95
684 304 303 * * 132
DIAG EBE SGS EMF FEP D-FEP
1763 797 833 * * 227
2503 1120 1125 * * 290
545 257 265 * * 145
875 387 395 * * 202
DIAG EBE SGS EMF FEP D-FEP
5011 2404 2541 * * 1266
7115 3382 3425 * * 1705
1113 622 586 * * 952
2002 939 952 * * 1438
Entries marked with an asterisk failed to converge in 8000 iterations.
5
6
7
ν = 0.49
3 3
22,518
4
4
ν = 0.40
3
9818
ν = 0.25
3
4
C
9004
2313
4 DIAG
5
6 EBE
7 SGS
3 FEN
4
5
6
7
EDN
Fig. 2. Structured mesh problem with CSTs. Vertical axis = log2(k) horizontal axis = log2(N).
against log2(N). This is helpful when studying the asymptotic behaviour of each method. (The results for the EMF preconditioner are omitted from the plots to improve clarity). As stated above, for DIAG, EBE and SGS we have that the condition number of the preconditioned system j = O(h2) = O(N2) so we expect that the number of iterations required will satisfy k = O(h1) = O(N). This is confirmed by the results in Figs. 2 and 3, where the lines corresponding to these methods are seen to have slope 1, indicating that, in the asymptotic limit, the methods are essentially the same. The differing height of the lines gives an indication of the constant of proportionality involved (which is usually hidden in the O order notation but is important in practice). For all values of m, DIAG consistently requires more iterations than EBE and SGS, with the latter two methods showing very similar behaviour due to their comparable structures.
2314
C.E. Augarde et al. / Computers and Structures 84 (2006) 2306–2315
11
11
11
10
10
10
9
9
8
8
7
7
6
6
DIAG CST
7
9 8
5
5
5
4
4
5
ν = 0 .0
3
4
5
6
11
10
10
9
9
8
8
7
7
6
6
0 0.25 0.4 0.49
4
5
6
4
0.4 0.49
ν = 0.25
3
11
0 0.25
3 7
6
7 6
3
EDP CST
8
7
4
3
3
4
5
6
7
3
4
5
6
7
11 DIAG LST
EDP LST
8
10
7
9 8
6
7
5
5
4
4
5
4
5 DIAG
6
0.49
ν = 0.49
EBE
7
4 3
SGS
4
0.4
5
3 3
0 0.25 0.4 0.49
0.25
ν = 0.40
3
0
6
4 FEN
5
6
7
EDN
Fig. 3. Structured mesh problem with LSTs. Vertical axis = log2(k) horizontal axis = log2(N).
The results for the methods based directly on element factorisations (FEP, EMF and EDP) exhibit different tendencies. Firstly, the performance of FEP and EMF is very poor on the constant strain triangle meshes, exhibiting a rapid growth in k with increasing problem size. Although they work markedly better with linear strain triangles, the performance varies erratically with m and, for the hardest problem (m = 0.49), growth of k with N is again rapid. These unappealing features appear to be alleviated, however, by EDP which factorises only the element displacement component. For both sets of meshes, the EDP iteration counts show growth at a slower rate than the O(N) of diagonal scaling: fitting the iteration counts to a power law suggests a rate of O(N2/3) for CSTs and O(N3/5) for LSTs. Although this is not quite the O(N1/2) rate suggested by the observations in 3.6, it still means that there is a problem size at which EDP will outperform all of the other methods: for
3
4
5
6
7
3 3
4
5
6
7
Fig. 4. Comparison of performance at varying Poisson’s ratio for problem A. Vertical axis = log2(k) horizontal axis = log2(N).
the examples here, this occurs at or below N = 64 in every case (and much sooner in most of them). We note in passing that a comparison across the four plots within each figure shows that it is the constant of proportionality mentioned above which grows with respect to m for each method rather than the asymptotic convergence rate. In addition, the deterioration as m ! 0.5 is more pronounced for the linear strain triangle case. Both of these features are illustrated in Fig. 4, in which selected data from Figs. 2 and 3 are replotted for the DIAG and EDP preconditioners only. The differences in height of the lines on each plot show the increasing rate of deterioration in condition as m ! 0.5. The rate of deterioration appears to be worse for the EDP preconditioner. 4.3.2. Problems B and C The numbers of iterations k required to reach convergence for various values of m for the unstructured mesh cantilever and footing problems described above are tabulated in Table 5. For all preconditioning methods, the
C.E. Augarde et al. / Computers and Structures 84 (2006) 2306–2315
cantilever problem requires considerably more iterations than the footing problem of comparable size. This is probably due to the latter’s higher proportion of prescribed boundary conditions. It is clearly much more difficult to draw any general conclusions about asymptotic behaviour in these cases. However, we can identify some trends similar to those exhibited by the structured mesh problems. Firstly, the behaviour of EBE and SGS is very much alike in all runs, with just over twice as many iterations required for DIAG. The performance of FEP and EMF is very poor: these methods are clearly not at all suitable for this type of unstructured problem. Again, however, the closely-related EDP preconditioner performs well, converging in the smallest number of iterations in all but two cases, those of the footing problems with m = 0.49. 5. Summary In this paper, we have described a range of elementbased preconditioners and have reviewed their performance on a set of linear elastic test problems. Such methods are important in the search for increasing feasibility and efficiency of large-scale simulations in solid mechanics. All of the methods described here can be implemented without the assembly of the full global coefficient and preconditioning matrices. Not only does this facilitate savings in terms of storage requirements, but it also renders the methods amenable to efficient implementation in a vector and/or parallel environment. Five of the methods described (DIAG, EBE, SGS, FEP, EMF) were drawn from previous work on Poisson problems in the mathematical literature. The fifth (EDP) is a new method based on the factorisation of the displacement component of the element matrices. The aim of the numerical experiments presented here was to highlight some features of the asymptotic convergence rate of these methods with a view to practical applications. The FEP method proved to be very poor on almost all problems. Our experiments on structured meshes illustrated the expected asymptotic dependence of the condition number of O(h2) for DIAG, EBE and SGS. The results for the new preconditioner, EDP, appear very promising, suggesting an improved asymptotic convergence rate for the preconditioned system. This represents a significant advance in terms of element-based methods for elasticity problems. Acknowledgement This work was supported by the Engineering and Physical Sciences Research Council under Grants GR/R85037/ 01 and GR/R85044/01.
2315
References [1] Augarde CE, Lazarov B, Ramage A, Staudacher J. Parallel performance of element-based preconditioning for linear elasticity. in preparation. [2] Potts DM, Zdravkovic´ L. Finite element analysis in geotechnical engineering: theory. London: Thomas Telford; 1999. [3] Hestenes MR, Stiefel E. Methods of conjugate gradients for solving linear systems. J Res Nat Bur Stand 1952;49:409–36. [4] Smith IM, Griffiths DV. Programming the finite element method. John Wiley and Sons; 2004. [5] Concus P, Golub GH, O’Leary DP. A generalized conjugate gradient method for the numerical solution of elliptic partial differential equations. In: Bunch J, Rose DJ, editors. Sparse matrix computations. New York: Academic Press; 1976. p. 309–32. [6] Axelsson O, Barker VA. Finite element solution of boundary value problems. Academic Press; 1984. [7] Axelsson O. Iterative solution methods. Cambridge Univ. Press; 1994. [8] Greenbaum A. Iterative methods for solving linear systems. Philadelphia: SIAM; 1997. [9] Meijerink JA, van der Vorst HA. An iterative solution method for linear systems of which the coefficient matrix is a symmetric M-matrix. Math Comp 1977;31:148–62. [10] Gustafsson I. A class of first order factorization methods. BIT 1978; 18:142–56. [11] Hladik I, Reed MB, Swoboda G. Robust preconditioners for linear elasticity fem analyses. Int J Numer Meth Eng 1997;40:2109–27. [12] Saint-Georges P, Warzee G, Beauwens R, Notay Y. High-performance pcg solvers for fem structural analysis. Int J Numer Meth Eng 1996;39:1313–40. [13] Saint-Georges P, Warzee G, Notay Y, Beauwens R. Problemdependent preconditioners for iterative solvers in fe elastostatics. Compos Struct 1999;73:33–43. [14] Jardine RJ, Potts DM, Fourie AB, Burland JB. Studies of the influence of non-linear stress–strain characteristics in soil-structure interaction. Geotechnique 1986;36:377–96. [15] Wathen AJ. An analysis of some element-by-element techniques. Comput Meth Appl Mech Eng 1989;74:271–87. [16] Ramage A, Wathen AJ. On preconditioning for finite element equations on irregular grids. SIAM J Matrix Anal Appl 1994;15(3): 909–21. [17] Hughes TJR, Levit I, Winget J. An element-by-element solution algorithm for problems of structural and solid mechanics. Comput Meth Appl Mech Eng 1983;36:241–54. [18] Lee H-C, Wathen AJ. On element-by-element preconditioning for general elliptic problems. Comput Meth Appl Mech Eng 1991;92: 215–29. [19] Saad Y. Iterative methods for sparse linear systems. Boston: PWS Pub. Co.; 1996. [20] Nour-Omid B, Parlett BN. Finite element preconditioning using splitting techniques. SIAM J Sci Stat Comput 1985;6:761–70. [21] Kaasschieter EF. A general finite element preconditioning for the conjugate gradient method. BIT 1989;29:824–49. [22] Liu WH, Sherman AH. Comparative analysis of the Cuthill–McKee and Reverse Cuthill–McKee ordering algorithms for sparse matrices. SIAM J Numer Anal 1976;13:198–213. [23] Gustafsson I, Lindskog G. A preconditioning technique based on element matrix factorisations. Comput Meth Appl Mech Eng 1986; 55:201–20. [24] Gustafsson I, Lindskog G. On parallel solution of linear elasticity problems. Part I: Theory. Numer Linear Algebra Appl 1998;5: 123–39.