Comput. Methods Appl. Mech. Engrg. 190 (2001) 3543±3563
www.elsevier.com/locate/cma
An integrated Davidson and multigrid solution approach for very large scale symmetric eigenvalue problems Y.T. Feng * Department of Civil Engineering, University of Wales Swansea, Singleton Park, Swansea, Wales, SA2 8PP, UK Received 11 October 1999
Abstract The Davidson method has recently emerged as one of the most popular solution approaches for large scale eigenvalue problems. The performance is however dominated by the choice of the correction equation and the way in which the equation is solved. In this paper, the most popular choices of correction equations are evaluated based on the three selection criteria. The Jacobi-Davidson and (modi®ed) In¯ated Newton schemes are identi®ed as the two best candidates. A highly eective solution approach ± multi-grid (MG) ± is considered for the solution of the selected correction equations and the issues associated with these two equations are discussed. In addition, a two-level integrated MG and Davidson solution strategy is presented to further enhance the performance of the Davidson method. Finally, the behaviour of the two correction equations is assessed numerically over a set of large scale 3D examples with up to one million d.o.f. Ó 2001 Elsevier Science B.V. All rights reserved. Keywords: Large scale symmetric eigenvalue problem; Davidson's method; Rayleigh quotient; Multigrid method; Jacobi±Davidson; Newton correction equation
1. Introduction The ®nite element analysis of structural dynamic problems and other engineering applications often requires the evaluation of a small number of smallest eigenvalues and associated eigenvectors of the following generalized symmetric eigenvalue problem A/ kB/
1
in which A and B are respectively, the global stiness and mass matrices of order n and assumed to be symmetric (semi-)positive de®nite; k is an eigenvalue and / is the corresponding eigenvector. If B I (identity matrix), Eq. (1) becomes a standard eigenvalue problem. For realistic modeling of 3D problems, the resulting eigenvalue problem can be very large and sparse. The problems de®ned to be `large ' are those that require substantial computer resources, in terms of memory and/or CPU time, for the particular computer system. The typical size of problems under current consideration is around one million for a normal desktop/workstation computer. Traditional solution methods include the Lanczos method, subspace iteration, along with shift-invert transformations [1]. The key operation involves the numerical solution of a sequence of linear systems. This is typically achieved by use of a direct solver. Although many ecient direct sparse matrix solution techniques have been developed [2] for large problems, these traditional methods could prove expensive in terms of CPU time and memory requirements. The substantial memory requirements may be alleviated by *
Corresponding author. E-mail address:
[email protected] (Y.T. Feng).
0045-7825/01/$ - see front matter Ó 2001 Elsevier Science B.V. All rights reserved. PII: S 0 0 4 5 - 7 8 2 5 ( 0 0 ) 0 0 2 8 3 - 8
3544
Y.T. Feng / Comput. Methods Appl. Mech. Engrg. 190 (2001) 3543±3563
using mixed schemes where the linear equations are solved by iterative methods instead of sparse direct ones. However, the eciency of the resulting algorithms may be low because iterative methods are not ecient in solving the same linear equations with dierent right-hand sides, and even worse, it may not always be possible for iterative solvers to obtain solutions with very high accuracy. Consequently, it appears to be necessary to develop new methodologies that have inherent mechanisms of incorporating approximate solutions of linear equations into their algorithm frameworks. In recent years considerable eort has been devoted to the development of such methods for solving large scale problems, and a number of eective and reliable algorithms are available which have been successfully applied to a wide range of practical problems. See [3] for an extensive survey on preconditioned eigensolvers and the references therein. Among these newly developed methods, the Davidson method [4], together with its variants [5±7], is considered as the most popular one, specially in the ®eld of computational chemistry. The fundamental difference from the traditional methods is that the search subspace in Davidson's method can be expanded in a very ¯exible way. The vector used to expand the space is obtained as an approximate solution of the so called correction equation. As the approximate solution can normally be obtained by iterative solvers, Davidson's method has the capability of solving large scale eigenvalue problems. However, the real performance that Davidson's method can achieve depends on several computational issues, but the aspects related to the correction equation play the essential role in the success of the method. In particular, the following issues are the most important to achieve high performance of the method: (1) how to choose an appropriate correction equation; (2) how to effectively solve this equation. These two issues actually constitute the main objectives of the present work. In the last decade, multigrid (MG) methods have emerged as an eective iterative approach for the solution of large scale linear systems of equations. Multigrid strategies have also been used to evaluate the partial eigen spectrum of standard/generalized eigenvalue problems in two ways. The ®rst approach simply employs the multigrid as an inner loop solver for an outer loop of any established inverse based eigensolvers (see e.g. [8]). The other approach directly involves eigenvalue solution procedures at each stage of multigrid. The Rayleigh quotient multigrid (RQMG) method [9] is an example of this later approach. In this work, we pursue the ®rst approach where the multigrid method is adopted to solve the correction equation in Davidson's method. In addition, the corresponding eigenpairs of coarser meshes are also computed to serve as initial guesses for the next ®ner meshes. The combination of these two aspects together gives rise to a twolevel integrated multigrid and Davidson eigen solution procedure. The ultimate goal of this work is therefore to discuss the major practical aspects that have signi®cant contributions to enhance the performance of Davidson's method for the solution of large scale eigenvalue problems arising in structural dynamics. The outline of the current work is as follows: In Section 2, the classic Rayleigh quotient iteration (RQI) method is reviewed and a modi®cation is proposed. The next section presents Davidson's algorithm along with currently used correction equations. Section 4 is devoted to the discussion on how to solve the selected correction equations by the multigrid methodology and how to deal with the additional computational issues associated with the selected correction equations; In the subsequent section, a fully integrated multigrid and Davidson solution approach is proposed in order to achieve high performance for very large scale problems; Finally, three numerical examples are provided to illustrate the solution strategies proposed and to further investigate some issues numerically. For simplicity, a standard eigenvalue problem is assumed for most of our discussion. Additional issues arising in the generalized eigenvalue case will be treated brie¯y in Section 5.3. Box 1. Classic Rayleigh quotient iterations · Given a unit vector q0 , and its Rayleigh quotient r0 RQ
q0 ; · DO i 0; 1; . . . Until Convergence Shift and invert:
A ri Iqi qi ; One-dimensional Rayleigh±Ritz analysis qi1 qi =kqi k; ri1 RQ
qi1 :
Y.T. Feng / Comput. Methods Appl. Mech. Engrg. 190 (2001) 3543±3563
3545
2. Classic Rayleigh quotient iterations (RQI) and its modi®cation 2.1. RQI algorithm The classic Rayleigh quotient iteration algorithm [10] is a special version of shift-invert based power methods. For a non-zero vector x, its Rayleigh quotient, denoted by RQ(x), is de®ned as RQ
x
xT Ax xT x
x 6 0
2
which is equivalent to the one-dimensional Rayleigh±Ritz analysis applied to x. Box 1 outlines the basic RQI algorithm. Given an initial guess q0 , the main loop of the RQI consists of two steps: the ®rst step is to solve the following shift-invert equation
A
ri Iqi qi
3
to get a new approximate solution qi , which is followed by the one-dimensional Rayleigh±Ritz analysis in the second step to improve the solution. Here the Rayleigh quotient ri is also called the Ritz value and qi the corresponding Ritz vector. The convergence is checked by monitoring the residual ri of the computed Ritz pair
qi ; ri de®ned by ri
A
ri Iqi :
4
It is well-known that RQI is more costly than the other traditional methods as a new linear system of equations is required to be solved at each iteration. Moreover these equations are more dicult to be solved because they may become singular when the Ritz pair approaches the exact solution. On the other hand, RQI has a very attractive property: the convergence is cubic. Also note that the RQI algorithm generally requires a fairly good initial vector, otherwise it may converge to other undesired eigenpairs. The shift-invert equation (3) will be referred to as the RQI correction equation in what follows. 2.2. Modi®cation One modi®cation to the classic RQI is to increase the Rayleigh±Ritz analysis space in the second step from one-dimensional space fqi g to a two-dimensional space spanned by fqi ; qi g. The improved approximate solution fqi1 ; ri1 g is chosen from two Ritz pairs obtained from the corresponding two-dimensional Rayleigh±Ritz analysis. Apparently, no signi®cant improvement in the convergence rate would be expected from this modi®ed RQI scheme due to the cubic convergence rate already present. However, if the RQI correction equation cannot be accurately computed, the original RQI may not be able to produce a series of solutions
qi ; ri with increased accuracy, especially when the accuracy of the correction equation solution qi is poor; while the modi®ed version can, at least, guarantee to obtain a new solution with an improved accuracy at each iteration no matter how poor the linear solution may be. The more important feature of this modi®ed version is that any form of linear system of equations that is mathematically equivalent to the RQI correction equation (refer to Section 3.3 for the de®nition of equivalence) can be employed to replace the correction equation and an identical solution will be obtained in exact form. Therefore, if there exist some forms of the equivalent correction equations which can be more effectively solved by iterative solvers, it would be possible to derive RQI-type algorithms. On the one hand, these algorithms can naturally incorporate iterative solvers into their algorithm frameworks in order to signi®cantly reduce memory requirement. On the other hand, they can also retain the high convergence property that the original RQI has. In the next section, more forms of equations equivalent to the RQI correction equation will be exploited and consequently, several dierent forms of modi®ed RQI algorithms are established.
3546
Y.T. Feng / Comput. Methods Appl. Mech. Engrg. 190 (2001) 3543±3563
Finally, it is also possible to expand the search subspace further in the RQI algorithm by including, for instance, a certain number of the intermediate approximate solutions, qj ; j min
1; i k; . . . ; i, where k is the prescribed number, as the bases of the space. In this case, Davidson's method generally offers more features.
3. Davidson's method 3.1. The algorithm outline The Davidson method can be considered as a subspace projection scheme, but it diers from the classic subspace iteration scheme in that, theoretically, any non-zero vector could be adopted to expand its search subspace. The outline of the algorithm is presented in Box 2. Given an initial p-dimensional subspace V 0 , the main loop of the Davidson algorithm includes the following steps: The ®rst step applies the Rayleigh±Ritz analysis on subspace V i to obtain the current Ritz pairs
Qi ; Ki from which the desired
qi ; ri is selected. Convergence is established if the residual norm of the desired Ritz pair is below the prescribed tolerance. The second step requires an inexact solution of the so called correction equation C i
A; r; qi ; . . .qi bi
ri ; qi ; . . .
5
to be found. Then the correction solution qi is used to expand the search subspace in the next step. Generally speaking, the coef®cient matrix C i and the right-hand side bi of the correction equation could be functions of A; qi ; ri ; ri and/or other variables/vectors, and the equation is generally solved by an iterative approach. If the dimension of search subspace becomes large, restarting techniques may be required. Finally, the additional eigenpairs can be found by making each subspace V i orthogonal to all the eigenvectors already found. The real performance that Davidson's method can achieve depends on many factors including Box 2. Algorithm outlines of Davidson's method · Given an initial subspace V 0 fv01 ; v02 ; . . . ; v0p g with kv0j k 1; j 1; . . . ; p, · DO i 0; 1; . . . Until Convergence 1. Rayleigh±Ritz analysis on V i Reduced eigenvalue problem Ai Ui Ui Ki with Ai V Ti AV i Computed Ritz pairs
Qi ; Ki Qi V i Ui ; Desired Ritz pair
qi ; ri , and its residual: ri
A
ri Iqi :
2. Computation of the approximate solution qi of correction equation: C i xi bi : 3. Subspace expansion: V i1 OrthofQi ; qi g; 4. Restart, if necessary · quality of initial guess subspace V 0 ; · actual form of correction equation; · eective solvability of the equation by iterative solvers;
Y.T. Feng / Comput. Methods Appl. Mech. Engrg. 190 (2001) 3543±3563
3547
· restarting technique adopted; · orthogonalization. The quality of the initial guess and the restarting technique adopted usually have limited in¯uence on the convergence of the Davidson algorithm. However, it appears that the solution from the coarse mesh provides a possible good initial guess. Further discussion will be presented in Section 5. We remark that no attempt is made to discuss restarting techniques in details. The relevant issues have been extensively covered in [11,12]. On the other hand, the form and properties of the correction equation are crucial to the success of Davidson's method. Therefore the aspects regarding the correction equation, will be the main theme of the remaining part of this section. In the above notation, subscript i is used to indicate the current Davidson iteration number. For simplicity the subscript will be omitted below whenever possible. 3.2. Correction equations Since Davidson's original version of correction equation [4], several dierent forms of correction equations have been proposed, each leading to a dierent variant of Davidson's method. We now review several approaches for the correction equation. 3.2.1. Basic versions 1. Davidson's version [4]. When Davidson proposed his method, he suggested the following equation as the correction equation: Diag
A
rIq r:
6
The above equation can be very easily solved, but the convergence of the resulting Davidson method is poor except when the matrix A is strongly diagonally dominant. 2. Morgan and Scott's version (MS) [5]. Morgan and Scott suggested to obtain the correction solution q by applying preconditioned iterative procedures to the following equation
A
rIq r:
7
It is clear that Davidson's version can be viewed as applying the diagonal preconditioning technique to the above equation. Although Morgan and Scott's choice can improve the convergence, other problems are introduced. For instance, the equation cannot be solved accurately otherwise the correction solution q becomes q and then the iteration stagnates. 3. The RQI version. We list here for the ®rst time the RQI correction equation (3) as the third basic choice, although historically this choice may not be practised in Davidson's method. The reason for listing the RQI correction equation here is that with this choice, Davidson's method would exhibit a cubic convergence rate, provided that the equation can be solved with a high level of accuracy by iterative techniques. Unfortunately, this is not the case in practice due to the strong ill-conditioned behaviour of
A qI when the convergence is established. 3.2.2. The Jacobi±Davidson (JD) scheme Based on the observation that, in the correction vector q , only the component that is orthogonal to q has an eective contribution to the expansion of the subspace, Sleijpen and van der Vorst [7] impose the following orthogonality condition qT q 0
8
to Eq. (7) thus results in the so called Jacobi±Davidson correction equation C JD q r;
9
3548
Y.T. Feng / Comput. Methods Appl. Mech. Engrg. 190 (2001) 3543±3563
where qqT
A
C JD
I
rI
I
qqT :
It has been shown that the Jacobi±Davidson correction equation has signi®cantly enhanced the original Davidson method in terms of eciency, stability, and robustness. In fact, this is the ®rst proposed correction equation with which, if solved accurately, the Davidson method can theoretically achieve a cubic convergence rate. Further evaluation about the scheme will be conducted in Subsection 3.3. The idea of the Jacobi±Davidson scheme has also been adopted and extended in the development of the other iterative solver based eigenvalue solution methods. In Sorensen and Yang's (inexact) truncated QRiteration [13], the following correction equation is used:
I
VV T
A
rI
I
VV T q r:
10
Clearly, q is further restricted to a smaller subspace that is orthogonal to the current search subspace V. Therefore, this correction equation may be viewed as the block version of the Jacobi±Davidson equation. Some research [14] has suggested, however, that in the context of Davidson's method, this block alternative is less eective than the original Jacobi±Davidson correction equation and hence no further consideration will be given to this particular form of correction equation in this work. Similarly, in the context of the block Rayleigh quotient iteration (BRQI) method, Fattebert [15] proposed the following correction equation
I
T
rI
q q 0
VV
A
11
T
with V q 0. The notable dierence is the presence of a dierent subspace V. The bases of V are constructed with a prescribed number of vectors from Q (note: Q are the orthonormal bases of V). They may include only the current Ritz vector q and may be extended to incorporate all vectors in Q. In the former case, (11) reduces to the original Jacobi±Davidson equation, and in the latter case, (11) becomes the block Jacobi±Davidson version. The suitability of this scheme as the correction equation for Davidson's method will not be pursued further. 3.2.3. The Newton schemes Recently, Wu et al. [16] proposed a class of correction equations derived by applying the Newton method to the solution of the minimization of Rayleigh quotient. This class of schemes is termed the Newton schemes and includes the following three equations: 1. The constrained Newton (CN) equation C CN q r;
12
where C CN A
rI
2rqqT A:
2. The (normalized) augmented Newton (AN) equation C AN qAN bAN ; where
C AN
A
qT
rI
13 q ; 0
qAN
q ; 0
bAN
r : 0
3. The in¯ated Newton (IN) equation C IN q r;
14
where C IN A
rI aqqT
in which a is a free parameter to be chosen. This scheme was ®rst proposed by Galick [17].
Y.T. Feng / Comput. Methods Appl. Mech. Engrg. 190 (2001) 3543±3563
3549
Suitability of these three equations as the correction equations for the Davidson method has been investigated [16], mainly in terms of numerical experiments. 3.3. Correction equation selection criteria In the above, we have brie¯y reviewed several correction equations. This naturally leads to the question: which correction equation should be used? In order to answer this question, we propose three criteria. Criterion 1 ± Equivalence. Any candidate correction equation should be equivalent to the RQI correction equation in exact arithmetic. In other words, when both correction equations are exactly solved, the Davidson algorithm should generate the same search subspace V i or the same space spanned by fqi ; qi g at each iteration. Criterion 2 ± Well-de®nition. Any candidate correction equation should be well de®ned, in particular, it should be non-singular when the exact solution is reached. Criterion 3 ± Solvability. Any candidate correction equation should be solved effectively by iterative solvers. Remark 1. If a candidate satis®es the ®rst criterion and can be solved accurately, the resulting Davidson iteration will achieve an asymptotically cubic convergence rate. In practice, however, the correction equation is generally solved approximately, and therefore we may loose the cubic convergence. Nevertheless we shall be con®dent that if the equation can be solved to a moderate level of accuracy, the Davidson method could still be expected to exhibit rapid convergence. Remark 2. The second and the third criteria would indicate if one particular form of correction equation could be eectively solved by iterative solvers, which, together with the ®rst criterion, would determine the overall performance of the Davidson method. Of course, it is easy to verify if a correction equation is well de®ned when the current Ritz value r approaches the exact eigenvalue, but it is rather dicult to de®ne the solvability of an equation in a measurable fashion, therefore, some numerical experiments are often required to be involved in the evaluation process. After applying the ®rst two criteria to all the previously listed correction equations, it turns out that the Jacobi±Davidson scheme and all three Newton schemes are equivalent to the RQI and are all well-de®ned even when the shift value is very close to an eigenvalue. Strictly speaking, C JD is singular when r k, but because the solution sought lies in the space orthogonal to the corresponding eigenvector, C JD can therefore be considered as non-singular within this subspace of interest. By closely examining at the three Newton schemes, it is evident that C CN in the CN scheme is nonsymmetric and slightly more expensive to work with. For the (Normalized) AN scheme, the dimension of C AN is one order larger than the others. Furthermore, it will be shown in Section 5.2 that the CN equation is actually closely related to the IN scheme and it can also be proved that the Normalized AN scheme is actually identical to the Jacobi±Davidson scheme. Consequently, we ®nally choose the Jacobi±Davidson and In¯ated Newton equations as two possibly best correction equations. Although both the JD and IN equations are mathematically equivalent in exact form and C JD and C IN are symmetric, their formulations are distinct. Therefore, the solvability of these two equations and the corresponding performance of Davidson's method should be examined. In the present work, one particular iterative solver ± the multigrid method ± is considered for the solution of the two selected correction equations. 4. Multigrid method: an eective equation solver 4.1. The basic idea Over the last decade, multigrid has been established as an ecient iterative approach for obtaining solutions of a wide variety of practical problems in the context of the ®nite dierence, ®nite volume and ®nite element methods.
3550
Y.T. Feng / Comput. Methods Appl. Mech. Engrg. 190 (2001) 3543±3563
The solution of Ax b
15
is equivalent to Ae t;
16
where e x x is the error solution, t b Ax the residual and x is an approximation to x. The essential MG principle is based on the observation that the smooth (or long-wavelength) part of the error e, which may not be ef®ciently swept out by iterative methods, could be substantially reduced by a coarse mesh correction. The success of MG strategies lies primarily in: (i) their excellent convergence characteristics, which theoretically should not depend on the size of the ®nite element mesh; (ii) their high ef®ciency whereby solutions of problems with n unknown are obtained with O
n in terms of work and storage for large classes of problems. For further information, we refer the reader to [18±20]. Multigrid methods can be broadly divided into two classes: geometric or algebraic [21] based. In the current work, only geometric based multigrid methods are considered. 4.2. Main building blocks of MG The main building blocks for a geometric MG are brie¯y described below. For a more detailed presentation, see [22]. 4.2.1. Generation of a series of grids The ®rst key issue related to the implementation of the MG method is the generation of a sequence of coarse and ®ne meshes. Our implementation allows these meshes to be created independently by any automatic mesh generators. It is also possible that coarse meshes can be automatically generated based on the ®nest mesh by, for instance, auto-coarsening strategies [23]. In this case, users only need to provide the ®nest mesh as usual and MG procedures could be implemented as a black box. As far as the coarse and ®ne meshes are concerned, they can be nested or non-nested and structured or unstructured, and even dierent element orders may also be used in dierent meshes. 4.2.2. Generation of transfer operators The operators that transfer variables between two meshes play a critical part in the success of the MG strategy. The coarse-to-®ne mesh interpolation operator P is associated with the interpolation of the `displacements' from the coarse mesh back to the ®ne mesh, while the ®ne-to-coarse mesh projection operator G projects the `forces' from the ®ne mesh to the coarse grid. In our cases, projection operator G is always taken as the transpose of interpolation operator P, i.e., G PT . The main diculty encountered in the generation of P lies in eciently determining the enclosing coarse mesh element for each ®ne mesh nodal point in the situation where non-structured meshes are involved. The so called alternating digital tree (ADT) search algorithm with an n log
n complexity by Bonet and Peraire [24] is employed in the current work and proved to be very ecient for a variety of applications. Note that for plate or shell structures, the generation of P should also take the corresponding element formulation into account. 4.2.3. Construction of coarse grid equations The coarse grid equation can be denoted as Ac xc bc :
17
In the linear MG scheme, this equation is used to accomplish the coarse correction and may be derived by two dierent approaches: The ®rst one (referred to as the FE procedure) obtains the global stiffness matrices by applying the same ®nite element discretisation procedure to the coarse meshes, while the second approach, referred to as the Galerkin strategy, constructs the coarse grid equations by directly projecting the ®ne mesh equations into the space de®ned by the transfer operators, i.e.
Y.T. Feng / Comput. Methods Appl. Mech. Engrg. 190 (2001) 3543±3563
Ac PT AP:
3551
18
In last few years, we have developed a MG strategy based on this Galerkin approach, termed the Galerkin multigrid (GMG) method, and have successfully applied this method to the solution of various solid problems [25±28]. The ecient implementation of the projection procedure is crucial to achieve an overall high performance of the GMG. Based on the underlying topological information of P and A, the sparsity pattern of Ac can be established. In addition, an ecient implementation of computing Ac with a linear complexity of O
n has also been developed. More details regarding these topics can be found in Feng et al. [26]. 4.2.4. MG iteration cycle One cycle of multigrid iterations consists of three phrases: pre-smoothing, coarse grid correction and post-smoothing. Let S
u; l denote the smoothing procedure to be used, with u as the initial guess, l the maximum iteration. Also let l1 and l2 be the maximum iterations of the pre- and post- smoothing procedures performed respectively, before and after the coarse mesh correction. Then the lth cycle of the MG iteration is to attain an approximate solution to the following error equation ADul tl ;
19
where tl b Aul 1 and ul 1 is the approximate solution to Eq. (15) after l 1 MG cycles. A two-grid MG iteration cycle is outlined in Box 3. For truly multilevel cases, with slightly dierent visit sequence, dierent types of cycles may result in. The most popular choices include V- and W-cycles. Obviously, the eciency of MG approaches will depend upon the quality of the coarse meshes and the appropriate selection of interpolation and projection operators. Once all Ac , P and G are determined, the performance of MG methods will entirely depend on the smoother S and the numbers of iterations l1 ; l2 . The practical selection of smoothers can range from very simple Jacobi, Gauss±Seidel, SOR, SSOR to incomplete decomposition, and even to any iterative algorithm. In the present work, IC(0), preconditioned CG, BiCGStab [29] and GMRES [30] algorithms can be chosen as the smoother for symmetric and nonsymmetric problems. Box 3. One cycle of two-level MG iterations · Pre-smoothing Smoothing on ®ne mesh: x S
0; l1 . Compute the new residual: t b Ax. · Coarse mesh correction Project the residual: tc PT t. Solve: Ac Dxc tc . · Post-smoothing Update initial guess: x x PDxc : Smoothing on ®ne mesh: x S
x; l2 . It is important to address that each MG iteration cycle can be viewed as a linear or nonlinear preconditioning step depending on the choice of smoothers and how the coarse mesh equations are solved. 4.2.5. Acceleration of MG cycles The ®nal block of MG accelerates these cycles by some iteration schemes. In theory, many of iterative algorithms can serve as the outer acceleration procedures with the MG cycle as the preconditioning. In view of the fact that the equations to be solved may be inde®nite and the MG preconditioning may be nonlinear as well, the FGMRES [31] algorithm is utilized as the accelerator for the considerations of stability, eciency and robustness [27].
3552
Y.T. Feng / Comput. Methods Appl. Mech. Engrg. 190 (2001) 3543±3563
4.3. Additional issues related to the solution of correction equations The application of the above described MG approach to the JD and IN correction equations is not straightforward, however, due to the fact that the coecient matrices associated with the equations are the rank one update to the shifted matrix A A rI, and are therefore full matrices. Apparently, the only option is to store either A or A (the later is actually stored in our implementation) explicitly and to de®ne the matrices C JD and C IN implicitly. Therefore the standard MG solution procedure should be changed slightly so that both the JD and IN correction equations can be solved eectively. It appears that there are two operations in the MG procedure that should be modi®ed: the ®rst one is matrix±vector multiplication and the other is preconditioning. 4.3.1. Matrix±vector multiplication For the IN scheme, operation C IN u
A aqqT u can be performed with one extra dot-product and one extra saxpy operation in addition to the standard matrix±vector operation Au, and for the JD scheme, computation of C JD u
I qqT A
I qqT u requires additional one pre- and one post-projection
I qqT u, which are equivalent to two dot-product and saxpy operations. 4.3.2. Preconditioning As both C JD and C IN are dense matrices and implicitly represented, the direct construction of their preconditioning matrices appears to be a dicult and challenging task. The common option is therefore to focus on the preconditioning matrix of A, M, and to express the preconditioning matrices of C JD and C IN in terms of M. Suppose that the preconditioning matrix M for A is de®ned either implicitly or explicitly. A typical preconditioning operation involves the solution of Mu w. 1. The Jacobi±Davidson scheme. Symbolically, the inverse of C JD can be expressed as [7] 1
C JD1 A
I
bqvT A
Then after replacing A 1
M JD1 M
I
1
1
bvvT
1
v A q; b 1=qT v:
20
1
by M , we immediately obtain a preconditioning matrix M IN for C JD as
~ vT M bq~
1
~vv ~T b~
1 ~:
~ v M q; b~ 1=qT v
21
1
~ M q and b~ 1=qT v ~ are required to compute This formula was ®rst proposed by Olsen et al. [6]. As v only once at the start of the solution of the Jacobi±Davidson equation (9) by the MG approach, the preconditioning can be performed with one extra dot-product and one extra saxpy operation in addition to one standard preconditioning step. It is trivial to verify that qT M JD1 M JD1 q 0:
22
According to these relations, we can further derive the following properties M JD1
I
qqT M JD1 ;
M JD1 M JD1
I
23
qqT
24
and M JD1
I
qqT M JD1
I
qqT
25
i.e. M JD1 maps every vector into the space that is orthogonal to q. Consequently after each preconditioning step, no orthogonalization is actually required. In addition, this feature is valid for any form of preconditioning matrix M. In particular, if no preconditioning is provided for A, i.e. M I, M JD1 then simply becomes an orthogonal projection M JD1
I
qqT :
26
Y.T. Feng / Comput. Methods Appl. Mech. Engrg. 190 (2001) 3543±3563
3553
In most iterative solvers, such as Krylov-type methods, matrix±vector multiplication and preconditioning are two consecutive operations. If we consider these two operations as an integrated one, we may be able to save some operations. In left preconditioning situations, we basically need to deal with the following operator M JD1 C JD M JD
I
qqT A
I
qqT :
27
By taking the relation Eq. (24) into account, it follows that M JD1 C JD M JD1 A
I
qqT :
28
Therefore, the post-projection operation in C JD can be omitted. Similarly, in right preconditioning situations, it has C JD M JD1
I
qqT AM JD1 :
29
In this case, the pre-projection can be omitted. Note that after each coarse grid correction, one postprojection may be necessary to ensure that the initial solution for the post-smoothing is in the right subspace. We also mention that slightly dierent preconditioning approach is adopted in [32]. 2. The Inflated Newton scheme. Similarly to the JD case, we can also derive an `Olsen'-like preconditioning matrix for C IN . Since the inverse of C IN can be symbolically expressed as a 1 1 C IN1 A
I cqvT A cvvT c
30 1 a=b replacing A
1
by M 1
M IN1 M
I
1
gives rise to a preconditioning matrix M IN for C IN as ! a 1 T T ~v ~ c~q~ v M c~v : c~ 1 a=b~
31
Clearly, the preconditioning requires the same operations as in the JD case. Note, however, that there will be no gain when considering both matrix±vector multiplication and preconditioning operations together in the current case. Remark 3. In theory, the matrix M should be re-constructed at each Davidson iteration. When r is near convergence, however, it may not be necessary to re-compute M as it should be converged as well. Therefore, in our implementation, new preconditioning matrix M i will not be evaluated and be simply taken as M i 1 instead, if the following condition is satis®ed j1
ri 1 =ri j < ;
where the tolerance could be chosen as a relatively large value. In our cases, 10
32 2
is used.
5. Integrated MG and Davidson solution strategy In this section, we will present an integrated MG and Davidson solution procedure and propose an alternative form of the IN scheme. We will also address the changes that become necessary when a generalized eigenvalue problem is considered. 5.1. Two-level integrated MG and Davidson solution approach In the previous section, MG has been chosen as an ideal approach for the inexact solution of correction equations in Davidson's method. The characteristics of the MG, in terms of eciency and robustness, would in turn considerably improve the performance of the Davidson method. As a matter of fact, we can further take advantage of the information provided by the coarse meshes and involve the MG with the
3554
Y.T. Feng / Comput. Methods Appl. Mech. Engrg. 190 (2001) 3543±3563
Davidson method in another level ± which then leads to a two-level combined Davidson and MG solution strategy as follows. At the ®rst combination level, we can compute the ®rst p(the dimension of initial search subspace) eigenpairs on the next coarser grid and then provide them as initial guesses for the current grid. It is wellknown that in general conditions, the `quality' of these initial guesses is fairly good and consequently the convergence of Davidson's method should be improved. At the second combination level, as just discussed in the previous section, MG is employed to solve the correction equation in Davidson's method. Now we can put these two levels together to attain a fully integrated solution approach: we can recursively apply Level 1 combination to all grids starting from the coarsest grid and within this level, apply Level 2 combination on each grid. Note that since the computed eigenpairs for coarse meshes are used only as initial guesses, it is not necessary to evaluate them to a very high accuracy. Usually a modest level of accuracy may be sucient. In addition, for the coarsest mesh, the traditional eigenvalue solution methods may be more ecient. 5.2. Alternative form of the In¯ated Newton scheme The selected IN correction equation includes a free (dimensional) parameter a to be chosen in practice. It may be expected that dierent value of a would result in dierent performance of the corresponding Davidson method. Apparently it will be of signi®cant importance if one can identify a unique value that leads to the optimum or near optimum overall performance for a large class of practical applications. Based on the observation that the dierent value may be necessary for dierent eigenpair computation of the same problem, we suggest the following choice for a a sr;
33
where r is the current Ritz value and s (non-dimensional) another parameter. In addition, a constant s will be used for the evaluation of all wanted eigenpairs of the same problem. Consequently, the IN equation becomes C IN
sq r
34
with C IN
s A
qI
sr qqT :
35
This modi®ed version is later referred to as the IN(s) correction equation. This modi®ed version can be alternatively motivated by the fact that matrix C CN in the constrained Newton (CN) Eq. (12) can be rewritten as C CN A
rI
2rqqT
2q rT :
36
When the Davidson iteration is approaching convergence, r ! 0. Therefore, it follows that C CN C IN
2 O
r
37
i.e. C IN
2 is the first order, symmetric approximation to C CN . The parametric studies conducted in Section 6 further show that the IN scheme with s 2 exhibits a reasonable performance for the problems tested. Also note that with s 0 the IN scheme is degenerated to the MS choice (7). 5.3. Changes for generalized eigenvalue problems So far, a standard eigenvalue problem has been assumed. If a generalized eigenvalue problem is considered, some of the previously established formulation may need to be modi®ed accordingly.
Y.T. Feng / Comput. Methods Appl. Mech. Engrg. 190 (2001) 3543±3563
3555
Firstly, the de®nition of the Rayleigh quotient is changed to xT Ax xT Bx and the residual r of a Ritz pair
q; r is de®ned by RQ
x
r Aq
A A
rB:
38
39
More importantly, the Jacobi±Davidson equation takes the form
I
^ qqT A
I
q^ qT q r
^ q Bq;
40
where the correction vector q satis®es the orthogonal condition ^qT q 0. Accordingly, the preconditioning matrix M JD is changed to 1
M JD1 M
I
~qv ~T M b^
1
~vv ~T b~
1 ~ ;
~ v M ^q; b~ 1=^qT v
41
where M is the preconditioning matrix of A. It can be shown that expressions (22)±(25) are changed respectively to ^ qT M JD1 M JD1 ^ q 0;
42
M JD1
I
43
q^ qT M JD1 ;
M JD1 M JD1
I
^ qqT
44
and M JD1
I
^ qqT M JD1
I
q^ qT :
45
Similarly, in the In¯ated Newton correction equation, the matrix C IN is changed to q^ qT C IN
A a^ and the corresponding preconditioning matrix M IN is de®ned as ! a 1 1 ~T M ~v ~T M IN1 M
I c~ ^ qv c~ v : c~ 1 a=b~
46
47
For the modi®ed IN(s) scheme, we simply replace a in (46) and (47) by sr. It is worth noting that almost all claims made in the previous sections, especially in Section 4.3, still hold in this case. Also note that slightly dierent form of the Jacobi±Davidson equation for the generalized eigenvalue problem was proposed and used in [32±34]. 6. Numerical experiments In this section, three examples in 3D ®nite element simulation of structural dynamic problems are presented to provide an assessment of the performances of the Jacobi±Davidson and (modi®ed) In¯ated Newton correction equations in the context of the Davidson method. The numerical experiments are also undertaken in order to investigate the in¯uence of dierent values of s on the behaviour of the IN(s) scheme. 6.1. Details of MG and Davidson methods used A two-grid form of the GMG scheme is used in the present work for the approximate solution of the two correction equations. Both the ®ne and coarse meshes are totally non-nested and have been generated
3556
Y.T. Feng / Comput. Methods Appl. Mech. Engrg. 190 (2001) 3543±3563
independently from each other. Moreover, except for the second example, all the other meshes are also fully unstructured. The preconditioning scheme adopted is the standard IC(0), and one and two IC(0) iterations are employed as the pre- and post-smoothing, respectively. The coarse mesh equation is solved by a direct solver. As mentioned early, FGMRES(10) is employed as the outer acceleration procedure with each MG cycle serving as the preconditioning step. The MG iteration will be terminated when the relative residual norm of the correction equation is smaller than 10 5 or the number of MG smoothing iterations exceeds 90. Apparently, dierent tolerance values and the maximum iteration numbers will aect both the local costs of the MG approach and the global convergence of the Davidson method, which inevitably leads to dierent overall performance of the Davidson method. It is not clear, however, how to choose the local solution accuracy level in order to achieve the most eective global performance. This important topic is not elaborated in this work. In Davidson's method, the dimension of the initial search subspace, p, is taken to be 5. A standard restarting technique is applied when the dimension of the search space is larger than 10. Convergence is assumed to be established when the relative residual norm of the current Ritz pair satis®es the following termination condition: kri k=ri < 10
11
and the ®rst ®ve smallest eigenpairs are evaluated. All the computations presented here are carried out on a SG Origin 2000 with 2GB memory using one R10000 195MHz processor. 6.2. Finite element meshes of three examples The ®rst example consists of a simple one-end clamped 3D beam structure with a square section. The ®ne mesh shown in Fig. 1 is composed of 28,489 unstructured 10-node tetrahedral elements with 121,542 active d.o.f, while the coarse mesh consists of 7656 unstructured 4-node tetrahedral elements with 4827 active d.o.f. The second example has a more complex geometry. Fig. 2 shows the ®ne mesh which has 30,672 structured 8-node hexahedral elements with 102,573 active d.o.f, while the coarse mesh consists of 3744 structured 8-node hexahedral elements with 13,059 active d.o.f. The third example is a realistic problem. The ®ne model contains 236,036 unstructured 10-node tetrahedral elements with 1,003,794 active d.o.f. and the coarse mesh has 31,646 unstructured 4-node tetrahedral elements with 18,021 active d.o.f. The ®ne mesh is shown in Fig. 3. Details of the above three examples, together with the memory requirements, are summarized in Table 1. Note that for Example 3 with over 1 million d.o.f., merely 1.2 GB memory is required for running the analysis. We also remark that for all examples, the global mass matrices are assumed to be diagonal. 6.3. Comparative study The ®rst batch of tests aims to compare the performances, in terms of the number of Davidson iterations and the number of smoothing required in the MG, of both the Jacobi±Davidson and modi®ed In¯ated Newton schemes as the correction equation in Davidson's method. The IN(s) with s 2 is always used in this study. For Example 1, Fig. 4(a) depicts the number of outer Davidson iterations required for the solution of each eigenpair in the JD and IN cases, respectively. It is clear that for each eigenpair, almost the same number of Davidson iterations are required for the JD and IN cases. Furthermore, at least a quadratic convergence rate is achieved as illustrated in Fig. 5. Note that for this example, every correction equation has been solved to the prescribed tolerance 10 5 in less than 90 MG smoothing iterations. However, at each Davidson iteration, the number of MG smoothing required for both the JD and IN is dierent. Fig. 4(b) shows the total numbers of MG smoothing required for each eigenpair for both equations. It is obvious that the JD requires slightly more MG smoothing than the IN to compute the
Y.T. Feng / Comput. Methods Appl. Mech. Engrg. 190 (2001) 3543±3563
3557
Fig. 1. Finite element model of Example 1 ± ®ne mesh.
Fig. 2. Finite element model of Example 2 ± ®ne mesh.
eigen-solutions. In fact, about 50±60 smoothing are required to solve the IN equation but nearly 30% more smoothing are required to solve the JD equation. Therefore this example illustrates that, although almost the same global convergence of Davidson's method has been achieved with both schemes, the IN equation can be solved slightly more eectively than the JD by the MG method, thereby a slightly better overall performance is obtained by the IN scheme. Figs. 6 and 7 present the results for Example 2. It is shown in Fig. 6 that for the ®rst three eigenpairs, the JD and IN have almost the same Davidson iterations but for the fourth and ®fth eigenpairs, the JD is
3558
Y.T. Feng / Comput. Methods Appl. Mech. Engrg. 190 (2001) 3543±3563
Fig. 3. Finite element model of Example 3 ± ®ne mesh.
Table 1 Summary of the FE meshes for Examples 1±3 No.
Fine mesh Element type
1 2 3
10-Node tetrahedral 8-Node hexahedral 10-Node tetrahedral
Coarse mesh Elements
DOF
28,489
121,542
30,672
102,573
236,036
1,003,794
Element type 4-Node tetrahedral 8-Node hexahedral 4-Node tetrahedral
Elements
DOF
Memory (MB)
7656
4827
114
3744
13,059
120
31,646
18,021
1220
clearly much inferior to the IN in terms of convergence. Note that for this example, the MG cannot solve both the JD and IN equations up to the prescribed accuracy after 90 smoothing. However, the solution accuracy achieved for the IN equations is generally higher than the JD equations. This demonstrates again that the JD equation is slightly more dicult to solve. Fig. 7 plots the convergence histories of both the JD and IN schemes for the ®ve eigenpairs, showing that a linear convergence rate is attained for this example. For the largest problem, Example 3, it is slightly surprising that both the JD and IN perform well in terms of the number of Davidson iterations as shown in Fig. 8(a), and attain almost the same quadratic convergence rate. As in Example 1, all the JD and IN correction equations are solved to the prescribed accuracy in less than 90 MG iterations. However, if examining the total number of MG smoothing required for each eigenpair shown in Fig. 8(b), it is evident again that the MG converges slightly faster for the IN equations than those of the JD, with a dierence of about 20% in favour to the IN. We note that on average around 650 CPU seconds are required for the solution of each IN equation. Therefore, the above comparative test suggests that in comparison with the JD, the IN correction equation may be solved slightly more eectively by the MG approach. It should be pointed out, however,
Y.T. Feng / Comput. Methods Appl. Mech. Engrg. 190 (2001) 3543±3563
3559
Fig. 4. Example 1 ± Performance comparison between IN( 2) and JD Scheme for each computed eigenpair: (a) Number of Davidson iterations; (b) total number of MG smoothing.
Fig. 5. Example 1 ± Convergence histories of IN( 2) and JD for the ®rst ®ve eigenpairs.
Fig. 6. Example 2 ± Number of Davidson iterations for each eigenpair.
3560
Y.T. Feng / Comput. Methods Appl. Mech. Engrg. 190 (2001) 3543±3563
Fig. 7. Example 2 ± Convergence histories of IN( 2) and JD for the ®rst ®ve eigenpairs.
Fig. 8. Example 3 ± Performance comparison between IN( 2) and JD Scheme for each computed eigenpair: (a) Number of Davidson iterations; (b) Total number of MG smoothing.
that under dierent circumstances, it is possible that dierent results may be obtained, and therefore more tests are necessary to be carried out before a ®nal conclusion on this matter can be drawn. 6.4. Parametric study The second set of tests is designed to ®nd out how the performance of the IN(s) is aected by the selection of dierent values of s. For Example 1, the tested values of s include: 100; 10; 2; 1; 0:5; 1; 2; 5; 10; 20; 50 and 100. Fig. 9 presents the total number of MG smoothing required for the computation of all ®ve eigenpairs for each s, together with the corresponding numbers for the JD and MS cases. It is clearly shown that there is no signi®cant dierence (with the maximum dierence around 10%) among these IN cases. It is also obvious that there is no general pattern indicating the way of determining a (near) optimal value of s (although the best performance is achieved by s 10 in this particular case). It should be pointed out, however, that the tested IN schemes exhibit better performance than the JD method. As expected, the MS choice shows the worst performance. The same parametric study is also conducted for the second example and the corresponding results are presented in Fig. 10. Almost the same conclusion as in the ®rst example can be drawn from the results. The slight dierence is that the IN(s) shows even more consistent behaviour than the ®rst example and the IN(50) performs the best. In fact, the performance dierences among the most IN(s) schemes are only one
Y.T. Feng / Comput. Methods Appl. Mech. Engrg. 190 (2001) 3543±3563
3561
Fig. 9. Example 1 ± Total number of MG smoothing for MS, JD and 12 dierent IN schemes.
Fig. 10. Example 2 ± Total number of MG smoothing for JD and 12 dierent IN schemes.
or two Davidson iterations. It is also worth mentioning that the Davison iteration with the MS scheme becomes stagnant when the third eigenpair is evaluated. In summary, it is illustrated that the IN(s) scheme exhibits a consistent performance for a large range of s (except for a very small value which would degenerate the IN(s) to the MS scheme). In other words, the IN(s) scheme appears to be robustness which is of practical importance. Of course, the in¯uence of the value of s on the performance of the IN scheme is nonlinear which depends on many other factors. Therefore further theoretical and numerical work are necessary in order to fully understand the issue. 7. Concluding remarks In the present work, the most popular choices of correction equations in the Davidson method have been evaluated based on the proposed three selection criteria. The Jacobi±Davidson and (modi®ed) In¯ated
3562
Y.T. Feng / Comput. Methods Appl. Mech. Engrg. 190 (2001) 3543±3563
Newton schemes have been identi®ed as two possibly best candidates. These two correction equations potentially give rise to a cubic convergence rate. In practice, however, a lower rate is normally obtained since the correction equations are solved approximately by employing iterative solution procedures. A highly eective MG solution approach has been considered for the solution of the selected correction equations and the issues associated with these two special forms of the equations have been discussed. In particular, the associated preconditioning aspects have been addressed. Moreover, a two-level integrated MG and Davidson solution strategy has been presented, in which the multigrid is further involved in the Davidson method by providing good quality initial guesses. By means of numerical experiments, the real performances of the JD and IN schemes are demonstrated. It has been shown that in the situation where the correction equation is solved up to a moderate accuracy, the Davidson method exhibits a quadratic convergence rate. Under the speci®ed test conditions, it has also been observed that the IN scheme performs slightly better than the JD method. A further parametric study has shown that the IN(s) presents a consistent behaviour for a wide range value of s. We conclude that the IN(s) scheme is an attractive alternative correction equation to the JD method in terms of eectiveness and robustness. It has also been demonstrated that with our proposed solution strategy, very large scale 3D eigenvalue problems can be solved on normal computers within reasonable time scale. Acknowledgements The author would like to thank Dr. R.B. Lehoucq for his encouragement and many helpful suggestions.
References [1] R.G. Grimes, J.G. Lewis, H.D. Simon, A shifted block Lanczos algorithm for solving sparse symmetric generalized eigenproblems, SIAM J. Matrix Anal. Appl. 15 (1) (1994) 228±272. [2] J.W.H. Liu, The multifrontal method for sparse matrix solution: Theory and practice, SIAM Rev. 34 (1) (1992) 82±109. [3] A.V. Knyazev, Preconditioned eigensolvers ± an oxymoron?, Electron. Trans. Numer. Anal. 7 (1998) 104±123. [4] E.R. Davidson, The iterative calculation of a few of the lowest eigenvalues and corresponding eigenvectors of large real-symmetric matrices, J. Comput. Phys. 17 (1975) 87±94. [5] R.B. Morgan, D.S. Scott, Generalizations of Davidson's method for computing eigenvalues of sparse symmetric matrices, SIAM J. Sci. Statist. Comput. 7 (1986) 817±825. [6] J. Olsen, P. Jorgensen, J. Simon, Passing one-billion limit in full con®guration-interaction (FCI) calculations, Chem. Phys. Lett. 169 (1990) 463±472. [7] G.L.G. Sleijpen, H.A. van der Vorst, Jacobi±Davidson iteration method for linear eigenvalue problems, SIAM J. Matrix Anal. Appl. 17 (2) (1996) 401±425. [8] J.-L. Fattebert, A inverse iteration using multigrid for quantum chemistry, BIT 36 (1996) 494±508. [9] J. Mandel, S. McCormick, A multilevel variational method for au kbu on composite grids, J. Comput. Phys. 80 (1989) 442±450. [10] B.N. Parlett, The Symmetric Eigenvalue Problem, Prentice-Hall, Englewood Clis, NJ, 1980. [11] A. Stathopoulos, Y. Saad, K. Wu, Dynamic thick restarting of the Davidson and the implicitly restarted Arnoldi methods, SIAM J. Sci. Comput. 19 (1) (1998) 227±245. [12] A. Stathopoulos, Y. Saad, Restarting techniques for the (Jacobi±)Davidson symmetric eigenvalue methods, Electron. Trans. Numer. Anal. 7 (1998) 163±181. [13] D.C. Sorensen, C. Yang, A truncated RQ-iteration for large scale eigenvalue calculations, SIAM J. Matrix Anal. Appl. 19 (4) (1998) 1045±1073. [14] M. Genseberger, G.L.G. Sleijpen, Alternative correction equations in the Jacobi±Davidson method, Numer. Linear Algebra Appl. 6 (3) (1999) 235±253. [15] J.-L. Fattebert, A block Rayleigh quotient iteration with local quadratic convergence, Electron. Trans. Numer. Anal. 7 (1998) 56±74. [16] K. Wu, Y. Saad, A. Stathopoulos, Inexact Newton preconditioning techniques for large symmetric eigenvalue problems, Electron. Trans. Numer. Anal. 7 (1998) 202±214. [17] A.T. Galick, Eective solution of large sparse eigenvalue problems in microelectronic simulation, Ph.D. Thesis, University of Illinois at Urbana-Champaign, 1993. [18] A. Brandt, Multi-level adaptive solution to boundary-value problems, Math. Comput. 31 (1977) 333±390. [19] W. Hackbuch, Multi-grid Methods And Applications, Springer, Berlin, 1985. [20] P. Wesseling, An Introduction to Multigrid Methods, Wiley, Chichester, 1992.
Y.T. Feng / Comput. Methods Appl. Mech. Engrg. 190 (2001) 3543±3563
3563
[21] J.W. Ruge, K. Stuben, Algebraic multigrid, in: S.F. McCormick (Ed.), Multigrid Methods, SIAM, Philadelphia, Pennsylvania, USA, 1987, pp. 73±130 (Chapter 4). [22] C.C. Douglas, Multigrid methods in science and engineering, Comput. Sci. Engrg. 3 (4) (1996) 55±68. [23] H. Guillard, Node-nested multigrid with delaunay coarsening, Technical Report 1898, Institute National de Recherche en Informatique et en Automatique, 1993. [24] J. Bonet, J. Peraire, An alternating digital tree (ADT) algorithm for 3D geometric searching and intersection problems, Int. J. Numer. Meth. Engng. 31 (1991) 1±17. [25] D.R.J. Owen, Y.T. Feng, D. Peric, Iterative methods on parallel computers for FE simulation of 3D sheet forming operations, in: R.H. Wagoner, J.K. Lee, G.L. Kinzel (Eds.), Proceedings of the Third International Conference on Numerical Simulation of 3D Sheet Forming Processes (NUMISHEET'96), Dearborn, Michigan, USA, 1996. [26] Y.T. Feng, D. Peric, D.R.J. Owen, A non-nested Galerkin multi-grid method for solving linear and nonlinear solid mechanics problems, Comput. Math. Appl. Mech. Engrg. 144 (1997) 307±325. [27] Y.T. Feng, D. Peric, D.R.J. Owen, A multi-grid enhanced GMRES algorithm for elasto-plastic problems, Int. J. Numer. Meth. Engng. 42 (1998) 1441±1462. [28] Y.T. Feng, D. Peric, Coarse mesh evolution strategies in the Galerkin multigrid method with mesh adaptivity for geometrically nonlinear problems, Int. J. Numer. Meth. Engng. 49 (2000) 547±571. [29] H. van der Vorst, Bi-CGSTAB: a fast and smoothly converging variant of Bi-CG for the solution of nonsymmetric linear equations, SIAM J. Sci. Statist. Comput. 13 (1992) 631±644. [30] Y. Saad, M. Schultz, GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems, SIAM J. Sci. Statist. Comput. 7 (1986) 856±869. [31] Y. Saad, A ¯exible inner-outer preconditioned GMRES algorithm, SIAM J. Sci. Statist. Comput. 14 (1993) 461±469. [32] G.L.G. Sleijpen, H.A. van der Vorst, Ecient expansion of subspaces in the Jacobi±Davidson method for standard and generalized eigenproblems, Electron. Trans. Numer. Anal. 7 (1998) 75±89. [33] G.L.G. Sleijpen, J.G.L. Booten, D.R. Fokkema, H.A. van der Vorst, Jacobi±Davidson type method for generalized eigenproblems and polynomial eigenproblems, BIT 36 (3) (1996) 595±633. [34] D.R. Fokkema, G.L.G. Sleijpen, H.A. van der Vorst, Jacobi±Davidson style QR and QZ algorithm for the partial reduction of matrix pencils, SIAM J. Sci. Comput. 20 (1998) 94±125.