Advances in Water Resources 26 (2003) 833–849 www.elsevier.com/locate/advwatres
Efficient steady-state solution techniques for variably saturated groundwater flow Matthew W. Farthing a,*, Christopher E. Kees b, Todd S. Coffey c, C.T. Kelley b, Cass T. Miller a a
b
Center for the Advanced Study of the Environment, Department of Environmental Sciences and Engineering, University of North Carolina, Chapel Hill, NC 27599-7431, USA Center for Research in Scientific Computation, Department of Mathematics, North Carolina State University, Raleigh, NC 27695-8205, USA c Mathematical Information and Computational Sciences, Sandia National Laboratories, Albuquerque, NM 87185-1110, USA Received 31 October 2002; received in revised form 10 April 2003; accepted 5 May 2003
Abstract We consider the simulation of steady-state variably saturated groundwater flow using RichardsÕ equation (RE). The difficulties associated with solving RE numerically are well known. Most discretization approaches for RE lead to nonlinear systems that are large and difficult to solve. The solution of nonlinear systems for steady-state problems can be particularly challenging, since a good initial guess for the steady-state solution is often hard to obtain, and the resulting linear systems may be poorly scaled. Common approaches like Picard iteration or variations of NewtonÕs method have their advantages but perform poorly with standard globalization techniques under certain conditions. Pseudo-transient continuation has been used in computational fluid dynamics for some time to obtain steady-state solutions for problems in which NewtonÕs method with standard line-search strategies fails. Here, we examine the use of pseudo-transient continuation as well as NewtonÕs method combined with standard globalization techniques for steady-state problems in heterogeneous domains. We investigate the methodsÕ performance with direct and preconditioned Krylov iterative linear solvers. We then make recommendations for robust and efficient approaches to obtain steady-state solutions for RE under a range of conditions. Published by Elsevier Ltd. Keywords: RichardsÕ equation; Steady-state; Pseudo-transient continuation
1. Introduction Variably saturated groundwater flow is commonly modeled using RichardsÕ equation (RE) along with a set of constitutive relations describing the interdependence among fluid pressures, saturations, and relative permeabilities (p–S–k relations). While analytical and semianalytical approximations for variably saturated flow exist, these are valid for limited sets of auxiliary conditions and domains [32]. As a result, significant effort has focused on developing robust techniques for solving RE numerically [9,28,30,34,36,38–40]. Obtaining solutions for RE for many realistic physical conditions remains a *
Corresponding author. E-mail addresses:
[email protected], matthew_farthing@ unc.edu (M.W. Farthing),
[email protected] (C.E. Kees), tscoffe@ sandia.gov (T.S. Coffey),
[email protected] (C.T. Kelley),
[email protected] (C.T. Miller). 0309-1708/03/$ - see front matter Published by Elsevier Ltd. doi:10.1016/S0309-1708(03)00076-9
challenge. Infiltration problems are often characterized by sharp fronts in both space and time. Steady-state solution is often nontrivial as well, since the volume fraction can vary steeply for problems with realistic boundary conditions and heterogeneous porous media. It is the nature of the nonlinearities introduced through standard p–S–k relations like the van Genuchten [37] and Mualem [29] models that accounts for the majority of the difficulties associated with solving RE. A number of issues associated with resolving this nonlinear behavior have received attention. These include solution variable transformations [4,38], the evaluation of p–S–k relations [35], approximation of interface conductivities in standard low-order spatial discretizations [28,40], choice of dependent variable [39], and time discretization approach [9,20,33,35]. Since the majority of time discretizations are implicit, both transient and steadystate problems typically lead to the solution of a system of discrete nonlinear equations. These systems can be
834
M.W. Farthing et al. / Advances in Water Resources 26 (2003) 833–849
Nomenclature Roman letters A accumulation term for pressure head form of RE A accumulation term contribution to Jacobian J Jacobian C scaling factor for Dirichlet boundary conditions F nonlinear function for DAE formulation, semi-discrete G nonlinear function for DAE formulation, discrete Ks saturated hydraulic conductivity Kss surface saturated hydraulic conductivity Ke effective hydraulic conductivity Nmax maximum number of iterations for nonlinear solution methods Od discrete spatial operator R discrete spatial operator and source term Se effective saturation T extent of temporal domain Xd extent of spatial domain along xd axis f source term for the aqueous phase evaluated at cell centers f source term for the aqueous phase g gravitational vector gu unit vector, g=kgk ki intrinsic permeability kr relative permeability mv parameter for VGM n unit outward normal for X ne total number of nodes nv parameter for VGM nxd number of nodes along xd axis p pressure of the aqueous phase t time coordinate u mass flux ub Neumann boundary value ur precipitation rate y variable for DAE formulation y0 temporal derivative for DAE formulation ymin lower bound for solution in test for evaluation error ymax upper bound for solution in test for evaluation error Greek letters Dt time step for Wtc methods Dtmax maximum time step for Wtc methods Dxd spatial increment in xd direction Dy Newton increment C boundary of physical domain CD portion of C for which Dirichlet boundary conditions are set
CN X av b c s h hr hs k kþ l . q rmin rmax s w wb w0
portion of C for which Neumann boundary conditions are set physical domain parameter for VGM compressibility of the aqueous phase switching tolerance in hybrid Newton–Picard method sufficient decrease parameter for Armijo line search volume fraction of the aqueous phase residual volumetric water content saturated volumetric water content line-search scaling factor intermediate scaling factor for quadratic line search viscosity of the aqueous phase density of the aqueous phase normalized density of the aqueous phase bound parameter for quadratic line search bound parameter for quadratic line search TTE parameter for truncation error estimate bound pressure head Dirichlet boundary value initial condition for w
Subscripts and superscripts d coordinate axis identifier (subscript) i node qualifier along x1 axis (subscript) j node qualifier along x2 axis (subscript) l global identifier for solution unknowns (subscript) n time level identifier (superscript) sn nonlinear iteration index (superscript) Abbreviations Clock wall clock time (s) DAE differential algebraic equation Feval function evaluation HAS two-level hybrid additive Schwarz domain decomposition preconditioner Jeval Jacobian evaluation LF linear iteration failure LI linear iteration LU-B banded LU decomposition from LAPACK LU-S sparse LU decomposition from SPOOLES (version 2.2) NILS NewtonÕs method with a quadratic line search NLF nonlinear iteration failure NLI nonlinear iteration ODE ordinary differential equation PIH Newton–Picard hybrid approach RE RichardsÕ equation
M.W. Farthing et al. / Advances in Water Resources 26 (2003) 833–849
SER Steps TTE VGM
switched evolution relaxation Wtc approach completed iterations for method temporal truncation errror Wtc approach combined van Genuchten and Mualem p–S–k relation
quite large and difficult to solve, especially for problems with heterogeneous domains in two and three spatial dimensions. The nonlinear system solution method used can then have a strong impact on the overall success of a simulator for RE. The most common approaches have been Picard iteration [9,10,20] or a variant of NewtonÕs method [22,36,39]. Each of these methods has its strengths and weaknesses, and several works have compared the robustness and efficiency of Picard and Newton approaches for a variety of problems [8,26,28,30]. NewtonÕs method combined with globalization techniques like a line search, reduction of time step (backtracking), or the use of Picard iteration to obtain an initial guess has proven more reliable and efficient than Picard iteration for several problems [28,30]. However, both Newton and Picard approaches have been shown to perform poorly under certain conditions. The solution of nonlinear systems can be particularly difficult for steady-state problems due to the increased difficulty of obtaining a good initial guess for the steady-state solution. In addition, the conditioning of the linearized systems is typically worse [16], and there is no longer recourse to reducing time steps when convergence breaks down [30]. Pseudo-transient continuation (Wtc) has been used in computational fluid dynamics for some time to obtain steady-state solutions for problems in which NewtonÕs method with standard line-search strategies fails [11,24,25]. Wtc combines features of backward Euler time integration and NewtonÕs method with the idea of using information about the transient physical problem to guide selection of intermediate iterates to the steady-state solution [16]. Since it uses information from a time-dependent problem, Wtc can be more robust than standard line-search strategies, while incurring less computational expense than full integration of the transient problem [24]. In this work, we seek effective techniques for obtaining steady-state solutions to RE. We examine the use of Wtc methods as well as NewtonÕs method combined with standard globalization techniques for steady-state problems in homogeneous and heterogeneous domains. We investigate the methodsÕ performance with both direct and preconditioned Krylov iterative linear solvers. We then make recommendations for robust and efficient approaches to obtain steady-state solutions for RE under various conditions.
p–S–k Wtc
835
pressure–saturation–relative permeability constitutive relation pseudo-transient continuation
2. Background Historically, Picard iteration has been the most common nonlinear solution method used for RE [9,33]. Its appeal can be traced to the fact that it is simple to implement, since it does not involve derivatives of terms involving the p–S–k relations and produces symmetric linear systems [30]. While Picard iteration is still used [10,20], Newton and inexact Newton approaches have become increasingly more common [15,19,22], particularly for more realistic multidimensional problems. An inexact Newton approach can be thought of as NewtonÕs method where the equation for the Newton update is satisfied only approximately [36]. This may be due to the fact that the Newton update is solved only approximately using, for example, a Krylov-type iterative linear solver [19]. Or, only an approximate Jacobian may be used in the Newton update. Common approaches for approximating the full Jacobian are to use finite differences (a numerical Jacobian) [15] or to lag its evaluation over a number of iterates (chord or modified NewtonÕs method) [35]. While these approaches may be conceptually straightforward, a number of subtleties often arise [23]. For instance, the choice of stopping criteria and the associated tolerances for both the nonlinear and linear solvers, the frequency of Jacobian evaluations, and the increment used in evaluating numerical Jacobians can all play a significant role in the success of an approach for a given problem [8,26,28,30,36]. Since the convergence of NewtonÕs method can be quite sensitive to the quality of initial guess, supplementing a Newton approach with strategies like a line search and backtracking has improved nonlinear solver performance for some problems [28,31]. Line-search techniques typically select a fraction of the full Newton correction to update the current iterate with the step chosen to insure that the updated solution represents a sufficient decrease in the nonlinear residual [11,19]. For transient problems, methods commonly adjust the time step based on nonlinear solver performance. Specifically, in the case of poor nonlinear solver performance, the time step is often reduced [8,33,35]. This can improve the scaling of the Jacobian, since the reciprocal of the time step typically appears on the diagonal [16], and can improve the quality of the initial guess obtained from the solution at the previous time step [30]. Another strategy designed to
836
M.W. Farthing et al. / Advances in Water Resources 26 (2003) 833–849
address Newton approachesÕ sensitivity to initial guess is a hybrid Newton–Picard algorithm which uses a Picard iteration initially until a certain level of convergence is reached and then switches to NewtonÕs method [8,30]. Like the hybrid Newton–Picard algorithm, Wtc attempts to combine the rapid convergence of NewtonÕs method near the steady-state solution with a more robust iteration far from the solution. Wtc employs a timestepping approach when the iteration is far from the steady state, and thus exploits the relationship between the nonlinear system for the steady-state problem and the initial value problem from which it is derived. The time-stepping algorithm used in Wtc has properties similar to forward Euler with simple heuristics, which makes it both unstable and inaccurate as a time-integration method for stiff ordinary differential equations (ODEs). Nevertheless, Wtc can effectively maintain certain important qualities of the transient solution of some problems such as enforcing a CFL condition, and has been shown to be effective in reaching the steady state for models with discontinuous transient phenomena such as shocks. While RE does not exhibit shocks under physically realistic choices of parameters, it does exhibit sharp fronts due to the nonlinearities as well as steep gradients due to discontinuities in the porous media properties.
Here p, ., and l are the pressure, density, and viscosity of the aqueous phase, h is the volume fraction, and f is a source term for the aqueous phase. .0 is the density at p0 , b is the compressibility of the aqueous phase, w is the pressure head, and g is a vector accounting for the acceleration of gravity. Ke is the effective hydraulic conductivity, Ks is the saturated hydraulic conductivity, and ki is the intrinsic permeability of the porous medium. X R2 is the spatial domain, and ½0; T is the temporal domain. The auxiliary conditions for Eq. (1) are given by w ¼ wb u n ¼ ub w ¼ w0
on CD ; t 2 ½0; T
ð3Þ
on CN ; t 2 ½0; T
ð4Þ
in X; t ¼ 0
ð5Þ
u ¼ qKe ðrw qgu Þ
ð6Þ
where
is the mass flux, wb and ub are boundary condition functions. w0 is the initial condition, and n is the unit outward normal for X. We have also set C ¼ oX ¼ CD [ CN with CD \ CN ¼ ;. The steady-state form of Eq. (1) is r qKe ðrw qgu Þ ¼ f
in X
ð7Þ
3. Approach 3.1. Formulation RE can be formulated in a number of ways. We begin with an expression for the conservation of mass for the aqueous phase in an air–water system where the solid phase is assumed immobile, and interphase mass transfer is neglected. Combining this with the standard extension of DarcyÕs law to variable saturated conditions [27] gives oðqhÞ ¼ r qKe ðrw qgu Þ þ f ot
in X ½0; T
ð1Þ
. ¼ .0 ebðp p0 Þ q ¼ .=.0 w¼
Ks ¼
.0 kgkki l
Ke ¼ kr ðwÞKs
Se ¼
ðh hr Þ ðhs hr Þ
Se ¼ ½1 þ ðav jwjÞnv mv kr ¼
pffiffiffiffiffi m 2 Se ½1 ð1 Se1=mv Þ v
ð8Þ ð9Þ ð10Þ
where Se is the effective saturation, hr is the residual volumetric water content, hs is the saturated volumetric water content, av is a parameter related to the mean pore size, nv is a parameter related to the uniformity of the soil pore-size distribution, and mv ¼ 1 1=nv . For w P 0, the porous medium is fully saturated, and Eqs. (8)–(10) revert to
with
p .0 kgk g gu ¼ kgk
with auxiliary data given by Eqs. (3) and (4) without temporal dependence. We use nonhysteretic forms of the p–S–k relations of van Genuchten [37] and Mualem [29] (VGM). For w < 0, these are given by
ð2Þ
Se ¼ 1
ð11Þ
kr ¼ 1
ð12Þ
The pressure head form of RE is obtained from Eq. (1) by applying the chain rule to the left-hand side
M.W. Farthing et al. / Advances in Water Resources 26 (2003) 833–849
ow ¼ r qKe ðrw qgu Þ þ f ot oq oh þq AðwÞ ¼ h ow ow
AðwÞ
ð13Þ
The physical boundaries are located at the nodes (cell centers), so specifying Dirichlet conditions in pressure or volume fraction is straightforward. For example, at a cell Xi;j CD , we set Cðwi;j wbi;j Þ ¼ 0;
3.2. Spatial discretization We consider a discretization of X ¼ ½0; X1
½0; X2 R2 into a regular, orthogonal grid with ne ¼ nx1 nx2 nodes with Dxd ¼ Xd =ðnxd 1Þ, for d ¼ 1; 2. In the interior, we define cells of size Dx1 Dx2 around each node and apply a cell-centered finite difference approximation to the right-hand side of Eq. (1). We also apply a cell-centered discretization along the boundary by extending the domain with fictitious ghost cells of width Dxd =2. For a cell Xij in the interior, we write oðqi;j hi;j Þ ¼ Ri;j ot Ri;j ¼ Od;i;j fi;j
ð14Þ i ¼ 2; . . . ; nx1 1; j ¼ 2; . . . ; nx2 1 ð15Þ
For a cell Xi;j in the interior, the discrete spatial approximation is
Od;i;j ¼
wiþ1;j wi;j 1 q Ke;iþ1=2;j qiþ1=2;j gu1 Dx1 iþ1=2;j Dx1 wi;j wi 1;j qi 1=2;j Ke;i 1=2;j qi 1=2;j gu1 Dx1 wi;jþ1 wi;j 1 qi;jþ1=2 Ke;i;jþ1=2 þ qi;jþ1=2 gu2 Dx2 Dx2 wi;j wi;j 1 qi;j 1=2 Ke;i;j 1=2 qi;j 1=2 gu2 Dx2 ð16Þ
where the subscripts in gu ¼ ½gu1 ; gu2 T indicate values for each coordinate, and quantities with a 1/2 subscript denote values estimated at cell interfaces. For the interface values, we use a harmonic average for saturated hydraulic conductivity, the arithmetic average for density, and upwind the relative permeability qiþ1=2;j ¼ ½qðwiþ1;j Þ þ qðwi;j Þ =2
ð17Þ
Ke;iþ1=2;j ¼ kr;iþ1=2;j Ks;iþ1=2;j Ks;iþ1=2;j ¼ 2Ks;i;j Ks;iþ1;j =ðKs;i;j þ Ks;iþ1;j Þ ( w wi;j kr ðwiþ1;j Þ if iþ1;j > qiþ1=2;j gu1 Dx1 kr;iþ1=2;j ¼ kr ðwi;j Þ otherwise
ð18Þ
The corresponding terms along the other coordinate axis are defined symmetrically.
837
ð19Þ
where C is a scaling factor that gives the boundary equation roughly the same scaling as the interior nodes. For cells along CN , we use linear extrapolation to apply the flux at the exterior (artificial) boundary of the extended ghost cell rather than the cell center. This approach allows us to use the same nodal equation at boundary nodes Xi;j CN as we do in the interior. If, for example, X1;j CN , we set u1;1=2;j ¼ 2ub1;1;j u1;3=2;j
ð20Þ T
at the fictitious left cell boundary. Here u ¼ ½u1 ; u2 . 3.3. Temporal approximation Since we consider both direct solution of the steadystate problem Eq. (7) and integration of Eq. (1) to steady state, we begin by applying the cell-centered finite difference spatial approximation to Eq. (13) in a method of lines (MOL) context Aðwi;j Þ
owi;j ¼ Ri;j i ¼ 2; . . . ; nx1 1; j ¼ 2; . . . ; nx2 1 ot ð21Þ
with the Dirichlet and Neumann boundary conditions given by Eqs. (19) and (20). The semi-discrete system corresponding to Eq. (21) can be written as a set of differential algebraic equations (DAEs) Fðt; y; y0 Þ ¼ 0
ð22Þ
where F represents a set of equations that depend on time t, a set of dependent variables y, and a set of firstorder derivatives with respect to time of these dependent variables, y0 . A variety of approaches can be used to integrate Eq. (22). To illustrate the structure of the nonlinear system for transient problems, we use a backward Euler approximation to convert Eq. (22) to a fully discrete system [14,22]. 1 G tnþ1 ; ynþ1 ; nþ1 ðynþ1 yn Þ ¼ 0 ð23Þ Dt 3.3.1. Nonlinear solution for the transient problem At each time level, a full time integration approach such as backward Euler must solve the nonlinear system Eq. (23). A general Newton iteration for Eq. (23) can be written
838
M.W. Farthing et al. / Advances in Water Resources 26 (2003) 833–849
½Jnþ1;sn fDysn þ1 g ¼ fGnþ1;sn g
ð24Þ
where fDysn þ1 g ¼ fynþ1;sn þ1 g fynþ1;sn g and sn is a nonlinear iteration index. The Jacobian, J, is formed by differentiating Eq. (23) with respect to y. For Eq. (21), we can write J as " # nþ1;sn 1 oðAy0 Þ oOd nþ1;sn nþ1;sn nþ1;sn ½J ¼ nþ1 ½A þ Dt oy oy ð25Þ where A is diagonal with ½A l;l ¼ Aðwi;j Þ, oAy0 =oy is also diagonal, and oOd =oy will be banded with five nonzero entries. Here, l is a global identifier corresponding to cell Xi;j with, for example, l ¼ ðj 1Þnx1 þ i for i ¼ 1; . . . ; nx1 , j ¼ 1; . . . ; nx2 . For unknowns along CD , J is simply (see Eq. (19)) ½Jnþ1;sn l;l ¼ C
ð26Þ
3.4. Wtc approximation Wtc attempts to find a solution to Eq. (7) by integrating Eq. (21) to steady state. The approach is straightforward and can be included in many existing steady-state or transient solvers with minor modifications. The fully discrete Wtc system can be obtained from Eq. (21) by first applying a backward Euler time discretization as in Eq. (23) and then using NewtonÕs method with yn as the initial guess. 3.4.1. Solution for Wtc update While applying multiple iterations of NewtonÕs method is possible, the form of Wtc which we present here performs only a single Newton update for Eq. (23) [24]. The resulting equation for the Wtc iterate is ½Jn fDynþ1 g ¼ fRn g ¼ fOd ðyn Þg þ ff n g
ð27Þ
with fDynþ1 g ¼ fynþ1 g fyn g since only one Newton iteration is performed and fynþ1;0 g ¼ fyn g. J has the same form as Eq. (25), but without a derivative of the accumulation term n 1 oOd n n ½J ¼ nþ1 ½A ð28Þ Dt oy Note that when Dt is small enough ½Jn Dt1nþ1 ½An and therefore Dy
nþ1
Dt
nþ1
n 1
n
½A fR g
ð29Þ
which is the update corresponding to the forward Euler method applied to the ODE form of RE. 3.4.2. Wtc step selection Wtc solves a series of problems of the form in Eq. (27), while adapting the time step Dtnþ1 based on the intermediate solutionÕs behavior. However, we note that
the intermediate solution does not necessarily represent an accurate approximation to the physical solution at an intermediate time. The Wtc iterations terminate based on the norm of the steady-state residual kRk becoming sufficiently small [24]. There are a number of common strategies for selecting Dtnþ1 , including the switched evolution relaxation (SER) strategy. SER increases the time step in inverse proportion to the reduction in the steady-state residual [11,16,24]. Dtnþ1 ¼ Dtn
kRn 1 k kRn k
ð30Þ
The temporal truncation error approach (TTE) attempts to control the time step based on the local temporal truncation error [11,16,24]. It chooses Dtnþ1 so that
ðDtnþ1 Þ2 o2 y
l n
ð31Þ ðt Þ
6s
2ð1 þ jyln jÞ ot2 for each component of the solution yl and some constant s. Here, we estimate o2 yl =ot2 at tn by [11] n o2 yl n 2 yl yln 1 yln 1 yln 2 ðt Þ n ð32Þ Dt þ Dtn 1 ot2 Dtn Dtn 1 To avoid evaluation errors in the constitutive relations and to increase the robustness of the Wtc iterations, we include a test to make sure that the solution is within broad, physically relevant bounds ½ymin ; ymax . If the solution is outside these bounds the time step is halved and the step is repeated. In the numerical experiments presented below, we use an interval ymin ¼ 100 m, ymax ¼ 100 m. We also enforce an upper bound Dtmax on the chosen time step for both SER and TTE. 3.5. Newton’s method for the steady-state problem Following Eq. (21), we can write a cell-centered finite difference spatial approximation of Eq. (7) to solve the steady-state problem directly Ri;j ¼ 0
i ¼ 2; . . . ; nx1 1; j ¼ 2; . . . ; nx2 1
ð33Þ
with Dirichlet and Neumann boundary conditions given by Eqs. (19) and (20) without temporal dependence. The Newton update for Eq. (33) is ½Jsn fDysn þ1 g ¼ fRsn g oOd ½J ¼ oy
ð34Þ
3.5.1. Globalization techniques A number of globalization strategies are commonly used to address the sensitivity of NewtonÕs method to initial guess. The Armijo line-search technique scales the original Newton update by a factor chosen to take a step as close to the original update as possible while insuring
M.W. Farthing et al. / Advances in Water Resources 26 (2003) 833–849
a sufficient decrease in the nonlinear residual [19]. At iteration level sn for Eq. (34), the Armijo line search can be written (1) Solve Eq. (34) for Dysn þ1 , and set k ¼ kþ ¼ 1 (2) While kRðysn þ kDysn þ1 Þk > ð1 s kÞkRðysn Þk Choose kþ if kþ < rmin k, kþ ¼ rmin k else if kþ > rmax k, kþ ¼ rmax k k ¼ kþ (3) Set fysn þ1 g ¼ fysn g þ kfDysn þ1 g where s is a parameter controlling the amount of decrease required in the nonlinear residual and k is the final scaling factor. Here, we chose kþ so that it minimized a three-point parabolic approximation of kRðysn þ kDysn þ1 Þk ¼ f ðkÞ. Bounds on kþ are dictated by rmin and rmax . The details of this approach can be found in [23]. For this work we set rmin ¼ 0:1, rmax ¼ 0:55, and s ¼ 10 4 . The line search can fail if k becomes too small. In this case, the nonlinear solver is said to have failed due to line-search stagnation [11]. As with the Wtc approaches, we include a test to make sure that the solution is within a large interval in order to avoid errors evaluating constitutive relations and improve the robustness of the iterations. (1) Solve Eq. (34) for Dysn þ1 , and set k ¼ 1 (2) While ylsn þ kDylsn þ1 62 ½ymin ; ymax 8l k ¼ k=2 (3) Set fysn þ1 g ¼ fysn g þ kfDysn þ1 g In the numerical experiments, we include this test for both Newton and hybrid Newton–Picard methods. We use the same interval ymin ¼ 100 m, ymax ¼ 100 m as is used in the corresponding bounds test for the Wtc methods. 3.6. Picard iteration Picard iteration has been used widely in the solution of RE [10,20]. In brief, a Picard linearization of Eq. (33) can be formulated as the Newton update in Eq. (34) but with an approximate Jacobian in which the coefficient derivatives okr =ow and oq=ow are omitted from oOd =oy [26,30]. The performance of Picard iteration and Newton approaches have been compared in several works [26,28,30]. In many cases, NewtonÕs method with line search has proven more robust than a straightforward application of Picard iteration [28]. However, a hybrid Newton– Picard algorithm has also been suggested for reducing the sensitivity of NewtonÕs method to the quality of the initial guess [8,30]. This approach performs an initial number of iterations for Eq. (34) using the Picard approximation for J and then switches to a Newton update
839
with the full Jacobian [8,30]. The motivation for this approach is that the Picard iterations are, in general, cheaper than their Newton counterparts. Ideally, the switch to NewtonÕs method is chosen so that (1) a minimal number of Picard iterations are performed; and, (2) upon switching, ysn is sufficiently close to the true to solution that the asymptotic convergence rate for the Newton updates is realized. There are several possible criteria for determining the crossover from Picard to Newton iteration, including kDysn k < c [8]. Alternatively, one can base the switch on sufficient decrease in the initial residual, kRsn þ1 k < c kR0 k
ð35Þ
We use Eq. (35) in the numerical results presented below. 3.7. Linear system solution We test five algorithms for solving the linear systems arising in the nonlinear iteration. As a direct solver, we use both the banded LU decomposition from LAPACK (LU-B) [1], as well as the implementation of LU from the SPOOLES package (LU-S) [2,3] in PETSc [5–7]. We also test the iterative method BiCGstab using ILU preconditioning with zero fill from PETSc (BiCGstabILU), and a two-level hybrid additive Schwarz domain decomposition method (BiCGstab-HAS) [18].
4. Results In the following sections we will present results of several numerical experiments and compare the behavior of Wtc with NewtonÕs method and a hybrid Newton– Picard iteration. First we summarize the test problems on which the numerical experiments were carried out. 4.1. Test problems 4.1.1. Problem 1: infiltration example The first test problem is a relatively simple onedimensional example and provides a case where each of the methods should perform well. We consider each test case as a transient problem with a corresponding steadystate solution. The first example simulates infiltration in vertical domain X ¼ ½0; X1 . Initial conditions for the infiltration are set to static equilibrium with the water table, located at the bottom of the domain. Constant Dirichlet boundary conditions are set at the top so that the steady-state solution contains both saturated and unsaturated regions. Table 1 summarizes the relevant physical parameters and auxiliary conditions.
840
M.W. Farthing et al. / Advances in Water Resources 26 (2003) 833–849
Table 1 Fluid and domain properties for Problem 1 Variable
Value
Units
gu1 kgk .0 b p0 X1 wb ðx1 ¼ 0Þ wb ðx1 ¼ 10Þ w0 ðx1 Þ nv av Ks
)1.0 7.321 · 1010 998.2 6.564 · 10 20 0 10 0 )0.05 x1 4.264 5.470 5.040
– m/d2 kg/m3 m d2 /kg kg/m d2 m m m m – m 1 m/d
4.1.2. Problem 2: hillslope example The spatial domain for the second problem, X ¼ ½0; X1 ½0; X2 , is illustrated in Fig. 1. The domain is partitioned into rectangular blocks, each with constant media properties. The lower-left and upper-right coordinates of each block are given in Table 2 along with the media type for the block, which we have labeled P1, P2, P3, or P4. Table 3 lists the corresponding media properties. The media types represent a range of soil
types from clay to sand. The temporal domain is t 2 ½0; T with T chosen so that the solution is at steady state. The relevant fluid properties and auxiliary conditions for Problem 2 are given in Table 4. The log of the saturated hydraulic conductivity is given in Fig. 2. Note that the domain is rotated 45 to simulate a simple hillslope. The boundary and initial conditions are configured to reflect an impermeable bedrock at the X2 boundary, no flow at the X1þ boundary, and static, saturated equilibrium along X1 . The surface boundary condition at X2þ is a nonlinear flux boundary condition that simulates infiltrating precipitation until the surface becomes saturated. After saturated conditions are reached at the surface, the outward normal flux increases to zero and becomes positive as the pressure rises above atmospheric pressure. This condition reflects a well-drained surface that permits very little ponding at the surface. The boundary conditions can be summarized as follows: w X ¼ x2
ð36Þ
ubX þ ¼ 0
ð37Þ
ubX ¼ 0
ð38Þ
1
1
2
X2+
(X1 , X2 )
ubX þ ¼ 2
X1
ur ; ur þ Kss w;
w<0 w>0
ð39Þ
where ur is the precipitation rate and Kss is the saturated conductivity at the surface. For the numerical experiments presented, Kss ¼ 10 m/d. The steady-state solution of the volume fraction is given in Fig. 3. This solution was computed by integrating the transient problem to T ¼ 1700 d at which time kRk1 < 10 5 . The solution was obtained using a DAE integrator with relative and absolute integration tolerances of 10 4 [21,22]. The transient solution exhibits steep moving fronts throughout the domain while the equilibrium solution shown maintains a number of steep moisture gradients due to the layering of the media.
X 1+
(0, 0)
X2-
2
Fig. 1. Problem domain X.
Table 2 Media distribution for Problem 2 Medium type
P1
P2
P1
P1
P1
P1
P1
P2
P2
P2
Left [m] Bottom [m] Right [m] Top [m]
0 0 10 0.1
0 0.1 0.71 1
0.7 0.1 1.4 0.8
1.4 0.1 2.9 0.3
2.9 0.2 6.4 0.3
7.1 0.1 9.3 0.3
8.6 0.3 9.3 0.8
2.9 0.1 7.1 0.2
6.4 0.2 7.1 0.3
9.3 0.1 10 1
P2
P2
P2
P2
P2
P3
P3
P3
P4
P1
1.4 0.6 8.6 0.7
0.7 0.8 9.3 1
1.4 0.7 2.9 0.8
7.1 0.7 8.6 0.8
1.4 0.4 3.6 0.6
2.9 0.7 7.1 0.8
5.0 0.4 8.6 0.5
1.4 0.3 8.6 0.4
Left [m] Bottom [m] Right [m] Top [m]
3.6 0.4 5.0 0.6
5.0 0.5 8.6 0.6
M.W. Farthing et al. / Advances in Water Resources 26 (2003) 833–849
841
Table 3 Media properties for Problem 2 Medium type
hs
hr
nv
av [m 1 ]
Ks [m/d]
P1 P2 P3 P4
0.41 0.40 0.39 0.39
0.07749 0.03120 0.03822 0.02691
2.090 4.264 2.370 3.264
0.244 5.470 0.478 0.244
1.10808 · 10 5 5.04000 · 100 1.80100 · 10 3 4.04000 · 100
Table 4 Fluid and domain properties for Problem 2 Variable
Value
Units
gu1 gu2 kgk .0 b p0 X1 X2 ur
)0.7071 )0.7071 7.321 · 1010 998.2 6.564 · 10 20 0 10 1 )0.4
– – m/d2 kg/m3 m d2 /kg kg/m d2 m m m/d
4.2. Numerical experiments For both test problems, we performed several numerical experiments on a series of grids. On each grid, we ran NewtonÕs method with a quadratic line search (NILS), the Newton–Picard hybrid approach (PIH), as well as two Wtc methods, SER and TTE, until kRk2 < ðkR0 k2 þ 1Þ 10 5 . The initial guess was taken to be the initial conditions for the corresponding transient problem. For the NILS and PIH calculations, we enforced a maximum number of nonlinear iterations sn 6 Nmax ¼ 1000, while the maximum number of steps
for the SER and TTE runs was n 6 Nmax ¼ 5000. The maximum number of line searches allowed was 1000 and the sufficient decrease parameter for Newton line search was s ¼ 10 4 . The PIH approach used a value of c ¼ 10 2 in its switching strategy, Eq. (35). The Wtc methods used an initial time step of Dt0 ¼ 4:050 10 5 , a maximum time step Dtmax ¼ 105 , and the TTE algorithm used s ¼ 1:0. The relevant parameters are summarized in Table 5. The linear systems for Problem 1 were solved using the LU-B decomposition from LAPACK with each method. For Problem 2, we ran the numerical experiments for each of the four steady-state solution methods with the LU-S direct solver as well the two iterative methods (BiCGstab-ILU and BiCGstab-HAS). A relative residual test was used for BiCGstab with a tolerance of 10 6 . The maximum number of linear iterations allowed was 2000 except where noted. The simulations were performed on a Pentium 4 (2.53 GHz) workstation with 256 MB of RAM running Redhat Linux 7.3. The elapsed time for the simulations was recorded using the ANSI C clock and time intrinsics. The results for Problem 1 on spatial grids of size ne ¼ 41–161 are presented in Table 6. The labels are as follows:
Fig. 2. logðKs Þ for Problem 2.
842
M.W. Farthing et al. / Advances in Water Resources 26 (2003) 833–849
Fig. 3. Steady-state volume fraction for second test problem.
• • • • • • •
Table 5 Numerical methods summary
Nmax s c
Nmax Dt0 Dtmax s
NILS
PIH
1000 10 4 –
1000 – 10 2
TTE
SER
5000 4.050 · 10 5 105 1.0
5000 4.050 · 10 5 105 –
Steps––completed iterations for method NLI––nonlinear iteration NLF––nonlinear iteration failure LI––linear iteration LF––linear iteration failure Clock––wall clock time (s) X––failed to converge
The simulations for Problem 1 were small and involved a homogeneous medium. As a result, we expected the methods to have little difficulty in its solution. Both the NILS method and the PIH iteration performed well, even though the NILS method employed a number of line searches on each spatial grid. PIH took roughly half as many steps as NILS and needed only one line search
• Jeval––Jacobian evaluation • Feval––function evaluation Table 6 LU-B runs for Problem 1 Jeval
Steps
NLI
NLF
LI
LF
LS
Clock
299 33 103 167
30 14 51 83
30 14 0 0
0 0 0 0
0 0 0 0
0 0 0 0
102 1 0 0
0.05 0.01 0.06 0.04
26 15 104 168
253 35 209 337
26 15 104 168
26 15 0 0
0 0 0 0
0 0 0 0
0 0 0 0
88 1 0 0
0.04 0.03 0.14 0.09
25 16 207 353
329 35 410 705
25 16 202 351
25 16 0 0
0 0 0 0
0 0 0 0
0 0 0 0
128 0 0 0
0.08 0.06 1.43 0.33
ne ¼ 41 NILS PIH TTE SER
30 14 51 83
ne ¼ 81 NILS PIH TTE SER ne ¼ 161 NILS PIH TTE SER
Feval
M.W. Farthing et al. / Advances in Water Resources 26 (2003) 833–849 -1
10
NILS PIH -1
10
||R|| 2 /||R0||2
-3
10
10
10
-5
-7
-9
10
0
5
10
15 iterate
20
25
30
Fig. 4. NILS/PIH residual history, Problem 1.
once it switched to the Newton iteration, indicating that the initial Picard iterations were more successful in advancing the intermediate iterates closer to the steadystate solution than NewtonÕs method with a line search alone. The Wtc methods also converged to the correct solution, but required significantly more steps than either the NILS or PIH approaches. To illustrate the methodsÕ performance, Fig. 4 shows the residual history for NILS and PIH. Both methods achieved quadratic convergence as they approached the root. Problem 2 was more challenging than Problem 1 due to its dimensionality, heterogeneous medium, and the
843
nonlinear boundary condition at the surface. Results of the numerical experiments with LU-S and BiCGstab are presented in Tables 7–9. The increased wall clock times, iteration counts, and line searches reflect the added difficulty of Problem 2. NILS converged for every linear solver and spatial grid in Tables 7–9. It required a similar number of nonlinear iterations and line searches for a particular grid, regardless of the linear solver used. Unlike NILS, PIH failed for the cases considered. With one exception where it failed in the initial linear solve, PIH did not reduce the original nonlinear residual sufficiently to satisfy Eq. (35) and switch to NILS. In these cases, it exhausted the allowed number of nonlinear iterations. Both the Wtc methods converged for each spatial grid and linear solver combination in Tables 7–9, with TTE consistently between 2 and 5 times faster than SER. The run times for NILS and TTE were similar for most of the simulations. NILS was more efficient with the LU-S solver, particularly for the 81 · 81 grid. On the other hand, TTE was more efficient for the BiCGstab calculations. The difference in run times increased for the HAS preconditioner, where TTE was 3 and 1.4 times faster than NILS on the 41 · 41 and 81 · 81 grids respectively. The results also indicate a difference in the way NILS and TTE behaved as the spatial grid was refined. Namely, NILS required roughly the same number of iterations to converge regardless of spatial grid or linear solver, while the number of steps taken by TTE grew noticeably as ne increased. To investigate
Table 7 LU-S runs for Problem 2 Jeval
Feval
Steps
NLI
NLF
LI
LF
LS
ne ¼ 11 11 NILS PIH TTE SER
Clock
83 1000 49 145
3901 6173 99 291
83 1000 49 145
83 1000 0 0
0 1 0 0
83 1000 49 145
0 0 0 0
1806 0 0 0
1.5 X 0.26 1.65
ne ¼ 21 21 NILS PIH TTE SER
132 1000 91 379
7057 2239 183 759
132 1000 91 379
132 1000 0 0
0 1 0 0
132 1000 91 379
0 0 0 0
3291 0 0 0
2.55 X 2.59 7.5
ne ¼ 41 41 NILS PIH TTE SER
106 1000 173 866
4725 2885 346 1732
106 1000 172 865
106 1000 0 0
0 1 0 0
106 1000 173 866
0 0 0 0
2181 0 0 0
9.92 X 12.23 53.29
ne ¼ 81 81 NILS PIH TTE SER
149 1000 450 2068
7369 3449 888 4135
149 1000 437 2066
149 1000 0 0
0 1 0 0
149 1000 450 2068
0 0 0 0
3412 0 0 0
56.44 X 164.43 709.67
844
M.W. Farthing et al. / Advances in Water Resources 26 (2003) 833–849
Table 8 BiCGstab-ILU runs for Problem 2 Jeval
Feval
Steps
NLI
NLF
ne ¼ 11 11 NILS PIH TTE SER
83 1000 49 145
3901 6173 99 291
83 1000 49 145
83 1000 0 0
0 1 0 0
ne ¼ 21 21 NILS PIH TTE SER
134 1000 98 378
7205 2239 197 757
134 1000 98 378
134 1000 0 0
ne ¼ 41 41 NILS PIH TTE SER
111 1000 177 865
4863 2893 354 1730
111 1000 176 864
ne ¼ 81 81 NILS PIH TTE SER
145 1 424 2068
7127 3 847 4135
145 1 422 2066
LI
LF
LS
Clock
1297 7371 210 445
0 0 0 0
1806 0 0 0
1.58 X 0.3 1.77
0 1 0 0
6895 19419 970 2559
0 0 0 0
3360 0 0 0
4.5 X 2.08 7.14
111 1000 0 0
0 1 0 0
11153 38981 2215 11693
0 0 0 0
2239 0 0 0
17.01 X 13.49 54.44
145 1 0 0
0 0 0 0
31799 2000 7934 50243
0 1 0 0
3301 0 0 0
148.57 X 136.65 604
Clock
Table 9 BiCGstab-HAS runs for Problem 2 Jeval
Feval
Steps
NLI
NLF
LI
LF
LS
ne ¼ 11 11 NILS PIH TTE SER
84 1000 49 146
3959 6173 99 293
84 1000 49 146
84 1000 0 0
0 1 0 0
4200 24663 425 1076
0 0 0 0
1833 0 0 0
1.13 X 0.38 1.99
ne ¼ 21 21 NILS PIH TTE SER
131 1000 88 379
7059 2239 177 759
131 1000 88 379
131 1000 0 0
0 1 0 0
46899 51097 1628 5799
0 0 0 0
3293 0 0 0
26.3 X 2.63 9.04
ne ¼ 41 41 NILS PIH TTE SER
110 1000 168 866
4991 3019 336 1732
110 1000 167 865
110 1000 0 0
0 1 0 0
24471 74767 4814 26241
0 0 0 0
2305 0 0 0
69.02 X 23.93 117.69
ne ¼ 81 81 NILS PIH TTE SER
148 1000 439 2068
7315 3449 867 4135
148 1000 427 2066
148 1000 0 0
0 1 0 0
30762 69639 11964 78706
0 0 0 0
3389 0 0 0
395.15 X 288.42 1377.61
the performance of TTE, we ran an additional set of calculations where s was set to 10 3 and scaled by ne for each grid. The corresponding s values ranged from 0.121 on the 11 · 11 grid to 25.9 on a grid with ne ¼ 161 161. As a result, the TTE time step selection was slightly more conservative on coarser grids than the original choice of s ¼ 1 and was more ag-
gressive on the finer grids. Table 10 contains the results for TTE with LU-S and BiCGstab for ne ¼ 11 11 to ne ¼ 161 161 as well as the results for NILS on the finest grid. With the use of a scaled s, Steps for TTE was roughly the same for ne ¼ 11 11 to ne ¼ 81 81 and each linear solver. There was still, however, an increase in the
M.W. Farthing et al. / Advances in Water Resources 26 (2003) 833–849
845
Table 10 Runs with s scaled by ne for Problem 2 ne
Jeval
Feval
Steps
NLI
NLF
11 · 11 21 · 21 41 · 41 81 · 81 161 · 161 161 · 161
114 127 125 146 248 232
229 255 250 281 458 12961
114 127 124 134 209 232
0 0 0 0 0 232
0 0 0 0 0 0
BiCGstab-ILU TTE 11 · 11 TTE 21 · 21 TTE 41 · 41 TTE 81 · 81 TTE 161 · 161 NILS 161 · 161
113 129 130 148 806 1
227 259 260 289 1185 3
113 129 129 140 378 1
0 0 0 0 0 1
BiCGstab-HAS TTE 11 · 11 TTE 21 · 21 TTE 41 · 41 TTE 81 · 81 TTE 161 · 161 NILS 161 · 161
115 125 131 152 283 236
231 251 262 288 472 13153
115 125 130 135 188 236
0 0 0 0 0 236
number of steps for the finest grid, particularly for the BiCGstab-ILU solver. TTE with the scaled s took slightly longer than the original s ¼ 1 calculations on the coarser grids, but reduced the total simulation time as the grids were refined. There was a corresponding improvement for TTE on the larger spatial grids with all three solvers. The computational effort for NILS and TTE with s scaled was essentially the same for LU-S except for the finest grid, where NILS was 1.6 times faster. For BiCGstab-ILU and BiCGstab-HAS, TTE with s scaled was twice as fast as the simulations with s ¼ 1 on the ne ¼ 81 81 grid. We also note that BiCGstab-ILU did not perform well with the ne ¼ 161 161 system. Simulations with each of the steady-state solution methods and BiCGstab-ILU typically required significantly less computational effort than BiCGstab-HAS for the coarser grids. However, for ne ¼ 161 161 the total number of steps, linear iterations, and simulation time increased significantly for TTE while NILS failed after both 2000 and 10,000 linear iterations. As an example of the progress of the Wtc iterations in comparison to the Newton and Newton–Picard iterations, we graphed the history of kRk2 versus iteration for SER on the ne ¼ 81 81 grid with the LU-S linear solver in Fig. 5. As it neared the steady-state solution, the methodÕs convergence steepened. This reflects the convergence of the Newton iteration that Wtc reduces to near the root. Unlike the globalized Newton iteration, however, Wtc does not enforce a decreasing sequence of residuals. As a result, Fig. 5 shows increased residuals at
LF
LS
Clock
114 127 125 146 248 232
0 0 0 0 0 0
0 0 0 0 0 6043
1.57 2.96 8.16 58 667.07 429.58
0 0 0 0 0 0
447 1132 1739 3474 92232 10000
0 0 0 0 0 1
0 0 0 0 0 0
0.69 2.36 9.72 65.85 2330 X
0 0 0 0 0 0
979 2173 3782 7192 20806 70484
0 0 0 0 0 0
0 0 0 0 0 6130
1.86 4.28 18.17 131.9 1386.07 3397
0
10
-2
10
-4
10 ||R|| /||R || 2 2 0
LU-S TTE TTE TTE TTE TTE NILS
LI
-6
10
-8
10
-10
10
-12
10
0
500
1000
1500
2000
2500
step
Fig. 5. SER residual history, Problem 2.
some points in the SER calculations. The iteration history of TTE was more extreme, often oscillating wildly before nearing the steady-state solution. Fig. 6 shows the TTE calculation with s ¼ 1 on the ne ¼ 81 81 grid. While only the final, steady-state solution is of interest, the residual oscillations can make determining convergence for TTE more difficult and indicate that a conservative steady-state stopping criterion should be used. For some problems, it has been found that performing multiple sub-iterations for Eq. (27) can improve convergence [16,24]. However, we investigated this
846
M.W. Farthing et al. / Advances in Water Resources 26 (2003) 833–849 0
10
-2
10
-4
||R|| /||R || 2 2 0
10
-6
10
-8
10
-10
10
-12
10
0
50
100
150
200
250 step
300
350
400
450
Fig. 6. TTE residual history, Problem 2.
alternative for the SER and TTE updates for the second test problem and found no improvement for either approach.
5. Discussion We began with the goal of identifying effective methods for the steady-state solution of RE. To this end, we performed several numerical experiments for NewtonÕs method with two globalization techniques (NILS and PIH) as well as two versions of Wtc (SER and TTE). Various observations can be made based on the results, which reflect a range of difficulty for the one and two-dimensional variably saturated flow problems examined in this work. The use of a line search or hybrid Picard iteration made NewtonÕs method more robust for Problem 1. PIH required roughly half the iterations of NILS, while both it and NILS took considerably fewer steps than either SER or TTE to reach steady state. The first test problem was intended to represent behavior for mild problems since the boundary conditions were simple and the conductivity was homogeneous. The resulting linear systems were also small and tridiagonal. Under these ideal conditions, NILS had little difficulty, and PIH was able to speed convergence to the steadystate solution. The Wtc methods offered no benefit in this situation. The methodsÕ relative performance changed though, as we moved to more realistic conditions. The controlling factor in their performance for Problem 2 was the difficulty associated with solving the resulting linear systems for each method. In addition to nonlinear boundary conditions, Problem 2 involved a two-
dimensional, block-heterogeneous domain with medium properties corresponding to sand and clay. As a result, the linear systems for Problem 2 were more poorly scaled than in Problem 1. On the first step of the 81 · 81 grid the condition number of the NILS Jacobian was approximately 1019 and after 150 iterations was 109 (condition numbers calculated using the dgbtrf/ dgbcon routines from LAPACK). Poor scaling of the Jacobians manifested itself in a number of ways, including the added work required by BiCGstab to converge with both preconditioners. In general, the difficulty of solving the linear systems can also translate into less accurate search directions as, for example, one is forced to use coarser linear solver tolerances to obtain results given computational limitations. The performance of NILS with the LU-S solver illustrates the impact of the Jacobian scaling on direct methods and the result of inaccurate search directions. The results in Table 7 indicate that NILS was the most efficient approach when LU-S was used for the LU decompositions. However, when we performed the tests with the LU solver from PETSc which does not include pivoting, the solution was less accurate, and NILS either exhausted the allowed nonlinear iterations or stagnated in the line search. The PIH approach did not perform as it was intended for Problem 2 because the initial Picard iterations did not adequately reduce the initial residual. The convergence of PIH could have been improved by relaxing the switching criterion in Eq. (35) or enforcing a maximum number of Picard iterations, so that it would have switched to the Newton updates before exhausting the allowed iterations. However, this would be, in essence, just NILS and would have masked the ineffectiveness of the initial Picard iterations for Problem 2. In contrast, SER was robust but relatively inefficient. It converged for each of the runs and linear solvers. Its iteration histories were more predictable than TTE and showed more consistent reduction in the steady-state residual. However, SER was significantly slower than TTE for most cases. The two most efficient methods for Problem 2 were NILS and TTE. Their relative efficiency was dictated largely by the computational expense associated with the linear solvers. With both methods, the number of steps required for a given spatial grid was similar using the LU-S and BiCGstab solvers. However, the computational effort required changed significantly. Using a robust direct solver, NILS was three time faster than TTE with s ¼ 1 on the 81 · 81 grid, but it required four times as many total linear iterations with BiCGstab-ILU and three times as many using BiCGstabHAS. As a result, it was 9% slower using BiCGstab-ILU and 28% slower using BiCGstab-HAS. The total number of iterations for NILS was fairly consistent on the spatial grids considered. On the other
M.W. Farthing et al. / Advances in Water Resources 26 (2003) 833–849
hand, TTE with s ¼ 1 did not scale as well when the spatial grids were refined. The total number of steps required to converge increased with ne , making TTE more expensive. Varying s based on ne led to more consistent results for TTE and improved its performance for finer grids. NILS was still more efficient with a direct solver on the 161 · 161 grid, but it required more than four times as many linear iterations as TTE with BiCGstab. As a result, TTE was between 2 and 3 times faster than NILS on the finer grids using the iterative linear solvers and converged for BiCGstab-ILU on the 161 · 161 grid when NILS failed. The value of a number of parameters like c ; s ; rmin ; rmax and the linear solver tolerances for BiCGstab effected each of the methods considered to some degree. Other work has suggested that an adaptive strategy for selecting the linear solver tolerances based on the Eisenstat–Walker criteria [13] can reduce overall computational expense of both NILS and Wtc [16]. In Problem 2, however, the behavior of NILS was already quite sensitive to the reduced accuracy of the linear solvers due to ill-conditioning, even at early stages of the iteration. Tuning a robust and effective adaptive strategy for linear solver tolerances would likely be a worthwhile but difficult undertaking. Other parameter tuning issues arise in TTE, in particular the choice of s. Finding a successful configuration for a given method and problem usually requires some effort. While different choices of parameter values may lead to improved performance, the basic behavior of the methods was the same for the range of parameters we investigated. NILS was preferable for the simulations with a robust direct solver like LU-S, but BiCGstab performed better with the Wtc algorithms. TTE was more aggressive than SER and more efficient than NILS when BiCGstab was used with either ILU or HAS preconditioning. As a reference point, we also compared the performance of the Wtc methods to time-accurate integration using a variable-order, variable step-size DAE integrator [21]. As might be expected, the full time integration was the most reliable approach for obtaining steadystate solutions, but the computational expense incurred was from 5 to 25 times greater than that of SER, which was similarly robust. An initial guess closer to the final solution would have improved the performance of each of the methods considered. Since the goal of PIH and the Wtc methods is to approximate a Newton iteration near the steadystate solution, one can expect the advantage of NILS to increase as the initial guess better approximates the steady-state solution. However, we used static equilibrium as a reasonable initial guess, since we are interested in evaluating the methods for problems where good initial guesses for the steady-state solution are difficult to obtain.
847
The range of spatial grids presented for the numerical experiments was relatively coarse and the resulting linear systems were small to moderate in size. Still, the Jacobians were ill-conditioned for Problem 2, which proved to be a significant test for the direct and iterative linear solvers. For larger systems arising from realistic two and three-dimensional problems, we can only expect the difficulties associated with the linear systems to become more severe. Moreover, large simulations will often have to be solved in parallel to obtain results in a reasonable time frame. For the numerical experiments presented here, NILS combined with the LU-S solver was the most efficient approach on each spatial grid. However, preconditioned Krylov methods and sparse direct solvers have their own advantages and disadvantages depending on the problem and architecture [12,17]. A general comparison of their relative merits is beyond the scope of this paper.
6. Conclusions Our numerical experiments for Wtc approaches as well as NewtonÕs method with various globalization techniques lead us to the following conclusions and recommendations: • For problems where use of a robust direct solver is feasible, NewtonÕs method with a line search is the most efficient approach for obtaining steady-state solutions to RE. • Using an initial number of Picard iterations for NewtonÕs method with line search can improve performance in some instances, but it does not necessarily lead to more robust performance for difficult problems. • Inexact Newton methods with standard globalization techniques have particular difficulty when the Jacobian is poorly scaled due to factors such as heterogeneous conductivity fields. • If NewtonÕs method fails or performs poorly for a given steady-state problem, it is worth examining a range of linear solver and line-search parameters before abandoning a Newton approach. • Wtc is a relatively simple approach that can improve the efficiency and robustness of existing steady-state solvers for RE on difficult problems, particularly if iterative linear solvers are used.
Acknowledgements The authors would like to thank Dr. T.-C. Yeh for several helpful suggestions. The research of CEK and CTK was supported in part by National Science
848
M.W. Farthing et al. / Advances in Water Resources 26 (2003) 833–849
Foundation grants DMS-0070641, DMS-0112542, and Army Research Office grant DAAD19-02-1-0391. The efforts of MWF and CTM were supported by grants 5 P42 ES05948 from the National Institute of Environmental Health Sciences and DMS-0112653 from the National Science Foundation. Computational support was provided by the North Carolina Supercomputer Center.
References [1] Anderson E, Bai Z, Bischof C, Demmel J, Dongarra J, Du Croz J, et al. LAPACK UsersÕ Guide. Philadelphia, PA: SIAM; 1992. [2] Ashcraft C, Grimes R. Spooles: An object-oriented sparse matrix library. In: Proceedings of the 1999 SIAM Conference on Parallel Processing for Scientific Computing, March 22–24, San Antonio. SIAM; 1999. [3] Ashcraft C, Grimes R. SPOOLES homepage. Technical Report http://www.netlib.org/linalg/spooles/spooles.2.2.html, 1999. [4] Baca RG, Chung JN, Mulla DJ. Mixed transform finite element method for solving the non-linear equation for flow in variably saturated porous media. Int J Numer Methods Fluids 1997; 24:441–55. [5] Balay S, Gropp WD, McInnes LC, Smith BF. Efficient management of parallelism in object oriented numerical software libraries. In: Arge E, Bruaset AM, Langtangen HP, editors. Modern software tools in scientific computing. Boston, MA: Birkhauser; 1997. p. 163–202. [6] Balay S, Gropp WD, McInnes LC, Smith BF. PETSc homepage. Technical Report http://www.mcs.anl.gov/petsc, Argonne National Laboratory, 1998. [7] Balay S, Gropp WD, McInnes LC, Smith BF. PETSc 2.0 users manual. Technical Report ANL-95/11- Revision 2.0.28, Argonne National Laboratory, 2000. [8] Bergamaschi L, Putti M. Mixed finite elements and Newton-type linearization for the solution of RichardsÕ equation. Int J Numer Methods Eng 1999;45:1025–46. [9] Celia MA, Bouloutas ET, Zarba RL. A general mass-conservative numerical solution for the unsaturated flow equation. Water Resour Res 1990;26(7):1483–96. [10] Chounet LM, Hilhorst D, Jouron C, Kelanemer Y, Nicolas P. Simulation of water flow and heat transfer in soils by means of a mixed finite element method. Adv Water Resour 1999;22(5):445–60. [11] Coffey TS. Temporal and pseudo-temporal numerical integration methods. PhD thesis, North Carolina State University, Raleigh, NC, 2002. [12] Duff IS, van der Vorst HA. Developments and trends in the parallel solution of linear systems. Parallel Comput 1999;25(13– 14):1931–70. [13] Eisenstat SC, Walker HF. Choosing the forcing terms in an inexact Newton method. SIAM J Sci Comput 1996;17:16–32. [14] Farthing MW, Kees CE, Miller CT. Mixed finite element methods and higher-order temporal approximations. Adv Water Resour 2002;25(1):85–101. [15] Forsyth PA, Wu Y-S, Pruess K. Robust numerical methods for saturated–unsaturated flow with dry initial conditions in heterogeneous media. Adv Water Resour 1995;18:25–38. [16] Gropp WD, Keyes DE, McInnes LC, Tidriri MD. Globalized Newton–Krylov–Schwarz algorithms and software for parallel implicit CFD. Int J High Performance Comput Appl 2000; 14(2):102–36. [17] Gupta A. Recent advances in direct methods for solving unsymmetric sparse systems of linear equations. Assoc Comput Machinery, Trans Math Software 2002;28(3):301–24.
[18] Jenkins EW, Kees CE, Kelley CT, Miller CT. An aggregationbased domain decomposition preconditioner for groundwater flow. SIAM J Sci Comput 2001;23(2):430–41. [19] Jones JE, Woodward CS. Newton–Krylov-multigrid solvers for large-scale, highly heterogeneous, variably saturated flow problems. Adv Water Resour 2001;24(7):763–74. [20] Kavetski D, Binning P, Sloan SW. Adaptive time stepping and error control in a mass conservative numerical solution of the mixed form of RichardsÕ equation. Adv Water Resour 2001; 24:595–605. [21] Kees CE, Miller CT. C++ implementations of numerical methods for solving differential-algebraic equations: design and optimization considerations. Assoc Comput Machinery, Trans Math Software 1999;25(4):377–403. [22] Kees CE, Miller CT. Higher order time integration methods for two-phase flow. Adv Water Resour 2002;25(2):159–77. [23] Kelley CT. Iterative methods for linear and nonlinear equations. Philadelphia, PA: SIAM; 1995. [24] Kelley CT, Keyes DE. Convergence analysis of pseudotransient continuation. SIAM J Numer Analysis 1998; 35(2): 508–23. [25] Knoll DA, McHugh PR. Enhanced nonlinear iterative techniques applied to nonequilibrium plasma flow. SIAM J Sci Comput 1998;19(1):291–301. [26] Lehmann F, Ackerer Ph. Comparison of iterative methods for improved solutions of the fluid flow equation in partially saturated porous media. Transport Porous Media 1998;31(3): 275–92. [27] Miller CT, Christakos G, Imhoff PT, McBride JF, Pedit JA, Trangenstein JA. Multiphase flow and transport modeling in heterogeneous porous media: challenges and approaches. Adv Water Resour 1998;21(2):77–120. [28] Miller CT, Williams GA, Kelley CT, Tocci MD. Robust solution of RichardsÕ equation for non-uniform porous media. Water Resour Res 1998;34(10):2599–610. [29] Mualem Y. A new model for predicting the hydraulic conductivity of unsaturated porous media. Water Resour Res 1976;12(3):513– 22. [30] Paniconi C, Putti M. A comparison of Picard and Newton iteration in the numerical solution of multidimensional variably saturated flow problems. Water Resour Res 1994;30(12):3357– 74. [31] Paniconi C, Putti M. Newton-type linearization and line search methods for unsaturated flow models. In: Aral MM, editor. Advances in groundwater pollution control and remediation. NATO ASI Series 2: Environment, vol. 9. Kluwer Academic; 1996. p. 155–72. [32] Parlange J-Y, Hogarth WL, Barry DA, Parlange MB, Haverkamp R, Ross PJ, et al. Analytical approximations to the solution of RichardsÕ equation with applications to infiltration, ponding, and time compression approximation. Adv Water Resour 1999;23: 189–94. [33] Rathfelder K, Abriola LM. Mass conservative numerical solutions of the head-based Richards equation. Water Resour Res 1994;30(9):2579–86. [34] Simunek J, Vogel T, van Genuchten MTh. The SWMS_2D code for simulating water flow and solute transport in two-dimensional variably saturated media. Technical Report Research Report 132, US Salinity Laboratory, Agricultural Research Service, US Department of Agriculture, Riverside, CA, 1994. [35] Tocci MD, Kelley CT, Miller CT. Accurate and economical solution of the pressure-head form of RichardsÕ equation by the method of lines. Adv Water Resour 1997;20(1):1–14. [36] Tocci MD, Kelley CT, Miller CT, Kees CE. Inexact Newton methods and the method of lines for solving RichardsÕ equation in two space dimensions. Comput Geosci 1999; 2(4):291–309.
M.W. Farthing et al. / Advances in Water Resources 26 (2003) 833–849 [37] van Genuchten MTh. A closed-form equation for predicting the hydraulic conductivity of unsaturated soils. Soil Sci Soc Am J 1980;44:892–8. [38] Williams GA, Miller CT, Kelley CT. Transformation approaches for simulating flow in variably saturated porous media. Water Resour Res 2000;36(4):923–34.
849
[39] Wu Y-S, Forsyth PA. On the selection of primary variables in numerical formulation for modeling multiphase flow in porous media. J Contaminant Hydrol 2001;48:277–304. [40] Zaidel J, Russo D. Estimation of finite difference interblock conductivities for simulation of infiltration into initially dry soils. Water Resour Res 1992;28(9):2285–95.