Journal of Computational and Applied Mathematics 311 (2017) 599–617
Contents lists available at ScienceDirect
Journal of Computational and Applied Mathematics journal homepage: www.elsevier.com/locate/cam
A BDDC algorithm with adaptive primal constraints for staggered discontinuous Galerkin approximation of elliptic problems with highly oscillating coefficients Hyea Hyun Kim a , Eric T. Chung b,∗ , Chenxiao Xu c a
Department of Applied Mathematics and Institute of Natural Sciences, Kyung Hee University, Republic of Korea
b
Department of Mathematics, The Chinese University of Hong Kong, Hong Kong
c
Department of Applied Mathematics and Statistics, The State University of New York, Stony Brook, United States
article
info
Article history: Received 13 June 2016 Received in revised form 27 August 2016 Keywords: BDDC algorithm Staggered discontinuous Galerkin method Primal unknowns Generalized eigenvalue problem
abstract A BDDC (Balancing Domain Decomposition by Constraints) algorithm for a staggered discontinuous Galerkin approximation is considered. After applying domain decomposition method, a global linear system on the subdomain interface unknowns is obtained and solved by the conjugate gradient method combined with a preconditioner. To construct a preconditioner that is robust to the coefficient variations, a generalized eigenvalue problem on each subdomain interface is solved and primal unknowns are selected from the eigenvectors using a predetermined tolerance. By the construction of the staggered discontinuous Galerkin methods, the degrees of freedom on subdomain interfaces are shared by only two subdomains, and hence the construction of primal unknowns are simplified. The resulting BDDC algorithm is shown to have the condition number bounded by the predetermined tolerance. A modified algorithm for parameter dependent problems is also introduced, where the primal unknowns are only computed in an offline stage. Numerical results are included to show the performance of the proposed method and to verify the theoretical estimate. © 2016 Elsevier B.V. All rights reserved.
1. Introduction In this paper, a BDDC (Balancing Domain Decomposition by Constraints) algorithm with adaptive primal unknowns is developed and analyzed for a staggered discontinuous Galerkin (SDG) formulation of second order elliptic problems with highly varying and random coefficients in both two and three dimensions. The SDG method is developed by Chung and Engquist in [1,2] for wave propagation problems and further extended for curl–curl problems, advection–diffusion equations, the Stokes system and the Navier–Stokes equations [3–6]. In two-dimensional SDG modeling, each triangle in a given triangulation is subdivided into three triangles by connecting an interior point to its three vertices. A pair of finite element spaces is then defined on the refined triangulation, imposing one element continuous on the edges of the original triangulation while the other continuous on the set of edges generated by the subdivision process. The resulting basis functions are thus globally discontinuous, but locally continuous on a union of a few triangles. A similar idea can be generalized for 3D modeling, see [2,3]. The staggered continuity property provides a conservative inter-element flux condition and results in an optimal order of approximation. For elliptic problems, the final linear system arising from the
∗
Corresponding author. E-mail addresses:
[email protected],
[email protected] (H.H. Kim),
[email protected] (E.T. Chung),
[email protected] (C. Xu).
http://dx.doi.org/10.1016/j.cam.2016.08.028 0377-0427/© 2016 Elsevier B.V. All rights reserved.
600
H.H. Kim et al. / Journal of Computational and Applied Mathematics 311 (2017) 599–617
SDG method is symmetric and positive definite without adding numerical fluxes and penalty parameters, in contrast to standard DG methods [7–11]. However, one disadvantage is that, the linear system is larger and less sparse than those from other DG or conforming Galerkin formulations. Therefore, in practical use, a fast solver for SDG formulation is much desirable. In the work [12] by Kim, Chung, and Lee, a BDDC algorithm was developed and analyzed for the SDG formulation of elliptic problems with irregular subdomain partitions and coefficient which is constant on each subdomain. Under certain assumptions it is proved to have an optimal bound C (1 + log(H /h))2 of condition numbers for two-dimensional case and a sub-optimal but scalable bound C (H /h) for three-dimensional case, where the constant C is independent of coefficient jumps and mesh parameters. However, standard BDDC algorithms and also that in [12] require a strong assumption on the coefficients inside each subdomain. In the works by Pechstein and Scheichl [13,14], a bound of condition numbers of FETI (Finite Element Tearing and Interconnecting) algorithms has been analyzed depending on the coefficient variations inside subdomains and across the subdomain interfaces. A similar problem was also considered in the work [15] and an additional set of primal unknowns is introduced as in [16–18]. As an another variant, to address a similar problem for two-level Schwarz methods, adaptive coarse spaces have been introduced and analyzed in [19–21]. In this paper, we will develop and analyze a BDDC algorithm with adaptive primal unknowns for SDG formulation of elliptic problems. We introduce a set of adaptive primal unknowns by considering a generalized eigenvalue problem on each subdomain interface. The generalized eigenvalue problem is formed by the parallel sum of local Schur complement matrices without any concern on the subdomain partition or the coefficient variations across the interfaces, as in [22–25]. We emphasize that due to the property of SDG approximation, unknowns on the subdomain interface are shared only by two subdomains. This is different from conforming Galerkin approximation where unknowns at subdomain vertices and unknowns on subdomain edges in 3D are shared by more than two subdomains. This simplifies the algorithm quite a lot for SDG approximation. The generalized eigenvalue problem is considered for each subdomain interface shared by two subdomains and a set of primal constraints is then formed by the eigenvectors with their eigenvalues greater than the given tolerance value λTOL . Similar to [23], we apply a change of basis to express the primal constraints explicitly and enforce the strong continuity. Our BDDC algorithm can then be applied just as in the standard BDDC algorithm with the selected primal unknowns. A bound of condition number, C λTOL , can be derived by working on algebraic calculations on matrices and vectors following [26], where C is independent of the coefficient variations and λTOL is the given tolerance. Due to the relation between SDG method and the hybridized discontinuous Galerkin (HDG) methods [27,28], we expect that the preconditioner developed in this paper can be used for HDG methods. We perform numerical experiments for various test models. The results show that our BDDC algorithm can keep a good computational performance for extremely oscillating and discontinuous coefficients. In addition, we provide a modified algorithm which works for problems with coefficient parameter generated by a combination of several components. We first obtain several sets of primal unknowns by some given combinations. A new set of primal unknowns is formed by solving an eigenvalue problem generated by those pre-calculated primal unknowns. Dual unknowns are then generated by SVD (Singular Value Decomposition) process. Numerical examples show that the revised algorithm can save the cost for solving generalized eigenvalue problems for large-scale problems. The rest of paper is organized as follows. In Section 2, the SDG formulation and related local matrices to a subdomain partition will be introduced for a model problem. In Section 3, we will further illustrate our algorithm, on how to construct adaptive primal unknowns and BDDC preconditioner. In Section 4, an estimate of the conditioner numbers will be analyzed. Several numerical examples will then be presented in Section 5 to verify the performance of the proposed method. In Section 6, we will extend the algorithm to problems with parameter combination. Conclusions will be given in Section 7. 2. Staggered discontinuous Galerkin formulation 2.1. Model problem We consider an elliptic problem in a bounded domain Ω ∈ Rd where d = 2 or 3 with homogeneous Dirichlet boundary condition: find u ∈ H01 (Ω ) such that
− ∇ · (ρ(x)∇ u(x)) = f (x) ∀x ∈ Ω ,
(2.1)
where ρ(x) ≥ ρ0 > 0 and f (x) is in L (Ω ). Here L (Ω ) is the space of square integrable functions and (Ω ) is the space of square integrable functions of which weak derivatives are in L2 (Ω ) up to the first order derivatives, and of the zero trace value on the boundary of Ω . The coefficient ρ(x) can be discontinuous and even highly oscillating in Ω , but will be later assumed to be constant in each finite element. 2
2
H01
2.2. Staggered discontinuous Galerkin discretization Following Chung and Engquist [1,2], we first define an initial triangulation Tu . We suppose the domain Ω is triangulated by a set of triangles in 2D (or tetrahedra in 3D) and call by the edges (or faces) the equivalence classes shared by two finite
H.H. Kim et al. / Journal of Computational and Applied Mathematics 311 (2017) 599–617
601
Fig. 1. Illustration of the subdivision process in 2D.
elements. The notation Fu will denote the set of all such equivalence classes in this triangulation and Fu0 the subset of Fu , which is obtained after excluding those embedded in ∂ Ω . For each triangle of Tu in 2D, we select an interior point ν and denote this triangle by S (ν). We then further divide this triangle into three sub-triangles by connecting the point ν to the three vertices. The resulting triangulation is then denoted by T . Here the interior point ν is selected so that the triangulation T is shape regular. We denote by Fp the set of all new edges obtained by the subdivision. For 3D, a similar process can be applied by generating four sub-tetrahedra in one finite element. See Fig. 1 for an illustration of these concepts in 2D. In the SDG formulation, a pair of finite element spaces is introduced,
Sh := {v | v|τ ∈ P k (τ ), ∀τ ∈ T ; v is continuous across f ∈ Fu0 ; v|∂ Ω = 0} and
Vh = {V | V |τ ∈ [P k (τ )]d , ∀τ ∈ T ; V · n is continuous across f ∈ Fp }, where n is a unit normal vector to a given f ∈ Fp . For the construction of the basis functions for the above two spaces, see [29]. We remark that vector fields are typeset in bold faces throughout the paper. By using the pair and introducing an additional unknown Uh for approximation of ρ(x)∇ u(x), the SDG formulation is obtained: find (uh , Uh ) ∈ Sh × Vh such that
(Uh , V )Lρ2 (Ω ) − b∗h (uh , V ) = 0 ∀V ∈ Vh , bh (Uh , v) = (f , v)L2 (Ω )
∀v ∈ Sh ,
where
1
U · V dx, (f , v)L2 (Ω ) := (Uh , V )Lρ2 (Ω ) := Ω ρ(x) bh (V , v) := V · ∇v dx − V · n[v]dσ , Ω
f ∈Fp
Ω
f v dx,
f
and bh (v, V ) := − ∗
Ω
v∇ · V dx +
f ∈Fu0
v[V · n]dσ . f
Here [v] and [V · n] are defined as
[v] := (vτ1 n1 + vτ2 n2 ) · n,
[V · n] := Vτ1 · n1 + Vτ2 · n2 ,
where τ1 and τ2 are sub-triangles sharing an edge f , ni are the outward unit normal vector to f ∈ τi . We note that n is the unit normal vector chosen for the face f , i.e., can be any of n1 and n2 . We refer to [12] for details of the derivation. In addition, according to Lemma 2.4 of Chung and Engquist [2], we have bh (V , v) = b∗h (v, V ),
∀(v, V ) ∈ Sh × Vh .
Let Mh and Bh be matrices obtained from (U , V )Lρ2 (Ω ) and bh (U , v), where (U , V ) ∈ Vh × Vh and (V , v) ∈ Vh × Sh . Since
bh (V , v) = b∗h (v, V ), the matrix BTh corresponds to the bilinear operator b∗h (v, V) for (v, V ) ∈ Sh × Vh . We then obtain a discrete linear system of the variational form: Mh Uh − BTh uh = 0, Bh Uh = fh .
(2.2)
602
H.H. Kim et al. / Journal of Computational and Applied Mathematics 311 (2017) 599–617
Since Mh is symmetric, positive definite, and block diagonal, we can eliminate Uh from (2.2) to obtain an equation for uh , Bh Mh−1 BTh uh = fh ,
(2.3)
which is a symmetric and positive definite linear system. The positive definiteness can be proved by the inf–sup condition of bh (V , v), see Theorem 3.2 of [2]. Due to the subdivision process and the staggered continuity property, the linear system obtained in (2.3) has a larger size than those from the conforming approximation and standard DG approximation. Domain decomposition methods can then be applied to reduce the computational cost. For this purpose, the BDDC algorithm was developed in the first and second authors’ work [12] jointly with Lee when the coefficient ρ(x) is piecewise constant in each subdomain. On the other hand, when the coefficient ρ(x) has large variations across the subdomain interface, the BDDC algorithm equipped with the standard coarse problem does not give scalable results. Here we will extend the BDDC algorithm to a more general coefficient ρ(x) by enriching the coarse component of the BDDC algorithm. 3. The BDDC algorithm In the BDDC algorithm, the problem domain Ω is partitioned by non-overlapping subdomains {Ωi }, where Ωi are a connected union of triangles in Tu . The algebraic system in (2.3) is then reduced to a problem on the subdomain interface unknowns, which is obtained by eliminating unknowns interior to each subdomain. The interface problem is then solved by the conjugate gradient method applying a preconditioner. In the BDDC algorithm, the preconditioner solves independent local problems and one global coarse problem and then updates the residual by a weighted sum of these solutions. The local and global coarse problems are obtained by separating the interface unknowns into dual and primal unknowns. The coarse problem is formed by energy minimizing basis functions constrained at the primal unknowns. We refer to [30,26] for general introductions to BDDC algorithms. 3.1. Domain decomposition and static condensation Let {Ωi } be a non-overlapping partition of Ω and each Ωi is a subdomain consisting of a connected union of finite elements in Tu . We call the intersection of two subdomains the subdomain interface. In the SDG formulation, the unknowns on the subdomain interfaces are shared by only two subdomains contrary to the conforming finite element methods. Recalling the two discrete spaces Sh and Vh , we note that functions in Sh are continuous across the edges in the set Fu0 , while functions in Vh can be discontinuous across those in the set Fu0 . We thus can see that functions in Vh are decoupled across the subdomain interface. In other words, the algebraic equation in (2.2) is decoupled across the subdomain interface. For the given partition, we define local finite element spaces as
Vh,i := Vh|Ωi ,
Sh,i := Sh|Ωi
which are the restrictions of Vh and Sh to Ωi . Similarly, two local bilinear forms can be defined for V in Vh,i and v in Sh,i , bh,i (V , v) :=
V · ∇v dx − Ωi
b∗h,i (v, V ) := −
f ∈Fp ∩Ωi
Ω
v∇ · V dx +
V · n[v]dσ ,
f
f ∈Fu0 ∩Ωi
v[V · n]dσ + f
f ∈Fu0 ∩∂ Ωi
v V · ni dσ ,
f
where ni are the unit normal vector to f pointing outward of ∂ Ωi . We note that bh,i (V , v) = b∗h,i (v, V ). We then define matrices Bi corresponding to the bilinear forms bh,i (V , v),
⟨Bi V , v⟩ = bh,i (V , v) and thus obtain
⟨BTi v, V ⟩ = b∗h,i (v, V ). Here ⟨·, ·⟩ denotes the l2 -inner product. Similarly, Mi denotes the matrices corresponding to the bilinear forms (U , V )L2ρ (Ωi ) ,
⟨Mi U , V ⟩ = (U , V )L2ρ (Ωi ) . Let Ri be the restriction operator from Sh to Sh,i . The equations in (2.2) are then written into: Mi Ui − BTi Ri uh = 0, N i =1
RTi Bi Ui =
N i=1
RTi fi ,
i = 1 · · · N,
H.H. Kim et al. / Journal of Computational and Applied Mathematics 311 (2017) 599–617
603
where N is the number of subdomains, Ui is the restriction of Uh to Ωi and fi is given by
⟨fi , v|Ωi ⟩ = (f , v)L2 (Ωi ) . Since Mi is invertible, the linear system can then be simplified and assembled as N
N
RTi Ai Ri uh =
i =1
RTi fi ,
(3.1)
i =1
where Ai := Bi Mi−1 BTi . Similar to the previous work [12], instead of solving the whole linear system (3.1), we eliminate the subdomain interior unknowns and only consider the unknowns on the subdomain interfaces by applying static condensation. For a given subdomain Ωi , the matrix Ai and vector fi can be partitioned as
Ai =
(i)
AII
(i)
ABI
(i)
AIB
fi =
(i)
ABB
(i)
fI (i) fB
,
where I and B denote the blocks corresponding to unknowns in the interior and on the boundary of the Ωi , respectively. After eliminating the interior unknowns, we then obtain the local Schur complement matrix and the corresponding load vector, (i) (i) (i) (i) Si = ABB − ABI (AII )−1 AIB ,
(i) (i) (i) (i) gi = fB − ABI (AII )−1 fI
and form a linear system as N
RTi Si R i uB =
N
RTi gi ,
(3.2)
i =1
i =1
where uB denotes the unknowns on the subdomain interfaces and Ri denotes the restriction operator mapping uB to local unknowns on the interface of Ωi . We now define three finite element spaces for the interface unknowns. The unknowns in (3.2) are continuous across each . We then consider the space W := Π N Wi , where Wi subdomain interface and we define the space of such unknowns as W i=1 is the space of local interface unknowns of Ωi . It is obvious that the unknowns in W can be discontinuous, or are decoupled is a subspace of W . across each subdomain interface. We also note that W between W and W , where part of the unknowns is The BDDC algorithm is constructed on an intermediate space W can be continuous, called primal unknowns while the remaining discontinuous, called the dual unknowns. Concretely, W written as
:= W Π ⊕ W∆ , W (i)
(i)
Π is the space of global primal unknowns and W∆ := Π N W∆ with W∆ as the space of dual unknowns of ∂ Ωi . where W i =1 3.2. Adaptive primal unknowns The main feature of our method lies in constructing a set of adaptive primal unknowns to reflect the local structures of the coefficient. This can be achieved by considering a generalized eigenvalue problem posed on each subdomain interface. Given a tolerance λTOL , dominant eigenfunctions are taken as basis for the adaptive primal unknowns. In the section, we will give a precise description of this process. Let Ωj be one of the neighboring subdomains of Ωi , and F be the interface shared by Ωi and Ωj , i.e. F = Ωi ∩ Ωj . We partition matrices Si and Sj into
Si =
(i)
SFF
(i)
SCF
(i)
SFC
(i)
SCC
and Sj =
(j)
SFF
(j)
SCF
(j)
SFC
(j)
SCC
,
where F and C denote the blocks corresponding to unknowns on the interface F and to the remaining unknowns on the boundaries of Ωi , respectively. We then define Schur complement matrices of Si and Sj with respect to the interface F by (j) (j) (j) (j) (j) (i) (i) (i) (i) (i) SF = SFF − SFC (SCC )−1 SCF and SF = SFF − SFC (SCC )−1 SCF . (i)
(i)
To simplify notations, we also define SF := SFF . Using the matrices defined above, we then introduce a well-designed generalized eigenvalue problem to select a set of primal unknowns. Generalized eigenvalue problems: For symmetric and semi-positive definite matrices A and B, we define A : B = A(A + B)+ B,
604
H.H. Kim et al. / Journal of Computational and Applied Mathematics 311 (2017) 599–617
where (A + B)+ is a pseudo inverse of the matrix A + B. The definition is called a parallel sum and it satisfies the following properties, see [31]: A : B = B : A, A : B ≤ A,
A : B ≤ B.
(3.3)
We then impose a generalized eigenvalue problem: (j) (i) ((D(Fi) )T SF(j) D(Fi) + (D(Fj) )T SF(i) DF(j) )vF = λ( SF : SF )vF .
(3.4)
(i)
Here DF are scaling matrices which will be introduced later in the BDDC preconditioner. The above form of generalized eigenvalue problems was first introduced in [24] and it was designed to control the upper bound in the analysis of the BDDC preconditioner. In that work, the idea of using the parallel sum was originated from [32]. Generally, we require that the scaling matrices satisfy (j)
(i)
DF + DF = I , where I is the identity matrix and F is the interface shared by the two subdomains. In the BDDC preconditioner, the scaling matrices can help to resolve the ill-conditioning due to the coefficient discontinuity across the subdomain interfaces, see [33] and references therein. Later numerical experiments will be presented to show that the choice of scaling matrices is also important in forming an efficient set of primal constraints for problems with highly heterogeneous and random coefficients. For simplification, we denote (i)
(j) (i)
(j)
(i) (j)
SF := (DF )T SF DF + (DF )T SF DF ,
(j) (i) SF := SF : SF .
The eigenvalue λ of (3.4) lies in (0, ∞]. Let λTOL ≥ 1 be a given tolerance. Among the whole eigenvectors, we select those with corresponding eigenvalues greater than λTOL . Specifically, we consider the following set of generalized eigenvectors,
(1)
(MF )
(2)
VF = vF , vF , . . . , vF
,
(l)
where each component vF corresponds to an eigenvalue λl satisfying
λl > λTOL . Notice that MF is, in general, different for different interface F . In addition, the above vectors are normalized so that
⟨SF vF(l) , vF(l) ⟩ = 1,
l = 1, 2, . . . , MF .
We will utilize the resulting generalized eigenvectors as basis for the primal unknowns and perform a change of basis (i) to the original linear system. For w = (w1 , w2 , . . . , wN ) in W , we will use the notation wF to denote the restriction of wi to an interface F . Recall that wi , i = 1, 2, . . . , N, are decoupled across the subdomain interfaces. By using those selected eigenvectors, we enforce the following constraints for w on each interface F ,
(SF vF(l) )T (wF(i) − wF(j) ) = 0,
∀l = 1, . . . , MF .
(3.5) (i)
To make the constraints explicit, we will now perform a change of basis for each wF . We then introduce new unknowns
(i)
w F =
( wF(i,)Π , w F(i,)∆ )T
and define the change of unknowns by
wF(i) = TF w F(i) = VF w F(i,)Π + VF⊥ w F(i,)∆ ,
(3.6) (MF )
(1)
where VF is a matrix consisting of columns in VF and VF⊥ consisting of columns orthogonal to {SF vF , . . . , SF vF above, by multiplying (SF VF )T on both sides of (3.6) we can see that
}. In the
(SF VF )T wF(i) = w F(i,)Π and due to (3.5) we obtain
w F(i,)Π = w F(j,)Π . (i)
(i)
(i)
(i)
We call w F ,Π primal unknowns and w F ,∆ dual unknowns. When w F ,Π = 0 the corresponding vector wF is orthogonal to the columns of SF VF , see (3.6). From the orthogonality illustrated above, we have the following property
⟨SF VF w F(i,)Π , VF⊥ w F(i,)∆ ⟩ = ⟨ SF VF w F(i,)Π , VF⊥ w F(i,)∆ ⟩ = 0.
(3.7)
(l)
We also note that by using that λl > λTOL for vF ∈ VF ,
⟨SF ,∆∆ ·, ·⟩ ≤ λTOL ⟨ SF ,∆∆ ·, ·⟩,
(3.8)
H.H. Kim et al. / Journal of Computational and Applied Mathematics 311 (2017) 599–617
605
where SF ,∆∆ := (VF⊥ )T SF VF⊥
and SF ,∆∆ := (VF⊥ )T SF VF⊥ .
In the following, we use the same notation Si for the stiffness matrix after the change of basis in all interfaces F of Ωi from original matrix Si . The original linear system in (3.2) is then assembled by using the transformed local problems N
N
RTi Si Ri w ˆ =
i=1
RTi gi
(3.9)
i=1
where w denotes the unknowns after changing basis for uB and the same notation gi is used after changing basis for the original vector gi . 3.3. BDDC preconditioner After performing the change of unknowns, we obtain the local Schur complement matrix Si and the corresponding load vector gi , which can be partitioned into
Si =
(i)
(i)
S∆∆
S1Π
SΠ∆
SΠΠ
(i)
and gi =
(i)
(i)
g∆
(i)
gΠ
,
where Π and ∆ denote the blocks to the primal unknowns and the dual unknowns, respectively. . In W , where the coarseAs we mentioned before, the BDDC preconditioner is built on the partially coupled space W level problem is solved globally and the problem on dual unknowns is solved independently in each subdomain. Thus two to W and taking inverse in an efficient way. procedures are needed in the algorithm, mapping the unknowns from W → W (i) and form a partially coupled To derive the preconditioner, we first introduce the restriction mapping Ri : W matrix
S=
RTi Si Ri ,
i
which is coupled at the primal unknowns and decoupled at the remaining dual unknowns. to W such that We define a mapping R from W
R∆ R= , RΠ
to W∆ and RΠ is the restriction of W to W Π . Since W∆ = where R∆ is a mapping from W
(1)
R∆
(i)
i
W∆ , R∆ is obtained as
(2) R∆ R∆ = .. , . (N )
R∆ (i)
(i)
to W∆ . where R∆ is the restriction from W For the BDDC preconditioner, we introduce a scaling matrix of the form D=
RTi Di Ri .
i
Here the matrix Di is block diagonal, (i)
Di = diag F ⊂∂ Ωi (DF ), (i)
where each DF are scaling matrices to the unknowns in F of ∂ Ωi . The scaling matrices satisfy a partition of unity property,
(j)
DF = I
j∈N (F )
where N (F ) is the set of subdomains sharing an interface F . We note that in the staggered DG formulation any interface F is shared by only two subdomains. (i) There are various weighting methods to define a concrete form for DF . We refer to [34] and references therein for various scaling methods such as the multiplicity scaling, ρ scaling, stiffness scaling, and deluxe scaling. As in the work [22], the deluxe scaling is considered in our work and it is given by (i)
(i)
(j)
(i)
DF = (SF + SF )−1 SF ,
606
H.H. Kim et al. / Journal of Computational and Applied Mathematics 311 (2017) 599–617
(i)
where F is shared by Ωi and Ωj . We note that DF can be partitioned into the following block form, (i)
DF =
(i)
(i)
DF ,∆∆
DF ,1Π
DF ,Π∆
D F ,Π Π ,
(i)
(3.10)
(i)
where ∆ and Π denote blocks corresponding to the dual and primal unknowns. In our numerical test, we will use both deluxe scaling and multiplicity scaling, and the result shows that the former gives a much better performance by employing a smaller set of adaptive primal unknowns. We note that for the unknowns on an interface F , the multiplicity scaling is defined as 1/N (F ), where N (F ) is the number of subdomains sharing an interface F . In the staggered DG method, N (F ) = 2 for all interfaces F . The BDDC preconditioner is then given by −1 MBDDC = RT DT S −1 D R,
and by using the block Cholesky factorization as in [26],
S −1 =
(i) (i) (i) −1 ( R∆ )T (S∆∆ )−1 R∆ + RT0 FΠ Π R0 , i
(i) to W∆(i) and to W Π . The matrix FΠ Π is the coarse problem where R∆ is the restriction from W R0 is the mapping from W (i)
matrix which is the assembly of local matrices FΠ Π at the global primal unknowns, i.e., FΠΠ =
(i) (i) (i) (RΠ )T FΠ Π RΠ , i
where (i)
(i)
(i)
(i)
(i)
FΠΠ = SΠΠ − SΠ ∆ (S∆∆ )−1 S1Π , (i)
(i)
Π → WΠ is the restriction of global primal unknowns to subdomain primal unknowns in Ωi . The matrix and RΠ : W R0 consists of rows which correspond to energy minimizing coarse basis functions to the given primal unknown, i.e., R0 = RΠ −
(i) (i) (i) (i) (RΠ )T SΠ ∆ (S∆∆ )−1 R∆ i
to W Π . where RΠ is the restriction from W 4. Analysis of condition number bound In this section, we analyze an upper bound of condition numbers of our adaptive BDDC algorithm. Following [26, Theorem 1], we will need to show the estimate
⟨ S (I − ED ) w , (I − ED ) w ⟩ ≤ C ⟨ Sw , w ⟩, where ED is an average operator defined as ED = R RT D.
We recall the definitions of D, R, and S in Section 3.3 and also (3.10). We then obtain that for w ∈W (i)
((I − ED ) w ) |F =
(j)
(i)
(j)
(j)
(i)
(j)
DF ,∆∆ (wF ,∆ − wF ,∆ )
DF ,Π ∆ (wF ,∆ − wF ,∆ )
(j)
and ((I − ED ) w ) |F =
(i)
(j)
(i)
(i)
(j)
(i)
DF ,∆∆ (wF ,∆ − wF ,∆ )
DF ,Π ∆ (wF ,∆ − wF ,∆ )
,
(4.1)
(l)
where F denotes the common edge of two subdomains Ωi and Ωj and wF ,∆ denotes the dual unknowns of w |Ωl on F for l = i, j.
, Lemma 4.1. For a set of adaptive primal constraints with a given scaling and λTOL , we obtain the following bound for w ∈W ⟨ S (I − ED ) w , (I − ED ) w ⟩ ≤ C λTOL ⟨ Sw , w ⟩, where C is a constant depending on the maximum number of interfaces F per each subdomain but is independent of mesh parameters, the number of subdomains, and the coefficient ρ(x).
H.H. Kim et al. / Journal of Computational and Applied Mathematics 311 (2017) 599–617
607
Proof. From Eq. (4.1), we have
(i) ⟨SF ((I − ED ) w )(i) |F , ((I − ED ) w )(i) |F ⟩
⟨ S (I − ED ) w , (I − ED ) w⟩ ≤ C
i
F
=C
i
(i)
SF
(j)
DF ,∆∆ (j)
DF ,Π ∆
F
(wF(i,)∆
−
wF(j,)∆ ),
(j)
DF ,∆∆ (j)
D F ,Π ∆
(wF(i,)∆
−
wF(j,)∆ )
,
(4.2)
where C is a constant independent of the mesh parameters. We introduce the notation w F(i,)∆ for the zero extension of wF(i,)∆ to the full unknowns of F and then obtain
(j)
DF ,∆∆ (j)
DF ,Π∆
(wF(i,)∆ − wF(j,)∆ ) = D(Fj) ( wF(i,)∆ − w F(j,)∆ ).
From (4.2) and the above, we have
⟨ S (I − ED ) w , (I − ED ) w⟩ ≤ C
(i) (j) (i) ⟨SF DF ( w F ,∆ − w F(j,)∆ ), D(Fj) ( wF(i,)∆ − w F(j,)∆ )⟩. i
F
In the above, recalling (3.2) and then using (3.8), Cauchy–Schwarz inequality, the orthogonality of SF in (3.7), and (3.3) for
(j) (i) SF ), we can conclude that SF (= SF : F(j,)∆ ⟩ F(j,)∆ ), w F(i,)∆ − w ⟨SF ( wF(i,)∆ − w ⟨ SPD w , PD w ⟩ ≤ C i
≤ C λTOL
F
i
SF w F(j,)∆ , w F(j,)∆ ⟩) F(i,)∆ ⟩ + ⟨ (⟨ SF w F(i,)∆ , w
F
= 2C λTOL
F(i,)∆ ⟩ ⟨ SF w F(i,)∆ , w
≤ 2C λTOL
(i) (i) ⟨ SF w , w ⟩
i
F F
i
F
F
(i) (i) (i) ≤ 2C λTOL ⟨ SF wF , wF ⟩ ≤ 2CCF λTOL ⟨ Sw , w ⟩, i
F
(i)
where CF is the maximum number of interfaces F per each subdomain, wF denotes the restriction of w to F of ∂ Ωi , and the following property of Schur complement is used (i) (i) (i) Ri w , Ri w ⟩. ⟨ SF wF , wF ⟩ ≤ ⟨Si
By Lemma 4.1 and following [26, Theorem 1], we obtain: Theorem 4.2. For a given λTOL , we obtain the following condition number bound of the adaptive BDDC algorithm for a given scaling D, −1 κ(MBDDC S ) ≤ C λTOL ,
where C is a constant depending on the number of interfaces F per each subdomain but is independent of any mesh parameters, the number of subdomains, and the coefficient of the model problem in (2.1). 5. Numerical results We test our algorithm for various examples of ρ(x). In our experiments, we consider a unit square domain Ω and partition it into uniform square subdomains. In the following, Nd refers to the number of subdomains. Each subdomain is then divided into uniform grids for a given grid size h. We use H to denote the size of each subdomain. We use piecewise linear functions to construct the two finite element spaces Sh and Vh . In our CG iteration, the stop condition is when the relative residual norm is reduced by 10−10 . The condition number is approximated by Lanczos algorithm. As the first experiment, we consider a 3 × 3 uniform subdomain partition and ρ(x) = 1 for all Ωi . We choose λTOL = 1 + log(H /h) for a given H /h. We note that for the constant ρ(x), the multiplicity scaling and deluxe scaling are the same. In Table 1, the results are presented by increasing H /h and with a fixed Nd . We observe 1 + log(H /h) growth of condition numbers. We also observe that for each interface two primal unknowns are selected regardless of H /h. We then consider model problems with ρ(x), which has channel patterns as in Fig. 2. We fix Nd as 32 . Since the coefficients are symmetric across the subdomain interface, the multiplicity scaling and deluxe scaling are the same. We pick λTOL = 1 + log(H /h). The results are presented in Table 2 and they are similar to those in Table 1. We note that the
608
H.H. Kim et al. / Journal of Computational and Applied Mathematics 311 (2017) 599–617 Table 1 Performance of the BDDC algorithm for a problem with uniform ρ(x) = 1 with increasing H /h in a fixed subdomain partition Nd = 32 : Iter (number of iterations), λmin (minimum eigenvalue), λmax (maximum eigenvalue), pnum (number of primal unknowns), time (time spent on PCG solver and eigenvalue problems). H /h
Iter
λmin
λmax
pnum
Time
6 12 18 24 30
7 7 8 9 9
1.0002 1.0007 1.0008 1.0009 1.0001
1.3143 1.5192 1.6559 1.7598 1.8443
24 24 24 24 24
0.30 0.46 0.52 0.65 0.71
Table 2 Performance of the BDDC algorithm for channels (η = 103 ) and with increasing H /h in a fixed subdomain partition Nd = 32 : Iter (number of iterations), λmin (minimum eigenvalue), λmax (maximum eigenvalue), pnum (number of primal unknowns), time (time spent on PCG solver and eigenvalue problems). Channel
H /h
Iter
λmin
λmax
pnum
Time
One
6 12 18 24 30
8 8 9 9 9
1.0008 1.0000 1.0000 1.0000 1.0000
1.2795 1.5067 1.6617 1.7808 1.8793
24 24 24 24 24
0.32 0.48 0.53 0.58 0.71
Three
14 28 42
7 8 8
1.0006 1.0000 1.0000
1.2919 1.5215 1.6772
36 36 36
0.53 0.70 2.96
Fig. 2. ρ(x) for H /h = 14 and Nd = 32 with three channels in each subdomain: blue (ρ(x) = 1) and red (ρ(x) = η). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
number of primal constraints increases as more channels are introduced but it is not affected by increasing H /h. In Fig. 3 we plot the condition numbers from Tables 1 and 2. We observe an obvious increase similar as 1 + log(H /h). For the case with three channels and H /h = 28, our method is tested for varying the contrast η. The results are presented in Table 3 and they are quite robust to the contrast η. When the contrast becomes significant, one more primal unknown is selected for each interface so that the conditioner number can be controlled. To test our method for highly varying and random coefficients, we consider a group of examples for ρ(x) = 10r where r is chosen randomly from (−3, 3) for a given uniform grid, see Fig. 4. In Table 4, we perform our algorithm for increasing H /h in a fixed subdomain partition with Nd = 42 . Instead of picking λTOL = 1 + log(H /h) only, we consider cases where λTOL = H /h and λTOL = (1 + log(H /h))2 . We also test for the value of λTOL as 40, 60 to compare the performance. For all the experiments the deluxe scaling is employed. From the result, we observe that with the increase of H /h the percentage of primal unknowns decreases. The value 1 + log(H /h) seems to be proper for λTOL to give a good computational performance. To show our algorithm is robust with respect to contrast, we present in Table 5 the same experiment as above but with r chosen randomly from (−4, 4) and with H /h = 24. We see that we obtain a comparable result. In Table 6, our algorithm is tested for highly varying and random ρ(x) by increasing Nd = N 2 with a fixed local problem size H /h = 16. Similar to the previous experiment, we also compare results for various values of λTOL . Since λTOL does
H.H. Kim et al. / Journal of Computational and Applied Mathematics 311 (2017) 599–617
609
Fig. 3. Plot of condition numbers with increasing H /h for constant and channel ρ(x).
Fig. 4. The coefficient for random ρ(x) from 10−3 to 103 for H /h = 18 and Nd = 42 . Table 3 Performance of the BDDC algorithm depending on the contrast η for a model with H /h = 28, Nd = 32 and with three channels of contrast η inside each subdomain: Iter (number of iterations), λmin (minimum eigenvalue), λmax (maximum eigenvalue), pnum (total number of primal unknowns), time (time spent on PCG solver and eigenvalue problems).
η
Iter
λmin
λmax
pnum
Time
1 101 102 103 104 105 106
9 10 10 8 8 8 8
1.0010 1.0006 1.0000 1.0000 1.0000 1.0000 1.0000
1.8178 1.6333 2.0267 1.5185 1.5212 1.5215 1.5212
24 24 24 36 36 36 36
0.65 0.69 0.72 0.62 0.64 0.81 0.62
not change for different values of Nd , we expect that the conditioner number does not change much when increasing the computing scale. From the result, we find that the conditioner number is quite robust to the computing scale Nd . We also find that as increasing λTOL , the proportion of primal unknowns does not change much. Thus a smaller λTOL , which need less iterations, gives less computing time. In Table 7, we present the results of the same experiment using r ∈ (−4, 4) and using N = 16. Comparing with the results with r ∈ (−3, 3), we see that our algorithm is robust with respect to the contrast.
610
H.H. Kim et al. / Journal of Computational and Applied Mathematics 311 (2017) 599–617
Table 4 Performance of the BDDC algorithm for a problem with random ρ(x) in (10−3 , 103 ) by increasing H /h in a fixed subdomain partition Nd = 42 for various λTOL : gdof (total number of dofs), Iter (number of iterations), κ (condition number), pnum (total number of primal unknowns), ppnum (proportion of pnum to gdof), time (time spent on PCG solver and eigenvalue problems).
λTOL
Iter
κ
pnum
ppnum
Time
6
288
1 + log(H /h) (2.79) (1 + log(H /h))2 (7.79) H /h (6) 40 60
11 14 14 17 17
1.8678 2.7701 2.7701 5.0212 5.0212
40 29 29 25 25
13.88% 10.06% 10.06% 8.68% 8.68%
0.79 0.48 0.45 0.55 0.58
12
576
1 + log(H /h) (3.48) (1 + log(H /h))2 (12.14) H /h (12)
13 16 16 18 18
3.3292 3.7663 3.7663 10.5374 10.5374
43 29 29 27 27
7.46% 5.03% 5.03% 4.68% 4.68%
1.37 1.16 1.15 1.26 1.26
15 20 20 23 23
2.7993 3.1103 3.2290 3.6458 3.6692
42 32 31 26 26
4.86% 3.70% 3.58% 3.00% 3.00%
2.79 3.11 3.22 3.64 3.66
14 17 17 19 19
2.5399 6.1367 6.1369 10.3308 10.3869
41 32 31 30 29
3.55% 2.77% 2.69% 2.60% 2.51%
4.19 4.51 4.50 4.95 4.97
13 18 19 20 22
5.6310 7.1010 7.4411 7.9184 8.5202
51 36 35 33 32
3.54% 2.50% 2.43% 2.23% 2.22%
5.63 7.13 7.14 7.15 8.52
H h
gdof
40 60 18
864
1 + log(H /h) (3.89) (1 + log(H /h))2 (15.13) H /h (18) 40 60
24
1152
1 + log(H /h) (4.17) (1 + log(H /h))2 (17.45) H /h (24) 40 60
30
1440
1 + log(H /h) (4.40) (1 + log(H /h))2 (19.37) H /h(30) 40 60
Table 5 Performance of the BDDC algorithm for a problem with random ρ(x) in (10−4 , 104 ) using a fixed subdomain partition Nd = 42 for various λTOL : gdof (total number of dofs), Iter (number of iterations), κ (condition number), pnum (total number of primal unknowns), ppnum (proportion of pnum to gdof), time (time spent on PCG solver and eigenvalue problems). H h
gdof
λTOL
Iter
κ
pnum
ppnum
Time
24
1152
1 + log(H /h) (4.17) (1 + log(H /h))2 (17.45) H /h (24) 40 60
14 25 28 28 28
4.0321 10.4994 11.3371 12.3034 12.3034
50 32 29 28 28
4.34% 2.77% 2.51% 2.43% 2.43%
3.40 4.06 4.33 4.43 4.36
In Table 8, we compare the performance of BDDC algorithm for multiplicity scaling and deluxe scaling by increasing Nd = N 2 with a fixed H /h = 16. The deluxe scaling shows an overwhelming result just using 10% of the number of primal unknowns, which are needed in the multiplicity scaling to achieve a similar condition number. As the last set of examples we test our algorithm for fracture data as illustrated in Fig. 5 for parameter p = 1 and p = 106 . We also consider the pattern of coefficient for p = 103 which is quite similar to that of p = 106 except for the contrast. In Table 9, two different computing scales of (H /h, Nd ) are tested for the model with p = 103 . In Table 10, several tests are presented for various p. In both tables, we find that λTOL = 1 + log(H /h) provides the best performance in terms of computing time. From all the experiments above, we find that the conditioner numbers are determined by the user-defined tolerance λTOL as confirmed in our analysis. Compared to the multiplicity scaling, the generalized eigenvalue problems with deluxe scaling produce the set of primal unknowns which is more robust to the coefficient variations. In addition, the value 1 + log(H /h) seems to be a proper choice for λTOL in the test examples. 6. Application to parameter dependent problems The adaptive BDDC algorithm we introduced above works efficiently for highly oscillating coefficients. However, one fact worth noticing is that the computational cost of solving the generalized eigenvalue problem in (3.4) increases as the size of the local problems increases. Thus solving them for each individual parameter is not desirable. In this section, we will develop a new algorithm which applies well to a specific series of parameter dependent elliptic problems that we will introduce below. The idea is motivated from [35,36]. Some complicated coefficient parameters can be regarded as a combination of several coefficient components. In our proposed algorithm, we construct a new set of primal unknowns, which involves the information of those components.
H.H. Kim et al. / Journal of Computational and Applied Mathematics 311 (2017) 599–617
611
Table 6 Performance of the BDDC algorithm depending on λTOL for highly varying and random ρ(x) in (10−3 , 103 ) by increasing Nd = N 2 with a fixed H /h = 16: gdof (total number of dofs), Iter (number of iterations), κ (condition number), pnum (total number of primal unknowns), ppnum (proportion of pnum to gdof), time (time spent on PCG solver and eigenvalue problems).
λTOL
Iter
κ
ppnum
Time
4
768
1 + log(H /h) (3.77) (1 + log(H /h))2 (14.23) H /h (16) 40 60
15 18 19 19 19
2.1871 2.1851 2.3653 2.2285 2.2091
42 29 28 25 25
5.46% 3.76% 3.64% 3.25% 3.25%
2.18 2.18 2.36 2.28 2.20
8
3 584
1 + log(H /h) (3.77) (1 + log(H /h))2 (14.23) H /h (16) 40 60
16 25 26 32 37
3.6668 7.5411 9.0213 22.2070 24.9573
202 143 141 127 124
5.63% 3.99% 3.93% 3.54% 3.45%
9.85 14.92 14.50 17.58 20.12
16
15 360
1 + log(H /h) (3.77) (1 + log(H /h))2 (14.23) H /h (16) 40 60
16 31 33 47 54
2.9064 11.0172 18.0581 25.3169 30.0954
887 636 616 542 528
5.77% 4.13% 4.01% 3.53% 3.43%
76.11 114.86 116.93 248.05 318.58
24
35 328
1 + log(H /h) (3.77) (1 + log(H /h))2 (14.23) H /h (16) 40 60
17 34 37 52 62
4.2873 16.0957 16.0967 30.6304 46.1534
2017 1453 1411 1265 1221
5.71% 4.11% 3.99% 3.45% 3.45%
601.52 684.69 974.62 982.17 1097.37
N
gdof
pnum
Table 7 Performance of the BDDC algorithm depending on λTOL for highly varying and random ρ(x) in (10−4 , 104 ) by using N = 16 with a fixed H /h = 16: gdof (total number of dofs), Iter (number of iterations), κ (condition number), pnum (total number of primal unknowns), ppnum (proportion of pnum to gdof), time (time spent on PCG solver and eigenvalue problems). N
gdof
λTOL
Iter
κ
pnum
ppnum
Time
16
15 360
1 + log(H /h) (3.77) (1 + log(H /h))2 (14.23) H /h (16)
17 34 34 56 59
3.5229 12.8785 12.8786 46.2711 46.2817
954 720 706 603 579
6.21% 4.68% 4.59% 3.92% 3.76%
74.53 95.25 93.92 124.91 129.58
40 60
Fig. 5. The coefficients ρ(x) of fracture model with size 460 × 460 and with model parameter p = 1 (left), p = 106 (right).
Once the set of primal unknowns is given, the set of dual unknowns is then generated by SVD (singular value decomposition) process to obtain the orthogonality of the transformed local problems, see (3.7). Our numerical examples show that the proposed BDDC algorithm can save computational cost while keeping its performance as good as the one designed in Section 3, especially for large scale problems.
612
H.H. Kim et al. / Journal of Computational and Applied Mathematics 311 (2017) 599–617
Table 8 Comparison of performance of the BDDC algorithm for multiplicity scaling and deluxe scaling depending on λTOL for highly varying and random ρ(x) in (10−3 , 103 ) by increasing Nd = N 2 with a fixed H /h = 16: ppnum (proportion of primal unknowns to the total number of dofs), Sca (Scaling Matrix, M:Multiplicity, D: Deluxe), Iter (number of iterations), κ (condition number), time (time spent on PCG solver and eigenvalue problems).
λTOL
N
1 + log(H /h)
4
(1 + log(H /h))
2
H /h 40 60 1 + log(H /h)
8
(1 + log(H /h))2 H /h 40 60 1 + log(H /h)
16
(1 + log(H /h))2 H /h 40 60
ppum
Sca
Iter
κ
Time
5.85% 68.09% 4.03% 50.52% 3.90% 48.95% 3.51% 39.84% 3.38% 34.76%
D M D M D M D M D M
13 19 17 39 18 42 17 65 18 79
2.2089 3.6542 4.7129 14.1978 9.9927 4.7452 15.9124 4.7508 39.5262 59.9930
1.88 1.97 1.96 2.18 2.00 2.26 1.91 2.68 2.02 2.84
5.77% 64.56% 4.07% 47.32% 4.01% 45.75% 3.54% 36.10% 3.43% 32.64%
D M D M D M D M D M
16 19 27 39 27 42 34 66 38 81
3.0945 3.7635 9.9908 14.1215 9.9927 15.8960 25.5546 46.7166 40.3541 59.7653
10.06 89.71 15.29 87.59 15.70 86.95 18.71 83.48 20.97 82.01
5.77% 64.34% 4.13% 46.92% 4.01% 45.61% 3.53% 36.04% 3.43% 32.57%
D M D M D M D M D M
16 19 31 39 33 42 48 66 54 81
2.9064 4.0334 11.0172 17.1507 18.0581 17.4502 25.3169 46.0106 40.0954 65.1789
79.26 3835.25 106.26 3833.58 108.29 3892.67 147.76 3532.49 161.87 3667.67
Table 9 Performance of the BDDC algorithm for a fracture data with p = 103 : gdof (total number of dofs), Iter (number of iterations), κ (condition number), ppnum (proportion of primal unknowns to gdof), time (time spent on PCG solver and eigenvalue problems). H /h
gdof
λTOL
Iter
κ
ppnum
Time
46
2
10
16 560
1 + log(H /h) (4.82) (1 + log(H /h))2 (23.13)
16 35
3.3884 12.0648
2.08% 1.08%
144.50 294.82
23
202
34 960
1 + log(H /h) (4.13) (1 + log(H /h))2 (17.10)
16 33
2.9921 11.1473
4.06% 2.17%
431.00 436.94
Nd
Table 10 Performance of the BDDC algorithm for a fracture data with varying p and with H /h = 23 and Nd = 202 : Iter (number of iterations), κ (condition number), ppnum (proportion of primal unknowns to the total number of dofs), time (time spent on PCG solver and eigenvalue problems). p
λTOL
Iter
κ
ppnum
Time
1
1 + log(H /h) (4.13) (1 + log(H /h))2 (17.10)
12 23
2.0385 6.3336
4.34% 2.17%
435.88 619.24
103
1 + log(H /h) (4.13) (1 + log(H /h))2 (17.10)
16 33
2.9921 11.1473
4.06% 2.17%
431.00 436.94
106
1 + log(H /h) (4.13) (1 + log(H /h))2 (17.10)
19 36
2.5447 10.8634
4.07% 2.15%
512.34 532.21
6.1. Model problem We consider a series of problems: find u ∈ H01 (Ω ) such that
− ∇ · (ρµ (x)∇ u(x)) = f (x) ∀x ∈ Ω ,
(6.1)
H.H. Kim et al. / Journal of Computational and Applied Mathematics 311 (2017) 599–617
613
where ρµ (x) depends on a vector µ = (µ1 , µ2 , . . . , µm ) and a collection of coefficient parameters ρ1 (x), ρ2 (x), . . . , ρm (x). In a more detail, the constant vector µ = (µ1 , µ2 , . . . , µm ) satisfies 0 < µi < 1 for i = 1, 2, . . . , m, m µi = 1,
(6.2) (6.3)
i =1
and there exists an operator g such that
ρµ (x) = g (µ1 ρ1 (x), µ2 ρ2 (x), . . . µm ρm (x)). We remark that {ρi (x), i = 1 · · · m} can be regarded as a decomposition of ρµ (x). Thus ρi (x) can be much regular even though ρµ (x) is highly oscillating. We call ρi (x) parameter components and ρµ (x) parameter combination. 6.2. Primal unknowns In this subsection, we introduce a new way of designing primal unknown basis. Our intuition is that the set of primal unknowns should meet three requirements. Firstly it contains enough information of all parameter components. It depends only on the components but not on the parameter combination, so that it can be pre-calculated and applied to the problems with different µ. Secondly, such primal set should be easy to obtain and it is better to avoid solving a generalized eigenvalue problem. Lastly, the amount of primal unknowns needs to be kept relatively small. We first pick several constant vectors ν1 , ν2 , . . . , νp , each of which meets the two requirements of µ in (6.2) and (6.3). We then obtain a set of coefficients ρν1 (x), ρν2 (x), . . . , ρνp (x) such that
ρνi (x) = g (νi(1) ρ1 (x), νi(2) ρ2 (x), . . . νi(m) ρm (x)) (1)
(2)
i = 1, 2, . . . , p,
(m)
where νi = (νi , νi , . . . , νi ). For each ρνi (x), we consider an elliptic problem which is illustrated in (2.1). We follow the same algorithm until forming the generalized eigenvalue problem (GEP) (3.4) on each interface F . We then solve the GEP and obtain the matrix ViF := VF ,Π whose column vectors are eigenvectors with their eigenvalues λ ≥ λTOL . We apply the above procedure for ρν1 (x), ρν2 (x), . . . , ρνp (x). We then obtain the matrices V1F , V2F , . . . , VpF for F . We let
UF := [V1F V2F . . . VpF ] and consider the following eigenvalue problem UFT UF wF = λwF .
We then obtain a set of eigenvalues λ of UFT UF , such that λ1 ≥ λ2 ≥ λ3 ≥ · · · λk = λk+1 = · · · λm = 0. We can pick (1)
(2)
(q)
a threshold value λs and collect eigenvectors {wF , wF , . . . , wF } with their corresponding eigenvalues λ > λs after removing the linearly dependent components. We then define a new set of basis for primal unknowns as (1)
(2)
(q)
VΠF = {UF wF , UF wF , . . . , UF wF }. In general, q is smaller than the number of unknowns on F . We remark that in practice, we pick constant vectors {v1 , . . . , vp } and follow the procedure we have just introduced to construct a basis of primal unknowns. The basis can be stored first and utilized whenever needed. If the basis contains enough information of all the components, this primal set can work well for any model problem (6.1) with ρµ (x). In addition, we can control λs to select a proper number of primal unknowns. Our numerical results show that a comparable but not large set of primal unknowns is enough for a good performance. 6.3. New BDDC Algorithm In this subsection, we present our revised BDDC algorithm. For any constant vector µ satisfying the requirements in (6.2) and (6.3), we formulate the problem in (6.1) with ρ(x) = ρµ (x) and follow the procedure as our original algorithm until (i)
obtaining the Schur compliment matrices SF on interfaces F of subdomain Ωi . Since we have already had the set VΠF of primal unknowns, we do not need to form generalized eigenvalue problems (3.4). Using the given set VΠF , we will apply a change of basis so that the orthogonal property in (3.7) holds for SF . The orthogonality helps to control the energy of dual unknowns by the energy of full unknowns with both dual and primal parts, i.e.,
⟨SF ,∆∆ wF ,∆ , wF ,∆ ⟩ ≤ ⟨SF wF , wF ⟩, where wF ,∆ denotes the dual part of wF . This can be achieved by applying the SVD technique. We let M = (SF VΠF )T and factor M into M = UΣV T , where the dimension of V is NF , the dimension of U is MF , and MF < NF .
(6.4)
614
H.H. Kim et al. / Journal of Computational and Applied Mathematics 311 (2017) 599–617
We can further write the decomposition into M = U Σ1
Σ2
V1
V2
T
,
where Σ1 is a diagonal matrix and Σ2 is a zero-matrix. The column vectors of matrix V2 can then be used as a basis of the dual unknowns. We denote V∆F := V2 and let the change of basis be T = [VΠF V∆F ]. The orthogonal condition can be shown as follows. We consider the transformed matrix T T SF T = [VΠF V∆F ]T SF [VΠF V∆F ]
=
(VΠF )T SF VΠF
(VΠF )T SF V∆F
(V∆F )T SF VΠF
(V∆F )T SF V∆F
and consider its off-diagonal block
(VΠF )T SF V∆F =MV∆F =U Σ 1
Σ2
V1
V2
=U Σ 1
Σ2
V1T V2 V2T V2
T
V2
.
By the theory of SVD, V1T V2 = 0 and V2T V2 is an identity matrix. Since Σ2 is the zero matrix, we obtain that the off-diagonal block vanishes, i.e., (VΠF )T SF V∆F = 0. The remaining procedure follows our previous algorithm. After the change of basis, we construct the BDDC preconditioner using the primal and dual unknowns and solve the global linear system by PCG. Compared to the original algorithm, our new method is computationally more efficient for a series of problems once coefficient basis {νi } is properly selected and the corresponding set of primal unknowns is obtained. The SVD procedure in (6.4) requires much less computational cost than solving the generalized eigenvalue problem. 6.4. Numerical tests In this section, we apply our new algorithm to two model problems to show its performance. Similar to Section 5, we consider a unit square domain Ω and partition it into uniform square subdomains. Each subdomain is then divided into uniform grids for a given mesh size h. We also use H to denote the size of each subdomain. In PCG iteration, the stop condition is when the relative residual norm is less than 10−10 . In the first set of experiments, we define ρµ (x) = eµρ1 (x)+(1−µ)ρ2 (x) , where ρ1 (x) and ρ2 (x) are used as horizontal channel and vertical channel coefficients. The coefficient components are illustrated in Fig. 6 and the coefficient combination is illustrated in Fig. 7. We randomly select three vectors νi to construct the set of primal unknowns. We then select µ = 0.3 and solve the model problem with the original algorithm and with the new algorithm to compare their performance. In this case, we let λs = 1 and λTOL = 1 + log(H /h) so that a proper number of primal unknowns is selected. The result is presented in Table 11. From the result, we can see that the new algorithm has a better performance in terms of computing time. We now consider 512 × 512 Perm Model. We define ρµ (x) = µρ1 (x) + (1 − µ)ρ2 (x), where ρ1 (x) and ρ2 (x) are used as horizontal and vertical Perm coefficients. The coefficient components are illustrated in Fig. 9 and the coefficient combination is illustrated in Fig. 8. We select six vectors of νi to construct a primal unknown basis. We then consider a model with µ = 0.6 and solve the model problem by the original algorithm and new algorithm to compare their performance. In this case, we let λs = 10−3 and λTOL = 1 + log(H /h). The result is presented in Table 12. We can observe that the new algorithm has a better performance than the original method. 7. Conclusion In this paper, a BDDC algorithm with adaptive primal unknowns is developed and analyzed for a SDG approximation of elliptic problems with highly oscillating coefficients. To determine a set of primal unknowns, a generalized eigenvalue problem is solved for each subdomain interface and dominant eigenvectors, defined by a user-defined tolerance, are used as primal unknowns. Due to the construction of the SDG method, the degrees of freedom on subdomain interfaces are shared by only two subdomains. This feature simplifies the construction of the primal unknowns. Moreover, a bound for the condition number is proved, which depends on the user-defined tolerance but is independent of the coefficient. We also propose an algorithm for parameter dependent elliptic problems, in which the primal unknowns are determined in an offline stage and are not required to form for each individual parameter. Numerical results are presented to verify the theoretical statements and show the performance of the preconditioner.
H.H. Kim et al. / Journal of Computational and Applied Mathematics 311 (2017) 599–617
Fig. 6. The original parameters ρ1 (x) and ρ2 (x) with H /h = 42 and N = 32 for channel models.
Fig. 7. The combined parameter of ρ1 (x) and ρ2 (x) with µ = 0.3.
Fig. 8. The combined parameter ρ(x) = 0.6ρ1 (x) + 0.4ρ2 (x) for 512 × 512 modified Perm model.
615
616
H.H. Kim et al. / Journal of Computational and Applied Mathematics 311 (2017) 599–617 Table 11 Comparison of performance of the two BDDC algorithms for a model with channels as components (η = 106 ) with Nd = 32 and increasing H /h: Method (O: Original, N: New), Iter (number of iterations), κ (condition number), pnum (number of primal unknowns), time (time spent on PCG solver and eigenvalue problems for original method, and on PCG solver and SVD for new method). Channel One
H /h 6 12 18 24 30
Three
14 28 42
Method
Iter
κ
pnum
Time
O N O N O N O N O N
7 8 8 8 9 9 9 9 10 10
1.2577 1.2566 1.4816 1.4532 1.6058 1.6025 1.7102 1.9058 1.8035 1.7988
24 24 24 24 24 24 24 24 24 24
0.22 0.13 0.34 0.14 0.39 0.22 0.47 0.31 0.52 0.39
O N O N O N
8 8 9 9 10 10
1.3748 1.3730 1.5841 1.5868 1.7242 1.7206
24 24 24 24 24 24
0.44 0.29 0.79 0.40 1.01 0.78
Table 12 Comparison of performance of the two BDDC algorithms for Perm model: gdof (total number of dofs), Method (O: Original, N: New), Iter (number of iterations), κ (condition number), ppnum (proportion of primal unknowns to gdof), time (time spent on PCG solver and eigenvalue problems for original method and on PCG solver and SVD for new method). Method
Iter
κ
ppnum
Time
4
128
6 144
O N
14 11
6.3261 2.0704
0.78% 1.32%
139.73 89.63
8
64
14 336
O N
17 12
4.3932 3.5976
1.41% 2.53%
154.45 110.33
16
32
30 720
O N
18 16
3.9664 3.2802
2.93% 4.39%
387.59 281.07
N
H /h
gdof
Fig. 9. The original parameters ρ1 (x) and ρ2 (x) for 512 × 512 modified Perm model.
Acknowledgments The research of Hyea Hyun Kim is supported by the National Research Foundation of Korea (NRF) grants funded by NRF2014R1A1A3052427 and by NRF20151009350. The research of Eric Chung is partially supported by the Hong Kong RGC General Research Fund (Project: 400813), the CUHK Direct Grant for Research 2015/16 and the CUHK Research Incentive Fund 2015/16.
H.H. Kim et al. / Journal of Computational and Applied Mathematics 311 (2017) 599–617
617
References [1] E.T. Chung, B. Engquist, Optimal discontinuous Galerkin methods for wave propagation, SIAM J. Numer. Anal. 44 (2006) 2131–2158. [2] E.T. Chung, B. Engquist, Optimal discontinuous Galerkin methods for the acoustic wave equation in higher dimensions, SIAM J. Numer. Anal. 47 (2009) 3820–3848. [3] E. Chung, C.S. Lee, A staggered discontinuous Galerkin method for the convection–diffusion equation, J. Numer. Math. 20 (2012) 1–32. [4] E.T. Chung, C.S. Lee, A staggered discontinuous Galerkin method for the curl–curl operator, IMA J. Numer. Anal. (2011) drr039. [5] H.H. Kim, E.T. Chung, C.S. Lee, A staggered discontinuous Galerkin method for the Stokes system, SIAM J. Numer. Anal. 51 (2013) 3327–3350. [6] S.W. Cheung, E. Chung, H.H. Kim, Y. Qian, Staggered discontinuous Galerkin methods for the incompressible Navier-Stokes equations, J. Comput. Phys. 302 (2015) 251–266. [7] D.N. Arnold, F. Brezzi, B. Cockburn, L.D. Marini, Unified analysis of discontinuous Galerkin methods for elliptic problems, SIAM J. Numer. Anal. 39 (2002) 1749–1779. [8] F. Brezzi, G. Manzini, D. Marini, P. Pietra, A. Russo, Discontinuous Galerkin approximations for elliptic problems, Numer. Methods Partial Differential Equations 16 (2000) 365–378. [9] B. Cockburn, C.-W. Shu, The local discontinuous Galerkin method for time-dependent convection–diffusion systems, SIAM J. Numer. Anal. 35 (1998) 2440–2463. [10] J. Douglas, T. Dupont, Interior penalty procedures for elliptic and parabolic Galerkin methods, in: Computing Methods in Applied Sciences, Springer, 1976, pp. 207–216. [11] B. Rivière, M.F. Wheeler, V. Girault, Improved energy estimates for interior penalty, constrained and discontinuous Galerkin methods for elliptic problems. Part I, Comput. Geosci. 3 (1999) 337–360. [12] H.H. Kim, E.T. Chung, C.S. Lee, A BDDC algorithm for a class of staggered discontinuous Galerkin methods, Comput. Math. Appl. 67 (2014) 1373–1389. [13] C. Pechstein, R. Scheichl, Analysis of FETI methods for multiscale PDEs, Numer. Math. 111 (2008) 293–333. [14] C. Pechstein, R. Scheichl, Analysis of FETI methods for multiscale PDEs. Part II: interface variation, Numer. Math. 118 (2011) 485–529. [15] S. Gippert, A. Klawonn, O. Rheinbach, Analysis of FETI-DP and BDDC for linear elasticity in 3d with almost incompressible components and varying coefficients inside subdomains, SIAM J. Numer. Anal. 50 (2012) 2208–2236. [16] A. Klawonn, M. Lanser, P. Radtke, O. Rheinbach, On an adaptive coarse space and on nonlinear domain decomposition, in: Domain Decomposition Methods in Science and Engineering XXI, Springer, 2014, pp. 71–83. [17] J. Mandel, B. Sousedk, J. stek, Adaptive BDDC in three dimensions, Math. Comput. Simulation 82 (2012) 1812–1831. [18] N. Spillane, V. Dolean, P. Hauret, F. Nataf, D.J. Rixen, Solving generalized eigenvalue problems on the interfaces to build a robust two-level FETI method, C. R. Math. 351 (2013) 197–201. [19] J. Galvis, Y. Efendiev, Domain decomposition preconditioners for multiscale flows in high-contrast media, Multiscale Model. Simul. 8 (2010) 1461–1483. [20] N. Spillane, V. Dolean, P. Hauret, F. Nataf, C. Pechstein, R. Scheichl, A robust two-level domain decomposition preconditioner for systems of PDEs, C. R. Math. 349 (2011) 1255–1259. [21] V. Dolean, F. Nataf, N. Spillane, R. Scheichl, Analysis of a two-level Schwarz method with coarse spaces based on local Dirichlet-to-Neumann maps, Comput. Methods Appl. Math. 12 (2012) 391–414. [22] C.R. Dohrmann, C. Pechstein, Modern domain decomposition solvers: BDDC, deluxe scaling, and an algebraic approach, 2013. http://people.ricam. oeaw.ac.at/c.pechstein/pechstein-bddc2013.pdf. [23] H.H. Kim, E.T. Chung, A BDDC algorithm with optimally enriched coarse spaces for two-dimensional elliptic problems with oscillatory and high contrast coefficients, SIAM Multiscale Model. Simul. (2015) 571–593. [24] A. Klawonn, P. Radtke, O. Rheinbach, FETI-DP with different scalings for adaptive coarse spaces, Proc. Appl. Math. Mech. (2014). [25] H.H. Kim, E. Chung, J. Wang, BDDC and FETI-DP algorithms with adaptive coarse spaces for three-dimensional elliptic problems with oscillatory and high contrast coefficients, 2016, arXiv preprint arXiv:1606.07560. [26] J. Li, O.B. Widlund, FETI-DP, BDDC, and block Cholesky methods, Internat. J. Numer. Methods Engrg. 66 (2006) 250–271. [27] E. Chung, B. Cockburn, G. Fu, The staggered DG method is the limit of a hybridizable DG method, SIAM J. Numer. Anal. 52 (2014) 915–932. [28] E. Chung, B. Cockburn, G. Fu, The staggered DG method is the limit of a hybridizable DG method. Part II: the Stokes flow, J. Sci. Comput. (2015) 1–18. [29] E.T. Chung, C.Y. Lam, J. Qian, A staggered discontinuous Galerkin method for the simulation of seismic waves with surface topography, Geophysics 80 (2015) T119–T135. [30] C.R. Dohrmann, A preconditioner for substructuring based on constrained energy minimization, SIAM J. Sci. Comput. 25 (2003) 246–258. [31] W.N. Anderson Jr., R.J. Duffin, Series and parallel addition of matrices, J. Math. Anal. Appl. 26 (1969) 576–594. [32] C. Dohrmann, C. Pechstein, Modern domain decomposition solvers: BDDC, deluxe scaling and an algebraic approach, in: 22th International Conference on Domain Decomposition Methods. [33] L. Beirão da Veiga, L.F. Pavarino, S. Scacchi, O.B. Widlund, S. Zampini, Isogeometric BDDC preconditioners with deluxe scaling, SIAM J. Sci. Comput. 36 (2014) A1118–A1139. [34] A. Klawonn, R. Patrick, O. Rheinbach, A comparision of adaptive coarse spaces for iterative substructuring in two dimensions, Tech. Rep. Preprint 2015-05, Fakultät für Mathematik und Informatik, TU Bergakademie Freiberg, Germany, 2015. [35] Y. Efendiev, E. Chung, G. Li, T. Leung, Sparse generalized multiscale finite element methods and their applications, Int. J. Multiscale Comput. Eng. 14 (1) (2016). [36] Y. Efendiev, J. Galvis, F. Thomines, A systematic coarse-scale model reduction technique for parameter-dependent flows in highly heterogeneous media and its applications, Multiscale Model. Simul. 10 (2012) 1317–1343.