Applied Mathematics and Computation 378 (2020) 125212
Contents lists available at ScienceDirect
Applied Mathematics and Computation journal homepage: www.elsevier.com/locate/amc
A coupled monodomain solver with optimal memory usage for the simulation of cardiac wave propagation Nagaiah Chamakuri a,b,∗, Philipp Kügler a a b
Institute of Applied Mathematics and Statistics, University of Hohenheim, Stuttgart 70599, Germany Department of Mathematics, Mahindra-École Centrale (MEC), Hyderabad, 500 043, India
a r t i c l e
i n f o
Article history: Received 21 October 2019 Revised 14 February 2020 Accepted 7 March 2020
Keywords: Monodomain model Physiological ionic models Compile-time sparse matrix FEM Adaptive Runge-Kutta methods Reentry Parallel computing
a b s t r a c t The monodomain model is a common description for explaining the electrical activity of the heart. The model is nonlinear and consists of an ODE system that describes electrochemical reactions in the cardiac cells and a parabolic PDE that express the diffusion of the electrical signal. FEM approaches for the numerical solution of the monodomain equations can be classified into those that simultaneously update the state variables with respect to time by solving a coupled system and those that perform the temporal update in a decoupled manner. While coupled strategies are known to yield more accurate results, they so far have not been applied to physiological cell models due to their enormous memory requirements and computational times. In this paper, we tackle these challenges and suggest a novel computational strategy, that exploits the sparsity of local matrices in the assembly of global FEM matrices and hence features optimal usage of memory. We demonstrate the practicability of our coupled approach by employing three popular physiological cell models (Luo-Rudy phase-I, Ten Tusscher 2006 and O’Hara-Rudy 2011) and compare it with a commonly used decoupled strategy. Qualitatively different results obtained for the simulation of cardiac reentry underline the potential relevance of our work for future studies of cardiac pathology. © 2020 Elsevier Inc. All rights reserved.
1. Introduction The bidomain model provides the basis for many numerical studies of cardiac electrophysiology. It consists of a coupled system of partial differential equations describing the diffusion of the intracellular and extracellular potential fields across the cardiac tissue. Furthermore, ionic currents across the cell membrane and encompassing cellular reactions are described by systems of ordinary differential equations (ODEs) based on either phenomenological or electrophysiological cell models. The numerical solution of the bidomain equations is challenging, see [10,19,24,31], as the state variables act across multiple temporal and spatial scales, and as sufficiently large domain sizes and observation times are needed to study wavefront propagation free of boundary effects, and in particular arrhythmic events such as reentry. The bidomain model reduces to the monodomain model if the two media, intra- and extracellular, have the same anisotropy ratio. In many large scale simulations, the monodomain model is used in order to reduce the computational load. The spatial discretization of the monodomain model leads to a system of algebraic-differential equations, and its numerical integration with respect to time can either be done in a coupled or a decoupled manner. In the coupled approach, ∗
Corresponding author at: Institute of Applied Mathematics and Statistics, University of Hohenheim, 70599 Stuttgart, Germany. E-mail addresses:
[email protected] (N. Chamakuri),
[email protected] (P. Kügler).
https://doi.org/10.1016/j.amc.2020.125212 0 096-30 03/© 2020 Elsevier Inc. All rights reserved.
2
N. Chamakuri and P. Kügler / Applied Mathematics and Computation 378 (2020) 125212
the state variables of the differential equations are simultaneously updated in each time step of the time integrator, then typically using a fully implicit time stepping scheme [7–9,17]. In decoupled approaches such as implicit-explicit (IMEX) time stepping [5,11,12,19,22–24,33] or operator splitting methods [16,22,24,26,27], the reaction and diffusion terms are treated separately such that in each time step only a system of linear equations has to be solved for the partial differential equation (PDE) variables. While coupled approaches yield more accurate results than decoupled ones, they involve a nonlinear system in each time step for the PDE variables and hence are much more demanding with respect to the computing time and memory requirements. As the large size of state variables in physiological cell models leads to a few million degrees of freedom in the discretized monodomain model, coupled solvers with implicit time stepping have so far only been applied to phenomenological cell models with only a handful of state variables. In this paper, we suggest a novel coupled strategy for solving the monodomain model equations that is based on an efficient assembly of the global compressed row storage (CRS) matrix of the finite element method (FEM) discretization. The core idea is to exploit the sparse coupling of the cellular state variables and to implement the local block matrices appearing at each CRS element as compile-time sparse matrices. Furthermore, we take advantage of higher order time stepping schemes with an adaptive choice of the step size. Consequently, computational costs as well as computer memory requirement are significantly reduced so that the monodomain equations for the first time are solved in a coupled manner even in the presence of physiological cell models. In particular, this enables the accurate and efficient simulation of pathological behavior such as cardiac reentry. The remaining part of the paper is organized as follows. In the next section, we present the monodomain model equations and list the physiological cell models (Luo-Rudy phase-I [15], Ten Tusscher and Panfilov [29] and O’Hara-Rudy et.al. [18]) used in our numerical experiments. In Section 3, we illustrate the spatial discretization of the monodomain model, give a comparison between coupled and decoupled solution strategies, and finally explain our central idea of compile-time local sparse matrices. Section 4 contains numerical results, where we first compare our approach with a prohibitive dense matrix implementation, and then compare it to a common decoupled solver based on the simulation of cardiac reentry with the three before mentioned cell models. 2. Governing equations Monodomain model In the bidomain model, see [10,25] for a detailed description, the dynamics of the intra- and extracellular potentials are described by a set of reaction-diffusion equations, while the ionic activity is expressed in terms of ordinary differential equations. The monodomain model can be derived from the bidoman model by assuming aligned conductivity tensors, see [10,25]. We denote ⊂ Rd , d ∈ {2, 3}, be a bounded connected domain with Lipschitz continuous boundary ∂ . Also, Q = × (0, T ] and = ∂ × (0, T ] are denoted by the space-time domain and its lateral boundary respectively. The governing equations of the monodomain model are represented by the following equations:
βCm
∂v , c ) + Istim (x, t ) in Q = ∇ · σ¯ i ∇ v − β Iion (v, w ∂t
(1)
∂ c , c ) in Q, = S ( v, w ∂t
(2)
∂w ) in Q, = G ( v, w ∂t
(3)
: Q → Rn represents the n gating variables, c : Q → Rm represents the m ionic where v : Q → R is the membrane potential, w concentration variables and σ¯i : → Rd×d is the harmonic average conductivity tensor between the intra- and extracellular ones. The surface to volume ratio is denoted by β and the membrane capacitance by Cm . Istim represents the external current , c ) is the ionic current flowing through the ionic channels and the stimulus to generate the excitation wavefronts. Iion (v, w ) and S(v, w , c ) determine the evolution of the gating variables and concentrations, which are defined by a functions G(v, w specific cell model, see [1] for details. Eq. (1) is a parabolic type equation and Eqs. (2–3) is a set of ODEs which needs to be solved at each spatial point of the computational domain. We impose no flux boundary conditions due to the absence of a conductive bath, such that the intracellular and extracellular domains are electrically isolated along the tissue boundaries. We consider the following initial and boundary conditions for the monodomain model
η · σ¯ i ∇v = 0 on
(4)
(x, 0 ) = w 0 and c (x, 0 ) = c0 on , v(x, 0 ) = v0 , w
(5)
0 : → R is where v0 : → R denotes the initial membrane potential, c0 : → R is the initial concentration variables, w the initial gating variables at time t = 0 and η is the outward unit normal vector to the boundary of the domain.
N. Chamakuri and P. Kügler / Applied Mathematics and Computation 378 (2020) 125212
3
Cell models In our simulation study, we use the following three well accepted physiological cell models of increasing complexity, that ) and S(v, w , c ). specify the ionic current Iion through the ionic channels in the membrane and the source functions G(v, w Luo-Rudy model (LRI) The Luo-Rudy phase-I (LRI) cell model [15] describes the membrane action potential of the mammalian ventricular cell. The ionic current
Iion = INa + Isi + IK + IK1 + IK p + Ib comprises 6 components, namely the background current (Ib ), the plateau potassium current (IKp ), the time independent potassium current (IK1 ), the slow inward calcium current (Isi ), the time dependent potassium current (IK ) and the fast sodium current (INa ). The time dependent currents depend on six activation and inactivation gates m, h, j, d, f, X, which are governed by ODEs of the form
dk = αk (v )(1 − k ) − βk (v )k, dt
where k = m, h, j, d, f, X.
(6)
The voltage-dependent rate constants α k and β k are depend on the transmembrane voltage v. We denote the gating vari = (m, h, j, d, f, X ). Furthermore, the slow inward calcium current (Isi ) depends on the calcium concentraables vector by w tion (c) described by
dc , c ). = S ( v, w dt
(7)
Summarizing, the LRI model consists of 8 state variables which involves 6 ionic currents, 6 gating equations and a nonlinear ODE of the calcium concentration. All the model parameters and functions were used as in the original paper [15]. Ten Tusscher model (TP06) The ten-Tusscher and Panfilov [29] model describes the ventricle action potential of human epicardial, endocardial, and midmyocardial cells. The ionic transmembrane current has the following 12 components: fast Na+ , L-type Ca2+ , rapid and slow components of the delayed rectifier K+ , inward rectifier K+ , transient outward K+ , plateau K+ , Na+ -Ca2+ exchanger, Na+ -K+ pump, sarcolemmal Ca2+ pump, and background Na+ and Ca2+ currents. The model comprises 19 state variables, namely membrane voltage, 11 gating and 7 concentration variables, see [29] for variable names and a more detailed description. In our simulations, we used the epicardial cell parameter and function setting as in the original paper [29]. O’Hara-Rudy model (ORd11) Another description of the human ventricle action potential for epicardial, endocardial and myocardial cells is given by the O’Hara-Rudy [18] model, which is based on data from human ventricular myocyte experiments. The ionic transmembrane current comprises 16 components, the additional late Na+ current, Na+ current through the L-type Ca2+ channel, K+ current through the L-type Ca2+ channel, the myoplasmic component of Na+ -Ca exchange current and subspace component of Na+ -Ca2+ exchange current are required comparing with TP06 model. The model has 41 state variables, namely the membrane voltage, 29 gating variable and 11 concentration variables. In our simulations, we used the epicardial cell parameter and function setting as in the original paper [18]. 3. Numerical approach 3.1. Weak formulation In the following, we denote the inner product and norm in L2 () by ( · , · ) and | · | respectively, and those in H1 () by (·, · )H 1 and || · ||. Furthermore, we denote H = L2 () and V = H 1 (). We assume that for constants 0 < m < M
m|ξ |2 ≤ ξ T σ¯ i ξ ≤ M|ξ |2 ,
for all
ξ ∈ Rd .
(8)
) ∈ L2 (0, T ; V ) ∩ C (0, T ; H ) ∩ Lp (0, T; The weak formulation of the monodomain model reads as follows. A triple (v, c, w
t ∈ L2 (0, T ; H ) and ct ∈ L2 (0, T ; H ) is called weak solution Q)) × C(0, T; H) × C(0, T; H) with vt ∈ L2 (0, T ; V ∗ ) ∩ L p (0, T ; Q ), w (0 ), c (0 ) ) = (v0 , w 0 , c0 ) and if of 1–(3), if (v(0 ), w
< β Cm vt (t ), ϕ >V ∗ ,V + < ct (t ), φ >V ∗ ,V +
t (t ), ψ >V ∗ ,V +
σ¯ i ∇v(t )∇ϕ +
(t ), c (t ))ϕ =< Istim (t ), ϕ >V ∗ ,V , β Iion (v(t ), w
(t ), c (t ))φ = 0, S(v(t ), w
(t ))ψ = 0 G(v(t ), w
(9) (10) (11)
4
N. Chamakuri and P. Kügler / Applied Mathematics and Computation 378 (2020) 125212
is satisfied for all a.e. t ∈ (0, T) and (ϕ , φ , ψ ) ∈ V × H × H. Here, V∗ denotes the dual to V with H as pivot space and 1 1 p + p = 1. For some simplified cell models, the existence and uniqueness of a weak solution to the bidomain model was demonstrated in [6]. The wellposedness of the monodomain model coupled with the LRI cell model was derived in [30] but is an open problem for more physiological cell models. 3.2. Spatial discretization In this paper, we use the conforming linear finite element method (FEM) for the spatial discretization of monodomain model. We denote Vh ⊂ H1 () as a finite dimensional subspace spanned by piecewise linear basis functions with respect are expressed in the following form v(t ) = N to the spatial grid. The approximate solutions v, c and w c (t ) = i=0 v i (t )ωi , N N (t ) = i=0 w i (t )ωi , respectively, where {ωi }N c i (t )ωi and w denote the basis functions. This semi-discretization in i=0 i=1 space results in the differential algebraic system:
βCm M
∂v = −Ai v − β Iion (v, w, c ) + Istim ∂t
(12)
M
∂c = S ( v, w, c ) , ∂t
(13)
M
∂w = G ( v, w ) , ∂t
(14)
together with initial conditions for v, c and w, M = {< ωi , ω j >}N is the mass matrix, Ai = {< σ¯i ∇ ωi , ∇ ω j >}N is the i, j=1 i, j=1 stiffness matrix, and the vector Istim is defined by Istim = { Istim , ω j }Nj=1 . The nonlinear reaction terms of the PDEs are given by
Iion (v, w, c ) = Iion
N
N
i ωi , w
i=0
v i ωi ,
N
i=0
i=0
N
N
G ( v, w ) = G
v i ωi ,
i=0
S ( v, w, c ) = S
N
v i ωi ,
i=0
N
c i ωi ,
i=0
i ωi , w
N
,
c i ωi
,
i=0
jω j w
.
j=0
3.3. Temporal discretization Time discretization schemes for the numerical integration of (12)–(14) can be divided into coupled and decoupled methods. For a detailed comparative study on the stability and accuracy of implicit, semi-implicit, explicit time discretizations of monodomain model equations with simplified cell models we refer to [12]. Coupled strategy The semi discretized Eqs. (12)–(14) can be expressed as a coupled system
˜ M
∂x = F ( x ), ∂t
x(t 0 ) = x0
(15)
for the state variable x = (v, w, c ). In order to solve the ODE system (15), we used the Rosenbrock method, which belongs to the class of linearly implicit Runge–Kutta methods. For brevity, we denote an s-stage Rosenbrock method of order p with the embedding of order pˆ = p from time ti to t i + τ i has the form
1
τ iγ
˜ − J ko = F t + τ M
xi+1 = xi +
i
i
α j, x + i
o−1 l=1
s
aol kl
˜ −M
o−1 bol l=1
τi
kl ,
o = 1, . . . , s ,
(16)
ml kl ,
(17)
ˆ l kl . m
(18)
l=1
xˆ i+1 = xi +
s l=1
N. Chamakuri and P. Kügler / Applied Mathematics and Computation 378 (2020) 125212
5
ˆ l were chosen in such a way that certain consistency order conditions are The constant coefficients γ , α j , aol , bol , ml , and m fulfilled to assure the convergence orders p and pˆ. We used exact derivatives of the vector F(x) for the construction of the Jacobian matrix J in (16). In our computations, the stiff ODE solver ROS3PL [14] is used, which has s = 4 internal stages and is third order accurate ( p = 3). Moreover, this has no order reduction in the PDE case and satisfies the L-stability property. To determine the future suggested time step, the second solution xˆ i+1 is computed according to eq. (18). Here we choose p > pˆ = 2 since we prefer to continue the integration with the higher order solution x. We omit the details of adapting the time step strategy and refer to [14] for a complete description. Now we turn the discussion to solve the systems of linear equations that arise in each time step by efficient iterative solvers. For solving the discretized systems at each internal stage of (16), we employed a BiCGSTAB [32] method with block Jacobi preconditioning. In parallel numerical implementation, the main essence is to distribute the computational grid evenly across all available processors which is mainly achieved by domain decomposition algorithms. In numerical simulations, we used the internal YASP grid generator in DUNE [3] for parallel grid constructions. This grid generator supports various levels of overlapping grids for parallel simulations and we chose a zero level of overlapping grids, in otherwords non overlapping grids, in both strategies. For the domain decomposition technique, the computational domain is partitioned into subdomains and each subdomain is assigned to a single processor. A reduced size problem on each subproblem, which is coupled to adjacent problems along interface boundaries, is solved. The interface coupling at the processor boundaries can be relaxed by introducing an additional communication in the solution of the algebraic system, see [4].
Decoupled strategy For numerical comparisons, a decoupled strategy for solving 12)–((14) is implemented. We chose a popular operator splitting approach that separates the reaction operator (associated with the ionic current) from the diffusion operator (associated with the conduction in the media), and that treats the dynamics of the gating and concentration variables separately [2,22,31]. Here we adopt that bidomain decoupled strategy to the monodomain problem accordingly. Such splitting strategies eliminate strong dependencies between the state variables of differential equations and increase the computational efficiency by using different numerical schemes for the reaction and the diffusion terms. Their main disadvantage is a loss of numerical accuracy because the state variables are decoupled and hence not simultaneously updated with respect to time. The decoupled strategy realized in this paper for solving the monodomain equations at each time step reads as follows. 1. First, we solve for the gating variables wn+1 at t + t by means of the implicit Euler method
wn+1 − tG(vn , wn+1 ) = wn .
(19)
2. Second, we solve for the ionic concentrations cn+1 by means of the explicit Euler method
cn+1 = cn + tS(vn , wn+1 , cn ).
(20)
3. Finally, we update the transmembrane potential vn+1 by solving the parabolic equation
βCm n βCm M + Ai vn+1 = Mv − β MIion (vn , wn+1 , cn+1 ). t t
(21)
For solving the systems of linear equations in (21), we again used a BiCGSTAB method with block Jacobi preconditioning.
3.4. Optimal memory usage Discretized versions of physiological cell models must be solved on fine computational grids due to the rapid changes of dynamics occurring in very short time intervals and over small spatial domains. In addition, small time-steps must be incorporated. The need for relatively fine computational meshes and the large size of more physiological cell model equations requires large computer memory and leads to a high computational cost [2,20,22,28]. Here we demonstrate that the implementation of a compile-time local sparse matrix is a memory efficient technique for solving the monodomain equations in a coupled manner. Suppose that the chosen cell model has nm state variables. Then, in the context of the finite element (FE) method, each nodal point contributes a nm × nm block matrix to the global FE matrix. In most of the existing FE simulation packages, this block matrix is implemented as a dense matrix (DM). Consequently, n2m floating point operations are needed at each nodal point during the matrix-vector product operations of the Krylov solvers. However, due to the loose coupling between the state variables, it is necessary to use specialized data structures that take advantage of the sparse structure of the local matrix. In this regard, we implement the local block matrix as a compile-time sparse matrix (CTSM). By using template meta programming in C++, the CTSM is implemented as a vector and the local matrix-vector products are realized by unrolling the loops. Though this requires more work at compile time, the run time can be significantly reduced as unnecessary floating point operations are avoided. In addition, the need for computer memory is lower as only non-zero matrix entries need to be stored. For example, the Jacobian matrix of the Luo-Rudy phase-I cell model with its 8 state variables (v, m, h, j, d, f, X,
6
N. Chamakuri and P. Kügler / Applied Mathematics and Computation 378 (2020) 125212 Table 1 Memory usage of DM and CTSM for different mesh sizes and different ionic models where NNZ denotes number of non-zero entries and R represents the ratio between DM and CTSM. Cells
NNZ
90,000 157,151 5,760,000
LRI model
811,801 2,224,511 51,854,401
TP06 model
ORd11 model
DM (GB)
CTSM (GB)
R
DM (GB)
CTSM (GB)
R
DM (GB)
CTSM (GB)
R
0.41 1.13 26.5
0.15 0.42 9.95
2.67 2.67 2.67
2.34 6.42 149.7
0.51 1.38 32.3
4.62 4.62 4.62
10.91 29.91 697.3
1.46 4.0 105.7
7.47 7.47 6.59
c) has the following sparse matrix structure.
v
⎛
v m h j d f X c
X
⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝
c
m h j d f
⎞
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
⎟ ⎟ ·⎟ ⎟ ·⎟ ⎟ ·⎟ ⎟ ·⎠
·
·
·
·
·⎟ ⎟
·⎟ (22)
Here, denotes the non-zero entries and · denotes the zero entry in the local matrix. Hence, in case of a CTSM implementation, the storage of the local block matrix entries only requires one dimensional array of 24 doubles instead of 64. The sparsity, the number of zero-valued elements divided by the total number of elements, of the local block matrix is 62%. Furthermore, only 24 floating point operations are needed at each nodal point during the matrix-vector product operations of the Krylov solvers. Likewise, in the case of the TP06 cell model with its 19 variables, the CTSM implementation requires one dimensional array of 78 doubles to store the local block matrix entries instead of 361 doubles and requires 78 flops at the local matrixvector product. Here the sparsity of the local block matrix is 78%. For the ORd11 ionic model with its 41 state variables, the CTSM implementation requires one dimensional array of 225 doubles to store the local block matrix entries instead of 1681 doubles and needs 225 flops for the local matrix-vector product. In this model, the sparsity of the local block matrix is 86%. We remark that such very large local sparse matrices are infeasible to manipulate using standard dense-matrix algorithms on realistic heart geometries. In our study, we implemented this compile time sparse vector strategy in the framework of the dune-pdelab [4] software. The gain of computer memory obtained by using the CTSM implementation instead of the dense matrix implementation is summarized in Table 1 for different mesh sizes and different ionic models. For instance, the global FE matrix for the monodomain equation with the ORd11 model defined on a 2D slab geometry with 5,760,0 0 0 FE cells requires about 7.5 times less memory if assembled according to the CTSM approach. 4. Numerical results In this section, we now present the simulation results obtained with the CTSM approach. First, we give a comparison of runtime of the simulation and matrix assembly time with the DM approach. Second, we solve the monodomain equation for the simulation of cardiac reentry and compare with respect to accuracy the CTSM approach with the decoupled strategy outlined above. Our code implementation is based on the public domain FEM packages DUNE [3] and Dune-PDELab [4]. The computations were performed on the bwUniCluster (located at KIT, Germany) using up to 64 Intel Xeon E5-2660 v4 (Broadwell) computing nodes, where each node consists of 28 cores with each 128 GB main memory. Again, we used the before mentioned three physiological cell models with original parameter settings in case of TP06 [29] and ORd11 [18]. For the LRI model, we chose Gsi = 0.02, GNa = 4.0 and GK = 0.423 to obtain spiral wavefronts but kept all other parameters at their original values [15]. In all simulations, the diagonal conduction coefficients were set to 0.0015 cm2 /ms. The initial values of the transmembrane voltage, the gating variables and the concentration variables are given by resting values of the cardiac tissue. We use the uniform quadrilateral mesh in our computations. 4.1. Comparison of sparse and dense matrix implementations For the comparison of the CTSM and the DM strategy, we chose a simulation time span of 1 ms, fixed the time step size to 0.02 of the coupled solver, and tested with different numbers of processors. For these simulations, we chose the computational domain = [0, 1]2 cm2 and a grid resolution of 300 × 300 quadrilateral elements.
N. Chamakuri and P. Kügler / Applied Mathematics and Computation 378 (2020) 125212
7
Table 2 Computational times (in seconds) and speed up for the LRI model. Cores
4 16 112 448 896
DM
CTSM
Speedup (DM/CTSM)
Assembly
Solver
Assembly
Solver
Assembly
Solver
Total
178.85 54.20 8.12 1.94 0.99
684.77 214.44 31.45 8.11 4.58
147.26 45.77 6.66 1.65 0.84
427.52 134.45 18.87 5.42 2.92
1.21 1.18 1.21 1.17 1.17
1.60 1.59 1.66 1.49 1.56
1.50 1.49 1.54 1.42 1.48
Table 3 Computational times (in seconds) and speed up for the TP06 model. Cores
4 16 112 448 896
DM
CTSM
Speedup (DM/CTSM)
Assembly
Solver
Assembly
Solver
Assembly
Solver
Total
4354.55 1354.94 198.78 48.14 25.38
8601.36 3172.43 324.36 93.02 44.91
2268.88 703.58 104.45 25.66 12.91
2876.71 1051.54 112.29 30.67 15.01
1.91 1.92 1.90 1.87 1.96
2.98 3.01 2.88 3.03 2.99
2.51 2.57 2.41 2.50 2.51
Table 4 Computational times (in seconds) and speed up for the ORd11 model. Cores
4 16 112 448 896
DM
CTSM
Speedup (DM/CTSM)
Assembly
Solver
Assembly
Solver
Assembly
Solver
Total
10134.99 2791.79 416.79 98.06 46.75
15479.93 4104.86 558.72 135.84 71.16
3753.7 1011.52 147.80 35.88 17.69
3011.66 824.27 109.77 26.92 14.13
2.70 2.75 2.81 2.73 2.64
5.13 4.97 5.08 5.04 5.03
3.78 3.75 3.78 3.72 3.70
For the test with the LRI model, Table 2 presents the computational times needed for the FEM matrix assembly and the complete ODE solution based on the number of cores. The speedup is reported for the matrix assembly, the solver and the complete simulation, and calculated as the ratio of the DM values over the corresponding CTSM values. In this example, the matrix assembly with CTSM is approximately 1.20 times faster than with DM across different core numbers, the ODE solver is roughly 1.6 times faster, and the total computation time is about 1.50 shorter. Furthermore, good parallel efficiency was observed up to 896 cores. Likewise, Tables 3 and 4 list the numerical results obtained with the TP06 model and the ORd11 model. While with TP06 the total computation was about 2.5 times faster for CTSM than for DM across all tested number of cores, this factor even increased to roughly 3.7 in case of ORd11. Again, good parallel efficiency was observed up to 896 cores for both models. Furthermore, we investigated the weak parallel scalability of the CTSM approach and considered an increasing number np of processors according to n p = 8, 28, 112, 448 and 1792. The speedup test of the CTSM technique was performed on a slab domain, where a fine mesh was chosen such that the local mesh size on each subdomain (or processor) is fixed at 56 × 56 quadrilateral elements. That way, the global size of the discrete LRI system increases from about 0.2 million dofs for the smallest domain with 8 subdomains to 46 million dofs for the largest domain with 1792 subdomains. For this benchmark, the simulation was run for 20 time steps of 0.05 ms during the depolarization phase, which computationally is the most intense. The parallel efficiency for the LRI model is presented in Fig. 1 and clearly shows that the numerical scheme is scalable. We remark that we observed similar parallel weak scalability results for the TP06 and ORd11 models. Furthermore, comparable results were obtained for the DM technique. 4.2. Comparison of coupled and decoupled approaches For all three cell models, we compared the coupled with the decoupled approach for solving the monodomain equation in two ways. First, we studied the relative error of the state variables at the reference time t ∗ = 80 ms during repolarization, and second, we analyzed the behavior during cardiac reentry in response to a S1-S2 protocol. With respect to the convergence analysis of the error we chose the computational domain = [0, 1]2 cm2 and a grid resolution ranging from 75 × 75 cells to 1200 × 1200 cells. The reference solution was calculated using the coupled strategy with the finest grid and the fixed time step size τ = 0.025 ms. Here the small time step size was chosen to ensure the accuracy of the numerical solution at the finest grid level. For the rougher grids, the coupled method was combined with adaptive time stepping. In the implementation of the decoupled method, the time step size was set to 0.04 ms for the roughest grid and reduced by a
8
N. Chamakuri and P. Kügler / Applied Mathematics and Computation 378 (2020) 125212
Fig. 1. Weak scaling of LRI model.
Fig. 2. L2 -error at time 80 ms of the decoupled approach at left and coupled scheme at right. Table 5 Action potential characteristics for LRI model.
APD90 (ms) dv/dt (V/s) Conduction Velocity (cm/s )
Decoupled method
Coupled method
102.03 33.02 36.28
102.68 31.78 37.92
factor 1/2 for each grid refinement. In all cases, the relative error of some state variable Z ∈ (v, c1 , . . . , w1 . . . ) was calculated as
errZ (t ∗ ) = Z (., t ∗ ) − Zr (., t ∗ ) L2 () / Zr (., t ∗ ) L2 () ,
(23)
where Zr denotes the corresponding variable of the reference solution. With respect to the reentry analysis, we used the originally published domain sizes which are listed below for convenience. Results with LRI model For the LRI model, Fig. 2 illustrates the dependence of the relative error on the grid size and shows that the coupled approach yields more accurate results than the decoupled one. However, we numerically observed that there is a slight difference in the solutions for the simulation of one complete action potential. With respect to the reentry analysis, we chose = [0, 6]2 cm2 and grid resolutions ranging from 300 × 300 to 2400 × 2400. At time t = 0 ms, an initial stimulus S1 of Itr = 100 μA/cm2 was applied for a duration of 2 ms along the bottom edge of the tissue sheet to induce a planar wavefront traveling towards the top edge of the slab domain. The characteristics of the resulting action potential at the midpoint of are given in Table 5 as obtained for the coupled approach with adaptive time stepping and the decoupled approach with step size t = 0.02, and with grid resolution 1200 × 1200
N. Chamakuri and P. Kügler / Applied Mathematics and Computation 378 (2020) 125212
9
Fig. 3. The solution of v at midpoint (3,3) in computational domain. left: with decoupled method middle: with coupled method right: Convergence of solutions with varying time step size in decoupled method.
Fig. 4. The transmembrane voltage v solution of LRI model at times t= 40, 120, 240, 40 0 and 60 0 ms respectively. First row : with decoupled method. Second row : with coupled method.
in both cases. Here the time step size and the domain size were considered based on the convergence of the solution which is shown in Fig. 3. Though the small differences in APD90, maximal upstroke velocity dv/dt and conduction velocity might seem to be not critical for the simulation of normal wave propagation, they have a strong impact on the simulation of reentry. For the initiation of the latter, a second S2 stimulus with Itr = 100 μA/cm2 and duration 2 ms was applied in the region [0, 0.1] × [0, 3] at time t = 115 ms, when the critical recovery isoline arrived at the center of the sheet. This second stimulus generated a single phase singularity at the intersections between the critical recovery isoline and the boundary of the S2 stimulus region. The resulting membrane potential (v) at the midpoint of recorded for a simulation period of 600 ms is plotted in Fig. 3, both for the coupled and the decoupled approach and in dependence on the grid size. While the coupled approach leads to five action potentials for all grid sizes, the decoupled approach at the two finest grid resolutions only leads to four action potentials. This difference in simulation outcomes is due to an accumulation of small deviations as listed in Table 5 and further illustrated in Fig. 4 by a temporal sequence of snapshots of the membrane potential v. While at t = 40 ms the difference between the solutions obtained by the two methods seems to be negligible, the excitation wavefront is still faster for the coupled method. At t = 120 ms, the end of the stimulus S2 period, the difference in the two solutions becomes visual and further grows as time progresses. As a consequence, the coupled and the decoupled approach lead to completely different predictions of the spiral wave development at t = 600 ms. For this simulation, the decoupled approach took 1 h 31 min on 84 cores and only was 1.35 times faster than the coupled approach. Results with TP06 model Also for the TP06 model, the convergence analysis of the relative error demonstrates a higher accuracy of the coupled approach, see Fig. 5. With respect to the reentry analysis, we chose = [0, 25]2 cm2 and two grid resolutions 10 0 0 × 10 0 0 and 20 0 0 × 20 0 0, both leading to similar results. At time t = 0 ms, an initial stimulus S1 of Itr = 100 μA/cm2 was applied for a duration of 1 ms along the bottom edge of the tissue sheet to induce a planar wavefront traveling towards the top edge of the slab domain. The characteristics of the resulting action potential at the midpoint of are given in Table 6. While the coupled and the decoupled approach only lead to slightly different values of the maximum peak of v, the APD90 and the maximal upstroke velocity, the conduction velocity obtained with the coupled approach is about 10% higher. As
10
N. Chamakuri and P. Kügler / Applied Mathematics and Computation 378 (2020) 125212
Fig. 5. L2 -error at time 80 ms of the decoupled approach at left and coupled scheme at right. Table 6 Action potential characteristics for TP06 model.
Maximum peak value (mV) APD90 (ms) dv/dt (V/s) Conduction Velocity (cm/s)
Decoupled method
Coupled method
34.91 346.11 26.12 56.62
35.02 346.06 26.23 64.01
Fig. 6. The time courses of v at the point (0,25) in the computational domain obtained by the decoupled and coupled methods.
a consequence, the two methods produce different reentry behavior. The latter was triggered at time t = 550 ms by the application of a second S2 stimulus of Itr = 100 μA/cm2 in the region [0, 9] × [0, 9] for a duration of 1 ms. A comparison of the time course of the membrane potential (v) at the point (0,25) of , see Fig. 6, shows that in the coupled approach the action potentials occur earlier than in the decoupled one. Also, the spiral wave tip moves faster in case of the coupled approach due to the higher conduction velocity, see Fig. 7 for comparing sequences of transmembrane voltage snapshots. For this reentry simulation, the decoupled approach took 2 h 8 min on 896 cores and only was 2.25 times faster than the coupled approach. Results with ORd11 model Finally, we repeated our tests with the ORd11 model and made similar observations as in the case of LRI and TP06. In particular, the results obtained with the coupled approach are more accurate, as demonstrated by the convergence error analysis summarized in Fig. 8. While this difference was not significant for the simulation of one heartbeat, it again mattered in the context of reentry. For the reentry simulation, we chose = [0, 18]2 cm2 and GKr = 0.2392 in order to reduce the excitation wave width. At t = 0, an initial stimulus S1 of Itr = 120 μA/cm2 with the duration of 2 ms was applied along the bottom edge of the tissue sheet in order to induce a planar wavefront traveling towards the top edge. The characteristics of the resulting action poten-
N. Chamakuri and P. Kügler / Applied Mathematics and Computation 378 (2020) 125212
11
Fig. 7. The transmembrane voltage v solution of TP06 model at times t= 375, 712, 831, 1523 and 20 0 0 ms, respectively. First row: with the decoupled method. Second row: with the coupled method.
Fig. 8. L2 -error at time 80 ms of the decoupled approach at left and coupled scheme at right. Table 7 Action potential characteristics for ORd11 model.
Maximum peak value (mV) APD90 (ms) dv/dt (V/s) Conduction Velocity (cm/s )
Decoupled method
Coupled method
29.94 96.0 29.19 43.31
28.91 96.04 29.11 43.92
tial are given in Table 7 and show small differences between the coupled and the decoupled approach. Yet, in particular, the difference in conduction velocity is large enough to cause clear differences in the reentry behavior. At time t = 330 ms, a second S2 stimulus of Itr = 120 μA/cm2 was applied in the region [0, 9] × [0, 9] for a duration of 2 ms. Again, the resulting difference between the two simulation methods is illustrated by comparing the time course of the membrane potential at the point (0,18) in , see Fig. 9, and by comparing spiral wave formation and movement in a sequence of voltage snapshots, see Fig. 10. For this reentry simulation, the decoupled approach took 3 hours 43 minutes on 896 cores and was 3.46 times faster than the coupled approach. Results on 3D realistic geometry In this subsection, we present the numerical results based on a realistic 3D geometry and used the LRI model in this study. In our computations, a full three spatial dimensional rabbit ventricle (RV) geometry, see the left-hand side of Fig. 11, is considered which is generated based on histological images [21]. The size of the RV domain is 2.79 × 2.91 × 2.56 cm3 . It
12
N. Chamakuri and P. Kügler / Applied Mathematics and Computation 378 (2020) 125212
Fig. 9. The solution of v at the point (0,18) in the computational domain using the decoupled and coupled methods.
Fig. 10. The transmembrane voltage v solution of ORd11 model at times t= 316, 556, 776, 864, 1028 and 1440 ms respectively. First row: with the decoupled method. Second row: with the coupled method.
Fig. 11. The 3D RV domain at left hand side and the glyph of the fiber directions al (x) on RV domain at right hand side.
consists of 3,073,529 tetrahedrons and 547,680 nodal points. As a consequence of the myocardium material, the conductivity tensors are anisotropic. We denote by al (x), at (x) and an (x) the fiber, sheet and normal to the sheet directions respectively in the orthonormal basis [13], which depends on the position in the cardiac tissue. In this test case, the rotational isotropy is assumed at the cardiac tissue structure, i.e. σni = σti . Then the local conductivity tensor σ¯ i is expressed as
σ¯ i = (σil − σti ) al (x )aTl (x ) + σti I .
(24)
N. Chamakuri and P. Kügler / Applied Mathematics and Computation 378 (2020) 125212
13
Table 8 Action potential characteristics on realistic 3D RV geometry for LRI model.
Maximum peak value (mV) APD90 (ms) dv/dt(V/s) Conduction Velocity (cm/s )
Decoupled method
Coupled method
−6.25 43.80 54.59 35.77
−6.80 44.69 58.38 37.62
Fig. 12. The solution of v at the point (−0.81, 0.61, −0.77) in the 3D RV computational domain using the decoupled and coupled methods.
Fig. 13. The transmembrane voltage v solution of LRI model on 3D RV geometry at times t= 47, 101, 191, 353 and 551 ms, respectively. First row: with the decoupled method. Second row: with the coupled method.
Here σli , σti denote the measured conductivity coefficients along with the corresponding directions and I is the identity matrix. The fiber directions of the cardiac tissue have been modeled based on anatomical observations, see [21] for more information and the right-hand side of Fig. 11 for the distributions of the fiber orientation in our computations. We chose similar parameters as in the 2D simulations with the exception of the S2 stimulus time. First, an initial stimulus S1 of Itr = 100 μ A/cm3 was applied at time t = 0 ms for a duration of 2 ms at the bottom of the RV geometry to initiate a planar wavefront. At time t = 135 ms, a second S2 stimulus of strength Itr = 100 μ A/cm3 was applied in a small region of 0.8 cm radius at a point (0.0281, 0.5744, 1.1987) for a duration of 2 ms. The action potential characteristics on RV domain are given in Table 8. In this test case, the coupled and the decoupled approach lead to different values in all three APD characteristics. The difference between the decoupled and coupled values is higher in case of the 3D realistic domain including the fiber orientation in comparison with the 2D slab. As a consequence, the two methods produce different reentry behavior in 3D domain. The time course of the membrane potential (v) at the spatial point (−0.81, 0.61, −0.77) on of two approaches is depicted in Fig. 12, and this shows that the action potentials occur earlier in the coupled approach than in the decoupled one. The 3D snapshots of the transmembrane voltage v at different time instance are shown in Fig. 13, and we can see that excitation wavefront is faster in the case of coupled approach as compared with the decoupled approach. Then at the time 600 ms, the solution leads to different in both approaches.
14
N. Chamakuri and P. Kügler / Applied Mathematics and Computation 378 (2020) 125212
In this test case, the decoupled approach took 1 h 24 min on 84 cores and only was 1.89 times faster than the coupled approach. We realized a similar observation in the case of TP06 and ORd11 membrane models. 5. Conclusion The presented compile-time sparse matrix(CTSM) approach allows an efficient storage of the FEM global CRS matrix and an effective handling of several species corresponding to the system of reaction-diffusion equations. The memory ratio of the dense matrix to the CTSM for the LRI model is 2.67, for TP06 model is 4.67 and for ORd11 mode is about 7.0. It is evident that, for the sheer size of many solution species, the storage requirement of the CTSM implementation is much lower than for the dense matrix implementation. Additionally, the computational times are optimal, and the speedup for the linear solver using the CTSM approach is enormous if a higher number of state variables is present in the simulation. Our numerical experiments show that the total speed up of CTSM over DM approach for the LRI model is roughly 1.50, for the TP06 model is 2.50 and for the ORD11 model is approximately 3.75. Also, a good parallel efficiency is obtained up to 1792 cores for all cell models. By using the CTSM technique, a large class of cell models, e.g. LR91, TP06 and ORd11 models, in a coupled approach can be solved efficiently. The presented numerical results show that the L2 errors of the coupled method are smaller than the decoupled method and the AP characteristics are slightly different in comparison to the decoupled approach. Moreover, we remark that the numerical results obtained with the coupled approach are qualitatively different than the decoupled approach in the case of cardiac reentry simulations. Our numerical experiments strongly suggest the usage of the coupled approach for the simulation of pathological scenarios. Since our CTSM technique is a generalization of the one used in the FEM toolbox Dune-PDELab [4], it immediately can be applied to a large variety of reaction-diffusion problems. In future, we plan to extend the presented numerical techniques to solve the bidomain models on 3D realistic geometries with inclusion of anisotropic conductivity coefficients and fiber rotation. Acknowledgments We greatly acknowledge the support for high-performance computing time at the bwUniCluster which is funded by the Ministry of Science, Research and Arts of the State of Baden-Württemberg, Germany. NC gratefully acknowledges the financial support from the SERB, Department of Science and Technology, India (EMR/2017/0 0 0664 and MTR/2017/0 0 0598). Also we acknowledge our student Mr. Athish for his help on CTSM implementation. References [1] CellML Model Repository, http://models.cellml.org/cellml. [2] T. Austin, M.L. Trew, A. Pullan, Solving the cardiac bidomain equations for discontinuous conductivities, IEEE Trans. Biomed. Eng. 53 (7) (2006) 1265– 1272, doi:10.1109/TBME.2006.873750. [3] P. Bastian, M. Blatt, A. Dedner, C. Engwer, R. Klöfkorn, R. Kornhuber, M. Ohlberger, O. Sander, A generic grid interface for parallel and adaptive scientific computing. Part II: implementation and tests in DUNE, Computing 82 (2) (2008) 121–138, doi:10.1007/s00607- 008- 0004- 9. [4] P. Bastian, F. Heimann, S. Marnach, Generic implementation of finite element methods in the distributed and unified numerics environment (DUNE), Kybernetika 46 (2) (2010) 294–315. [5] M. Boulakia, M.A. Fernández, J.-F. Gerbeau, N. Zemzemi, A coupled system of PDEs and ODEs arising in electrocardiograms modeling, Appl. Math. Res. eXpress 2008 (2008), doi:10.1093/amrx/abn002. [6] Y. Bourgault, Y. Coudiére, C. Pierre, Existence and uniqueness of the solution for the bidomain model used in cardiac electrophysiology, Nonlinear Anal. Real World Appl. 10 (1) (2009) 458–482. [7] N. Chamakuri, Parallel and space-time adaptivity for the numerical simulation of cardiac action potentials, Appl. Math. Comput. 353 (2019) 406–417, doi:10.1016/j.amc.2019.01.063. [8] N. Chamakuri, K. Kunisch, G. Plank, Optimal control approach to termination of re-entry waves in cardiac electrophysiology, J. Math. Biol. (2013) 1–30, doi:10.10 07/s0 0285- 012- 0557- 2. [9] P. Colli Franzone, L.F. Pavarino, S. Scacchi, A comparison of coupled and uncoupled solvers for the cardiac bidomain model, ESAIM: M2AN 47 (4) (2013) 1017–1035, doi:10.1051/m2an/2012055. [10] P. Colli Franzone, L.F. Pavarino, S. Scacchi, Mathematical Cardiac Electrophysiology, Springer, 2014. [11] P. Colli Franzone, L.F. Pavarino, B. Taccardi, Simulating patterns of excitation, repolarization and action potential duration with cardiac bidomain and monodomain models, Math. Biosci. 197 (1) (2005) 35–66, doi:10.1016/j.mbs.2005.04.003. [12] M. Ethier, Y. Bourgault, Semi-implicit time-discretization schemes for the bidomain model, SIAM J. Numer. Anal. 46 (5) (2008) 2443–2468, doi:10.1137/ 070680503. [13] D.A. Hooks, K.A. Tomlinson, S.G. Marsden, I.J. LeGrice, B.H. Smaill, A.J. Pullan, P.J. Hunter, Cardiac microstructure: implications for electrical propagation and defibrillation in the heart, Circ. Res. 91 (4) (2002) 331–338, doi:10.1161/01.RES.0 0 0 0 031957.70 034.89. [14] J. Lang, D. Teleaga, Towards a fully space-time adaptive FEM for magnetoquasistatics, IEEE Trans. Magn. 44 (6) (2008) 1238–1241, doi:10.1109/TMAG. 2007.914837. [15] C. Luo, Y. Rudy, A model of the ventricular cardiac action potential: depolarization, repolarization, and their interaction, Circ. Res. 68 (1991) 1501–1526. [16] M.C. MacLachlan, J. Sundnes, R.J. Spiteri, A comparison of non-standard solvers for odes describing cellular reactions in the heart, Comput Methods Biomech. Biomed. Eng. 10 (5) (2007) 317–326. [17] M. Murillo, X.-C. Cai, A fully implicit parallel algorithm for simulating the non-linear electrical activity of the heart, Numer. Linear Algebra.Appl. 11 (23) (2004) 261–277, doi:10.1002/nla.381. [18] T. O’Hara, L. Virág, A. Varró, Y. Rudy, Simulation of the undiseased human cardiac ventricular action potential: model formulation and experimental validation, PLoS Comput. Biol. 7 (5) (2011) 1–29, doi:10.1371/journal.pcbi.1002061. [19] P. Pathmanathan, M.O. Bernabeu, R. Bordas, J. Cooper, A. Garny, J.M. Pitt-Francis, J.P. Whiteley, D.J. Gavaghan, A numerical guide to the solution of the bidomain equations of cardiac electrophysiology, Prog. Biophys. Mol. Biol. 102 (2) (2010) 136–155, doi:10.1016/j.pbiomolbio.2010.05.006. [20] L.F. Pavarino, S. Scacchi, Multilevel additive schwarz preconditioners for the bidomain reaction-diffusion system, SIAM J. Sci. Comput. 31 (2008) 420– 443, doi:10.1137/070706148.
N. Chamakuri and P. Kügler / Applied Mathematics and Computation 378 (2020) 125212
15
[21] G. Plank, R.A. Burton, P. Hales, M. Bishop, T. Mansoori, M.O. Bernabeu, A. Garny, A.J. Prassl, C. Bollensdorff, F. Mason, F. Mahmood, B. Rodriguez, V. Grau, J.A.E. Schneider, D. Gavaghan, P. Kohl, Generation of histo-anatomically representative models of the individual heart: tools and application, Philos. Trans. Royal Soc. A: Math. Phys. Eng. Sci. 367 (1896) (2009) 2257–2292, doi:10.1098/rsta.2009.0056. [22] G. Plank, M. Liebmann, R.W. dos Santos, E. Vigmond, G. Haase, Algebraic multigrid preconditioner for the cardiac bidomain model, IEEE Trans. Biomed. Eng. 54(4) (2007) 585–596. [23] Z. Qu, A. Garfinkel, An advanced algorithm for solving partial differential equation in cardiac conduction, IEEE Trans. Biomed. Eng. 46 (9) (1999) 1166–1168, doi:10.1109/10.784149. [24] J.A. Southern, G. Plank, E.J. Vigmond, J.P. Whiteley, Solving the coupled system improves computational efficiency of the bidomain equations, IEEE Trans. Biomed. Eng. 56 (10) (2009) 2404–2412, doi:10.1109/TBME.2009.2022548. [25] J. Sundnes, G.T. Lines, X. Cai, B.F. Nielsen, K.-A. Mardal, A. Tveito, Computing the Electrical Activity in the Heart, Springer, 2006. [26] J. Sundnes, G.T. Lines, A. Tveito, An operator splitting method for solving the bidomain equations coupled to a volume conductor model for the torso, Math. Biosci. 194 (2) (2005) 233–248, doi:10.1016/j.mbs.2005.01.001. [27] J.A. Trangenstein, C. Kim, Operator splitting and adaptive mesh refinement for the Luo-Rudy I model, J. Comput. Phys. 196 (2) (2004) 645–679, doi:10. 1016/j.jcp.2003.11.014. [28] N. Trayanova, J. Constantino, T. Ashihara, G. Plank, Modeling defibrillation of the heart: approaches and insights., IEEE Rev. Biomed. Eng. 4 (2011) 89–102, doi:10.1109/rbme.2011.2173761. [29] K.H.W.J. ten Tusscher, A.V. Panfilov, Alternans and spiral breakup in a human ventricular tissue model, Am. J. Physiol.-Heart Circ. Physiol. 291 (3) (2006) H1088–H1100. doi: 10.1152/ajpheart.00109.2006PMID: 16565318 [30] M. Veneroni, Reaction-diffusion systems for the macroscopic bidomain model of the cardiac electric field, Nonlinear Anal. Real World Appl. 10 (2) (2009) 849–868, doi:10.1016/j.nonrwa.20 07.11.0 08. [31] E.J. Vigmond, R. Weber dos Santos, A.J. Prassl, M. Deo, G. Plank, Solvers for the cardiac bidomain equations., Prog. Biophys. Mol. Biol. 96 (1–3) (2008) 3–18. [32] H.A. van der Vorst, Bi-CGSTAB: a fast and smoothly converging variant of BI-CG for the solution of nonsymmetric linear systems, SIAM J. Sci. Stat. Comput. 13 (1994) 631–644. [33] J.P. Whiteley, An efficient numerical technique for the solution of the monodomain and bidomain equations, IEEE Trans. Biomed. Eng. 53 (11) (2006) 2139–2147, doi:10.1109/TBME.2006.879425.