Applied Mathematics and Computation 339 (2018) 899–915
Contents lists available at ScienceDirect
Applied Mathematics and Computation journal homepage: www.elsevier.com/locate/amc
Highlighting numerical insights of an efficient SPH method E. Francomano∗, M. Paliaga University of Palermo, Polytechnic School, Department of Industrial and Digital Innovation – Viale delle Scienze, Ed.6, 90128 Palermo, Istituto Nazionale Di Alta Matematica, Gruppo Nazionale per il Calcolo Scientifico, Italy
a r t i c l e
i n f o
Dedicated to Maryam Mirzakhani Keywords: Kernel based methods Smoothed Particle Hydrodynamics Accuracy Convergence Improved fast Gaussian transform
a b s t r a c t In this paper we focus on two sources of enhancement in accuracy and computational demanding in approximating a function and its derivatives by means of the Smoothed Particle Hydrodynamics method. The approximating power of the standard method is perceived to be poor and improvements can be gained making use of the Taylor series expansion of the kernel approximation of the function and its derivatives. The modified formulation is appealing providing more accurate results of the function and its derivatives simultaneously without changing the kernel function adopted in the computation. The request for greater accuracy needs kernel function derivatives with order up to the desidered accuracy order in approximating the function or higher for the derivatives. In this paper we discuss on the scheme dealing with the infinitely differentiable Gaussian kernel function. Studies on the accuracy, convergency and computational efforts with various sets of data sites are provided. Moreover, to make large scale problems tractable the improved fast Gaussian transform is considered picking up the computational cost at an acceptable level preserving the accuracy of the computation. © 2018 Elsevier Inc. All rights reserved.
1. Introduction In recent years, meshless methods have gained growing interest in many different areas of science [2,4,8,9,21,39]. The basic idea of these methods is to provide numerical solutions without using any mesh in the problem domain. Methods without a predefinite connections are easily adapted to domains with complex and/or time evolving geometries without the difficulties that would be required to handle those features with topological data structures. They can be useful in nonlinear problems involving viscous fluids, heat and mass transfer, linear and non-linear elastic or plastic deformations, etc. In the Lagrangian approach the points, describing the problem domain, move with the medium, and points may be added or deleted in order to maintain a prescribed sampling density. In the Eulerian approach the points are fixed in space, but new points may be added where there is need for increased accuracy. So, in both approaches the nearest neighbors of a point are not set. Numerical simulations usually need the values of a function and its derivatives at certain point and in this paper we focus on their approximation by means of the Smoothed Particle Hydrodynamics (SPH) method. This method was originally developed for solving astrophysical problems [16,17,29–33] and subsequently it has been also used in other areas of science and engineering [1,3,5,12–14,22–24,26,33,36,37,41]. The method results very attractive but it suffers from several drawbacks due to inaccurate approximation at boundaries and at irregular interior regions. This often confines its utility to limited scenarios. Many techniques have been devised to alleviate these problems and some of these have been documented ∗
Corresponding author. E-mail address:
[email protected] (E. Francomano).
https://doi.org/10.1016/j.amc.2018.07.060 0 096-30 03/© 2018 Elsevier Inc. All rights reserved.
900
E. Francomano, M. Paliaga / Applied Mathematics and Computation 339 (2018) 899–915
in [6,25,27,28] and in the references therein. In this paper we discuss on sources of enhancement in accuracy of the standard SPH method via Taylor expansion of the kernel approximation of a function and its derivatives [27]. In this way, accurate estimates of the function and its derivatives are simultaneously provided and no lower order derivatives are inherent in approximating the higher order derivatives. Therefore, the possible numerical errors in a lower order derivative will not be brought to the higher order ones. Moreover, high order of accuracy can be obtained without changes on the kernel function avoiding to lead unphysical results such as negative density or negative energy that can lead to breakdown of the entire computation in simulating some problems [23]. Accuracy can be increased in approximating a function or its derivatives by employing kernel function derivatives with order up to the desidered accuracy order in approximating the function or higher when more accurate results for the derivatives of the function are requested.So the Gaussian kernel function, infinitely differentiable and adequately smooth even for higher order derivatives is a suitable choice. In our study we propose numerical investigations on the standard and improved method dealing with the Gaussian kernel. Many experiments are conducted with the aim to address numerical features of the method accomplished with various data sets locations, gridded and scattered in a unit square domain, referring to bivariate test functions [35,40]. The modified approach is very interesting for the applications, but the computational demanding is a bottleneck as data locations get finer and for an high number of evaluation points. The computational effort is mainly due to the summations of kernel and its derivatives in assembling the matrix of the solving system for each evaluation point. Motivated to speed up the computation and to make large scale problems tractable we focus on efficient processing of this task. Working with Gaussian kernel, the derivatives involve sums of products of the polynomials and Gaussian one. This allows us to take advantage in the computation and to make use of fast algorithms [19] for all the summations set out for the proposed strategy. Namely, we consider the improved fast Gaussian transform [34] picking up the computational cost at an acceptable level preserving the accuracy of the computation. Furthermore, the matrix and the known vector assembly is generated by taking into account the Gaussian function as common element in the fundamental tasks. The overall computational work performs to linear for a fixed level of accuracy. We present the computations with the direct and the improved fast summation algorithm showing satisfactory results referring to a bivariate case study. The remainder of the paper is as follows. In Section 2 we present a review of the standard formulation. In Section 3 we describe the improved method supported by numerical simulations for some test functions in a unit square domain. In this section some discussions on the errors versus the number of data are reported referring to different orders of accuracy and with different data sets. The Section 4 is devoted to computational topics presenting the direct and the fast sum computation via Improved Fast Transform Gaussian adapted for our purposes. In Section 5 the results presented in the paper are shortly summarized. 2. Ab initio formulation To make the paper self-contained we briefly review the SPH standard formalism from first principles. The method makes use of a kernel approximation using ideas from distribution theory for approximating a function with a delta distribution representation [15]. Definition 1. Let f : ⊂ Rd → R, d ≥ 1, the kernel approximation is defined as
< fh (x ) >:=
f ( ξ )K ( x, ξ ; h )d
(1)
with x = (x(1 ) , . . . , x(d ) ), ξ = (ξ (1 ) , . . . , ξ (d ) ) ∈ . The function K(x, ξ ; h) is named kernel function and h is the smoothing length. The parameter h localizes the influence of the kernel function which approximates a Dirac δ -function in the limit h → 0. The kernel is usually normalized to unity and it is required to be symmetric and sufficiently smooth. Under these assumptions the error of the kernel approximation can be estimated as second order of accuracy, or of first order of consistency [16,17,23]. Any function K(x, ξ ; h ) with the above properties can be employed as smoothing kernel function. A common choice is the Gaussian function K ( x, ξ ; h ) =
2
ξ −x2 1 − √ e h2 . d d h π
(2)
The kernel (2) clearly decays when x moves away from ξ and the dimensional constant αd = 1/hd π d is to satisfy the unity condition requirement [23]. Moreover, it is infinitely differentiable, radial and strictly positive definite function on Rd for any d [7,10,11,42]. This function will be taken into consideration as kernel from now on. When the entire domain is represented by a finite number of data sites we proceed in the approximation as follows
N
Definition 2. Given a set of data sites = ξ j
j=1
⊂ and the corresponding measurements y j = f (ξ j )
N j=1
∈ R the par-
ticle approximation of the function is defined as
fh (x ) :=
N
f ( ξ j )K ( x, ξ j ; h )d j ,
j=1
where dj is the measure of the subdomain j associated to each data site. The triple (K, , h ) essentially characterizes the approximation.
(3)
E. Francomano, M. Paliaga / Applied Mathematics and Computation 339 (2018) 899–915
901
Table 1 MAE, RMSE, MEAN error with h - N = 1089 - Test function (4a)
Gridded data Halton data Sobol data Random data
MAE
RMSE
MEAN
h
0.0302 0.0753 0.0624 0.2157
0.0104 0.0214 0.0184 0.0353
0.0072 0.0162 0.0124 0.0226
0.0214 0.0680 0.0560 0.0570
2.1. Some numerical behaviors of the standard SPH In the following we discuss on the method by proposing various numerical experiments referring to the bivariate functions (4), depicted in Fig. 1, originally proposed in the ACM Transaction Software Packages [35,40]
f (x(1) , x(2) ) =16x(1) x(2) (1 − x(1) )(1 − x(2) ),
(4a)
1 f (x(1) , x(2) ) = tanh(9(x(2) − x(1) ) + 1 ), 9
(4b)
f ( x (1 ) , x (2 ) ) =
1.25 + cos(5.4x(2) ) . 6 + 6 ( 3x ( 1 ) − 1 )2
(4c)
We consider the Gaussian kernel and different data sets as centers of the kernel, taken in number as the progression
(2n + 1 )2 . Gridded, Halton, Sobol and random points are taken as data sequences and we will refer to these sets as G , H , S , R . The second and third ones are available in MATLAB@ Statistics and Machine Learning Toolbox as haltonset and sobolset, respectively. The former was introduced by Halton [20] and the latter by Sobol [38]. The random points are generated by means of the random function of MATLAB@ . The evaluation points {x1 , . . . , xM } are on a regular mesh layed out over the computational domain. For all the simulations M = 1600, = [0, 1]2 and the Voronoi tessellation is adopted to define j . In Fig. 2 we show N = 289 data sites for the four data sequences G , H , S , R and the M evaluation points. The following formulas are adopted to estimate the accuracy of the solution
MAE := max
1≤i≤M
| fh (xi ) − f (xi )|,
M | f h ( xi ) − f ( xi )|2
RMSE :=
i=1
M M
MEAN :=
(5)
,
(6)
| f h ( xi ) − f ( xi )|
i=1
M
(7)
and the errors are plotted in a loglog scale throughout the paper. In this section the discussion focuses on the test function (4a). First of all we give evidence of the influence of h on the approximation. To this aim we perform a series of experiments by varying the smoothing length h for the data sets of Fig. 2 and in Fig. 3 we plot the MAE by fixing the number of data to N = 1089. In Table 1 we report the values of h and the MAE, RMSE, MEAN error by considering G , H , S , R with N = 1089. The standard method usually does not yield to satisfactory results and by increasing the data density in the accuracy slightly improves for the first few increased values as we observe in Fig. 4. The approximation of the derivatives via standard SPH is also not good enough. We test the approximation on a uniform grid with increasing data density in the unit square domain and the results are displayed in Fig. 5. Analogous conclusions are reached with the data sets H , S , R and by approximating the derivatives along the x(2) direction. The numerical simulations assess that the approximation with the standard particle formulation is poor and that it is not according with the second order of accuracy claimed for the kernel approximation. This is mainly due to the discrepancy between the kernel and particle approximation. The SPH discretization makes reference to a set of data for which the conditions, valid for the kernel approximation, are not always preserved for instance with data near the boundary of the problem domain or with irregular data distribution due to the unbalanced contribution to the discretized summations [16,17,23]. With the goal to improve the approximation of the function and its derivatives, we consider Taylor expansion of the function, projecting with the kernel function and its derivatives and integrating over the problem domain [27]. This procedure is appealing because allows to improve the accuracy without changes on the kernel function. Moreover, the function and its derivatives are simultaneously computed, for each evaluation point, with the desired order of accuracy and high order of derivatives are generated independently on the lower one. In the next section we discuss on the basic idea of the improved approach coupled with some validation results.
902
E. Francomano, M. Paliaga / Applied Mathematics and Computation 339 (2018) 899–915
Fig. 1. Test functions (left) and their first derivatives (right) used in the numerical experiments.
E. Francomano, M. Paliaga / Applied Mathematics and Computation 339 (2018) 899–915
903
Fig. 2. The circles as the set of N = 289 data sites ξ j ; the crosses as the set of M = 1600 evaluation points xi .
Fig. 3. Maximum absolute error vs. h for N=1089 data sites in G , H , S , R .
3. Enhancing approximation To ensure the 1st order of accuracy for the kernel approximation we consider the Taylor expansion of f(ξ ), retaining only the first term, multiplying for the kernel function and integrating over w.r.t. the variable ξ
f ( ξ )K ( x, ξ ; h )d =
f ( x )K ( x, ξ ; h )d +
f ( ξ )K ( x, ξ ; h )d f (x ) =
+ O ( h ). K ( x, ξ ; h )d
O ( h ) K ( x , ξ ; h ) d ,
(8)
(9)
904
E. Francomano, M. Paliaga / Applied Mathematics and Computation 339 (2018) 899–915
Fig. 4. Error versus the number of data in G , H , S , R . Test function (4a).
Fig. 5. Error versus the number of data in G for Dx(1) f . Test function (4a).
The corresponding discrete formulation is N
f (x ) =
f ( ξ j )K ( x, ξ j ; h )d j
j=1 N
+ O ( h ).
(10)
K ( x, ξ j ; h )d j
j=1
The requirement of the unity condition for the kernel function corresponds to assure the first order of accuracy-or 0th order of consistency (k = 0)- for the kernel approximation, but it is not always assured for the discrete counterpart. So we proceed in the approximation by adopting the (10) instead of (3) to improve the results. Numerical simulations, summarized in Fig. 6, are performed by referring to the test function (4a) with the data sets G , H , S , R and by increasing the density of the data in the unit square domain. The comparison between the plots
E. Francomano, M. Paliaga / Applied Mathematics and Computation 339 (2018) 899–915
905
Fig. 6. k = 0. Error versus the number of data in G , H , S , R . Test function (4a).
Table 2 k = 0. MAE, RMSE, MEAN error for data in G . Test function (4a). Data used in the Fig. 6(a). N
f
h
MAE
Rate
RMSE
Rate
MEAN
Rate
9 25 81 289 1089 4225 16641 66049
0.5097 0.3556 0.2136 0.1175 0.0616 0.0316 0.0160 0.0080
– 0.7046 0.8672 0.9402 0.9723 0.9865 0.9950 0.9968
0.2178 0.1203 0.0592 0.0283 0.0142 0.0072 0.0037 0.0018
– 1.1627 1.2047 1.1629 1.0435 0.9898 0.9733 0.9971
0.1849 0.0923 0.0368 0.0136 0.0052 0.0026 0.0013 6.40e−04
– 1.3603 1.5642 1.5640 1.4407 1.0399 1.0134 1.0046
0.2357 0.1414 0.0786 0.0415 0.0214 0.0109 0.0055 0.0028
in Fig. 4 and in Fig. 6 gives evidence of the improvements in the approximation. In the Tables 2 and 3, the errors and the convergence rate are reported referring to gridded and random data sets. The convergence rate is computed using the formula
rate =
log( ene−1 ) n
log( θθn−1 ) n
,
n≥2
en−1 and en as the errors for two subsequently data sequences, θn−1 and θ n fixed respectively as the fill-distance [7,10,11,42]
θ= sup min ξ j − x2 x∈ ξ j ∈
(11)
for the two data sets. The results confirm the theoretical assumptions making evidence of a less accurate approximation for data in R . In Fig. 7 we summarize the maximum absolute error for the test functions (4b) and (4c) for the different data sequences G , H , S , R .
906
E. Francomano, M. Paliaga / Applied Mathematics and Computation 339 (2018) 899–915 Table 3 k = 0. MAE, RMSE, MEAN error for data in R . Test function (4a). Data used in the Fig. 6(d). N
f
h
MAE
Rate
RMSE
Rate
MEAN
Rate
9 25 81 289 1089 4225 16641 66049
0.6812 0.7022 0.3457 0.2759 0.1643 0.0786 0.0398 0.0229
– −0.1035 0.9830 0.4228 1.4872 1.1075 0.6855 1.0415
0.2986 0.2136 0.1170 0.0642 0.0398 0.0183 0.0071 0.0041
– 1.1417 0.8355 1.1262 1.3698 1.1683 0.9538 1.0440
0.2552 0.1751 0.0949 0.0454 0.0255 0.0091 0.0030 0.0015
– 1.2845 0.8499 1.3813 1.6619 1.5523 1.1179 1.2564
0.4398 0.3280 0.1595 0.0936 0.0661 0.0340 0.0126 0.0074
Fig. 7. k = 0. MAE versus number of data in G , H , S , R . (a) Test function (4b) and (b) test function (4c).
3.1. Higher order of accuracy Now we extend the idea presented in the previous section to generate methods with higher approximation order taking into consideration the Taylor series expansion of the function f(ξ ) up to the order k
1 (ξ − x )α Dα f (x ) + O(hk+1 ), α!
f (ξ ) =
(12)
|α|≤k
where α = (α1 , α2 , . . . , αd ) ∈ Nd is a multi-index with |α| =
d
αi , α ! = α1 !α2 ! . . . αd ! and the differential operator is de-
i=1
fined as
D α :=
δ |α|
(δ x(1) )α1 . . . (δ x(d ) )αd
.
Hence, the (12) is multiplied for the kernel function and its derivatives up to the order k and integrated over w.r.t. the variable ξ
f ( ξ )K ( x, ξ ; h )d =
1 ( ξ − x )α D α f ( x )K ( x, ξ ; h )d + α ! |α|≤k + O(hk+1 )K(x, ξ ; h )d
f ( ξ )D k K ( x, ξ ; h )d =
.. .
1 ( ξ − x )α D α f ( x )D k K ( x, ξ ; h )d + α ! |α|≤k + O(hk+1 )D k K(x, ξ ; h )d.
In linear algebra notation, neglecting the error, the improved formulation corresponds to the point-wise linear systems
< A ( k ) > c ( k ) =< b ( k ) >
(13)
E. Francomano, M. Paliaga / Applied Mathematics and Computation 339 (2018) 899–915
where
⎛
⎜
< A (k ) > = ⎝
⎛ ⎜
K ( x, ξ ; h )d
...
.. .
...
k D K ( x, ξ ; h )d
1 k!
...
(ξ
(d )
− x ( d ) )k K ( x, ξ ; h )d .. .
(ξ
(d )
⎞ ⎟ ⎠
− x ( d ) )k D k K ( x, ξ ; h )d
⎞
f (x )
⎟ ⎠
.. .
c (k ) = ⎝
1 k!
907
Dxk(d ) f (x )
⎛
⎜
< b (k ) > = ⎝
⎞
f ( ξ )K ( x, ξ ; h )d
⎟ ⎠
.. .
f ( ξ )D k K ( x, ξ ; h )d
where the notation D k(d ) indicates that the operator is applied with the multi-index having kd = k . We are now ready to x propose the discrete formulation for the function and its derivatives estimate up to order k at the evaluation point x
⎛
N
K ( x, ξ j ; h )d j
⎜ ⎜ j=1 ⎜ .. ⎜ A (k ) = ⎜ . ⎜ ⎜ N ⎝ D k K ( x, ξ j ; h )d j
1 k!
...
⎜ c (k ) = ⎜ ⎝ ⎛
⎞
(ξ j(d ) − x(d ) )k K(x, ξ j ; h )d j
⎟ ⎟ ⎟ .. ⎟ ⎟ . ⎟ ⎟ N (d ) ⎠ (d ) k k 1 ( ξ − x ) D K ( x , ξ ; h ) d j j j k! j=1
... ...
j=1
⎛
N
j=1
⎞
f (x ) .. . Dxk(d ) f (x ) N
⎟ ⎟ ⎠ ⎞
f (ξ j )K(x, ξ j ; h )d j
⎜ ⎜ j=1 ⎜ .. ⎜ b (k ) = ⎜ . ⎜ ⎜ N ⎝ f (ξ j )D k K(x, ξ j ; h )d j
⎟ ⎟ ⎟ ⎟ ⎟. ⎟ ⎟ ⎠
j=1
The described process provides values for the function and its derivatives accurate of order k+1 for the function f and of order (k+1)-p for the derivatives of order p. To give evidence of this assertion, we consider the error vector
⎛
⎜ ⎜ ⎜ ⎜ (k ) e =⎜ ⎜ ⎜ ⎝
O (h
k+1
)
N
⎞
K ( x, ξ j ; h )d j
j=1
.. . O(hk+1 )
N
D k K ( x, ξ j ; h )d j
⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠
j=1
and the system matrix A(k) approximated as
⎛ A (k ) ∼ =⎝
p11 .. . pk1
... .. . ...
P
p1k .. . pkk
⎞
⎛
⎜ ⎠⎜ ⎝
⎞
1
⎟ ⎟. ⎠
h ..
. hk
908
E. Francomano, M. Paliaga / Applied Mathematics and Computation 339 (2018) 899–915 Table 4 k = 1. MAE, RMSE for data in G . Test function (4a). Dx ( 1 ) f
N
f MAE
Rate
RMSE
Rate
MAE
Rate
RMSE
Rate
9 25 81 289 1089 4225 16641 66049 96721
0.3016 0.1158 0.0366 0.0114 0.0035 9.54e−04 2.45e−04 6.39e−05 4.38e−05
– 1.8745 1.9596 1.8258 1.8001 1.9052 1.9874 1.9771 1.9857
0.1424 0.0563 0.0183 0.0052 0.0014 3.64e−04 9.27e−05 2.33e−05 1.57e−05
– 0.1818 1.9120 1.9690 1.9784 1.9909 1.9996 1.9994 2.0526
2.7931 1.6832 0.9237 0.4825 0.2461 0.1241 0.0629 0.0319 0.0264
– 0.9914 1.0207 1.0211 1.0145 1.0111 0.9935 0.9855 0.9909
1.0237 0.5379 0.2512 0.1131 0.0530 0.0268 0.0134 0.0067 0.0056
– 1.2598 1.2949 1.2540 1.1431 1.0089 1.0134 0.9999 0.9264
h
0.2357 0.1414 0.0786 0.0415 0.0214 0.0109 0.0055 0.0028 0.0023
Table 5 k = 1. MAE, RMSE for data in R . Test function (4a). N
Dx ( 1 ) f
f
9 25 81 289 1089 4225 16641 66049 96721
h
MAE
Rate
RMSE
Rate
MAE
Rate
RMSE
Rate
0.5337 0.2936 0.1295 0.0857 0.0416 0.0106 0.0011 7.73e−04 3.79e−04
– 2.0375 1.1353 0.7732 2.0767 2.0559 2.2823 0.6629 2.9256
0.1925 0.1518 0.0602 0.0235 0.0116 0.0031 4.53e−04 1.54e−04 9.79e−05
– 0.8063 1.2815 1.7659 2.0340 1.9577 1.9375 2.0273 1.8595
2.8825 2.4489 1.4324 0.9766 0.7202 0.3799 0.1711 0.0960 0.0859
– 0.5558 0.7437 0.7185 0.8744 0.9611 0.8036 1.0858 0.4563
1.0814 0.8345 0.4485 0.2333 0.1350 0.0597 0.0233 0.0127 0.0103
– 0.8838 0.8609 1.2259 1.5700 1.2276 0.9478 1.1402 0.8597
0.4398 0.3280 0.1595 0.0936 0.0661 0.0340 0.0126 0.0074 0.0058
Table 6 k = 2. MAE for data in G . Test function (4a). N
Dx ( 1 ) f
f
9 25 81 289 1089 4225 16641 66049 96721
Dx2(1) f
Rate
MAE
Rate
MAE
Rate
0.2647 0.0741 0.0146 0.0028 3.32e−04 4.44e−05 5.73e−06 7.29e−07 4.11e−07
– 2.4917 2.7587 2.8822 2.9417 2.9710 2.9934 2.9954 2.9954
1.9477 0.8781 0.3074 0.0921 0.0253 0.0066 0.0017 4.30e−04 2.94e−04
— 1.5591 1.7855 1.8941 1.9473 1.9737 1.9831 1.9934 1.9959
4.5084 2.9491 1.7288 0.9434 0.4939 0.2528 0.1279 0.0643 0.0532
– 0.8306 0.9094 0.9485 0.9756 0.9876 0.9961 0.9532 0.9673
So, in computing the solution the error is about
⎛
⎜ (A(k) )−1 e(k) ∼ =⎜ ⎝
N
⎜ O ( h ) K ( x, ξ j ; h )d j ⎜ j=1 ⎜ N ⎜ O(hk+1 ) D K ( x, ξ j ; h )d j ⎟ −1 ⎜ ⎟P ⎜ ⎜ j=1 ⎠ ⎜ .. ⎜ . ⎜ ⎜ N ⎝ O(hk+1 ) D k K ( x, ξ j ; h )d j k+1
⎛
⎞
1 1 h
..
h
MAE
. 1 hk
0.2357 0.1414 0.0786 0.0415 0.0214 0.0109 0.0055 0.0028 0.0023
⎞ ⎟ ⎟ ⎛ ⎟ O(hk+1 ) ⎟ ⎟ ⎜ O ( hk ) ⎟∼⎜ ⎟=⎝ .. ⎟ . ⎟ ⎟ O (h ) ⎟ ⎠
⎞ ⎟ ⎟. ⎠
j=1
In the following we report some results for the test function (4a) in approximating the function and its derivatives with k = 1 and k = 2. In Tables 4 and 5 the results are with k = 1 whilst in Tables 6–9 we report the results with k = 2. These illustrate the approximation does converge well, almost reaching the results predicted by the theory. In Fig. 8, by fixing N = 1089, we focus on the MAE comparing SPH and the improved method with k = 0, k = 1 and k = 2. We observe that the error is reduced inside the domain and it is always present on the boundaries but a significant decrease is with k=1,2. We conclude this section presenting the MAE for finer data locations concerning the three functions (4a), (4b), (4c) with the standard SPH formulation and with the modified one by varying k = 0, 1, 2. In Figs. 9–11 we show the maximum absolute error versus the number of data in G , H , S , R .
E. Francomano, M. Paliaga / Applied Mathematics and Computation 339 (2018) 899–915
909
Fig. 8. MAE for (a) SPH standard; (b) Improved method with k = 0, (c) k = 1, (d) k = 2. N = 1089 data in G . Function test (4a).
Fig. 9. Function test (4a). Comparison of MAE on the function with the standard SPH formulation and the improved ones with k = 0, 1, 2. (a) Gridded data; (b) Halton data; (c) Sobol; (d) Random data.
910
E. Francomano, M. Paliaga / Applied Mathematics and Computation 339 (2018) 899–915
Fig. 10. Function test (4b). Comparison of MAE on the function with the standard SPH formulation and the improved ones with k = 0, 1, 2. (a) Gridded data; (b) Halton data; (c) Sobol; (d) Random data.
Fig. 11. Function test (4c). Comparison of MAE on the function with the standard SPH formulation and the improved ones with k = 0, 1, 2. (a) Gridded data; (b) Halton data; (c) Sobol; (d) Random data.
E. Francomano, M. Paliaga / Applied Mathematics and Computation 339 (2018) 899–915
911
Table 7 k = 2. MAE for data in R . Test function (4a). Dx ( 1 ) f
Dx2(1) f
h
N
f MAE
Rate
MAE
Rate
MAE
Rate
9 25 81 289 1089 4225 16641 66049 96721
15.5032 3.4298 0.7152 0.3016 0.0242 0.0016 3.37e−04 3.99e−05 3.00e−05
– 5.1438 2.1742 1.6194 7.2410 4.0504 1.5692 4.0176 1.2219
37.2568 31.2706 9.7030 3.2224 0.9811 0.3792 0.0548 0.0083 0.0146
— 0.5972 1.6229 2.0675 3.4140 1.4286 1.9487 3.5404 −2.3887
68.8620 49.2390 49.8307 26.2748 5.9107 4.4230 3.7019 0.8011 0.0146
— 1.1437 −0.0166 1.2004 4.2828 0.4357 0.1793 2.8826 −47520
0.4398 0.3280 0.1595 0.0936 0.0661 0.0340 0.0126 0.0074 0.0058
Table 8 k = 2. RMSE for data in R . Test function (4a). N
Dx ( 1 ) f
f
9 25 81 289 1089 4225 16641 66049 96721
Dx2(1) f
h
RMSE
Rate
RMSE
Rate
RMSE
Rate
0.6575 0.2018 0.0379 0.0096 0.0012 1.06e−04 1.39e−05 1.73e−06 1.37e−06
– 4.0278 2.3202 2.5729 5.8605 3.6944 2.0466 3.9153 0.9577
2.2986 1.8702 0.5166 0.1566 0.0518 0.0137 0.0027 5.60e−04 5.77e−04
— 0.7034 1.7841 2.2388 3.1739 2.0 0 01 1.6361 2.9516 −0.1200
3.6311 3.5045 3.0262 1.1947 0.4512 0.2175 0.1547 0.0427 0.0802
— 0.7034 1.7841 2.2388 3.1739 2.0 0 01 0.3432 2.4187 −2.5873
0.4398 0.3280 0.1595 0.0936 0.0661 0.0340 0.0126 0.0074 0.0058
Table 9 k = 2. MEAN for data in R . Test function (4a). N
Dx ( 1 ) f
f
9 25 81 289 1089 4225 16641 66049 96701
Dx2(1) f
h
MEAN
Rate
MEAN
Rate
MEAN
Rate
0.1512 0.0749 0.0118 0.0020 3.85e−04 3.83e−05 3.27e−06 4.65e−07 2.70e−07
– 2.3926 2.5671 3.3536 4.6827 3.4661 2.4788 3.6734 2.3176
0.7934 0.7176 0.1746 0.0539 0.0218 0.0058 9.62e−04 2-88e−04 1.95e−04
— 0.3425 1.9601 2.2055 2.6055 1.9948 1.8099 2.2694 1.6593
2.2061 2.2544 1.1652 0.4935 0.2133 0.0810 0.0384 0.0152 0.0162
— −0.0738 0.9152 1.6112 2.4088 1.4554 0.7519 1.7448 −0.2748
0.4398 0.3280 0.1595 0.0936 0.0661 0.0340 0.0126 0.0074 0.0058
4. Computational skills The computational strategy proposed is more expensive than the standard one. As detailed in the previous section, fixed k, the linear system (13) has to be generated and solved for each evaluation point x. The size of the linear system (13) dek )! pends on the domain dimension d and on k, i.e. sk,d = (d+ and the overall computational effort is d!k!
Ck,d ≈ MNsk,d
sk,d + 3 . 2
(14)
We notice that the matrix A(k) is partially composed by the entries of the matrices A(j) , with j = 0, . . . , k − 1, as shown k) in Fig. 12 and 2dk sk−1,d [ (d+2 sk−1,d + 1] more elements need to end up the matrix A(k) starting from the matrix A(k−1 ) . k This computational extra effort is required to generate the system matrix if we desire the (k +1)th order of accuracy only on the boundary or on subdomains where the data are less regularly distributed. The matrices K(k) , V, P(k)
⎛ ⎜ ⎝
K (k ) = ⎜
K ( x, ξ 1 ; h )
K ( x, ξ 2 ; h )
...
K ( x, ξ N ; h )
.. .
.. .
...
.. .
D k K ( x, ξ 1 ; h )
D k K ( x, ξ 2 ; h )
...
D k K ( x, ξ N ; h )
⎞ ⎟ ⎟ ⎠
(15)
912
E. Francomano, M. Paliaga / Applied Mathematics and Computation 339 (2018) 899–915
Fig. 12. Skeleton of the matrix associated to the (k +1)th order of accuracy.
⎛ d 1 ⎜ ⎜ V=⎜ ⎜ ⎝
⎞ ⎟ ⎟ ⎟ ⎟ ⎠
d 2 ..
.
(16)
d N
⎛ ⎜ ⎜ ⎜ P (k ) = ⎜ ⎜ ⎝
1
(ξ1(1) − x(1) )
...
1 k!
1
(ξ2(1) − x(1) )
...
1 k!
.. .
.. .
...
1
(ξN(1) − x(1) )
...
(ξ1(d ) − x(d ) )k
⎞
⎟ (ξ2(d ) − x(d ) )k ⎟ ⎟ ⎟ .. ⎟ . ⎠ (d ) 1 ( d ) )k ( ξ − x N k!
(17)
allow us to rewrite the system as
(K(k) VP(k) )c(k) = K(k) Vf
(18)
where f collects the function values. The result of this formulation is no different than finding each entry individually by means of inner-vector products (BLAS level 1) but, from a performance point of view, computing the system matrix as in (18) is faster by taking advantages from matrix-matrix products (BLAS level 3) [18]. Anyway, in order to improve the computation in the next section we introduce a fast summation technique for the system matrix assembly. 4.1. Fast computation via IFGT The experiments presented in the previous section illustrate the approximation of large data sets with the Gaussian kernel function. This function is infinitely differentiable with high order derivatives sufficiently smooth employing itself as common element of all derivatives. The gained results are perceptible but a controversial aspect concerns the computational effort for the system matrix assembly which makes the procedure rather expensive and not easily approachable in the applications. The computation of summation appears as a bottleneck but a fast technique for M simultaneously evaluation sums of N Gaussian kernel functions allows to reduce the computational complexity from the standard O(MN) to the O(M+N). To this aim the Improved Fast Gaussian Transform (IFGT) offers advantages, for uniform and non uniform data sites locations, allowing to tune the desidered accuracy. By opportunely combining the proposed approach with the IFGT we considerably speed up the method at a large number of evaluation points. A full description of the IFGT is beyond the scope of this paper and the interested reader is directed to [34] for an expository treatment of this tool. Each entry of the matrix A(k) and of the known vector b(k) need a summation and can be computed taking advantages by the IFGT with different weights depending on the derivatives of the kernel interested. The required algebra is somewhat tedious but straightforward and some computations can be re-used due to the Gaussian kernel employed in the formulation of the derivatives. In the following we provide some computational details referring - for the sake of simplicity - to k = 1 and d = 2. In this case we need the six fundamental IFGTs N 1IF GT := d j K ( x, ξ j ; h ) j=1
2IF GT :=
N
(d j ξ j(1) )K(x, ξ j ; h )
j=1
3IF GT :=
N j=1
(d j ξ j(2) )K(x, ξ j ; h )
E. Francomano, M. Paliaga / Applied Mathematics and Computation 339 (2018) 899–915
913
Fig. 13. CPU times (s) versus N in G : (a) k = 1; (b) k = 2. Table 10 CPU times (s) versus N. Data in G . Test function (4a). N
k=1
k=1 IFGT
k=2
k=2 IFGT
9 81 289 1089 4225 16641
0.1539 0.8124 1.3506 4.4933 19.5425 39.8844
0.0255 0.0264 0.5747 0.7062 1.0367 1.6538
0.7095 1.3353 2.7261 4.8085 21.3708 84.6397
0.1756 0.4532 0.9541 1.2359 2.4587 3.0125
Table 11 CPU times (s) versus N. Data in R . Test function (4a).
4IF GT :=
N
N
k=1
k=1 IFGT
k=2
k=2 IFGT
9 81 289 1089 4225 16641
0.5099 0.7778 1.3719 3.6146 10.6324 42.3251
0.0253 0.0341 0.5747 0.7062 1.0985 1.5315
0.6818 1.0210 1.3925 3.6724 16.0703 79.7602
0.1856 0.4232 0.9824 1.2523 2.6577 3.1543
(d j ξ j(1) ξ j(1) )K(x, ξ j ; h )
j=1
5IF GT :=
N
(d j ξ j(1) ξ j(2) )K(x, ξ j ; h )
j=1
6IF GT :=
N
(d j ξ j(2) ξ j(2) )K(x, ξ j ; h ) these are coupled with different weights giving rise to the corrective matrix A(1)
j=1
(1 ) A11 = 1IF GT
(1 ) A12 = 2IF GT − xi(1 ) 1IF GT (1 ) A13 = 3IF GT − xi(2 ) 1IF GT (1 ) A21 =
(1 ) A32 =
2 (1 ) A h2 12 2 [4IF GT h2 2 [5IF GT h2 2 (1 ) A h2 13 2 (1 ) A h2 23
(1 ) A33 =
2 [6IF GT h2
(1 ) A22 = (1 ) A23 = (1 ) A31 =
2
+ (xi(1 ) ) 1IF GT + 2xi(1 ) 2IF GT ]
− xi(2 ) 2IF GT − xi(1 ) 3IF GT − xi(1 ) xi(2 ) 1IF GT ]
2
+ (xi(2 ) ) 1IF GT − 2xi(2 ) 3IF GT ].
The Fig. 13 compares the cost of direct summation versus the IFGT summation and shows the efficiency greatly improved by making use of the IFGT. We compare the CPU times which need to generate and to solve the linear system (13) with or without the IFGT for k = 1 and k = 2 with the same approximation error referring to the test function (4a) with data in G . In the Tables 10 and 11 the CPU times for k = 1 and k = 2 respectively with gridded and random data in approximating the
914
E. Francomano, M. Paliaga / Applied Mathematics and Computation 339 (2018) 899–915
test function (4a) are reported. The simulations are conducted on a computer equipped with a processor Intel(R) Core (TM) i7-3537U CPU 2.00 GHz. 5. Conclusions In this paper we present numerical investigations of an improved SPH method based on Taylor series expansion dealing with the Gaussian kernel function. We give evidence of the accuracy, convergency and computational demanding in approximating a function and its derivatives. Many experiments are conducted with the aim to address the basic features of the method accomplished with various data sets referring to bivariate test functions. By combining the proposed approach with the improved fast gaussian transform to assembly the system matrices we considerably speed up the method at a large number of evaluation points. Satisfactory results and computational advantages with uniform and non uniform large data sets encourage to proceed in the approximation of a function and its derivatives with the combined methodology. Acknowledgments The research has been supported by the Istituto Nazionale di Alta Matematica - Progetto 2018 of the Gruppo Nazionale per il Calcolo Scientifico “Metodi, algoritmi e applicazioni dell’approssimazione multivariata” and inside Rete ITaliana di Teoria dell’Approssimazione. References [1] G. Ala, E. Francomano, Numerical investigations of an implicit leapfrog time-domain meshless method, J. Sci. Comput. 62 (3) (2014) 898–912. [2] G. Ala, E. Francomano, S. Ganci, G.E. Fasshauer, M. McCourt, The method of fundamental solutions in solving coupled boundary value problems for m/EEG, SIAM J. Sci. Comput. 37 (4) (2015a) B570–B590. [3] G. Ala, E. Francomano, S. Ganci, Unconditionally stable meshless integration of time-domain maxwell’s curl equations, Appl. Math. Comput. 255 (15) (2015b) 157–164. [4] G. Ala, E. Francomano, S. Ganci, G.E. Fasshauer, M. McCourt, An augmented MFS approach for brain activity reconstruction, Math. Comput. Simul. 141 (2017) 3–15. [5] G. Ala, E. Francomano, A marching-on in time meshless kernel based solver for full-wave electromagnetic simulation, Num. Algorithms 62 (4) (2013) 541–558. [6] T. Belytschko, Y. Krongauz, J. Dolbow, C. Gerlach, On the completeness of meshfree methods, Int. J. Numer. Methods Eng. 43 (1998) 785–819. [7] M.D. Buhmann, Radial basis functions: theory and implementations, in: Cambridge Monographs on Applied and Computational Mathematics, vol. 12, Cambridge University Press, 2003. [8] X. Chen, J.H. Jung, Matrix stability of multiquadric radial basis function methods for hyperbolic equations with uniform centers, J. Sci. Comput. 51 (3) (2012) 683–702. [9] A. Chowdhury, A. Wittek, K. Miller, G.R. Joldes, An element free Galerkin method based on the modified moving least squares approximation, J. Sci. Comput. (2016) 1–15. [10] G.E. Fasshauer, Meshfree approximation methods with MATLAB, in: Interdisciplinary Mathematical Sciences, 6, World Scientific, Hackensack, NJ, 2007. [11] G.E. Fasshauer, M. McCourt, Kernel-based approximation methods using MATLAB, in: Interdisciplinary Mathematical Sciences, 19, World Scientific, 2015. [12] E. Francomano, F.M. Hilker, M. Paliaga, E. Venturino, An efficient method to reconstruct invariant manifolds of saddle points, Dolomites Res. Notes Approx. 10 (2017) 25–30. [13] E. Francomano, F.M. Hilker, M. Paliaga, E. Venturino, Separatrix reconstruction to identify tipping points in an eco-epidemiological model, Appl. Math. Comput. 318 (2018) 80–91. [14] E. Francomano, M. Paliaga, Detecting 3d multistability with a meshfree reconstruction of invariant manifolds of saddle point, Math. Meth. Appl. Sci. (2018) 1–9, doi:10.1002/mma. [15] D.A. Fulk, A numerical analysis of Smoothed Particle Hydrodynamics, Air Force Institute of Technology, School of Engineering of the Air Force, Air University, Montgomery, Alabama, 1994 Dissertation thesis. [16] R.A. Gingold, J.J. Monaghan, Smoothed particle hydrodynamics: theory and application on spherical stars, Monthly Notices Roy. Astronom. Soc. 181 (1977) 375–389. [17] R.A. Gingold, J.J. Monaghan, Kernel estimates as a basis for general particle method in hydrodynamics, J. Comput. Phys. 46 (1982) 429–453. [18] G.H. Golub, C.F. Van, Loan, Matrix Computations, fourth ed., Johns Hopkins, University Press, Baltimore, MD, 2012. [19] L. Greengard, J. Strain, The fast gauss transformation. SIAM, J. Sci. Statist. Comput. 12 (1) (1991) 79–94. [20] J.H. Halton, On the efficiency of certain quasi-random sequences of points in evaluating multi-dimensional integrals, Num. Math. 2 (1960) 84–90. [21] B. Li, F. Habbal, M. Ortiz, Optimal transportation meshfree approximation schemes for fluid and plastic flows, J. Numer. Meth. Eng. 83 (2010) 1541–1579. [22] L.D. Libersky, A.G. Petscek, Smooth particle hydrodynamics with strength of materials, in: Proceedings of the next free-lagrangian method, 35, Springre, 1991, pp. 248–257. [23] G.R. Liu, M.B. Liu, Smoothed Particle Hydrodynamics - A Mesh-Free Particle Method, World Scientific Publishing, Singapore, 2003. [24] M.B. Liu, G.R. Liu, Smoothed particle hydrodynamics (SPH): an overview and recent developments, Arch. Comput. Methods Eng. 17 (1) (2010) 25–76. [25] M.B. Liu, G.R. Liu, K.Y. Lam, Constructing smoothing functions in smoothed particle hydrodynamics with applications, J. Comput. Appl. Math. 155 (2003) 263–284. [26] M.B. Liu, W.P. Xie, G.R. Liu, Modeling incompressible flows using a finite particle method, Appl. Math. Modell. 29 (12) (2005) 1252–1270. [27] M.B. Liu, W.P. Xie, G.R. Liu, Restoring particle inconsistency in smoothed particle hydrodynamics, Appl. Numer. Math. 56 (1) (2006) 19–36. [28] W.K. Liu, S. Jun, Y.F. Zhang, Reproducing kernel particle methods, Int. Jour. Meth. Fluids 20 (8–9) (1995) 1081–1106. [29] L.B. Lucy, A numerical approach to the testing of fusion process, Astron. J. 82 (1977) 1013–1024. [30] J.J. Monaghan, J.C. Lattanzio, A refined particle method for astrophysical problems, Astron. Astrophys. 149 (1985) 135–143. [31] J.J. Monaghan, An introduction to SPH, Comput. Phys. Commun. 48 (1988) 89–96. [32] J.J. Monaghan, Smoothed particle hydrodynamics, Ann. Rev. Astron. Astrophys. 30 (1992) 543–574. [33] J.J. Monaghan, Simulating free surface flows with SPH, J. Comput. Phys. 110 (1994) 399–406. [34] V.I. Morariu, B.V. Srinivasan, V.C. Raykar, R. Duraiswami, L.S. Davis, Automatic online tuning for fast gaussian summation, Adv. Neural Inf. Process. Syst. 21 (2009) 1113–1120. [35] R.J. Renka, R. Brown, Algorithm 792 : accuracy test of ACM algorithms for interpolation of scattered data in the plane, ACM Trans. Math. Softw. 25 (1999) 78–94.
E. Francomano, M. Paliaga / Applied Mathematics and Computation 339 (2018) 899–915
915
[36] S. Shao, Incompressible smoothed particle hydrodynamics simulation of multifluid flows, Int. J. Numer. Methods. Fluids 69 (11) (2012) 1715–1735. [37] H.F. Schwaiger, An implicit corrected SPH formulation for thermal diffusion with linear free surface boundary conditions, Int. J. Numer. Methods Eng. 75 (6) (2008) 647–671. [38] I.M. Sobol, Distribution of points in a cube and approximate evaluation of integrals, U.S.S.R Comput. Maths. Math. Phys. 7 (1967) 86–112. [39] C. Shu, H. Ding, K.S. Yeo, Local radial basis function-based differential quadrature method and its application to solve two-dimensional incompressible navier-stokes equations, Comput. Methods Appl. Mech. Eng. 192 (7–8) (2003) 941–954. [40] W.I. Thacker, W.I. Zhang, L.T. Watson, J.B. Birch, M.A. Iyer, M.W. Berry, Algorithm 905: SHEPPACK-modified shepard algorithm for interpolation of scattered multivariate data, ACM Trans. Math. Softw. 37 (3) (2010) 1–20. [41] C. Ulrich, M. Leonardi, T. Rung, Multi-physics SPH simulation of complex marine-engineering hydrodynamic problems, Ocean Eng. 64 (2013) 109–121. [42] H. Wendland, Scattered Data Approximation, Cambridge University Press, Vol. 17 (2005).