Discrete least-squares radial basis functions approximations

Discrete least-squares radial basis functions approximations

Applied Mathematics and Computation 355 (2019) 542–552 Contents lists available at ScienceDirect Applied Mathematics and Computation journal homepag...

2MB Sizes 1 Downloads 120 Views

Applied Mathematics and Computation 355 (2019) 542–552

Contents lists available at ScienceDirect

Applied Mathematics and Computation journal homepage: www.elsevier.com/locate/amc

Discrete least-squares radial basis functions approximations Siqing Li b,a,∗, Leevan Ling a, Ka Chun Cheung c a b c

Department of Mathematics, Hong Kong Baptist University, Kowloon Tong, Hong Kong College of Mathematics, Taiyuan University of Technology, Shanxi, China NVIDIA AI Technology Center (NVAITC), NVIDIA, USA

a r t i c l e

i n f o

MSC: 65D15 65N35 41A63 Keywords: Error estimate Meshfree approximation Kernel methods Tikhonov regularization Noisy data

a b s t r a c t We consider discrete least-squares methods using radial basis functions. A general 2 Tikhonov regularization with W2m -penalty is considered. We provide error estimates that are comparable to kernel-based interpolation in cases which the function it is approximating is within and is outside of the native space of the kernel. Our proven theories concern the denseness condition of collocation points and selection of regularization parameters. In μ particular, the unregularized least-squares method is shown to have W2 () convergence d for μ > d/2 on smooth domain  ⊂ R . For any properly regularized least-squares method, the same convergence estimates hold for a large range of μ ≥ 0. These results are extended to the case of noisy data. Numerical demonstrations are provided to verify the theoretical results. In terms of applications, we also apply the proposed method to solve a heat equation whose initial condition has huge oscillation in the domain. © 2019 Elsevier Inc. All rights reserved.

1. Introduction Given a set X = {x1 , . . . , xnX } of data points in some bounded domain  ⊂ Rd , on each of which function value fi = f (xi ) ∈ R, 1 ≤ i ≤ nX , was specified via some unknown function f. One important application is to reconstruct f based on data; commonly used methods include interpolation or function approximation. The function reconstruction process by radial basis functions (RBF) seeks an interpolant or approximant from the trial space

UZ,, := span{(· , z j ) : z j ∈ Z }, defined by some translation invariant symmetric positive definite kernel  :  ×  → R centered at a set Z = {z1 , . . . , znZ } ⊂  of trial centers. In RBF interpolation, we pick Z = X and the reconstructed interpolant of the form

s=

nZ 

α j (· − z j ),

(1)

j=1

 = [α1 , . . . , αnZ ]T of the (square) linear system which can be uniquely determined by the solution α

 = f|X , (X, X )α where [(X, X )]i, j = (xi , x j ), 1 ≤ i, j ≤ nX , is the interpolation matrix of  on X and f|X = [ f1 , . . . , fnX ]T is the vector of data values. Traditional interpolation error estimates were proven under the assumption that f lies in the reproducing kernel ∗

Corresponding author. E-mail addresses: [email protected], [email protected] (S. Li), [email protected] (L. Ling), [email protected] (K.C. Cheung).

https://doi.org/10.1016/j.amc.2019.03.007 0 096-30 03/© 2019 Elsevier Inc. All rights reserved.

S. Li, L. Ling and K.C. Cheung / Applied Mathematics and Computation 355 (2019) 542–552

543

Hilbert space, a.k.a. the native space, N, associated with the RBF kernel . Convergence estimates were measured in terms of the fill distance of trial centers Z as

hZ := sup min z − z j 2 (Rd ) .

(2)

z∈ z j ∈Z

Generally speaking, smoother kernels, which imply a smoother unknown function f ∈ N, , yield faster convergence rates. Readers are referred to the monographs [1–4] for details and to [5] for f not in the native space. Still assuming that the data points in X and values f|X were given. A more general setting removes the restriction that the sets Z of trial centers and X were identical, but we do insist on having nZ ≤ nX to yield different minimization problems. In comparison, there are far fewer theories for least-squares function approximation by RBF. A continuous least-squares problem takes the form

arg inf { f − sL2 () : s ∈ UZ,, }.

(3)

Suppose  ⊂ Rd is a bounded domain that satisfies a cone condition and has a Lipschitz boundary. Suppose further that the Fourier transform of the kernel  on Rd decays like



c 1 + ω22

−m



 ω ) ≤ C 1 + ω2 ≤ ( 2

−m

for all

ω ∈ Rd ,

(4)

for some m > d/2 with two positive constants 0 < c ≤ C . Under these assumptions, the native space N, is norm equivalent to the Sobolev space W2m (). If f ∈ W2m (), then it was shown in [6] and [7, Proposition 3.2] that

 f − sL2 () ≤ Chm Z  f W2m ()

min

s∈UZ,,

for some C > 0 independent of Z and f. One may also consider the discrete least-squares problem

arg inf

nX     f − s22 (X ) := [ f (xi ) − s(xi )]2 : s ∈ UZ,, ,

(5)

i=1

which can be solved by the solution of the overdetermined linear system

 = f|X , (X, Z )α where [(X, Z )]i, j = (xi , z j ), 1 ≤ i ≤ nX and 1 ≤ j ≤ nZ comprise the collocation matrix of  on Z at X. Under the same assumptions, [8, Theorem 2.3] implies an error estimate for the discrete error norm

min

s∈UZ,,

 f − s2 (X ) ≤ Cn1X/2 ρXd/2 hm Z  f W2m ()

for some constant C > 0 independent of X, Z, and f. Here, ρX = hX /qX is the mesh ratio of X defined by its fill distance as in (2) and the separation distance

qX :=

1 min xi − x j 2 (Rd ) . 2 i = j

A trivial application of discrete least-squares approximation is on parabolic PDEs. If one consider the regularity estimates [9] for parabolic problems

ut + Lu = f ∈ L2 (0, T ; L2 ()) with homogenous boundary condition and initial condition u = g ∈ H01 (), i.e.,

ess sup0≤t≤T u(t )H 1 () + uL2 (0,T ;H 2 ()) + ut L2 (0,T ;L2 ()) 0

≤ C ( f L2 (0,T ;L2 ()) + gH 1 () ), 0

H1 -approximation

it is desirable to have a for the initial conditions g in the form of (1). Then, the evolution of the time dependent coefficients α (t) can be obtained by solving ODE systems. In this paper, we derive some error estimates in continuous norms for the discrete least-squares function approximation by radial basis functions. 2. Stability of discrete least-squares problems We consider a more general setting of the discrete least-squares problem with a smoothness penalty given by the native space norm of some kernel  satisfying (4). For any s ∈ UZ,, in the form of (1), its native space norm is given by

 T (Z, Z )α . s2N, = α We still assume that  ⊂ Rd is a bounded Lipschitz domain satisfying a cone condition. Now, for any regularization parameter λ ≥ 0, we define the corresponding regularized least-squares approximant to data (X, f|X ) by



sλ := arg inf s∈UZ,,

s − f 22 (X ) + λ2 s2N, .

(6)

544

S. Li, L. Ling and K.C. Cheung / Applied Mathematics and Computation 355 (2019) 542–552

In this paper, we apply the native space norm in the regularization term, which is related to kriging [10] in statistics. In statistics, other regularization techniques also play important roles in applications. For example, the 1 and 2 penalty are used in Lasso regularized method [11] and ridge regression [12], respectively. Combination of 1 and 2 yields the Elastic Net regularization method [13]. Recently, a new regularized method called principal components lasso (pcLasso) was proposed [14]. In a recently published monograph [15, Section 8.6], it was proven that the solution sλ of (6) is stable with respect to the approximand:

sλ − f 22 (X ) + λ2 sλ 2N, ≤ (1 + λ2 /nX ) f 2N, for all f ∈ N, , and to the regularization parameter:

sλ N, ≤ s0 N, for all λ ≥ 0, where s0 denotes the unregularized solution with λ = 0. Moreover, the regularized solution sλ converges to s0 in the sense that

sλ − s0 2N, = O (λ2 ) and

sλ − s0 22 (X ) = O (nX λ2 ) for all λ 0.

To obtain convergent estimates to f, our first goal is to derive a stability estimate for (6) within the trial space UZ,, . By the sampling inequality in [16, Theorem 3.3], there exists some constant that depends only on , m and μ such that the followings hold:

 2 −μ −μ sW2μ () ≤ C,m,μ hd/ s 2 ( X ) + hm sW2m () for 0 ≤ μ ≤ m, X X

(7)

for any s ∈ W2m () with m > d/2 and any discrete sets X ⊂  with sufficiently small mesh norm hX . By the inequality (a + b)2 ≤ 2(a2 + b2 ) for any a, b ≥ 0, we have

 d−2μ 2 2 2m−d 2 sW ≤ Ch s  + h  s   μ m 2 ( X ) W2 () X X () 2   2m−d  d−2μ 2 2 s22 (X ) + λ2 sW ≤ ChX − λ2 sW m () + h m () X + 2 2

for 0 ≤ μ ≤ m with (x )+ = max(x, 0 ). Denote the critical regularization parameter −d/2 λ∗ := hm . X

For λ ≥ λ∗ , we immediately see that there is a constant depending on , m, and μ such that

 d−2μ 2 2 2 2  sW ≤ Ch s  + λ  s  for 0 ≤ μ ≤ m, μ m 2 ( X ) W2 () X ()

(8)

2

holds for any s ∈ W2m (). For 0 ≤ λ < λ∗ , we focus only on trial function s ∈ UZ,, ⊆ N, = W2m (). Within the trial space UZ,, , we have a Bernstein inequality [17, Lemma 3.2], which states that there is a constant depending only on , , m, and μ such that +μ sW2m () ≤ C,,m,μ q−m sW2μ () Z

for d/2 < μ ,

μ≤m

(9)

holds for all finite sets Z ⊂  with separation distance qZ . Although the original lemma there requires the integer index to satisfy d/2 < μ ≤ m, a closer inspection of the proof shows that the updated condition in (9) on μ yields the appropriate extension to fractional orders. Thus, we have

  2m−d  −2m+2μ 2 d−2μ 2 2 s22 (X ) + λ2 sW sW − λ2 qZ  sW μ () . μ m () + hX () ≤ ChX + 2

2

2

For sufficiently dense X, in the sense that d−2μ

C,,m,μ hX



h2Xm−d − λ2



−2m+2μ q + Z



1 , 2

(10)

we obtain the same stability estimate for 0 ≤ λ < λ∗ in the same form of (8) that holds for all s ∈ UZ,, . We summarize the result as follows: Lemma 1. Let a kernel  : Rd × Rd → R satisfying (4) with smoothness m > d/2 be given. Suppose  ⊂ Rd is a bounded Lipschitz domain satisfying an interior cone condition. Let Z ⊂  be a discrete set of trial centers with separation distance qZ . Let X ⊂  be another discrete set of collocation points with fill distance hX . Then, there exists a constant depending on , , m, and μ such that

 d−2μ 2 2 2 2  sW ≤ Ch s  + λ  s  μ m 2 ( X ) W2 () X () 2

holds for all s ∈ UZ,, in two circumstances:

for any

λ≥0

S. Li, L. Ling and K.C. Cheung / Applied Mathematics and Computation 355 (2019) 542–552

545

−d/2 • for λ ≥ hm , under the conditions X

0 ≤ μ ≤ m, and the set X being sufficiently dense for (7) to hold; −d/2 • or, for 0 ≤ λ < hm , under the conditions X

d/2 < μ ,

μ ≤ m,

and the set X being dense enough with respect to Z and λ for (10) to also hold. 3. Error estimates β

We consider cases f ∈ W2 () with β > d/2 and β ≤ m; i.e., f is not in the native space of  if β < m. Let s f ∈ UZ,, be the interpolant of f on Z in the trial space UZ . For any appropriate value of μ as specified in Lemma 1, μ we can manipulate the W2 () approximation error of sλ ∈ UZ,, in (6) by a chain of comparisons: 2 2 2 sλ − f W μ () ≤ sλ − s f W2μ () + s f − f W2μ () 2  d−2μ 2 2 sλ − s f 22 (X ) + λ2 sλ − s f W ≤ ChX + s f − f W μ m () () 2 2  d−2μ 2 2 2 2 2 2 2 sλ − f 2 (X ) + λ sλ W2m () + s f − f 2 (X ) + λ s f W ≤ ChX + s f − f W μ m () () . 2 2

By the minimization property of sλ , we obtain an upper bound in terms of the interpolant

 d−2μ 2 2 2 s f − f 22 (X ) + λ2 s f W sλ − f W + s f − f W μ μ m () () ≤ 2ChX () 2 2

(11)

2

for some positive constant C = C,,m,μ . Error estimates can now be derived based on the approximation power of RBF interpolants. For this, we rely on the results in [5] that some improved error bounds found in [7]. To begin, we bound the two continuous norms in (11). For any 0 ≤ μ ≤ β ≤ m, [5, Theorem 4.2] states that

s f − f W2μ () ≤ ChβZ −μ ρZm−μ  f W β () , 2

and using [17, Lemma 3.2] followed by Narcowich et al. [5, Corollary 4.3] yields that +β s f W2m () ≤ Cq−m s f W β () Z

for

2

−m+β

≤ CqZ

β > d/2

(1 + C  ρZm−β ) f W β () . 2

It remains to handle the discrete norm in (11). If β = m and f ∈ W2m (), the zero lemma in [8, Theorem 2.3] ensures that

s f − f 2 (X ) ≤ Cn1X/2 ρXd/2 hm Z s f − f W2m () ≤ Cn1X/2 ρXd/2 hm Z  f W2m () . The last inequality follows from the optimality of sf and norm equivalence between N, and W2m (). β

For f ∈ W2 (), with β ∈ N and d/2 < β ≤ m, not in the native space, we have

s f − f 2 (X ) ≤ Cn1X/2 ρXd/2 ρZm−β hβZ  f Cβ () ¯ by Narcowich et al. [7, Corollary 3.11]. Using the standard estimates nX ≤ Cq−d = C ρXd h−d and  f  X X complete the proof of the following theorem.

β W2 ()

≤  f C β () ¯ , we can

Theorem 1. Suppose all of the assumptions in Lemma 1 hold. Let sλ be the discrete least-squares solution of (6), then • if f ∈ W2m (),

 d/2−μ −μ  f W2m () , sλ − f W2μ () ≤ C ρXd h−X μ hm λ + ρZm−μ hm Z + hX Z β

• or, if f ∈ W2 () with β ∈ N, d/2 < β and β ≤ m,

 2−μ β −m m −μ β −μ  f Cβ () sλ − f W2μ () ≤ C ρXd ρZm−β h−X μ hβZ + ρZ2m−2β hd/ h λ + ρ h ¯ , X Z Z Z

holds for some constants depending on , , m, μ, and β .

546

S. Li, L. Ling and K.C. Cheung / Applied Mathematics and Computation 355 (2019) 542–552

The least amount of regularization used in (6) that does not impose any denseness requirement on the collocation set X is λ∗ . The resulting approximation power is comparable to RBF interpolation. −d/2 Corollary 1. Let sλ∗ be the regularized least-squares solution of (6) with regularization parameter λ∗ = hm . Under the asX sumption of Lemma 1, there exists some constant depending on , , and m such that

 m −μ m −μ m −μ  f W2m () sλ∗ − f W2μ () ≤ C ρXd h−X μ hm + ρZ h Z Z + hX

for 0 ≤ μ ≤ m,

holds for all f ∈ W2m (). In particular, we have

 m sλ∗ − f L2 () ≤ C (ρXd + ρZm )hm Z + hX  f W2m () .

3.1. Noisy data When data are contaminated by measurement or some other sort of error, we only have some noisy data fδ |X to work with instead of using the exact data value f|X in (6), the regularized solution with noisy data takes the form



sδ,λ := arg inf s∈UZ,,

s − fδ 22 (X ) + λ2 s2N, .

(12)

Here, we do not require that fδ be a function and only its values at X are required. Following the same line of logic in deriving (11), we have

 d−2μ 2 2 2 2 2 2 sδ,λ − f 22 (X ) + λ2 sδ,λ W sδ,λ − f W ≤ ChX + s f − f W μ μ m () + s f − f  (X ) + λ s f W m () () () 2 2 2 2 2  d−2μ d−2μ 2 2 s f − f 22 (X ) + λ2 s f W ≤ 2ChX + s f − f W  f − fδ 22 (X ) , μ m () () + 2ChX 2 2

that differs from (11) only in the last term. Corollary 2. Suppose all of the assumptions in Lemma 1 hold. Let sδ ,λ be the discrete least-squares solution of (12) with noisy d/2−μ data fδ |X , then the error estimates in Theorem 1 hold with an extra term ChX  f − fδ 2 (X ) added to the upper bounds for some constant C depending on , , m, and ν . We now focus on the case that f ∈ W2m () is in the native space of , and the pointwise absolute error at each data point is bounded. Following the general framework in [18, Section 8], we assume the existence of a noise level δ ∞ ≥ 0 relative to the Sobolev norm of f such that

max | f (x ) − fδ (x )| ≤ δ∞  f W2m () x∈X

(13)

for any set X ∈ . Then, 2  f − fδ 2 (X ) ≤ n1X/2 δ∞  f W2m () ≤ q−d/ δ∞  f W2m () X

and, hence, Corollary 2 yields

 d/2−μ −μ −μ sδ,λ − f W2μ () ≤ C ρXd h−X μ hm λ + ρZm−μ hm + ρXd/2 hX δ∞  f W2m () . Z + hX Z

This suggests a regularization strategy by using 2 0 ≤ λ ≤ λδ := h−d/ δ∞ = λ∗ h−m X δ∞ , X

for any allowed value of μ. If we take λ = 0, we fail the theoretical requirement for L2 () convergence; this will be studied numerically in the next section. If we take λ = λδ , Lemma 1 allows 0 ≤ μ ≤ m and the density requirement on data points X is independent of trial centers Z for significantly large noise in the sense that δ∞ ≥ hm . In application, this means that we can take nZ ࣠ nX . Due to X the presence of the ρ Z term in the error estimates in Theorem 1, we are highly motivated to choose Z ⊂  quasi-uniformly, or uniformly if possible. The L2 () error bound for the regularized least-squares solution of (12) with regularization parameter λδ will be

 d/2 m m sδ,λδ − f L2 () ≤ C ρXd hm Z + ρZ hZ + ρX δ∞  f W2m () .

The presence of noise affects accuracy linearly depending on some constant that depends , , m, and the mesh ratio ρ X of X. In other words, admissible noise in data is comparable to the interpolation error, i.e., δ∞ = O (hm ), for RBF least-squares Z methods.

S. Li, L. Ling and K.C. Cheung / Applied Mathematics and Computation 355 (2019) 542–552

547

4. Numerical examples Note that the native space norm of the trial function in UZ,, in the form of (1) is given by

 T (Z, Z )α .  s  N ,  = α Then, it is straightforward to show that the unique solution to (6) (or to (12)) is given by the normal equation



    = (X, Z )T f|X (X, Z )T (X, Z ) + λ2 (Z, Z ) α or (X, Z )T fδ|X ,

for identifying the unknown coefficients of the approximant sλ (or sδ ,λ ) in the form of (1). To avoid worsening the problem of the ill-condition, we compute a stabilized Cholesky decomposition of

((Z, Z ) + InZ ) = LLT , for some smallest possible ≥ 0 so that the Cholesky algorithm does not crash due to non-positive definiteness. This stabilization is implemented to safeguard the robustness of our algorithm when a large order of smoothness m is used. With L in hand, we can recast the normal equation as





  f (X, Z )  = |X α or fδ|X , λLT 0nZ

which becomes an overdetermined system that can be solved by standard linear solvers. If any > 0 is required in the  in the Cholesky decomposition, the least-squares problem above is equivalent to adding an extra smoothness term λ2 α regularization for the sake of numerical stability. We provide some demonstrations to verify some aspects of the proven theories. All numerical tests use the standard Whittle–Matérn–Sobolev kernel −d/2 m (x ) := xm K (x2 (Rd ) ) for all x ∈ Rd , 2 (Rd ) m−d/2

which satisfies (4) with exact Fourier transform (1 + ω22 )−m . We test m > d/2 as required in Lemma 1. Note that one can always scale the kernel by x2 ← x2 with any > 0. Instead of scaling, we normalize test functions in  = [−1, 1]2 for easy comparison without fewer parameters. In particular, we consider standard test functions

f1 (x, y ) = peaks(3x, 3y ), f2 (x, y ) = franke(x/2 + 1/2, y/2 + 1/2 ), and functions in the form of (x + y ) p , which is in W22 () for any p > 3/2, aiming to test the limit cases in theories. In our implementation, we evaluate the function and its derivatives via a smoothing by machine epsilon εmach = 2.2 × 10−16 and use

f3 (x, y ) := max(εmach , (x + y )3/2 ). All reported errors are estimated on a 512 uniform grid. 4.1. Example 1: Oversample ratio The least-squares stability in Lemma 1 holds under the condition when the fill distance hX of collocation is small enough −d/2 to fulfill (7) and also (10) if λ < λ∗ = hm . Although these denseness requirements were not known explicitly in practice, X practitioners have seldom had stability problems with RBF least-squares approximation, which hints that (7) and (10) can be fulfilled easily by some oversampling such that nX > nZ . To test the requirement (10), we set up an unconventional asymmetric interpolation problem. We take nZ = 212 regular nodes as trial centers Z. A set of nX = nZ collocation points X were quasi-random generated by a Halton sequence. The resulting linear system (6) is therefore an asymmetric square system. Note that, with λ = 0, the solvability of such a system is not guaranteed by any theories. Fig. 1(a) shows the resulting interpolant, which fails badly near the boundary where no collocation data is presented. The colorbar there indicates the absolute error. In Fig. 1(b), the regularized fit using λ = λ∗ is shown. Appropriate regularization is necessary in cases of insufficient oversampling. Next, we include more points from the Halton sequence to obtain an expanded set of nX = 529 collocation points, which yields an oversample ratio of nX : nZ = 1.2. The corresponding least-squares approximants are shown in Fig. 2; there is no observable difference between the unregularized and regularized fit. The former is hence preferred due to its lower computational cost. The same observations can be made for the other two test functions, and thus we omit their results from this presentation.

548

S. Li, L. Ling and K.C. Cheung / Applied Mathematics and Computation 355 (2019) 542–552

Fig. 1. Example 1. Least-squares approximants obtained by nZ = 441 uniformly placed set of trial center Z and nX = nZ quasi-random Halton set of collocation X with (a) no regularization λ = 0 and (b) regularization parameter λ∗ .

Fig. 2. Example 1. Least-squares approximants obtained by the setup in Fig. 1 but with oversampling nX = 1.2 nZ .

4.2. Example 2: hZ –Convergence To test convergence in the case of scattered data points, we use a Halton sequence to create the set 72 ≤ nZ ≤ 212 of trial centers Z. All tested sets Z are parts of the same Halton sequence with the same starting point, and hence, are nested. The oversample ratio is fixed at nX = 1.2 nZ and we generate the collocation sets X similarly with a different starting point. With a large enough skip, resulting points in X are distinct from Z. Then, we seek for the unregularized (λ = 0) and λ∗ -regularized least-squares approximants by solving (6). For easy comparison between the different test functions, we report the relative W22 () and L2 () errors

Eq =

sλ − f W2q () , q ∈ {0, 2}  f W2q ()

(14)

in Fig. 3 against hZ ≈ (4/nZ )1/2 . We point out that our proven theories do not apply to the L2 () convergence of the unregularized cases, and yet, we can see that the convergence profiles of the unregularized and regularized approximants are analogous. Under close inspection, the λ∗ -regularized least-squares approximations display slightly more stable convergence behavior. Test by test, the peaks function f1 allows convergence faster than predicted. The franke function f2 is also infinitely smooth and results in similar situations of superconvergence [19]. Most obviously, the predicted W22 () divergence rate for the case m = 2 should be (hX /hZ )−2 ≈ 1.2−2 by Corollary 1 and convergence behavior can be clearly seen in Fig. 3(a)–(d). Now, we consider f3 for a test of functions outside the native space. The W22 () error remains constant for all tested orders

S. Li, L. Ling and K.C. Cheung / Applied Mathematics and Computation 355 (2019) 542–552

549

Fig. 3. Example 2. W2q ()-Relative error, Eq for q ∈ {0, 2}, with respect to hZ = (4/nZ )1/2 of the unregularized and λ∗ -regularized least-squares approximant.

m of the kernel and sets Z, whereas the regularized least-squares approximations converge with order 2 even for kernels with higher order.

4.3. Example 3: hX –divergence We further investigate a possible hX divergence in the upper bound of the error estimate suggested by Corollary 1 when

μ > 0. We focus on μ = 2 and study the W22 () relative errors E2 . For ease of reproduction, we only consider uniform sets Z and X. The number of trial centers is fixed at nZ = 212 with hZ ≈ 10−1 . The number of collocations is nX ≈ γ nZ for 1.1 ≤ γ ≤ 10. Because f1 and all least-squares approximants are smooth, having some testing nX larger than the number of error evaluation points does not affect the results. Among all three test functions, the peaks reveals the most noticeable trends in hX -divergence; see Fig. 4 for the results from kernel smoothness orders m = 3 and m = 4. Despite observing the trends, the rate of divergence is much below the predicted rate of −2. Although we did not further investigate larger oversampling ratio, we conjecture that hX -divergence, if present, is illegible in practice.

550

S. Li, L. Ling and K.C. Cheung / Applied Mathematics and Computation 355 (2019) 542–552

Fig. 4. Example 3. W22 () error profiles of least-squares approximation using the same uniform nZ = 212 trial centers against different sets of uniformly distributed collocation points under oversampling ratio.

Fig. 5. Example 4. Error profiles of least-squares approximation based on noisy data against maximum pointwise noise ξ = maxX | f ( x ) − f δ (x )|.

4.4. Example 4: Noisy data Using the nZ = 212 uniform trial centers and uniform collocation points with oversample ratio nX = 1.2 nZ as in the last example, we test the linear divergence results with respect to noise in Corollary 2. We focus on reconstructing the first two smooth functions using kernels with m = 2. Noisy function values at the collocation points in X were generated by

fδ = f + Unif(−ξ ,

ξ ) with10−5 ≤ ξ ≤ 15.

Fig. 5 shows various relative errors Eq as in (14) for q = 0, 1, 2 of the least-squares reconstructions (12) of the peaks f1 and franke f2 functions with λ = 0 and λ = λδ . To identify the parameter δ∞ in (13), we must first have an estimate for  f W m () , which is a nontrivial task in general. For the sake of demonstration, we make such an estimation based on the 2

same procedure of error evaluation. We once again see that unregularized and regularized least-squares approximations yield similar accuracy and the expected linear divergence with respect to noise can be observed. 4.5. Example 5: An application to heat equations As mentioned in the introduction, the proposed approximation methods allow us to obtain approximate initial conditions in the trial space. From there, we can update with a method of lines or some Rothes time stepping methods. Consider the following unforced heat equation in  = [−1, 1]2 :

ut (x, y, t ) = u(x, y, t ) for (x, y, t ) ∈  × (0, T ], u(x, y, t ) = 0

for x ∈ ∂ , t ∈ [0, T ],

u(x, y, 0 ) = g(x, y )

for x ∈ .

(15)

S. Li, L. Ling and K.C. Cheung / Applied Mathematics and Computation 355 (2019) 542–552

551

Fig. 6. Example 5. Snapshots at time t = 0.25 and 0.5 of numerical solutions for a heat equation obtained by (a) the method of lines with unregularized and (b) λ∗ -regularized initial conditions (IC), and (c) finite element method.

We set the initial function as

g(x, y ) = cos(π x/2 ) cos(π y/2 ) arctan(50(x − y )), which “jumps” along the line x = y and is compatible with the zero boundary condition. For demonstration purposes only, we set up a standard RBF method of lines [20] for (15), in which we treat the RBF coefficients as time dependent. Whittle–Matérn–Sobolev kernel with smoothness order m = 3 is used in this demonstration. We use sets interior and boundary collocation points X ⊂  and Y ⊂ ∂ , and simply take Z = X ∪ Y as trial centers. By firstly enforcing boundary conditions at Y, we obtain an nX × nX ODE system, which is solved by the MATLAB ODE45 solver. To well capture this initial condition with rapid changes, we use quasi-uniform (i.e., without refinement near the jump) data points X ⊂  with hX ≈ hZ /3 to approximate g and, hence, obtain the initial value for the above set up ODE system. In Fig. 6, we present the numerical solutions (with nZ ≈ 600 and hZ ≈ 0.08) corresponding to the unregularized (λ = 0) and λ∗ -regularized approximants of initial condition g. The finite element solution (with 2577 nodes) obtained by the MATLAB PDE Toolbox is also presented for comparison. At t = 0.25 and 0.5, we can see that the λ∗ -regularized and FEM solutions are analogous. It is obvious that the unregularized solution is not decaying as fast as the other two solutions; this is the consequence of the oscillations near the jump at the initial stage. 5. Conclusion We provided some Sobolev error estimates for regularized RBF discrete least-squares approximation. With appropriate regularization, a least-squares stability holds for any sampling ratio of the fill distances of collocation points and trial cenμ ters. This results various W2 () error estimates for μ ≥ 0. For the unregularized least-squares approach, stability can only be shown with collocation points being sufficiently dense with respect to trial centers. This was demonstrated by a numerical example. The consequence is an extra condition on all of the proven error estimates, which only holds for μ > d/2. With sufficiently dense collocation points, the unregularized least-squares method shows numerical convergence profiles that closely resemble the regularized case. It is our future work to develop new theoretical tools to extend the results to 0 ≤ μ ≤ d/2. Acknowledgements This work was partially supported by a Hong Kong Research Grant Council GRF Grant.

552

S. Li, L. Ling and K.C. Cheung / Applied Mathematics and Computation 355 (2019) 542–552

References [1] M.D. Buhmann, Radial basis functions: theory and implementations, Volume 12 of Cambridge Monographs on Applied and Computational Mathematics, Cambridge University Press, Cambridge, 2003. [2] H. Wendland, Scattered data approximation, Cambridge Monographs on Applied and Computational Mathematics, 17, Cambridge University Press, Cambridge, 2005. [3] G.E. Fasshauer, Meshfree approximation methods with MATLAB, Interdisciplinary Mathematical Sciences 6. Hackensack, NJ: World Scientific, 2007. [4] G.E. Fasshauer, M.J. McCourt, Kernel-based Approximation Methods using MATLAB, Interdisciplinary Mathematical Sciences 19. Hackensack, NJ: World Scientific, 2015. [5] F.J. Narcowich, J.D. Ward, H. Wendland, Sobolev error estimates and a Bernstein inequality for scattered data interpolation via radial basis functions, Constr. Approx. 24 (2) (2006) 175–186. [6] J.D. Ward, Least squares approximation using radial basis functions: an update, in: M. Neamtu, E.B. Saff (Eds.), Advances in Constructive Approaximation: Vanderbilt2003, Nashboro Press, Brentwood, TN, 2004, pp. 499–508. [7] F.J. Narcowich, J.D. Ward, H. Wendland, Sobolev bounds on functions with scattered zeros, with applications to radial basis function surface fitting, Math. Comp. 74 (250) (2005) 743–763. [8] Q.T.L. Gia, F.J. Narcowich, J.D. Ward, H. Wendland, Continuous and discrete least-squares approximation by radial basis functions on spheres, J. Approx. Theory 143 (2006) 124–133. [9] L.C. Evans, Partial differential equations, Graduate Studies in Mathematics, American Mathematical Society, Providence (R.I.), 1998. Reimpr. avec corrections: 1999, 2002. [10] M. Scheuerer, R. Schaback, M. Schlather, Interpolation of spatial data–a stochastic or a deterministic problem? Eur. J. Appl. Math. 24 (2013) 601–629. [11] R. Tibshirani, Regression shrinkage and selection via the lasso: a retrospective, J. Royal Stat. Soc 58 (1) (1996) 267–288. [12] A.E. Hoerl, R.W. Kennard, Ridge regression: biased estimation for nonorthogonal problems, Technometrics 12 (1) (1970) 55–67. [13] H. Zou, T. Hastie, Regularization and variable selection via the elastic net, J. Royal Stat. Soc: Series B 67 (2) (2005) 301–320. [14] J.K. Tay, J. Friedman, R. Tibshirani, Principal component-guided sparse regression, arXiv:1810.04651 (2018). [15] A. Iske, Approximation, Springer-Lehrbuch Masterclass, Springer Spektrum, 2018. [16] R. Arcang’eli, M.C.L. de Silanes, J.J. Torrens, An extension of a bound for functions in Sobolev spaces, with applications to (m, s)-spline interpolation and smoothing, Numer. Math. 107 (2) (2007) 181–211. [17] K.C. Cheung, L. Ling, R. Schaback, H2 –convergence of least-squares kernel collocation methods, SIAM J. Numer. Anal. 56 (1) (2018) 614–633. [18] R. Schaback, All well–posed problems have uniformly stable and convergent discretizations, Numer. Math. 132 (2016) 597–630. [19] R. Schaback, Superconvergence of kernel-based interpolation, J. Approx. Theory 235 (2018) 1–19. [20] Y.C. Hon, R. Schaback, M. Zhong, The meshless kernel-based method of lines for parabolic equations, Comput. Math. Appl. 68 (12A) (2014) 20572067.