Available online at www.sciencedirect.com
Journal of Approximation Theory 171 (2013) 65–83 www.elsevier.com/locate/jat
Full length article
Interpolation and approximation in Taylor spaces Barbara Zwicknagl a , Robert Schaback b,∗ a Institut f¨ur Angewandte Mathematik, Universit¨at Bonn, Endenicher Allee 60, 53115 Bonn, Germany b Institut f¨ur Numerische und Angewandte Mathematik, Universit¨at G¨ottingen, Lotzestraße 16-18, D–37073 G¨ottingen,
Germany Received 25 June 2012; accepted 9 March 2013 Available online 19 March 2013 Communicated by Paul Nevai
Abstract The univariate Taylor formula without remainder allows to reproduce a function completely from certain derivative values. Thus one can look for Hilbert spaces in which the Taylor formula acts as a reproduction formula. It turns out that there are many Hilbert spaces which allow this, and they should be called Taylor spaces. They have certain reproducing kernels which are either polynomials or power series with nonnegative coefficients. Consequently, Taylor spaces can be spanned by translates of various classical special functions such as exponentials, rationals, hyperbolic cosines, logarithms, and Bessel functions. Since the theory of kernel-based interpolation and approximation is well-established, this leads to a variety of results. In particular, interpolation by shifted exponentials, rationals, hyperbolic cosines, logarithms, and Bessel functions provides exponentially convergent approximations to analytic functions, generalizing the classical Bernstein theorem for polynomial approximation to analytic functions. Finally, we prove sampling inequalities in Taylor spaces that allow to derive similar convergence rates for non-interpolatory approximations. c 2013 Elsevier Inc. All rights reserved. ⃝ Keywords: Univariate interpolation; Nonlinear approximation; Kernels; Reproducing kernel Hilbert spaces; Error bounds; Convergence orders
∗ Corresponding author.
E-mail addresses:
[email protected] (B. Zwicknagl),
[email protected],
[email protected] (R. Schaback). c 2013 Elsevier Inc. All rights reserved. 0021-9045/$ - see front matter ⃝ http://dx.doi.org/10.1016/j.jat.2013.03.006
66
B. Zwicknagl, R. Schaback / Journal of Approximation Theory 171 (2013) 65–83
1. Introduction The theory of reproducing kernels K in Hilbert spaces H of continuous functions on some domain Ω is well-known [7,18], and there are plenty of applications in Numerical Analysis, Stochastics, and Machine Learning [25]. The usual reproduction formula is f (x) = ( f, K (x, ·)) H
for all x ∈ Ω , f ∈ H,
(1)
but one of the most important reproduction formulas known otherwise is the Taylor formula f (x) =
∞ f ( j) (0) j x , j! j=0
for all x ∈ I, f ∈ F
(2)
for functions in some space F of analytic functions on some interval I around the origin of R. It is well-known [7,18] that each reproducing kernel Hilbert space has a unique kernel and a unique reproduction formula, but the converse is not true: there may be different function spaces with different kernels having the same reproduction formula. Therefore this paper focuses on cases where the Taylor formula (2) takes the form (1) of a reproduction formula (see [12]). It turns out that there is a variety of interesting kernels allowing this identification, i.e. roughly all kernels K (x, t) which are power series in xt with nonnegative coefficients. Each of these kernels defines a “native” Hilbert space in which it is reproducing via both (2) and (1). Following [12], all such spaces will be called Taylor spaces in this paper, and it turns out that classical function spaces like Hardy, Bergman, and Dirichlet spaces are special examples. This unifying approach allows the application of standard results concerning interpolation and approximation by spaces of the form FX := Span {K (x, ·) : x ∈ X }
(3)
for fixed finite sets X := {x0 , . . . , xn } ⊂ I . Thus the trial spaces can consist of exponentials, rationals, hyperbolic cosines, logarithms, and Bessel functions, for instance, depending on which kernel is chosen. Thus we can derive results on approximation and interpolation by those families from known results on kernel-based interpolation and approximation (see [26] and the references therein). These include optimality properties and error bounds, the latter being particularly interesting when they take the form of Bernstein-type theorems concerning spectral approximation orders for approximations of analytic functions. For interpolation, such theorems come as special cases of general results of [28] on power series kernels. We recall some properties of kernel-based interpolants in Sections 3 and 4, and present applications to popular functions in Section 5. A more general approach uses sampling inequalities in the framework of [21], for which a priori bounds have to be proven. The paper closes with some numerical examples. 2. Taylor spaces We deal with Taylor spaces in the spirit of [12], and hope that there will be no confusion with the related Taylor spaces introduced by Calderon and Zygmund [9]. We fix a subset N of N := {0, 1, 2, . . .} and consider functions f for which the remainder-free Taylor formula f (x) =
f ( j) (0) xj j! j∈N
is valid and allows reproduction of f via its derivatives at zero.
(4)
B. Zwicknagl, R. Schaback / Journal of Approximation Theory 171 (2013) 65–83
67
If N is finite and of the form N := {0, 1, . . . , N } then the admissible functions f form the space P N of polynomials of degree at most N . Other finite sets N lead to special spaces of lacunary polynomials. If N is an infinite subset of N := {0, 1, 2, . . .} then (4) is required to be valid in a neighborhood of the origin with absolute convergence. This suggests to work in the complex plane C in all cases, allow complex-valued functions, and assume (4) to hold at least on some disk D R := {x ∈ C : |x| < R} of a finite radius R > 0 with absolute convergence. But we can also consider cases where (4) is absolutely convergent in the full complex plane, defining an entire function f there, and we shall refer to this case via R = ∞. Before we take a closer look at spaces of functions satisfying (4) for infinite N , we consider ways to turn (4) into a reproduction formula of the form (1). Here we focus on a special technique using weighted series expansions (see [24,12]). We take positive weights λ j for all j ∈ N satisfying R2 j λ j < ∞, ( j!)2 j∈N λjej √ < ∞, j! j j∈N
R<∞ (5) R=∞
noting that the first condition follows for all R > 0 if the second is satisfied. Then we form the weighted inner product ( f, g) F :=
f ( j) (0)g ( j) (0) λj j∈N
(6)
on the space of all functions f which have all derivatives f ( j) (0) for j ∈ N , satisfy (4) and additionally also ∥ f ∥2F :=
| f ( j) (0)|2 < ∞. λj j∈N
(7)
The inner product (6) defines a Hilbert space F that we call a Taylor space. We drop the dependence of F on the sets N and Λ := {λ j : j ∈ N } for the reader’s convenience, but note that the weight set Λ will in all cases determine the inner product structure on F. Rewritten in power series form, the functions in F are f (z) =
j∈N
ajz j
with
j!2 |a j |2 < ∞ λ j j∈N
68
B. Zwicknagl, R. Schaback / Journal of Approximation Theory 171 (2013) 65–83
where a j = f ( j) (0)/j!. The inequality above is equivalent to (7), and together with (5) it implies that (4) is absolutely convergent in D R for finite R and in C for R = ∞. In fact, in D R we have |a j |2 ( j!)2 R2 j λ j <∞ · (8) | f (z)|2 ≤ 2 λ ( j!) j j∈N j∈N by the Cauchy–Schwarz inequality. Theorem 2.1. If N is finite then the Taylor space F with respect to N consists of the span of the monomials x j for j ∈ N . If N is infinite, and if weights λ j are chosen with (5), the Taylor space F consists of real-analytic functions with power series expansions around zero which are absolutely convergent in D R . The space will depend on Λ, consist of all functions f of the form (4) with (7) and carry the inner product (6). We still have to turn (4) into a Hilbert space reproduction formula f (x) = ( f, K (x, ·)) F
for all x ∈ D R , f ∈ F
(9)
with a suitable positive (semi-) definite reproducing kernel K : D R × D R → C. As is well-known [7,18], such a kernel must exist and is uniquely defined. Theorem 2.2. The kernel K (x, t) :=
j∈N
λj
(t x) j =: κ(t x), ( j!)2
(10)
x, t ∈ I
is well-defined due to (5) and the series is absolutely convergent for infinite N and for all x, t ∈ D R . It is positive (semi-) definite and reproducing in the Taylor space, and the Taylor formula coincides with the reproduction formula (1). Proof (See [12,28]). The first sentence follows from arguments we have used before. Now we evaluate ∂j xj K (x, t) = λ where λ j := 0 if j ∈ ̸ N, j ∂t j |t=0 j! f ( j) (0) λ j x j ( f, K (x, ·)) F = = f (x). λj j! j∈N
and
This yields plenty of examples of kernels when starting from power series κ(z) =
∞ j=0
λj
zj ( j!)2
which represent complex-analytic functions in a neighborhood of the origin. The connection to kernels K (x, t) is via K (x, t) = κ(t x). It allows selections of λ j which grow like R j ( j!)2 for j → ∞ and arbitrary R > 0. Consequently, there are many spaces which allow the Taylor
B. Zwicknagl, R. Schaback / Journal of Approximation Theory 171 (2013) 65–83
69
Table 1 Some kernels for Taylor spaces. j
z j∈N λ j ( j!)2
N
λj
(1 − z)−1 , 0 ≤ |z| < 1 (1 − z 2 )−1 , 0 ≤ |z| < 1
N 2N
( j!)2 ( j/2)!2
(1 − z)−α , α ∈ N>0 , 0 ≤ |z| < 1
N
(1 − z 2 )−α , α ∈ N>0 , 0 ≤ |z| < 1 log(1−z) − , 0 ≤ |z| < 1 z
2N
κ(z) =
exp(z) sinh(z)
N N 2N + 1
sinh(z)/z cosh(z)
2N 2N
z −α Iα (z)
2N
(α+ j−1)! j! (α−1)! (α−1+ j/2)!( j/2)! (α−1)! ( j!)2 j+1
j! j! j! j+1
j! j!2 2 j+α Γ (α+1+ j/2)( j/2)!
formula for reproduction. Table 1 gives some examples that we partially cover later in more detail. Many of these are special instances of hypergeometric functions (see [1]), i.e. 2 F1 (a, b; c; z) =
∞ (a)n (b)n z n (c)n n! n=0
p Fq (a1 , . . . , a p ; b1 , . . . , bq ; z) =
∞ (a1 )n · · · (a p )n z n (b1 )n · · · (bq )n n! n=0
with the Pochhammer symbol (a)n := a(a + 1) · · · (a + n − 1). Other and much more general cases for multivariate power series kernels are covered by [28]. In complex analysis, special cases of Taylor kernels and their associated native spaces are wellstudied (see e.g. [23,14,4,3]). 3. Interpolation In what follows, we consider interpolation at finite sets X := {x0 , . . . , xn } ⊂ D R of pairwise different nodes by functions from the space (3). Interpolation of arbitrary data on X requires nonsingularity of the kernel matrix A X := K (x j , xk ) 0≤ j, k≤n . It is Hermitian and positive semidefinite because it can be written as a Gramian, but it is positive definite only if additional conditions are satisfied. Before we focus on such conditions, we consider the case that the interpolated data on X come from a function f in the Taylor space F with kernel K . For these special data, the system is always solvable. In fact, the Hilbert space projector Π X from F to FX yields a function
70
B. Zwicknagl, R. Schaback / Journal of Approximation Theory 171 (2013) 65–83
Π X ( f ) = s f,X ∈ FX such that f − Π X ( f ) is orthogonal to FX . By (1) this means that s f,X interpolates f in X . For checking positive definiteness of the kernel matrix, let a ∈ Cn+1 be a vector with 2 n λ j j a 0 = aT AX a = x . m m ( j!)2 m=0 j∈N
Positive definiteness of the kernel matrix is guaranteed if 0=
n
j
for all j ∈ N
am x m
m=0
implies a = 0, i.e. iff the |N | × |X | Vandermonde matrix built from the n + 1 = |X | points in X and the |N | exponents in N has rank n + 1. In case N = {0, 1, . . . , N } with N ≥ n, the matrix is positive definite, but for other cases of finite N , the above nondegeneracy condition is all we can say here. However, for N = N we have positive definiteness for all point sets. The same holds for N = 2N if symmetry under squaring is avoided, i.e. if all |xm |2 are different. 4. Properties of interpolants The existence of interpolants being settled to some extent, we can apply known results from the theory of scattered data interpolation to interpolation in Taylor spaces. For more details and proofs we refer to [26] and the references therein. The first type of results concerns optimality. The interpolant s f,X ∈ FX to some function f ∈ F on a discrete set X satisfies the norm-minimality property ∥s f,X ∥ F ≤ ∥g∥ F
for all g ∈ F with g| X = f | X
and in particular ∥s f,X ∥ F ≤ ∥ f ∥ F . We now focus on error bounds and another optimality criterion. Consider all linear functionals of the form µa,x := f → f (x) −
n
a j f (x j )
j=0
with arbitrary vectors a ∈ Cn+1 and arbitrary but fixed x ∈ D R . It is well-known in the context of reproducing kernel Hilbert spaces that the solution of min ∥µa,x ∥ F ∗ =: PX (x)
(11)
a∈Cn+1
exists and is attained at a vector a = u(x) ∈ Cn+1 which solves the system K (x, xk ) =
n
u j (x)K (x j , xk ),
0≤k≤n
j=0
which is solvable because of K (x, ·) ∈ F. The solution is unique and satisfies Lagrange conditions u j (xk ) = δ jk if the system is nonsingular, but in general we know only that the
B. Zwicknagl, R. Schaback / Journal of Approximation Theory 171 (2013) 65–83
71
solution may be nonunique. We can now define the function L f,X (x) :=
n
f (x j )u j (x)
for all x ∈ D R
j=0
and by construction we have the standard error bound | f (x) − L f,X (x)| ≤ PX (x)∥ f ∥ F
for all f ∈ F, x ∈ D R
(12)
with the power function PX (x) defined in (11). This function can be explicitly calculated since PX (x)2 = ∥µu(x),x ∥2F ∗ n n = K (x, x) − u j (x)K (x, x j ) − u j (x)K (x j , x) j=0
+
n
j=0
u j (x)u k (x)K (x j , xk ).
j,k=0
Note that PX is uniquely defined even if the interpolation problem is not uniquely solvable. Since (11) implies that the optimal functional µu(x),x must be orthogonal to all functionals δx j , we know that L f,X is an interpolant to f on X which will coincide with s f,X if the system is nonsingular. In any case, the error bound (12) is useful. For instance, one can generate new interpolation points by choosing maximal points of PX (see [10]), or one can assess the quality of polynomial interpolation on point sets whose cardinality is much lower than the degree (see Section 7). If we are interested in asymptotic results for n → ∞, we have to confine ourselves to the case of infinite N . Furthermore, in Section 5, we consider the framework described above and restrict ourselves to interpolation and evaluation on intervals I := [−a, a] with a ≤ R. We consider only functions f from F with real coefficients in their power series, and we denote the resulting space by FR . If we interpolate functions f from FR , we shall make use of the analyticity of f and apply results of [28] after some modifications. These error bounds will in all cases come out to be of spectral order in terms of the fill distance h := sup min |x j − x| x∈I x j ∈X
of X in I . Since we shall later work with univariate functions on bounded intervals only, we can assume h ≈ 1/n in case of n quasi-uniformly distributed data points, i.e. those with bounded mesh ratio in the sense used in spline theory. Building on these results, the more general case of non-interpolatory approximation is addressed in Section 6. 5. Examples 5.1. Hardy space We consider the case κ(z) = (1 − z)−1 on the open unit disk D1 . This yields the Szeg¨o kernel S(ω, z) :=
1 1 − zω
for all ω, z ∈ D1 .
The native Hilbert space for this kernel is the well-known Hardy space H2 which consists of those functions analytic in the unit disk whose Taylor coefficients form a square-summable series. It is
72
B. Zwicknagl, R. Schaback / Journal of Approximation Theory 171 (2013) 65–83
known (see, e.g., [4,3]) that the norm can also be realized as an integral 2π 1 2 ∥ f ∥2H2 = f eiθ dθ. 2π 0 The reproducing formula then becomes the Cauchy formula. Thus the two most interesting reproduction formulas agree in this case and take the form of Hilbert space reproduction formulas. When working in an interval I = [−a, a] with 0 < a < 1 and real-valued functions, we get Proposition 5.1. The native Hilbert space FR := FR for the rational kernel R(x, t) = (1 − xt)−1 consists of real-valued functions whose complex extensions lie in the Hardy space H2 . Interpolation in point sets X = {x0 , . . . , xn } ⊂ I = [−a, a] with 0 < a < 1 using this kernel and the trial space (3) will be in terms of rational functions p/q with polynomials p and q of degree up to n and n + 1, respectively, the denominator polynomial q being fixed up to a multiplicative constant by its zeros in the points 1/x j , 0 ≤ j ≤ n. These classical rational interpolants are easy to calculate via polynomial interpolation of degree n. For interpolation in the corresponding native Hilbert space, we obtain the following convergence results from [28, Corollaries 1 and 2]: Theorem 5.2. 1. For 0 < a < 1 there are c > 0 and h 0 > 0 such that for all discrete sets X ⊂ I = [−a, a] with fill distance h ≤ h 0 and all f ∈ FR , the error between f and its interpolant s f,X is bounded by f − s f,X ≤ e−c/ h ∥ f ∥ FR . L [−a,a] ∞
2. Suppose j ∈ N and 0 < a < 1. Then there are C > 0 and h˜ 0 > 0 such that for all discrete sets X ⊂ I = [−a, a] with fill distance h ≤ h˜ 0 and all f ∈ FR , the error between the j-th derivative of f and its interpolant s f,X is bounded by √ ( j) ( j) ≤ e−C/ h ∥ f ∥ FR . f − s f,X L ∞ [−a,a]
Note that for quasi-uniform data sites we get spectral or exponential convergence with respect to n. 5.2. Bergman space Similarly we can proceed for the Bergman kernel B (w, z) =
1 (1 − zw)2
on the unit disk. The native Hilbert space for this kernel is the Bergman space B2 , which consists of those holomorphic functions on the unit disk that are square-summable with respect to the planar Lebesgue measure m (see [3]). The norm can be realized as 1 ∥ f ∥B2 = | f (z)|2 dm(z). π D1 The Bergman space contains the Hardy space. We now restrict everything to the real line and real-valued functions.
B. Zwicknagl, R. Schaback / Journal of Approximation Theory 171 (2013) 65–83
73
Proposition 5.3. The native Hilbert space FB := FR for the rational kernel B(x, t) := (1 − xt)−2 on intervals I := [−a, a] with 0 < a < 1 consists of real-valued functions analytic in I whose complex extensions lie in the Bergman space B2 . Now the kernel-based interpolation problem on n + 1 points uses rational functions p/q where q has the points 1/x j as double zeros and is of degree 2n + 2, while p is of degree at most 2n. For interpolation in FB we obtain the following convergence results. Theorem 5.4. 1. For 0 < a < 1 there are constants c, h 0 > 0 such that for all discrete sets X ⊂ [−a, a] with fill distance h ≤ h 0 and all f ∈ FB , f − s f,X ≤ e−c/ h ∥ f ∥ FB . L [−a,a] ∞
2. Suppose j ∈ N and 0 < a < 1. Then there are constants C, h˜ 0 > 0 such that for all discrete sets X ⊂ [−a, a] with fill distance h ≤ h˜ 0 and all f ∈ FB , the error between the j-th derivatives of f and its interpolant s f,X is bounded by √ ( j) ( j) ≤ e−C/ h ∥ f ∥ FB . f − s f,X L ∞ [−a,a]
Proof. 1. It suffices to check that (see [28, Theorem 3]) 2k 1 (2k) ≤ ck k!2 ˜ C := sup D y (1 − x y)2 x,y∈[−a,a] holds for almost all k ∈ N with some constant c independent of k, where D ℓy denotes the ℓ-th derivative with respect to the variable y. We compute (2k + 1)!x 2k 2k 1 (2k) ˜ = sup C := sup D y (1 − x y)2 x,y∈[−a,a] (1 − x y)2k+2 x,y∈[−a,a] k 16a 2 (2k + 1)!a 2k 2 ≤ 2k+2 ≤ 4 k! . 1 − a2 1 − a2 2. It suffices to check that (see [28, Theorem 6]) C (2k) := max sup Dxm D ℓy K (x, y) ≤ eck k 2k ℓ+m=2k x,y∈(−a,a)
holds for some constant c independent of k. By symmetry, we may assume m ≤ ℓ. Explicit calculations yield ∞ C (2k) = max sup Dxm D ℓy (n + 1) x n y n ℓ+m=2k x,y∈[−a,a] n=0 ∞ (n + 1)!n! n−m n−ℓ = max sup x y ℓ+m=2k x,y∈[−a,a] (n − ℓ)! (n − m)! n=ℓ = a −2k max
ℓ+m=2k
∞ n=ℓ
(n + 1)!n! a 2n . (n − ℓ)! (n − m)!
74
B. Zwicknagl, R. Schaback / Journal of Approximation Theory 171 (2013) 65–83
We claim that the maximum is attained for ℓ = m = k. Indeed, if ℓ ≥ m + 2, ∞ n=ℓ
∞ (n + 1)!n! (n + 1)!n! a 2n ≤ a 2n − ℓ + 1)! − m − 1)! (n − ℓ)! (n − m)! (n (n n=ℓ
≤
∞
(n + 1)!n! a 2n , − − 1))! − + 1))! (n (ℓ (n (m n=ℓ−1
and we conclude that the maximum is attained for m = ℓ = k. We proceed along the lines of [28, Proof of Lemma 3]. Using relations between the hypergeometric functions and the Jacobi polynomials we observe that ∞ (n + k + 1)! (n + k)! 2n 2n a a = 2 n!2 n=k (n − k) ! n=0 2 −k−1 (0,−1) a + 1 = (k + 1)!k! 1 − a 2 Pk+1 , 1 − a2
C (2k) = a −2k
∞ (n + 1)!n!
where (0,−1)
Pk+1
1 + a2 1 − a2
:=
k k+1 k 1 a 2m . m m (1 − a 2 )k+1 m=0 (0,−1)
By the recurrence relation of the Jacobi polynomials Pk := Pk Pk+1 (x) Pk (x) = Mk Pk (x) Pk−1 (x)
(see [13, (8.961(2))])
with (k − 1)(2k + 1) (k + 1)(2k − 1) . 0 √ Since for k → ∞, the eigenvalues of Mk tend to x ± −1 + x 2 , we deduce that (0,−1) Pk+1 (x) ≤ C k , 2k + 1 1 x− (2k − 1)(k + 1) Mk := k + 1 1
−
2
+1 where C depends on x = − aa 2 −1 but not on k. Finally there is c > 0 that does not depend on k with 2 −k−1 a +1 (0,−1) C (2k) = (k + 1)!k! 1 − a 2 Pk+1 − 2 a −1
≤ ck k! (k + 1)! ≤ ck k 2k . 5.3. Dirichlet space There are also Taylor kernels of logarithmic type. A well-known one is the kernel L (x, y) := − x1y log (1 − x y), which can be extended via L (w, z) := −
1 log (1 − zw) . zw
B. Zwicknagl, R. Schaback / Journal of Approximation Theory 171 (2013) 65–83
75
This kernel is the reproducing kernel for the Dirichlet space D (see [3]), which consists of those holomorphic functions f on the unit disk D whose derivative f ′ is in the Bergman space. Proposition 5.5. The native Hilbert space for the logarithmic kernel L consists of real-valued functions analytic in [−a, a] whose complex extensions lie in the Dirichlet space D. The interpolation on X := {x0 , . . . , xn } ⊂ I = [−a, a] for 0 < a < 1 is now carried out with linear combinations of functions 1 − log(1 − t x j ) for x j ̸= 0, t 1 for x j = 0, which is quite an unusual setting. The approximation orders due to [28] are the same as for the rational case. Thus Theorem 5.2 can be reformulated exactly also for this kernel and functions from its native space. 5.4. The exponential case In this section we consider a kernel of exponential type on R, namely E (x, t) := exp(xt) which arises when the entire function κ(z) = exp(z) is restricted to the real line. The native Hilbert space ∞ ∞ n 2 an z with an ∈ R, n!an < ∞ FE := f : R → R f (z) = n=0 n=0 consists of real-analytic functions with entire complex extensions. If an interval I := [−a, a] is fixed, and if point sets X := {x0 , . . . , xn } ⊂ I are used for interpolation, we have an interpolation with classical exponential sums, the x j being fixed frequencies. We shall remove this coupling between frequencies and data points later. To characterize functions from FE more precisely, we recall some basic definitions (see [8, Chapters 1 and 2]). Definition 5.6. For an entire function f of a complex variable z, we denote by M(r ) the maximum modulus of f (z) for |z| = r < ∞. We say that f is of order ρ if lim sup r →∞
log log M(r ) = ρ. log r
By convention, a constant function has order 0. Theorem 5.7. 1. The native Hilbert space FE of the exponential kernel consists of realvalued analytic functions that have entire complex extensions. In particular, it contains all polynomials. n 2. If an entire function f (z) = ∞ n=0 an z with real coefficients an is of order ρ < 2, then its restriction to R lies in FE . 3. For a function from FE , the complex extension is of order less than or equal to 2.
76
B. Zwicknagl, R. Schaback / Journal of Approximation Theory 171 (2013) 65–83
n ˜ Proof.1. Suppose f (x) = ∞ n=0 an x , and f ∈ FE . Then the natural complex extension f (z) ∞ n = n=0 an z converges in the whole complex plane since 1/2 n n 1 n 2 lim |an | ≤ lim |an | n! · lim = 0. n→∞ n→∞ n→∞ n! n 2. If an entire function f (z) = ∞ n=0 an z is of order ρ = 2 − 2ϵ with 1 > ϵ > 0 then (see [8, Theorem 2.2.2]) lim
n log n
n→∞ log (1/ |an |)
= 2 − 2ϵ,
where the expression on the right is to be taken as 0 if an = 0. Since an2 n! converges to 0, there is n 0 ∈ N such that an ≤ 1 for all n ≥ n 0 . Then there exists N ∈ N such that for all n≥N 1 n log n ≤ (2 − ϵ) log , |an | n
i.e., |an | ≤ n − 2−ϵ . Hence ∞
n!an2 ≤ C +
∞ n=N
n=0
2n
n!n − 2−ϵ ≤ C +
∞
2ϵ
n − 2−ϵ < ∞.
n=N
Similarly if f is of order 0 then there is N ∈ N such that n log n ≤ log (1/ |an |) for all n ≥ N , and the claim follows as above if we choose ϵ = 1. n 2 3. Suppose f (x) = ∞ n=0 an x , and f ∈ FE . Then an ≤ C/n! for some C > 0. By Stirling’s formula Cen/2 C |an | ≤ √ ≤ 2n+1 , n! n 4 and thus n log n n log n ≤ lim n lim 1 n→∞ n→∞ log |a1n | 2 log n + 4 log n − log C −
n 2
= 2.
Following [28, Corollaries 1 and 2], there are the following approximation orders for classical interpolation in FE . Theorem 5.8. 1. For a > 0 there are c > 0 and h 0 > 0 such that for all discrete subsets X of intervals I = [−a, a] with fill distance h ≤ h 0 and all f ∈ FE , the error between f and its interpolant s f,X is bounded above by f − s f,X ≤ ec log h/ h ∥ f ∥ FE . L [−a,a] ∞
2. Suppose j ∈ N and a > 0. Then there are C > 0 and h˜ 0 > 0 such that for all discrete sets X ⊂ I = [−a, a] with fill distance h ≤ h˜ 0 and all f ∈ N E the error between the j-th derivatives of f and its interpolant s f,X is bounded above by √ ( j) ( j) ≤ eC log h/ h ∥ f ∥ FE . f − s f,X L ∞ [−a,a]
B. Zwicknagl, R. Schaback / Journal of Approximation Theory 171 (2013) 65–83
77
By some simple calculational tricks, similar results on spectral convergence hold when frequencies are somewhat decoupled from data points. We assume that frequencies µ j are chosen with fill distance h in some interval [α, β], and we want to work with these frequencies, but evaluate the approximation error on [−a, a]. We then use the points x j := ϕ(µ j ) := −a + 2a
µj − α ∈ [−a, a] β −α
for interpolation in [−a, a] using scaled trial functions exp(ct x j ) as functions of t ∈ [−a, a]. Since the above error bounds will also hold for scaled kernels, we can take c = β−α 2a to see that we have worked with functions µj − α exp(ct x j ) = exp ct −a + 2a β −α µj α = exp ct −a − 2a exp ct 2a β −α β −α a+β = exp −tα exp tµ j 2a which the desired frequencies. The given function, however, has to be multiplied by now have a+β exp tα 2a before computations start. 6. Approximation in Taylor spaces A standard technique for the error analysis of non-interpolatory approximation processes is provided by sampling inequalities (see [19,27,17,22,5,21,6]). The latter quantify the observation that a function with small values on a sufficiently dense discrete set cannot have large values anywhere if its derivatives are bounded. A typical example for spaces H (Ω ) of smooth functions u : Ω → R reads (see [21]) √
∥u∥ L ∞ (Ω ) ≤ ec log(ch)/
h
∥u∥H(Ω ) + c ∥u| X ∥ℓ∞ (X )
(13)
for all functions u ∈ H (Ω ) and all sets X ⊂ Ω with sufficiently small fill distance h X,Ω ≤ h 0 . If such inequalities are applied to residuals of stable reconstruction processes, they imply error estimates which can even cope with noise in the data (see [22] for a general overview). Wellstudied examples include spline smoothing [27], support vector regression [20], and integral equations [16,15]. The right hand-side of (13) consists of a continuous term and a discrete term. The explicit forms of the constants depend on the function space H (Ω ), and is characterized by the asymptotic behavior of the embedding constants E(k) of H (Ω ) ↩→ W 2,k (Ω ) as k → ∞, i.e., E(k) =
sup f ∈H(Ω )\{0}
∥ f ∥W 2,k (Ω ) ∥ f ∥H(Ω )
,
(14)
where W 2,k (Ω ) denotes the Sobolev space of square-integrable functions f : Ω → R whose weak derivatives of order k are also square-integrable [2]. For the reader’s convenience, we illustrate how sampling inequalities help to prove convergence rates for approximation (see e.g. [27,22] for details). Consider the quadratic
78
B. Zwicknagl, R. Schaback / Journal of Approximation Theory 171 (2013) 65–83
minimization problem sµ∗
n 2 (s(x j ) − f (x j )) : s ∈ H (Ω ) = arg min ∥s∥H(Ω ) + µ 2
j=0
which is well-known from Learning Theory, using a quadratic loss function with a positive weight µ. Without going into details about the calculation of the minimizer sµ∗ , we observe that ∥sµ∗ ∥2H(Ω ) ≤ ∥ f ∥2H(Ω ) holds for all µ. Now an application of the sampling inequality (13) to u := f − sµ∗ yields the error bound √ c log(ch)/ h f − s∗ ∥ f ∥H(Ω ) + c ( f − sµ∗ )| X ℓ (X ) ≤ 2e µ L q (Ω ) ∞
for all µ, but note that for µ → ∞ we can make the discrete term as small as we want, due to µ
n (sµ∗ (x j ) − f (x j ))2 ≤ ∥ f ∥2H(Ω ) . j=0
In particular, choosing µ to satisfy √
µ−1 ≈ ec log(ch)/
h
,
yields exponential convergence for this non-interpolatory scheme. To derive sampling inequalities for Taylor spaces, we fix weights λ j and consider the associated Taylor space F of analytic functions f : [−a, a] → R. To bound the embedding n constants E(k) above, observe that for f ∈ F with f (x) = ∞ n=0 f n x ∥ f ∥2W 2,k ([−a,a]) =
k
a
2 D j f (x) d x
j=0 −a
2 n! n− j = fn x dx (n − j)! j=0 −a n= j k a ∞ ∞ 2 λn 2 ℓ! 2n−2 j ≤ fℓ x dx 2 λℓ n= j (n − j) ! j=0 −a ℓ= j ∞ k ∞ 2 λn 2a 2n−2 j+1 2 ℓ! = fℓ 2 λℓ n= j (n − j) ! 2n − 2 j + 1 j=0 ℓ= j k
a
∞
≤ 2a(k + 1)∥ f ∥2F max
0≤ j≤k
≤ 2(k + 1)∥ f ∥2F max
0≤ j≤k
∞ n= j
λn
a 2n−2 j (n − j) !2 2n − 2 j + 1
∞ λn+ j n=0
n!2
a 2n .
We now consider the examples of the previous section separately.
(15)
B. Zwicknagl, R. Schaback / Journal of Approximation Theory 171 (2013) 65–83
79
6.1. Hardy space The weights λ j associated to the Szeg¨o kernel S (x, t) = (1 − xt)−1 are λ j = j!2 . Here a < 1. Then by [28, Lemmas 4 and 5] max
0≤ j≤k
∞ λn+ j n=0
n!2
a 2n =
∞ λn+k n=0
n!2
a 2n ≤ C k k!2 .
Thus E(k) ≤ C k k k for all k ∈ N (see (15)), and by [21, Theorems 3.5 and 4.5] there are the following sampling inequalities which generalize results of [28, Section 7] to derivatives. Theorem 6.1. For a < 1 and j ∈ N there exist h 0 > 0 and C > 0 such that for all X ⊂ [−a, a] with fill distance h ≤ h 0 and all f ∈ FS ∥ f ∥ L ∞ ([−a,a]) ≤ e−C/ h ∥ f ∥ FS + eC/ h ∥ f | X ∥ℓ∞ (X ) , and ( j) f
√ L ∞ [−a,a]
≤ e−C/
h
∥ f ∥ FS + Ch − j ∥ f | X ∥ℓ∞ (X ) .
6.2. Bergman space The weights λ j associated to the Bergman kernel B (x, t) = (1 − xt)−2 are λ j = j! ( j + 1)!. Here a < 1. Hence by (15) and the computations from the proof of Theorem 5.4 E(k)2 ≤ C k
∞ (n + k)! (n + k + 1)! n=0
n!2
a 2n ≤ C k k 2k .
Thus, we obtain the same sampling inequalities as for the Hardy space (see [21, Theorems 3.5 and 4.5]), and Theorem 6.1 can be reformulated exactly also for functions from FB . 6.3. Dirichlet space The weights λ j associated to the logarithmic kernel L (t, x) = − xt1 log (1 − xt) are λ j = j!2 j+1 .
Again the asymptotic behavior of E(k) for k → ∞ is the same as for the Szeg¨o kernel, and Theorem 6.1 can be reformulated exactly also for functions from FD . 6.4. Exponential case The weights λ j associated to the exponential kernel E (x, t) = exp(xt) are λ j = j!. By definition of the Laguerre polynomials L k (see [1]) ∞ k 2j k a (n + k)! 2n 2 a2 a = e k! = ea k!L k −a 2 , 2 j j! n! n=0 j=0 and thus by (15) and the computations of the proof of [28, Lemma 3] ∞ (n + k)! 2n k a2 2 E(k)2 ≤ C k a = C e k!L −a ≤ C k kk . k 2 n! n=0 Hence by [21, Theorems 3.5 and 4.5], we derive the following sampling inequalities.
80
B. Zwicknagl, R. Schaback / Journal of Approximation Theory 171 (2013) 65–83
Table 2 L ∞ errors for different examples and kernels.
sin(x) 1/(1 + x 2 /25) 1/(1 + x 2 ) 1/(1 + 25x 2 ) B-spline
Szeg¨o
exp
log
Szeg¨o2
7.31e−04 8.73e−06 1.49e−02 3.89e+00 1.23e+00
6.53e−08 8.76e−10 1.11e−03 9.07e−01 2.88e−01
2.59e−04 2.97e−06 9.67e−03 2.97e+00 9.36e−01
2.30e−03 2.83e−05 2.48e−02 5.38e+00 1.70e+00
Theorem 6.2. For a > 0 and j ∈ N there exist h 0 > 0 and C > 0 such that for all X ⊂ [−1, 1] with fill distance h ≤ h 0 and all f ∈ FE √
∥ f ∥ L ∞ ([−a,a]) ≤ eC log(Ch)/
h
∥ f ∥ FS + eC/ h ∥ f | X ∥ℓ∞ (X ) ,
and ( j) f
√ L ∞ [−a,a]
≤ e−C/
h
∥ f ∥ FS + Ch − j ∥ f | X ∥ℓ∞ (X ) .
7. Numerical examples All the subsequent figures use four different interpolants: • • • •
the solid line for the Szeg¨o kernel, the dotted line for the exponential kernel, the - - line for the log kernel, and the - . line for the squared Szeg¨o kernel.
They differ in the function f providing the data. In all cases, the arising kernel matrices are severely ill-conditioned, but we did not apply any preconditioning techniques. The cases with expansions into purely even or purely odd terms are ignored and will be similar, provided that the symmetries are taken into account when setting up the problems. To allow comparisons between kernels, we fixed 9 equidistant data locations on [−0.9, +0.9] in all cases. Since the scalings of the figures might be hard to read, we present the L ∞ errors in Table 2. By the classical results of Bernstein on approximation of analytic functions by polynomials [11, Theorem 8.1, p. 229], it is to be expected that the location of singularities of f outside the interval will be of quite some influence on the error. For the entire function sin(x), one can expect the best possible behavior, and the results are in Fig. 1. We then have three examples of the Runge type, the singularities moving closer from ±5i via ±i to ±0.2i, see Figs. 2–4. The results get dramatically worse, and for the B-spline f (x) = 1 − |x| they are disastrous, but this function is too far away from any of the native Hilbert spaces of analytic functions. In all cases, the exponential kernel performed best, followed by the log kernel, the Szeg¨o kernel, and the squared Szeg¨o kernel. Our results carry over to integration by just taking the integrals of the interpolants. We get the same convergence rates as given in the above theorems, and these rates are better than the geometric rates for Monte-Carlo techniques given in [12]. For multivariate quadrature, one should apply results on interpolation from [28].
B. Zwicknagl, R. Schaback / Journal of Approximation Theory 171 (2013) 65–83
81
Fig. 1. Errors for f (x) = sin(x).
Fig. 2. Errors for f (x) = 1/(1 + x 2 /25).
Acknowledgments Special thanks are due to Christian Rieger for many fruitful discussions. The first author warmly thanks the Center for Nonlinear Analysis (NSF Grant No. DMS-0635983), where part of this work was carried out. Her research was funded by postdoctoral fellowship of the National Science Foundation under Grant No. DMS-0905778.
82
B. Zwicknagl, R. Schaback / Journal of Approximation Theory 171 (2013) 65–83
Fig. 3. Errors for f (x) = 1/(1 + x 2 ).
Fig. 4. Errors for f (x) = 1/(1 + 25x 2 ).
References [1] M. Abramowitz, I.A. Stegun, Handbook of Mathematical Functions, Dover, New York, 1970. [2] R. Adams, Sobolev Spaces, Academic Press, New York, 1975. [3] J. Agler, J.E. McCarthy, Pick Interpolation and Hilbert Function Spaces, in: Graduate Studies in Mathematics, vol. 44, American Mathematical Society, Providence, Rhode Island, 2002. [4] D. Alpay, The Schur Algorithm, Reproducing Kernel Spaces and System Theory, in: SMF/AMS Texts and Monographs, vol. 5, American Mathematical Society, USA, 2001.
B. Zwicknagl, R. Schaback / Journal of Approximation Theory 171 (2013) 65–83
83
[5] R´emi Arcang´eli, Maria L´opez de Silanes, Juan Torrens, An extension of a bound for functions in Sobolev spaces, with applications to (m, s)-spline interpolation and smoothing, Numer. Math. 107 (2007) 181–211. [6] R´emi Arcang´eli, Maria L´opez de Silanes, Juan Torrens, Extension of sampling inequalities to Sobolev semi-norms of fractional order and derivative data, Numer. Math. 121 (2012) 587–608. [7] N. Aronszajn, Theory of reproducing kernels, Trans. Amer. Math. Soc. 68 (1950) 337–404. [8] J.R.P. Boas, Entire Functions, in: A Series of Monographs and Textbooks, Academic Press Inc., New York, NY, 1954. [9] A.P. Calder´on, A. Zygmund, Local properties of solutions of elliptic partial differential equations, Studia Math. 20 (1961) 171–225. [10] Stefano De Marchi, R. Schaback, H. Wendland, Near-optimal data-independent point locations for radial basis function interpolation, Adv. Comput. Math. 23 (3) (2005) 317–330. [11] R. DeVore, G.G. Lorentz, Constructive Approximation, in: Grundlehren der Mathematischen Wissenschaften, vol. 303, Springer, 1993. [12] J. Dick, A Taylor space for multivariate integration, Monte Carlo Methods Appl. 12 (2006) 99–112. [13] I.S. Gradshteyn, I.M. Ryzhik, Table of Integrals, Series, and Products, Dover, Elsevier, Burlington, 2007. [14] Brian C. Hall, Holomorphic methods in analysis and mathematical physics, in: First Summer School in Analysis and Mathematical Physics, in: Contemporary Mathematics, American Mathematical Society, 1998, pp. 1–59. [15] J. Krebs, Support vector regression for the solution of linear integral equations, Inverse Problems 27 (6) (2011) 065007. [16] J. Krebs, A.K. Louis, H. Wendland, Sobolev error estimates and a priori parameter selection for semi-discrete tikhonov regularization, J. Inverse Ill-Posed Probl. 17 (2009) 845–869. [17] W.R. Madych, An estimate for multivariate interpolation, II, J. Approx. Theory 142 (2) (2006) 116–128. [18] H. Meschkowski, Hilbertsche R¨aume mit Kernfunktion, Springer, Berlin, 1962. [19] F.J. Narcowich, J.D. Ward, H. Wendland, Sobolev bounds on functions with scattered zeros, with applications to radial basis function surface fitting, Math. Comp. 74 (2005) 743–763. [20] C. Rieger, B. Zwicknagl, Deterministic error analysis of kernel-based regression and related kernel based algorithms, J. Mach. Learn. Res. 10 (2009) 2115–2132. [21] C. Rieger, B. Zwicknagl, Sampling inequalities for infinitely smooth functions, with applications to interpolation and machine learning, Adv. Comput. Math. 32 (2010) 103–129. [22] C. Rieger, B. Zwicknagl, R. Schaback, Sampling and stability, in: M. Dæhlen, M.S. Floater, T. Lyche, J.-L. Merrien, K. Mørken, L.L. Schumaker (Eds.), Mathematical Methods for Curves and Surfaces, in: Lecture Notes in Computer Science, vol. 5862, 2010, pp. 347–369. [23] S. Saitoh, Integral Transforms, Reproducing Kernels and their Applications, in: Pitman Research Notes in Mathematics, Longman, Great Britain, 1997. [24] R. Schaback, A unified theory of radial basis functions. Native Hilbert spaces for radial basis functions, II, J. Comput. Appl. Math. 121 (1–2) (2000) 165–177. Numerical analysis in the 20th century, Vol. I, Approximation theory. [25] R. Schaback, H. Wendland, Kernel techniques: from machine learning to meshless methods, Acta Numer. 15 (2006) 543–639. [26] H. Wendland, Scattered Data Approximation, Cambridge University Press, 2005. [27] H. Wendland, C. Rieger, Approximate interpolation with applications to selecting smoothing parameters, Numer. Math. 101 (2005) 643–662. [28] B. Zwicknagl, Power series kernels, Constr. Approx. 29 (1) (2009) 61–84.