Spherical basis functions and uniform distribution of points on spheres

Spherical basis functions and uniform distribution of points on spheres

Journal of Approximation Theory 151 (2008) 186 – 207 www.elsevier.com/locate/jat Spherical basis functions and uniform distribution of points on sphe...

283KB Sizes 4 Downloads 61 Views

Journal of Approximation Theory 151 (2008) 186 – 207 www.elsevier.com/locate/jat

Spherical basis functions and uniform distribution of points on spheres Xingping Suna,∗ , Zhenzhong Chenb a Institute of Mathematical Sciences, Henan Normal University, Xinxiang, Henan Province, China b Xinxiang College, Xinxiang, Henan Province, China

Received 13 February 2007; received in revised form 19 September 2007; accepted 22 September 2007 Communicated by Joseph Ward Available online 31 December 2007

Abstract The main purpose of the present paper is to employ spherical basis functions (SBFs) to study uniform distribution of points on spheres. We extend Weyl’s criterion for uniform distribution of points on spheres to include a characterization in terms of an SBF. We show that every set of minimal energy points associated with an SBF is uniformly distributed on the spheres. We give an error estimate for numerical integration based on the minimal energy points. We also estimate the separation of the minimal energy points. © 2007 Elsevier Inc. All rights reserved. MSC: 41A55; 11K38; 11K41; 46E22 Keywords: Uniform distribution of points; Positive definite functions; Weyl’s criterion; Minimal energies; Reproducing kernel Hilbert spaces; Discrepancy

1. Introduction Radial basis function (RBF) methods have recently found applications in many areas of computational mathematics. One of the true advantages of RBF methods is their flexibility and robustness in dealing with scattered data; see [32,33,43]. They have been implemented successfully to handle problems that occur in higher dimensional spaces in which the data are scattered and sparse; see [38,39,52]. In these situations, employing other methods, such as finite element methods and multivariate spline methods, become inefficient and encounter formidable computational complexities. ∗ Corresponding author. Permanent address: Department of Mathematics, Missouri State University, Springfield, MO

65897, USA. E-mail address: [email protected] (X. Sun). 0021-9045/$ - see front matter © 2007 Elsevier Inc. All rights reserved. doi:10.1016/j.jat.2007.09.009

X. Sun, Z. Chen / Journal of Approximation Theory 151 (2008) 186 – 207

187

Lately, RBF methods have also been proven highly effective on manifolds without boundaries; see [12,31,34–36]. The unit sphere Sd in Rd+1 is a typical example of such manifolds. The counterpart of RBF on spheres is the spherical basis function (SBF). A basic idea of RBF (SBF) methods is to use some intricately designed linear combinations of the “shifts” of a prescribed RBF (SBF) to approximate the underlying unknown function. The order of approximation is often gauged in terms of the smallest separation between the data sites, much like the role played by the Nyquist frequency as in approximation by trigonometrical polynomials. The main purpose of this paper is to explore the application of SBF methods to a different area: the uniform distribution of points on spheres. Uniform distribution of points has been extensively studied in many branches of mathematics, including harmonic analysis, number theory [27], and statistics [23]. Two pillars of the theory in uniform distribution are Weyl’s criterion and the Erdös–Turán discrepancy theorem in which approximation by trigonometrical polynomials plays a central role. Very naturally, this is an area in which some SBF methods can flourish. In fact, a part of our goal in the present paper is to characterize uniform distribution of points on spheres in terms of SBFs. The main result in this aspect is Theorem 3.4. Besides the natural connection to SBF methods we mentioned above, the present study is closely linked to that of minimal discrete Riesz s-energies (0 s < ∞) on manifolds; see [16,15] and the references therein. Although Riesz kernels are not ordinary SBFs, they share the common feature that we call “(conditionally) positive definiteness”. Readers who are familiar with this concept can naturally attribute many results to this kinship, as they read through the present paper. A highlight in this aspect is that minimal energy points associated with Riesz s-kernels (for each s 0) and those associated with every SBF kernel are uniformly distributed on the spheres. We remind readers that the distributions of minimal Riesz energy points on some other manifolds M (for example, the interval [−1, 1] and a properly imbedded torus) vary with s for s in the range 0 s < dim M, dim M being the Hausdorff dimension of M. We refer readers to [15,42] for an explanation of the interesting phenomenon. However, when s dim M (where M is a rectifiable manifold), a remarkable result of Hardin and Saff [16] asserts that minimal Riesz s-energy points are uniformly distributed on M (the case s = d requires that M be a subset of a C 1 manifold). We have included in this paper somewhat a larger than usual amount of background information. We have also at times given elaborate details in presenting the arguments. Our intention is to make the paper accessible to a wider audience, including the research communities of RBF, minimal energies, and statisticians who are interested in the RBF theory of uniform distribution. The present paper is arranged as follows. In Section 2, we give a brief summary of the evolution of SBF theory, the central theme being conditionally positive definiteness. In Section 3, we prove two criteria for uniform distributions of points on spheres in terms of an SBF. The criteria are equivalent to Weyl’s criterion. In Section 4, we prove that minimal energy points associated with an SBF are uniformly distributed on the spheres. In Section 5, we use Reproducing Kernel Hilbert Space (RKHS) theory to obtain a numerical integration error estimate based on minimal energy points. In Section 6, we estimate the separation of the minimal energy points.

2. Conditionally positive definite functions Spherical harmonics. Let L2 (Sd ) be the real Hilbert space equipped with the inner product  f, g := d

Sd

f (x)g(x) d(x),

188

X. Sun, Z. Chen / Journal of Approximation Theory 151 (2008) 186 – 207

where d is the rotational invariant probability measure on Sd , and d is the volume of Sd . We will use Y,m to denote the usual orthonormal basis of spherical harmonics [28]. For each fixed , the set {Y,m : m = 1, . . . , q } spans the eigenspace of the Laplace–Beltrami operator on Sd corresponding to the eigenvalue  = ( + d − 1). Here q is the dimension of the eigenspace corresponding to  and is given by [28, p. 4] ⎧ ⎪ 1,  = 0, ⎨ (2.1) q = (2 + d − 1)( + d − 1) ⎪ ,  1. ⎩ ( + 1)(d) For large , q = O(d−1 ). If f ∈ L2 (Sd ), then we may expand it in a series of spherical harmonics, q ∞  

f =

fˆ,m Y,m where fˆ,m := f, Y,m .

=0 m=1

Let x, y ∈ Sd , and let x · y denote the usual dot product in Rd+1 . Then spherical harmonics satisfy the addition formula [28]: q 

Y,m (x)Y,m (y) =

m=1

q () P (x · y). d 

(2.2)

()

Here  = d−1 2 , and P (t) denotes the degree  Legendre polynomial of order . The addition formula and the well-known inequality ()

()

|P (t)|P (1) = 1 yields the following inequality:   q q    

2 q   Y,m (x) = Y,m (x)Y,m (y)  .    d m=1

(2.3)

m=1

Conditionally positive definite functions. Let k be a nonnegative integer, and let  be a multiindex. A continuous function  : [−1, 1] → R is called conditionally positive definite of order k on Sd if for every possible set of n distinct points x1 , . . . , xn on Sd , and every nonzero vector (c1 , . . . , cn ) ∈ Rn satisfying n 

ci xi = 0,

|| < k,

(2.4)

ci cj (cos(d(xi , xj )))0.

(2.5)

i=1

we have n n   i=1 j =1

In this paper, we will use the acronym c.p.d. to stand for the phrase “conditionally positive definite”. We will say that  is strictly conditionally positive definite (s.c.p.d.) of order k on Sd if the strict inequality in (2.5) holds true for every nonzero vector (c1 , . . . , cn ) ∈ Rn satisfying

X. Sun, Z. Chen / Journal of Approximation Theory 151 (2008) 186 – 207

189

Eq. (2.4). Here, d(x, y) is the geodesic distance between x and y on Sd , and is precisely the angle between the vectors x and y in Rd+1 . Since |x| = |y| = 1, we have x · y = cos(d(x, y)). If k = 0, then the condition in Eq. (2.4) is vacuous. The set of all c.p.d. functions of order 0 on Sd coincides with the set of all positive definite functions on Sd defined by Schoenberg [47]. In the same paper, Schoenberg also characterized all the positive definite functions on spheres. He showed that a continuous function  is positive definite on Sd if and only if its expansion in Legendre polynomials, (x · y) =

∞  =0

a

q () P (x · y), d 

(2.6)

has all coefficients a 0 and  a q < ∞. Schoenberg’s argument can be modified slightly to show that a function  is c.p.d. of order k on Sd if and only if its expansion in Legendre polynomials in Eq. (2.6) has coefficients a 0 for all k and a q < ∞. Characterizations of s.c.p.d. functions of order k on Sd have been recently accomplished. These results were proved by Chen et al. [8] for the cases d 2, and by Pinkus [37] for the case d = 1. 1 A sufficient (but not necessary) condition for s.c.p.d. is that the coefficients a are positive for all k. They are precisely the functions we will mostly be working with in the present paper. For easy reference of the readers, we give a formal definition here. Definition 2.1. Let k 0. A continuous function  : [−1, 1] → R is called a spherical basis function (SBF) of order k on Sd , if its expansion in Legendre polynomials in Eq. (2.6) has coefficients a > 0 for all  k and a q < ∞. An SBF of order 0 will simply be called an SBF. These functions provide powerful tools in meshless approximations on spherical domains; see [31,34]. SBFs were introduced by Xu and Cheney [53], and Narcowich [29] showed that they are strictly positive definite in a stronger sense. Most of the useful SBFs of order k on Sd are “restrictions” to Sd of their “counterparts” in Rd+1 . Before we elaborate on the restriction procedure, let us first recall that a continuous function  : [0, ∞) → R is called conditionally positive definite of order k in Rd+1 (regarded as the metric space associated with the Euclidean distance), if for every possible n distinct points x1 , . . . , xn ∈ Rd , and every nonzero vector (c1 , . . . , cn ) ∈ Rn satisfying n 

cj xj = 0,

j =1

we have n n  

ci cj (|xi − xj |2 ) 0.

(2.7)

i=1 j =1

Here |x − y| denotes the Euclidean distance between x and y. If the strict inequality holds true, then the function  is called strictly conditionally positive definite of order k in Rd+1 . Again, the 1 The authors of [8,37] only considered the case k = 0. However, their methods can be modified in an obvious way to handle the general case.

190

X. Sun, Z. Chen / Journal of Approximation Theory 151 (2008) 186 – 207

set of all c.p.d. functions of order 0 in Rd+1 coincides with the set of all positive definite functions in Rd+1 as defined by Schoenberg [45]. 2 For x, y ∈ Sd , we have, by the Law of Cosine, |x − y|2 = 2 − 2x · y. Now let  be a c.p.d. function in Rd+1 . Then there is a unique function  : [−1, 1] → R such that (x · y) := (2 − 2x · y) = (|x − y|2 ),

x, y ∈ Sd .

(2.8)

Naturally,  is c.p.d. on Sd . It is also obvious that if  is s.c.p.d. in Rd+1 then  is an s.c.p.d. function on Sd . Throughout this paper, we will assume that  and  have this relationship characterized by the “restriction”. We will mostly be working with . However, in Section 6, we find it convenient to work with . There is a well-established relation between the Fourier transform of the function (defined in Rd+1 ): x  → (|x|2 ), and the Legendre–Fourier coefficients of the kernel (defined on Sd ): (x, y)  → (x · y); see [7,30,34]. Schoenberg characterized all the positive definite functions in Rd+1 as Bessel transforms of a certain kind. Let (x) be the function defined by  (x) := d eixy d(y), x ∈ Rd+1 . Sd

It is easy to see that (x) is radial. It is well known [45] that (t) = 

d +1 2

d−1 t 2 J d−1 (t), 2 2

where J d−1 (t) denotes the Bessel functions of the first kind. 2

Theorem 2.2 (Schoenberg [45,46]). Let  be a continuous function on [0, ∞). Then  is positive definite in Rd+1 if and only if there is a bounded, positive Borel measure  on [0, ∞) such that  ∞ √ ( t) = (tu) d(u), t ∈ [0, ∞). (2.9) 0

Conditionally positive definite functions of order k 1 in Rd+1 also enjoy an integral representation (albeit in a more complicated form); see [14]. However, for the special case k = 1, the representation reduces to a simpler form that agrees with the one given by Schoenberg [45] 3 ; see also [51]. Theorem 2.3 (Schoenberg [45]). Let  be a continuous function on [0, ∞). Then  is c.p.d. of order 1 in Rd+1 if and only if  has the following integral representation:  ∞ √ (tu) − 1 ( t) − (0) = d(u), t ∈ [0, ∞), (2.10) u2 0 2 Similar to what Schoenberg had done in [45,47], one can unify c.p.d. functions on Sd and those in Rd+1 as “c.p.d. functions on metric spaces”. On Sd , one uses the geodesic distance, while in Rd+1 , one uses the Euclidean distance. 3 Schoenberg did not use the terminology “conditionally positive definite functions”. Rather, he characterized these functions in terms of metric transform and imbedding.

X. Sun, Z. Chen / Journal of Approximation Theory 151 (2008) 186 – 207

191

where  is a positive Borel measure on [0, ∞) satisfying  ∞ d(u) < ∞. u2 1 If the measure  in Eq. (2.9) (Eq. (2.10)) is not concentrated on a subset that has no limit point, then the function  given rise to by the measure  is s.p.d. (s.c.p.d. of order 1) in Rd+1 , and its restriction to Sd is an SBF (of order 1) on Sd ; see [30]. 3. Uniform distributions of points on spheres For a fixed x ∈ Sd , and 0 < r < 2, let C(x, r) := {y : |y − x| r}. We will call C(x, r) a spherical cap centered at x and having radius r, or just a spherical cap when the center and radius are not important in the context. Definition 3.1. For each N 2, let {xN,1 , . . . , xN,N } be a set of N points on Sd . We say that the set {xN,1 , . . . , xN,N } is uniformly distributed on Sd if for each spherical cap C(x, r) (whose volume is denoted by Vol.(C(x, r))), we have #{xN,j : xN,j ∈ C(x, r)} Vol.(C(x, r)) . = N→∞ N d lim

We will also say that the points xN,1 , . . . , xN,N are uniformly distributed on Sd . When it is unnecessary to emphasize the dependence of the points xN,1 , . . . , xN,N on N, we will suppress the double indices, and simply write the points as x1 , . . . , xN . The spherical version of Weyl’s criterion. The following theorem is known as Weyl’s criterion; see Kuipers and Niederreiter [23]. Theorem 3.2 (Weyl’s criterion). Let x1 , . . . , xN be N points on Sd . Then the following three statements are equivalent: 1. The points x1 , . . . , xN are uniformly distributed on Sd . 2. For each fixed integer 1, and each fixed m, 1 m q , we have N 1  Y,m (xj ) = 0. N→∞ N

lim

(3.1)

j =1

3. For every continuous function f on Sd , we have  N 1  f (xj ) = f (x) d(x). lim N→∞ N Sd j =1

The main purpose of this paper is to use SBFs of order k to study uniform distributions of points on Sd . For the convenience of writing proofs, we use the summation formula for spherical harmonics to write , an SBF of order k, in the following form: (x · y) =

∞  =0

a

q  m=1

Y,m (x)Y,m (y),

(3.2)

192

X. Sun, Z. Chen / Journal of Approximation Theory 151 (2008) 186 – 207

with a > 0 for all k, and a q < ∞. The coefficients a are determined by   2 a = d (x · y)Y,m (x)Y,m (y) d(x) d(y). Sd

Sd

Using the Funk–Hecke formula; see [28], we can express a as follows:  d−1 1 () (t)P (t)(1 − t 2 ) dt, a = d −1 where  = (d − 1)/2. Due to the importance and frequent appearance of the coefficient a0 throughout the paper, we will systematically denote a0 by A . To give a clear illustration and introduction of what our main theorem in this section is about, we begin with the following simple result which we shall call an “averaging Weyl’s criterion”. Proposition 3.3. Let a be a positive sequence such that ∞ 

a q < ∞.

=1

Then the N points x1 , . . . , xN are uniformly distributed on Sd if and only if ⎛ ⎞2 q ∞ N    1 ⎝ a Y,m (xj )⎠ = 0. lim N→∞ N =1

m=1

(3.3)

j =1

Proof. To prove the necessity, assume that the points x1 , . . . , xN are uniformly distributed. Then by Weyl’s criterion, Eq. (3.1) holds true for each fixed 1  and fixed m, 1 m q . Applying Inequality (2.3), we have ⎛ ⎞2 q ∞ N ∞     1 q ⎝ a Y,m (xj )⎠  a < ∞. N d =1

m=1

j =1

=1

This allows us to apply the Lebesgue Dominated Convergence Theorem to get ⎛ ⎛ ⎞2 ⎞2 q q ∞ N ∞ N       1 1 ⎝ lim a Y,m (xj )⎠ = a lim ⎝ Y,m (xj )⎠ = 0. N→∞ N→∞ N N =1

m=1

j =1

=1

j =1

m=1

To prove the sufficiency, assume that Eq. (3.3) holds true. We have, for each fixed integer  1, and each fixed m, 1 mq , that ⎡ ⎛ ⎛ ⎞2 ⎞2 ⎤ q N ∞ N     1 ⎢ ⎥ ⎝1 lim sup ⎝ (3.4) Y,m (xj )⎠  lim ⎣ a Y,m (xj )⎠ ⎦ = 0. N→∞ N N N→∞ j =1

=1

m=1

j =1

By Weyl’s criterion, the N points x1 , . . . , xN on Sd are uniformly distributed on Sd .



The idea behind the averaging Weyl’s criterion can be exploited further to characterize uniform distribution of points on Sd in terms of an SBF of order k.

X. Sun, Z. Chen / Journal of Approximation Theory 151 (2008) 186 – 207

193

Theorem 3.4. Let k 0 and let  be an SBF of order k on Sd . Let x1 , . . . , xN be N points on Sd . Then the following three statements are equivalent: 1. The points x1 , . . . , xN are uniformly distributed on Sd . 2. Eq. (3.1) holds true for each spherical harmonics Y,m with 1  < k and m = 1, . . . , q , and the limit N 1  lim (xj · y) = A (3.5) N→∞ N j =1

holds true uniformly in y ∈ Sd . 3. Eq. (3.1) holds true for each spherical harmonics Y,m with 1  < k and m = 1, . . . , q , and the following limit holds true: N N 1  (xi · xj ) = A . N→∞ N 2

lim

(3.6)

i=1 j =1

The equivalence between Parts 1 and 2 of the above theorem (for the case k = 0) has been proved in [11] on homogeneous manifolds (of which Sd is a special case). We give a full proof of the theorem here for completeness. Proof. We will complete the circle (1) ⇒ (2) ⇒ (3) ⇒ (1). To show (1) ⇒ (2), we start with Eq. (3.2) and write down ⎛ ⎞ q N ∞ N     1 1 (xj · y) = a Y,m (y) ⎝ Y,m (xj )⎠ . (3.7) N N j =1

=0

j =1

m=1

Assume that the points {x1 , . . . , xN } are uniformly distributed. Then by Weyl’s criterion, Eq. (3.1) holds true for each fixed 1 and m, 1m q . We intend to take limit with respect to N on both sides of Eq. (3.7) to establish Eq. (3.5). But first we need to bound the series on the right-hand side of Eq. (3.7). For each fixed pair of (, m), let ,m ∈ Sd be such that for all x ∈ Sd , |Y,m (x)||Y,m ( ,m )|. We have ⎛ ⎞ ∞  q N     1  ⎝ ⎠ a Y,m (y) Y,m (xj )   N  =0 m=1  j =1 ⎛ ⎞ q q ∞ N ∞      1 ⎝ ⎠  a |Y,m ( ,m )| |Y,m (xj )|  a |Y,m ( ,m )|2 N =0

=

∞  =0

m=1

a

j =1

q < ∞. d

=0

m=1

(3.8)

Here we have used Inequality (2.3). This enables us to apply the Lebesgue Dominated Convergence Theorem to the right-hand side of Eq. (3.7) (combining with Eq. (3.1)) to get that ⎛ ⎞ q ∞ N    1 lim a Y,m (y) ⎝ Y,m (xj )⎠ N→∞ N =1 m=1 j =1 ⎛ ⎞ q ∞ N     1 = a Y,m (y) lim ⎝ Y,m (xj )⎠ = 0. N→∞ N =1

m=1

j =1

194

X. Sun, Z. Chen / Journal of Approximation Theory 151 (2008) 186 – 207

Eq. (3.5) then follows as a consequence. The version of the Lebesgue Dominated Convergence Theorem we applied above can be found in Rudin [41, p. 26]. The measurable space is {1, 2, . . .} with the counting measure. To show (2) ⇒ (3), we note that N N 1  (xi · xj ) N2 i=1 j =1

is the arithmetic mean of the N sequences: N1 N j =1 (xi · xj ), i = 1, . . . , N, each approaches to A as N → ∞. Thus, Eq. (3.6) follows directly from Eq. (3.5). To show (3) ⇒ (1), we shall prove that Eq. (3.1) holds true for all the spherical harmonics Y,m with  1. By the assumptions in Part 3, it is already true for each spherical harmonic Y,m with 1  < k and m, 1m q , which implies that

lim

k−1 

N→∞

=1



⎞2 N  1 a ⎝ Y,m (xj )⎠ = 0. N

(3.9)

j =1

Making use of Eq. (3.7), we write down ⎛ ⎞2 q N ∞ N N      1 ⎝1 (xi · xj ) = a Y,m (xj )⎠ . N N2 i=1 j =1

=0

m=1

(3.10)

j =1

With Eq. (3.9), we have

lim

N→∞

∞ 

a

=k

q  m=1



⎞2 N  ⎝1 Y,m (xj )⎠ = 0. N j =1

Since a , (k) are all positive, the desired result follows from the same argument used in the proof of Proposition 3.3.  The most appealing result of the above theorem comes from the cases k = 0, 1, in which the set of integers  satisfying 1 < k is empty. Therefore, the uniform distribution of the points x1 , . . . , xN on Sd is simply equivalent to either Eq. (3.5) or Eq. (3.6). In the rest of this paper, we will concentrate on these two cases. 4. Extremal energies and uniform distributions Let  be an SBF of order k, k = 0, 1, on Sd . For each natural number N, let N := {xj }N j =1 denote a set of N points on Sd . We define the (normalized) N point discrete -energy, E (N ), by E (N ) :=

N N 1  (xi · xj ). N2 i=1 j =1

(4.1)

X. Sun, Z. Chen / Journal of Approximation Theory 151 (2008) 186 – 207

195

The N point discrete -energy is also realized as a function: · · × Sd → R. E (N ) : Sd × · N

Inspecting Eq. (3.10) and noting that all the coefficients a , ( 1) are positive, we see immediately that the following inequality  E (N )A := (x · y) d(x) (4.2) Sd

holds true for any set N of N points. Also, the following result is evident. (1)

(2)

Proposition 4.1. Let  be an SBF of order k, k = 0, 1 on Sd . Let N and N each be a set of (1) (2) (1) N points. Assume that E (N ) E (N ), and that N is uniformly distributed on Sd . Then (2) N is also uniformly distributed on Sd . (1)

Proof. Since the points in N are uniformly distributed on Sd , we have by Theorem 3.4 that (1)

lim E (N ) = A .

N→∞

By the assumption, we have (1)

(2)

E (N )E (N ) A . The desired result (also) follows from Theorem 3.4.



The result of Proposition 4.1 leads us to anticipate that uniform distribution can be achieved by d minimizing the N point discrete -energy E (N ) over S · · × Sd . We will show that this  × · N

is indeed true. d Definition 4.2. Let N 2. Let a collection D := {Di }N i=1 of N closed subsets of S be given with

D := max diameter (Di ). 1i N

It is called an equal-volume partition of Sd with small diameters if  d 1. N i=1 Di = S . 2. int(Di ) ∩ int(Dj ) = ∅, i  = j , 1i, j N . 3. Vol.(Di ) = Nd , i = 1, . . . , N. 4. limN→∞ D = 0. By modifying the algorithm described in Proposition 2.1 in [31], we can have a simple way of generating equal-volume partitions of Sd with small diameters. However, much effort has been given to the construction of the stronger kind of equal-volume partitions of Sd in which there is a constant C independent of N such that D C/N 1/d . Although in the present paper, we do not resort to the stronger kind of partitions, we inform the readers of the fact that many authors have either assumed the existence of such partitions or used them pointedly to prove important

196

X. Sun, Z. Chen / Journal of Approximation Theory 151 (2008) 186 – 207

results. These authors include Beck and Chen [3], Bourgain and Lindenstrauss [4], Kuijlaars and Saff [22], and Stolarsky [48]. Rakhmanov et al. [40] constructed equal-volume partitions of √ S2 with diameter of each part 7/ N. Furthermore, the partitions given in [40] have zonal structure. Leopardi [24] has recently generalized their result to Sd (d 3). See also Feige and Schechtman [13]. ∗ } be a set Theorem 4.3. Let  be an SBF of order k, k = 0, 1 on Sd , and let ∗N = {x1∗ , . . . , xN d of N points on S that minimizes the N point discrete -energy, i.e.,

E (∗N ) = min E (N ), N

∗ are where the minimum is taken over the set of all possible N . Then the points x1∗ , . . . , xN uniformly distributed on Sd .

Proof. Let D := {D1 , . . . , DN } be an equal-volume partition of Sd with small diameters, and let y1 , . . . , yN be, respectively, taken from the interiors of D1 , . . . , DN . We write ⎛ ⎞ q N ∞ N    1  1 (x · yj ) = a Y,m (x) ⎝ Y,m (yj )⎠ . (4.3) N N j =1

=0

j =1

m=1

Note that for each pair of , m, the summation N 1  Y,m (yj ) N j =1

is a Riemann sum for the integral  Y,m (y) d(y). Sd

Therefore, for each  > 0, and 1 m q , we have N 1  Y,m (yj ) = 0. N→∞ N

lim

j =1

As in the proof of Theorem 3.4 ((1) ⇒ (2)), we apply the Lebesgue Dominated Convergence Theorem in Eq. (4.3) to get the following uniform convergence in x: N 1  (x · yj ) = A , N→∞ N

lim

x ∈ Sd .

j =1

Consequently, Theorem 3.4 gives us N N 1  lim (yi · yj ) = A . N→∞ N 2 j =1 j =1

X. Sun, Z. Chen / Journal of Approximation Theory 151 (2008) 186 – 207

197

Since ∗N minimizes the N point discrete -energy, we have E (∗N )

N N 1  (yi · yj ). N2 j =1 j =1

∗ are uniformly distributed on Sd . It follows from Proposition 4.1 that x1∗ , . . . , xN



∗ as minimal energy points associated with the function . If We will refer to the points x1∗ , . . . , xN their association with the function  is not to be emphasized, then we will simply call them minimal energy points. Based on the above proof, we carry out some preliminary numerical experiments with relatively small N values (ranging from 20 to 100). We first make an equal-volume partition of S2 , and select an interior point xj from each part Dj . We start with (x1 , . . . , xN ), and do a number of iterations using the “steepest descent method”. The results seem encouraging. We see each iteration is producing “better uniformly distributed” points on S2 . We are currently expanding the scope of numerical experiments to include relatively large N values.

5. Numerical integration errors based on minimal energy points Let  be an SBF of order k, k = 0, 1, on Sd , and let x1 , . . . , xN be a collection of N points that minimizes the N-point discrete -energy. We know from the results of previous sections that the measure N 1 

xj N j =1

converges (as N → ∞) to the rotational invariant probability measure on Sd in the weak star sense. In this section, we will quantify this convergence in an RKHS setting. Estimating the quantity      N 1    f (xj ) − f (x) d(x) N Sd  j =1  for a certain class of functions is often called “discrepancy estimate” in literature. Here we will carry out the proof for SBFs of order 0. A slight modification of RKHS is needed to deal with the case k = 1; see [25]. Reproducing Kernels are also called “Mercer Kernels”. The terminology “native space” is more prevalent in the community of approximation theorists; see [52]. Among many excellent sources, the 1950 paper by Aronszajn [2] is still considered by many to be a standard reference for RKHS. Besides many applications to different areas of mathematics, RKHS theory plays an important role in RBF methods and in the mathematical theory of learning; see Cucker and Smale [9]. The formation of the native space we use here is quite similar to that developed in [9]. For readers who are particularly interested in the spherical version of the formation, we refer to [26]. Let  be an SBF of order 0 on Sd . We consider the linear space consisting of all the finite linear combinations of zonal shifts of . Denote this linear space by P H , and define the following

198

X. Sun, Z. Chen / Journal of Approximation Theory 151 (2008) 186 – 207

bilinear form on P H : f, gP H :=



c d ( · ),

∈ ∈

where f (x) = ∈ c ( · x) and g(x) = ∈ d ( · x), and both  and  are finite subsets of S d , and c , d are real numbers. It is easy to verify that this is an inner product on P H thanks to the strict positive definiteness of . The native space N , associated with , is a linear subspace of L2 (S d ) defined by  N := f =

q ∞  

fˆ,m Y,m :

=0 m=1

∞ 

a−1

=0

q 

 2 fˆ,m

<∞ .

m=1

The native space N is a Hilbert space with the inner product: f, gN =

∞  =0

a−1

q 

fˆ,m gˆ ,m .

m=1

The following facts about the native space N and the inner product space P H are well known: 1. The native space N is an RKHS, and the reproducing kernel is (x · y). 2. The native space N is the completion of P H . For each fixed x ∈ Sd , the point evaluation functional x defined on N by f  → f (x),

f ∈ N

is continuous. This functional is “represented” by the function y  → (x · y). That is, for each fixed x ∈ Sd , we have (x · y), f (y)N = f (x),

f ∈ N .

We will call the function y  → (x · y), the representor of x . Proposition 5.1. The “integration” functional I defined by  f →

Sd

f (x) d(x),

f ∈ N ,

is continuous on N . The representor of I is the constant function: x  → A ,

x ∈ Sd .

X. Sun, Z. Chen / Journal of Approximation Theory 151 (2008) 186 – 207

199

Proof. For any given f ∈ N , we write f =

q ∞  

fˆ,m Y,m ,

(5.1)

=0 m=1

and apply the Cauchy–Schwarz inequality to get  q   ∞      2   fˆ,m f (x) d(x)  max |f (x)|   d  d S

x∈S



∞ 



=0



∞ 

q 

2 fˆ,m

m=1

a

=0

1 2

q d

1 2





=0

q d

∞ 

1

a−1

=0

2

=

∞ 

m=1

!

1 2



a

=0 q 

2 fˆ,m

1 2

m=1

=

q d

1 2



1 "

∞  =0

2

q 

1 2

2 Y,m (y)

m=1



1 ⎣a − 2 



q 

2 fˆ,m

1 2

⎤ ⎦

m=1

a

q d

1 2

f N .

This shows that I is continuous, and that the series in Eq. (5.1) converges uniformly on Sd to f. We can then integrate term by term to get  f (x) d(x) = fˆ0 = f, A N . Sd

Here we have used the fact that  Y,m (x) d(x) = 0, 1, m = 1, 2, . . . , q . Sd



Lemma 5.2. Let  be an SBF of order 0 on Sd , and let x1 , . . . , xN be a collection of N points that minimizes the N-point discrete -energy. Then the following inequality holds true: # # # # $ %1 N #1  # (1) 2 # #

xj − I #  . (5.2) #N N # # j =1

N

Proof. This is a direct application of an estimate by Hardin et al. [17]. In fact, using different notations, they showed ([17, Section 6, Eq. (28)]) that N N 1  (1) A  2 . (xi · xj ) A + N N i=1 j =1

Note that # #2 # # N #1  # #

xj # #N # # j =1 #

N

=

N N 1  (xi · xj ). N2 i=1 j =1

200

X. Sun, Z. Chen / Journal of Approximation Theory 151 (2008) 186 – 207

Therefore by Lemma 5.1, we have #2 # & ' # #  N N # #1 N 1  1  # #

xj − I # =

xj − I,

xj − I #N N N # # j =1

N

=

1 N2

j =1

N

#2 # # #  # #1 N # =#

xj # # # N j =1 # N N  

& −2

j =1

N 1 

xj , I N j =1

(xi · xj ) − A 

i=1 j =1

The desired inequality then follows.

'

N

+ I 2 N

(1) . N



Theorem 5.3. Let  be an SBF of order 0 on Sd . Let x1 , . . . , xN be a set of N points on Sd that minimizes the N point discrete -energy. Then for each f ∈ N , we have     $ %1  N 1   (1) 2   f (xj ) − f (x) d(x)  f N . N N Sd   j =1

Proof. By Lemma 5.2 and the Cauchy–Schwarz inequality, we have    & '     N N   1    1     f (x ) − f (x) d(x) =

− I, f   j xj N   N d  S  j =1   j =1 N  # # # # $ %1 N #1  # (1) 2 # #  f N #

xj − I #  f N .  N #N # j =1

N

We advise readers to compare Theorem 5.3 with the error estimates (both upper bounds and lower bounds) recently established by Brauchart [5], Brauchart and Hesse [6], Damelin and Grabner [10], Hesse [18], and Hesse and Sloan [19–21]. The result of Theorem 5.3 can be considered as a complement to those results by the authors mentioned above. However, the novelty of Theorem 5.3 is that the order of the error estimate N −1/2 is independent of the choice of the SBF, and the dimension d. We are currently exploiting the stochastic version of this phenomenon, and considering the ramification of it in statistical learning theory. 6. Separation of minimal energy points Configurations of minimal energy points are of great interest to mathematicians and scientists in many diverse disciplines. From the results of the previous two sections, we know that every set of minimal energy points is uniformly distributed on Sd , and enjoys good discrepancy estimates. With this consideration, it is perhaps surprising to many that up to this point we do not even know whether or not the minimal energy points are all distinct. It is also important to know how well the minimal energy points are separated. To address these issues, we generalize a method by Stolarsky [49]. The proofs in this section do not use positive definiteness. We remind the readers

X. Sun, Z. Chen / Journal of Approximation Theory 151 (2008) 186 – 207

201

that every function  : [−1, 1] → R can be uniquely related to the function  : [0, 4] → R by the following equation: (x · y) := (2 − 2x · y) = (|x − y|2 ),

x, y ∈ Sd .

6.1. The distinctness of minimal energy points Lemma 6.1. Let y ∈ Sd (d 2) be fixed, and let d (x) be the rotational invariant probability measure on Sd−1 . Then   1 (x · y)d (x) = 0, (x · y)2 d (x) = . d Sd−1 Sd−1 Proof. The first part is obvious. By the Funk–Hecke formula; see Müller [28], we have   d−3 d−2 1 2 (x · y)2 d (x) = t (1 − t 2 ) 2 dt d−1  d−1 −1 S "−1  ! d−1 d 1 1 d−3 2 2 2 2 ) = ( t 2 (1 − t) 2 dt *d + d−1  2 0  2 ( ) ( ) * + √ * +  23  d−1  d2 21  d2 2 1 ) ) ( = √ d *d + = . =√ ( d

2 2

 d−1  d+2 2

2

Here we have used the following familiar formulas for the Gamma function; see [1]: √  1

(p)(q) 3 p−1 q−1 u (1 − u) du = , (t + 1) = t(t),  = . (p + q) 2 2 0



Proposition 6.2. Let  : [0, 4] → R be continuous. Assume that  is twice differentiable on (0, 4), and that lim  (t) = −∞.

t→0+

Let N := {x1 , . . . , xN } be a set of N distinct points on Sd (d 2). Let UN be the function on Sd defined by UN (x) :=

N 

(|x − xj |2 ).

j =1

Let x0 ∈ Sd such that UN (x0 ) = min UN (x). x∈Sd

/ N . Then x0 ∈ Proof. Suppose on the contrary that x0 ∈ N . Without loss of generality, assume that x0 = x1 . Thus we have, for all x ∈ Sd , that N N   2 (0) + (|x1 − xj | )  (|x − xj |2 ), j =2

j =1

202

X. Sun, Z. Chen / Journal of Approximation Theory 151 (2008) 186 – 207

which allows us to conclude that, for all x ∈ Sd , (0) − (|x − x1 |2 ) 

N 

(|x − xj |2 ) −

j =2

N 

(|x1 − xj |2 ).

(6.1)

j =2

Assume that x1 = (1, 0, . . . , 0). Use x1 as the pole, and introduce the spherical coordinate system on Sd . We can then represent a point x on Sd in the polar form: x = (cos , sin x), ˜

x˜ ∈ Sd−1 ,

cos  = x · x1 .

Here  is the angle between x and x1 , and x˜ is often called the “curvilinear projection” of x onto the “equator” of Sd , which is Sd−1 . Likewise, each xj can be represented in the polar form: xj = (cos j , sin j x˜j ),

x˜j ∈ Sd−1 ,

cos j = xj · x1 .

Applying the Mean Value Theorem on the interval [0, |x − x1 |2 ], we get  (0) − (|x − x1 |2 ) = −4 (  ) sin2 , 2

(6.2)

where  ∈ (0, |x − x1 |2 ). Denote (1)

U (x) :=

N 

(|x − xj |2 ),

j =2

(1)

and expand U in Taylor polynomials (with remainder) as follows: (1)

(1)

U (x) = U (x1 ) + 2 +2

N 

N 



 (|x1 − xj |2 ) cos j (1 − cos ) + sin  sin j x˜j · x) ˜

j =2



2  (|x1 − xj |2 ) cos j (1 − cos ) + sin  sin j x˜j · x) ˜ + ◦(2 ).

j =2

Fix a , we integrate x˜ with respect to the rotational invariant probability measure d on Sd−1 . Using Lemma 6.1 and Inequality 6.1, we get  N  (1) (1) U (x)d (x) ˜ = U (x1 ) + 2  (|x1 − xj |2 ) cos j (1 − cos ) Sd−1

+

j =2

2 d

N 

 (|x1 − xj |2 ) sin2  sin2 j + ◦(2 ).

j =2

We continue simplifying by using the equations cos j = x1 · xj ,

sin2 j = 1 − (x1 xj )2 ,

|x1 − xj |2 = 2 − 2x1 · xj ,

and the asymptotic relations sin2  = 2 + O(3 ),

1 − cos2  =

2 + O(3 ). 2

X. Sun, Z. Chen / Journal of Approximation Theory 151 (2008) 186 – 207

203

We obtain 

(1)

Sd−1

(1)

U (x) d (x) ˜ = U (x1 ) + 2

+

2 d

N 2 

j =2

j =2

(1)

+

[ (|x1 − xj |2 )(x1 · xj − 1 + 1)]

[ (|x1 − xj |2 )(1 − x1 · xj )(1 + x1 · xj )] + ◦(2 )

= U (x1 ) − 2

N 

N  j =2

N 2  |x1 − xj |2  (|x1 − xj |2 ) 2 j =2

N 22   (|x1 − xj | ) + |x1 − xj |2  (|x1 − xj |2 ) d 

2

j =2

N   − |x1 − xj |4  (|x1 − xj |2 ) + ◦(2 ) 2d 2

j =2

(1)

= U (x1 ) + 2

N 

(|x1 − xj |2 ) + ◦(2 ),

(6.3)

j =2

where (t) = (1 − 2t ) (t) + dt (2 − 2t ) (t). Integrating on both sides of Inequality (6.1) on Sd−1 with respect to the measure (x), ˜ noting that the left-hand side of the inequality depends only on , and using Inequalities (6.2) and (6.3), we have −4 (  ) sin2

  (|x1 − xj |2 ). 2 2 N

j =2

This is a contradiction since limt→0+ (t) = −∞.



Remark 6.3. The long proof we presented above is for the purpose of obtaining Eq. (6.3) which is crucial in the proof of Theorem 6.5 in the next subsection. A simple proof (without the benefit of obtaining Eq. (6.3)) has been pointed out to us by an anonymous referee, in which one minimizes (1) the function U on a suitably chosen arch. Theorem 6.4. Let  : [0, 4] → R be a function satisfying the conditions as in Proposition 6.4. Let x1 , . . . , xN be N points that minimize the N point discrete -energy. Then the points x1 , . . . , xN are all distinct. Proof. For N = 2, any two minimal energy points are clearly distinct. Let N ( 3) be the minimal natural number so that the distinctness among the minimal energy points x1 , . . . , xN no longer stand. Without loss of generality, we assume that there is an xj , 2 j N , such that xj = x1 .  ( ) defined by The N points x1 , . . . , xN also minimize the function E N  E (N ) :=

 i=j

(|yi − yj |2 ),

N := (y1 , . . . , yN ) ∈ Sd × · · · × Sd, N

204

X. Sun, Z. Chen / Journal of Approximation Theory 151 (2008) 186 – 207

where the summation i=j is over all i and j with i  = j . If only the point x1 is removed and others are held fixed, then the pair-wise interactions of pairs not involving x1 are constant. Thus we have N 

(|x1 − xj |2 ) 

j =2

N 

(|x − xj |2 ),

x ∈ Sd ,

j =2

which contradicts Proposition 6.4.



6.2. Separation estimate for minimal energy points Let S denote the set of continuous functions  on [0, 4] that also satisfy the following conditions: 1.  is twice differentiable on (0, 4). 2. limt→0+  (t) = −∞. 3. The function  defined by % $ 1 1 2    (t) := − t (t) + t  (t) 2 d is nonnegative and bounded on [0, 4]. 4. The function  defined by  (t) := − (t) −

2  t (t) d

is nonnegative and strictly decreasing on (0, 4], and satisfies lim  (t) = ∞.

t→0+

Theorem 6.5. Let  ∈ S, and let {x1 , x2 , . . . , xN } be a set of N points on Sd (d 2) that minimizes the N point discrete -energy on Sd . Then the following estimate holds true: , min |xi − xj |  −1  (BN ), i=j

where B := sup0  t  4  (t), and −1  denotes the inverse function of  . Proof. We know from Theorem 6.4 that the points x1 , x2 , . . . , xN are all distinct, and that for each fixed xi ,   (|xj − xi |2 ) = min (|xj − x|2 ). j =i

x∈Sd j =i

Resorting to Inequality (6.3), we obtain that N  j =2

(|x1 − xj |2 ) 0.

X. Sun, Z. Chen / Journal of Approximation Theory 151 (2008) 186 – 207

205

Since (t) =  (t) −  (t), the above is N  j =2

 (|x1 − xj |2 ) 

N  j =2

 (|x1 − xj |2 ).

It then follows that  (|x1 − xj |2 ) 

N  j =2

 (|x1 − xj |2 ) 

N  j =2

 (|x1 − xj |2 ) BN.

(6.4)

Since  is a strictly decreasing function, we have |x1 − xj |2 −1  (BN ). That is

, |x1 − xj |  (−1  (BN ).

As x1 and xj (= x1 ) can be chosen arbitrarily, the desired inequality follows.



Examples. There are abundant SBFs of order k, k = 0, 1, that are also in the set S. We consider two such examples. Let 0 <  < 1. Then the function 1 defined by 1 (x) = −|x| ,

x ∈ Rd+1

is s.c.p.d. of order 1 in Rd+1 . In fact, it was shown in [50] that ( )  ∞ 22+1  2+d+1 2 (tu) − 1 1−2 ( ) u du. −|x|2 = − u2 0 (−) d+1 2

Thus, its restriction to Sd is an SBF of order 1. Let {x1 , . . . , xN } be a set of N points on Sd that minimize the N point discrete 1 -energy. Then by Theorem 3.4, the points x1 , . . . , xN are uniformly distributed on Sd . Applying Theorem 6.5, we can get the following separation estimate: $ min |xi − xj |  i=j

d + 2 − 2 22−1 (d +  − 1)

%

1 2−2

1

N − 2−2 .

For the special case d = 2, this estimate was obtained by Stolarsky in 1975 [49]. Let 0 <  < 1. Then the function 2 defined by 

2 (x) = e−|x| ,

x ∈ Rd+1

is strictly positive definite in Rd+1 ; see [44]. Its restriction to Sd is an SBF. Therefore, every set of minimal energy points {x1 , . . . , xN } associated with 2 is uniformly distributed on Sd . The discrepancy estimate in Theorem 5.3 applies. Furthermore, applying Theorem 6.5, we get the following estimate: (

min |xi − xj | C(, d)N i=j



1 2−2 +

)

,

206

X. Sun, Z. Chen / Journal of Approximation Theory 151 (2008) 186 – 207

where C(, d) is a constant depending only on  and d, and  > 0 can be selected as small as one wishes. The appearance of  here is the result of a crude estimate for the inverse function of 2 . Heuristic analysis strongly suggests the following estimate: −1

min |xi − xj | C(, d)N 2−2 log N. i=j

Acknowledgments The authors thank the two anonymous referees for their insightful suggestions and corrections that have enhanced the quality of the paper. References [1] M. Abramowitz, I. Stegun, Handbook of Mathematical Functions, with Formulas, Graphs, and Mathematical Tables, Dover Publications, New York, 1974. [2] N. Aronszajn, Theory of reproducing kernels, Trans. Amer. Math. Soc. 68 (1950) 337–404. [3] J. Beck, W. Chen, Irregularities of Distribution, Cambridge University Press, Cambridge, 1987. [4] J. Bourgain, J. Lindenstrauss, Distribution of points on spheres and approximation by zonotopes, Israel J. Math. 64 (1988) 25–31. [5] J.S. Brauchart, Invariance principle for energy functionals on spheres, Monatsh. Math. 141 (2004) 101–117. [6] J.S. Brauchart, K. Hesse, Numerical integration over spheres of arbitrary dimension, Constr. Approx. 25 (2007) 41 –71. [7] W. zu Castell, F. Filbir, Radial basis functions and corresponding zonal series expansions on the sphere, J. Approx. Theory 134 (2005) 65–79. [8] D. Chen, V.A. Menegatto, X. Sun, A necessary and sufficient condition for strictly positive definite functions on spheres, Proc. Amer. Math. Soc. 131 (2003) 2733–2740. [9] F. Cucker, S. Smale, On the mathematical foundations of learning, Bull. Amer. Math. Soc. (N.S.) 39 (2001) 1–49. [10] S.B. Damelin, P.J. Grabner, Energy functionals, numerical integration and asymptotic equidistribution on the sphere, J. Complexity 19 (3) (2003) 231–246. [11] S.B. Damelin, J. Levesley, X. Sun, Energy estimates and the Weyl criterion on compact homogeneous manifolds, in: Algorithm for Approximation, Springer, 2007, pp. 359–367. [12] N. Dyn, F. Narcowich, J. Ward, Variational principles and Sobolev-type estimates for generalized interpolation on a Riemannian manifold, Constr. Approx. 15 (1999) 175–208. [13] U. Feige, G. Schechtman, On the optimality of the random hyperplane rounding technique for MAX CUT, Random Structures and Algorithms 20 (2002) 403–440 (special issue: Probabilistic Methods in Combinatorial Optimization). [14] K. Guo, S. Hu, X. Sun, Conditionally positive definite functions and Laplace–Stieltjes integrals, J. Approx. Theory 74 (1993) 249–265. [15] D.P. Hardin, E.B. Saff, Discretizing manifolds via minimum energy points, Notices Amer. Math. Soc. 51 (10) (2004) 1186–1194. [16] D.P. Hardin, E.B. Saff, Minimal Riesz energy point configurations for rectifiable d-dimensional manifolds, Adv. Math. 193 (2005) 174–204. [17] D.P. Hardin, E.B. Saff, H. Stahl, The support of the logarithmic equilibrium measure on the sets of revolution, J. Math. Phys. 48 (2007) 022901 (14pp). [18] K. Hesse, A lower bound for the worst-case cubature error on spheres of arbitrary dimension, Numer. Math. 103 (2006) 413–433. [19] K. Hesse, I.H. Sloan, Optimal lower bounds for cubature error on the sphere S 2 , J. Complexity 21 (2005) 790–803. [20] K. Hesse, I.H. Sloan, Worst-case errors in a Sobolev space setting for cubature over the sphere S 2 , Bull. Austral. Math. Soc. 71 (2005) 81–105. [21] K. Hesse, I.H. Sloan, Cubature over the sphere S 2 in Sobolev spaces of arbitrary order, J. Approx. Theory 141 (2006) 118–133. [22] A.B.J. Kuijlaars, E.B. Saff, Asymptotics for minimal discrete energy on the sphere, Trans. Amer. Math. Soc. 350 (1998) 523–538. [23] L. Kuipers, H. Niederreiter, Uniform Distribution of Sequences, Wiley, New York, 1974.

X. Sun, Z. Chen / Journal of Approximation Theory 151 (2008) 186 – 207

207

[24] P. Leopardi, A partition of the unit sphere into regions of equal area and small diameter, Electron. Trans. Numer. Anal., 2006, to appear. [25] J. Levesley, W. Light, D. Ragozin, X. Sun, A simple approach to variational theory for interpolation on spheres, Internat. Theory Numer. Anal. 132 (1999) (Birkhauser, Basel, Switzerland). [26] J. Levesley, X. Sun, Approximation in rough native space by shifts of smooth kernels on spheres, J. Approx. Theory 133 (2005) 269–283. [27] H. Montgomery, Ten lectures on the interface between analytic number theory and harmonic analysis, CBMS, Regional Conference Series in Mathematics, No. 84, American Mathematical Society, Providence, Rhode Island, 1990. [28] C. Müller, Spherical Harmonics, Lecture Notes in Mathematics, Vol. 17, Springer, Berlin, 1966. [29] F.J. Narcowich, Generalized Hermite interpolation and positive definite kernels on a Riemannian manifold, J. Math. Anal. Appl. 190 (1995) 165–193. [30] F.J. Narcowich, X. Sun, J.D. Ward, Approximation power of RBFs and their associated SBFs: a connection, Adv. Comput. Math. 27 (2007) 107–124. [31] F.J. Narcowich, X. Sun, J.D. Ward, H. Wendland, Direct and inverse Sobolev error estimates for scattered data interpolation via spherical basis functions, Found. Comput. Math. (2007) 369–390. [32] F.J. Narcowich, J.D. Ward, Norm of inverses and condition numbers of matrices associated with scattered data, J. Approx. Theory 64 (1991) 69–94. [33] F.J. Narcowich, J.D. Ward, Norm estimates for the inverses of a general classes of scattered-data radial function interpolation matrices, J. Approx. Theory 69 (1992) 84–109. [34] F.J. Narcowich, J.D. Ward, Scattered data interpolation on spheres: error estimates and locally supported basis functions, SIAM J. Math. Anal. 33 (2002) 1393–1410. [35] F.J. Narcowich, J.D. Ward, Scattered-data interpolation on Rn : error estimates for radial basis and band-limited functions, SIAM J. Math. Anal. 36 (2004) 284–300. [36] F.J. Narcowich, J.D. Ward, H. Wendland, Sobolev error estimates and a Bernstein inequality for scattered-data interpolation via radial basis functions, Constr. Approx. 24 (2006) 175–186. [37] A. Pinkus, Strictly Hermitian positive definite functions, J. Anal. Math. 94 (2004) 293–318. [38] T. Poggio, Networks for approximation and learning, Proc. IEEE 78 (1990) 1481–1497. [39] M.J.D. Powell, The theory of radial basis functions in 1990, in: W. Light (Ed.), Wavelets, Subdivision, and Radial Basis Functions, Oxford University Press, Oxford, 1990. [40] E.A. Rakhmanov, E.B. Saff, Y.M. Zhou, Minimal discrete energy on the sphere, Math. Res. Lett. 1 (6) (1994) 647–662. [41] W. Rudin, Real and Complex Analysis, 3rd ed., McGraw-Hill Series in Higher Mathematics, McGraw-Hill Inc., New York, 1987. [42] E.B. Saff, A.B.J. Kuijlaars, Distributing many points on a sphere, Math. Intelligencer 19 (1997) 5–11. [43] R. Schaback, H. Wendland, Inverse and saturation theorems for radial basis function interpolation, Math. Comp. 71 (2001) 669–681. [44] I.J. Schoenberg, On metric spaces arising from Euclidean spaces by a change of metric and their imbedding in Hilbert space, Ann. of Math. 37 (1937) 787–793. [45] I.J. Schoenberg, Metric spaces and positive definite functions, Trans. Amer. Math. Soc. 44 (1938) 522–536. [46] I.J. Schoenberg, Metric spaces and completely monotone functions, Ann. of Math. 39 (1938) 811–841. [47] I.J. Schoenberg, Positive definite functions on spheres, Duke Math. J. 9 (1942) 96–108. [48] K.B. Stolarsky, Sums of distances between points on a sphere II, Proc. Amer. Math. Soc. 41 (1973) 575–582. [49] K.B. Stolarsky, Spherical distribution of N points with maximal distance sums are well spaced, Proc. Amer. Math. Soc. 48 (1975) 203–206. [50] X. Sun, Norm estimates for inverses of Euclidean distance matrices, J. Approx. Theory 70 (1992) 339–347. [51] X. Sun, Conditionally positive definite functions and their applications to multivariate interpolations, J. Approx. Theory 73 (1993) 159–180. [52] H. Wendland, Scattered Data Approximation, Cambridge Monographs on Applied and Computational Mathematics, Cambridge University Press, Cambridge, 2005. [53] Y. Xu, E.W. Cheney, Strictly positive definite functions on spheres, Proc. Amer. Math. Soc. 116 (1992) 977–981.