Econometrics and Statistics 1 (2017) 118–127
Contents lists available at ScienceDirect
Econometrics and Statistics journal homepage: www.elsevier.com/locate/ecosta
On the consistency of bootstrap methods in separable Hilbert spaces Gil González-Rodríguez∗, Ana Colubi INDUROT/Department of Statistics and OR, Campus de Mieres. University of Oviedo, Mieres 3600, Spain
a r t i c l e
i n f o
Article history: Received 2 June 2016 Revised 14 November 2016 Accepted 14 November 2016 Available online 22 November 2016 Keywords: Bootstrap methods Consistency Hilbert spaces Functional data Independent random elements Functional sample mean Functional regression models
a b s t r a c t Hilbert spaces are frequently used in statistics as a framework to deal with general random elements, specially with functional-valued random variables. The scarcity of common parametric distribution models in this context makes it important to develop non-parametric techniques, and among them, bootstrap has already proved to be specially valuable. The aim is to establish a methodology to derive consistency results for some usual bootstrap methods when working in separable Hilbert spaces. Naive bootstrap, bootstrap with arbitrary sample size, wild bootstrap, and more generally, weighted bootstrap methods, including double bootstrap and bootstrap generated by deterministic weights with the particular case of delete −h jackknife, will be proved to be consistent by applying the proposed methodology. The main results concern the bootstrapped sample mean, however since many usual statistics can be written in terms of means by considering suitable spaces, the applicability is notable. An illustration to show how to employ the approach in the context of a functional regression problem is included. © 2016 ECOSTA ECONOMETRICS AND STATISTICS. Published by Elsevier B.V. All rights reserved.
1. Introduction The explosion of functional data analysis during the last decades underlines the need to develop statistical tools in general spaces (Hormann and Kokoszka, 2010; Ramsay and Silverman, 2005; Wang et al., 2016; Yao et al., 2005). Separable Hilbert spaces (Cardot et al., 2013; Gabrys and Kokoszka, 2007; González-Rodríguez et al., 2012) provide a natural and flexible framework. The good properties of the metric structure of separable Hilbert spaces make it intuitive to generalize classical concepts and results, such as the expectation, the covariance matrix, the linear regression models, etc. (Cardot et al., 1999; Kosorok, 2008; Ledoux and Talagrand, 1991). These general concepts and results then become applicable to functional and other complex data (Biglieri and Yao, 1989; González-Rodríguez et al., 2012; Li and Hsing, 2010). Given the generality and the inherent high-dimensionality of this kind of spaces, there is a scarcity of parametric models that are used in practice to model Hilbert-valued random elements, although the Gaussian processes continue playing an important role via the CLT (Araujo and Giné, 1980). Thus, non-parametric tools are largely employed (Ferraty and Vieu, 2006) and, in this context, bootstrap techniques are very useful (Cuevas et al., 2006; Ferraty et al., 2010; Wang et al., 2016). In order to theoretically support the use of these techniques, their consistency should be analyzed. There exists a number of results in the literature devoted to prove the consistency of different types of bootstrap in certain general spaces for the
∗
Corresponding author. E-mail addresses:
[email protected],
[email protected] (G. González-Rodríguez).
http://dx.doi.org/10.1016/j.ecosta.2016.11.001 2452-3062/© 2016 ECOSTA ECONOMETRICS AND STATISTICS. Published by Elsevier B.V. All rights reserved.
G. González-Rodríguez, A. Colubi / Econometrics and Statistics 1 (2017) 118–127
119
case of independent random elements, which is the one considered in this paper. Remarkably, Gine and Zinn (1990) proved the consistency of the naive bootstrap for the sample mean in a space of indexed empirical processes and derived the same bootstrap in separable Hilbert spaces as a corollary. Given the importance of such a space in dealing with empirical measures, other bootstrap methods have been studied, e.g. Kosorok (2003); Ledoux and Talagrand (1988); Praestgaard and Wellner (1993). The variety of techniques makes it complicated to analyze their counterpart in Hilbert spaces. A unified framework will be provided to derive the consistency of the usual bootstrap approaches that can be employed for separable Hilbert spaces under independence. A number of bootstrap methods for the sample mean will be obtained as examples within the considered framework. Most notably, naive bootstrap, bootstrap with arbitrary sample size, wild bootstrap, and various weighted bootstrap, including double bootstrap, Bayesian bootstrap, urn model bootstrap and bootstrap generated by deterministic weights, with the particular case of delete −h jackknife, can be addressed. Moreover, how to extend these results to other mean-based statistics will be illustrated. As an important example, a problem concerning a functional regression model will be considered. The novelty is based on the introduction of a linkage function that allows us to easily derive bootstrap results in Hilbert spaces from the well-developed theory of empirical processes indexed by a family of functions. This simplifies the task of the applied statisticians by translating elaborated probability results into a closer and more familiar framework. The rest of the manuscript is organized as follows. In Section 2, the main results for separable Hilbert spaces are stated. That is, the linkage function to connect any separable Hilbert space with a particular space of functions indexed by a class of functions is introduced, and its main properties are derived. The consistency of weighted bootstrap approaches is guaranteed under general conditions. For illustrative purposes, Section 3 will be devoted to the development of a linear independence test between a functional response and any subset of scalar regressors involved in a multiple linear model. Various bootstrap procedures will be proposed, and they will be shown to be consistent by employing the results previously stated. In Section 4, technical details are gathered, namely, some notation and well-known results concerning bootstrap methods for empirical processes are recalled. This supporting theory is used in order to prove the main results stated in Section 2. The notation for the relevant case of L2 spaces is clarified in a subsection. Some concluding remarks are collected in Section 5. 2. Bootstrap approaches in separable Hilbert spaces Let X be an arbitrary space, and let F be any class of measurable functions f : X → R. In the same way, let
l ∞ (F ) = {g : F → R, gF < ∞} with gF = sup f ∈F |g( f )|, which is a Banach space when the sum and the product are defined pointwise. This is a natural space for analyzing the behavior of the empirical processes indexed by the class of functions F, which will be considered in Section 4 for technical purposes. Let (H, ·, · ) be a separable Hilbert space and denote by · the norm associated with the inner product. Let (, A, P ) be a probability space and let X be an H−valued random element so that EX2 < ∞. ZX will denote a centred Gaussian H-valued random element having the same covariance operator as X. Finally let X1 , X2 , . . . be a sequence of i.i.d. H−valued random elements following the distribution of X. The CLT for i.i.d. separable Hilbert-valued random elements (see, e.g., Laha and Rohatgi, 1979) ensures that n 1 (Xi − E (X )) → ZX √ n i=1
weakly in H. In order to link the spaces H and l ∞ (F ), the index class of functions to be considered is the closed unit ball of the dual space of H, namely,
F = { f ∈ H | f ≤ 1}. It should be noted that, by definition, f ∈ H if and only if f : H → R is a continuous linear function and consequently is measurable. The next theorem establishes a useful linkage between the spaces H and l ∞ (F ). This linkage will allow us to derive results for H−valued random elements from their counterpart results stated for empirical process indexed by a class of functions. Although this paper focuses only on the consistency of bootstrap approaches, other interesting results within the well-known context of empirical processes could be adapted as well (e.g., Kosorok, 2008). Theorem 1. Let D : H → l ∞ (F ) be defined so that
D(h )( f ) = f (h ) for all h ∈ H and all f ∈ F. Then (a) D is a continuous mapping. (b) There exists D−1 : l ∞ (F ) → H continuous with D−1 (D(h )) = h ∀h ∈ H.
120
G. González-Rodríguez, A. Colubi / Econometrics and Statistics 1 (2017) 118–127
Proof. On one hand
D(h )F = sup |D(h )( f )| = sup | f (h )| = h. f ∈F
f ∈F
Then, D is a bounded operator. It is easy to check that it is also linear. Consequently D is continuous, so (a ) follows. Let R(D ) ⊂ l ∞ (F ) be the range of D. As D(h )F = h for all h ∈ H then D has a continuous inverse D−1 : R(D ) → H (see, e.g., Kosorok, 2008, Lemma 6.16 – i). Since H and l ∞ (F ) are complete and
N ( D ) = {h ∈ H |D ( h ) = 0} = {0}, Banach’s theorem (see, e.g., Kosorok, 2008, Lemma 6.16 – ii) guarantees that R(D) is closed. In addition, as D−1 : R(D ) → H is continuous and R(D) is closed, then Dugundji’s extension theorem (see, e.g., Kosorok, 2008, Theorem 10.9) ensures the existence of a continuous extension D−1 : l ∞ (F ) → H which proves b). In order to introduce the practical exchangeable weighted bootstrap, an array of random variables {Wnj }nj=1 verifying the following conditions (see Praestgaard and Wellner, 1993) will be considered ) are exchangeable, A1) (Wn1 , . . . , Wnn A2) Wnj ≥ 0∀ j and nj=1 Wnj = n∀n ∈ N, ∞ A3) supn 0 (P (|Wn1 | > t ))1/2 dt < ∞, A4) limλ→0 lim supn→∞ supt≥λ t 2 P (Wn1 ≥ t ) = 0, A5) (1/n ) nj=1 (Wnj − 1 )2 → c2 > 0 in probability.
The almost sure consistency of the exchangeable weighted bootstrap and, consequently, the naive bootstrap, is derived as a simple Corollary of the well-known results collected in Section 4 by applying the Continuous Mapping Theorem to D−1 (see Section 4.1 for technical details). Corollary 1. If X1 , X2 , . . . is a sequence of i.i.d. H−valued random elements so that EX1 2 < ∞ and {Wnj }nj=1 is any array of random variables verifying conditions A1-5, then, n n 1 1 ω Wn j X j − √ X j → cZX , √ n j=1 n j=1
weakly in H P − a.s. In particular, • Naive bootstrap: {Xi∗ }ni=1 chosen at random with replacement from {Xi }ni=1 , n n 1 1 ∗ Xj − √ X j → ZX weakly in H √ n j=1 n j=1
P − a.s.
• Double bootstrap: {Xi∗∗ }ni=1 chosen at random with replacement from {Xi∗ }ni=1 , n n 1 1 ∗∗ Xj − √ X j → 2ZX weakly in H √ n j=1 n j=1
P − a.s.
−hn • Delete hn Jacknife: hn ∈ {1, . . . , n}, {Xi∗ }ni=1 chosen at random without replacement from {Xi }ni=1 . If hn /n → α ∈ (0,1) then n−h n 1 1 n n α X j∗ − √ Xj → ZX weakly in H √ 1−α n j=1 n − hn n j=1
P − a.s.
• Arbitrary sample size: m ∈ N, {Xi∗ }m chosen at random with replacement from {Xi }ni=1 . If min {n, m} → ∞ then i=1
√ m
m n 1 ∗ 1 Xj − Xj m n j=1
→ ZX weakly in H
P − a.s.
j=1
Many other examples of different exchangeable weights verifying A1-5 can be found in (Praestgaard and Wellner, 1993), including the Bayesian bootstrap or the Urn-model bootstrap, among others. The consistency of all of them is established by the previous corollary. A particular bootstrap technique that cannot be derived as an exchangeable weighted bootstrap is the wild bootstrap. According to Gine and Zinn (1990) and Ledoux and Talagrand (1988), a result due to Ledoux, Talagrand and Zinn establishes the equivalence between the CLT for random elements in a separable Banach space and the almost sure weak consistency of the wild bootstrap. Thus, the result holds directly for separable Hilbert spaces, and it is stated below for the sake of completeness. ∞ Let {ξi∗ }ni=1 be a sequence of i.i.d. random variables with zero mean, variance 1 and 0 (P (|ξ1∗ | > t )1/2 )dt < ∞. This last condition is proposed in (Gine and Zinn, 1990) as a generalization of the original result, which just considers Gaussian
G. González-Rodríguez, A. Colubi / Econometrics and Statistics 1 (2017) 118–127
121
random variables (see also Theorem 2.1. of (Wellner, 1992) for the general result for empirical process indexed by a class of functions). Then the consistency of the wild bootstrap for random elements in separable Hilbert spaces is guaranteed in this general situation. Theorem 2. If X1 , X2 , . . . is a sequence of i.i.d. H−valued random elements so that EX1 2 < ∞, then n 1 ∗ ξ Xi (ω ) → ZX weakly in H √ n i=1 i
P − a.s.
3. Bootstrap linear independence tests The aim is to illustrate by a simple situation the applicability of the presented technique by proposing different bootstrap procedures in order to test the linear independence between an H-valued response variable and a subset of scalar covariates. The linear model to be considered is described as follows:
Y = β1 X1 + . . . + βk Xk + C + ε ,
(1)
where Y and ε are H-valued random elements, β1 , . . . , βk , C ∈ H are the coefficients of the model, and X1 , . . . , Xk are uncorrelated real random variables with 0 < Var(Xj ) < ∞ for all j ∈ {1, . . . , k}. The error term is assumed to be centered and such that E (ε|(X1 , . . . , Xk )) = E (ε ). In this setting the regression function reduces to
m ( x1 , . . . , xk ) = C +
k
β jx j.
j=1
When H is the space of L2 functions defined on a closed interval, this regression model with functional response and scalar covariates was introduced by Faraway (1997) as an alternative to the classical longitudinal methods (see also, Ramsay and Silverman, 2005), although the conditions considered here are different for illustrative purposes. Namely, no fixed design for the regressors is considered, but an uncorrelation condition on the regressors is imposed to simplify the developments and highlight the applicability of the results. This restrictive uncorrelation condition could be avoided by simply considering the non-collinearity of the regressors. In this case, a testing procedure involving linear combinations of the regressors could be derived from the techniques in this section, which could be applied to the principal components as regressors. Let K = { j1 , . . . , jl } be a non-empty subset of {1, . . . , k}. The aim is to prove the asymptotical correctness and consistency of different bootstrap approaches that will be proposed in order to test if the set of explanatory variables {Xj }j ∈ K are linearly independent on Y, namely
H0 : β j = 0 for all j ∈ K H1 : β j = 0 for some j ∈ K
(2)
Under the considered model, given j ∈ {1, . . . , k}, the covariance between Y and Xj reduces to
E [X j − E (X j )][Y − E (Y )] = V ar (X j )β j , and consequently H0 is satisfied if and only if
E [X j − E (X j )][Y − E (Y )] 2 = 0.
j∈K
Let {(Yi , X1i , . . . , Xki )}ni=1 be a sequence of i.i.d. H Rk -valued random elements with the same distribution as (Y, X1 , . . . , Xk ), where denotes the direct sum of Hilbert spaces. In order to test H0 versus H1 , the following statistic will be used,
Tn =
2
n
1
(X ji − X j )(Yi − Y ) .
√ n
j∈K i=1
H0 will be rejected at a given significance level α whenever Tn > cα for certain cα appropriately chosen. l In order to analyze the asymptotic behavior of Tn , let Hl be the separable Hilbert space H ··· H, and consider the l sequence of i.i.d. H -valued random elements given by
Qi =
X j1 ,i − E (X j1 ) [Yi − E (Y )], . . . , X jl ,i − E (X jl ) [Yi − E (Y )] ,
for all i ∈ {1, . . . , n}. Some algebra shows that 0 < V ar (Qi ) < ∞ irrespectively of the values of (β1 , . . . , βk ) whenever E (||X j − E (X j )|| )4 < ∞ for all j ∈ K. In these conditions the CLT for i.i.d. Hilbert-valued random elements (see, e.g., Laha and Rohatgi, 1979) can be applied to the sequence {Qi }ni=1 , guaranteeing that n 1 Tn = √ (Qi − E (Q1 )) → ZQ n i=1
122
G. González-Rodríguez, A. Colubi / Econometrics and Statistics 1 (2017) 118–127
weakly in Hl as n tends to infinity. Consider the sequence of Hl -valued random elements given by
i = Q
X j1 ,i − X j1
Yi − Y , . . . , X jl ,i − X jl
Yi − Y
,
for all i ∈ {1, . . . , n}. Then
2
n
1
Tn = √ Qi .
n i=1
(3)
The following result can be stated. Theorem 3. Given the model (1), if E (||X j − E (X j )|| )4 < ∞ for all j ∈ K and α ∈ (0,1), the test (2) in which H0 is rejected whenever Tn > cα , where cα is the (1 − α )-quantile of the distribution of ||ZQ ||2 , is asymptotically correct and consistent at a significance level α . δn β with ||β || > 0 for some j ∈ K. If δ → ∞ and In addition, consider the sequence of Pitman local alternatives β (j n ) = √ n j n j δn √ n
→ 0 as n tends to infinity then limn→∞ P (Tn > c ) = 1 for any c > 0.
Proof. Recall (3) and note that, n √ 1 Q i = T n + nE ( Q 1 ) + R n , √ n i=1
being Rn the Hl -valued random element given by
Rn =
√ n X j1 − E (X j1 ) Y − E (Y ) , . . . , X jl − E (X jl ) Y − E (Y ) .
√ The CLT applied to n Y − E (Y ) , together with the SLLN and the Slutsky Theorem, guarantees that Rn converges in probability to 0. √ On the other hand, nE (Q1 ) vanishes under the null hypothesis. Thus, the Continuous Mapping Theorem ensures that Tn converges weakly to the same limit as ||Tn ||2 , that is ||ZQ ||2 . This guarantees the asymptotical correctness at a significance level α of the proposed test. √ Furthermore, note that if H0 is not verified then || nE (Q1 )||2 tends to infinity with n. Consequently, lim P (Tn > c ) = 1
n→∞
for any c > 0 and thus the proposed test is asymptotically consistent. Finally, for the sequence of local alternatives it is verified that
√
|| nE (Q1 )||2 = δn2
||V ar (X j )β j ||2 .
j∈K
As there exists j ∈ K such that ||β j || > 0 and δ n → ∞ as n tends to infinity, the result follows in a similar way.
In practice, the distribution of ||ZQ ||2 is unknown, so an approximation is needed. The aim is to propose an exchangeable weighted bootstrap procedure, a wild bootstrap procedure, and to prove their consistency by using the results of the previous section. The key point is to mimic the distribution of Tn , as its asymptotic distribution is ZQ irrespectively of whether the null hypothesis is fulfilled or not. For this purpose, consider {Wni }ni=1 any array of random variables verifying conditions A1-5, and the following bootstrap statistic, n n 1 1 Tn = √ Wni Qi − √ Qi . c n i=1 c n i=1
According to Corollary 1, Tn converges weakly to ZQ P − a.s. as n tends to infinity. Unfortunately Tn cannot be used in practice as it depends on unknown expectations. Instead, the following approximation will be used: n n 1 1 T n = √ Wni Qi − √ Qi . c n i=1 c n i=1
Finally, by defining Tn = ||T n ||2 , the next result can be stated. Theorem 4. Given the model (1), if E (||X j − E (X j )|| )4 < ∞ for all j ∈ K, the array of weights verifies conditions A1-5 and n , where cn is the (1 − α )-quantile of the distribution of α ∈ (0, 1), the test (2) in which H0 is rejected whenever Tn > cα Tn , α is asymptotically correct and consistent P − a.s. at a significance level α . 2 weakly P − a.s. as n tends to infinity, and to apply the Continuous Mapping Proof. It is enough to check that T n → ZQ 2 Theorem to ||·|| .
G. González-Rodríguez, A. Colubi / Econometrics and Statistics 1 (2017) 118–127
123
i = (X j ,i , . . . , X j ,i ) for all i ∈ {1, . . . , n}. As First of all, consider the sequence of i.i.d. Rl -valued random vectors given by X 1 l in the proof of Theorem 3, note that
Y T n = Tn + (E (Y ) − Y )RX n + ( E ( X ) − X )Rn , l RX n being the R -valued random vector given by n n 1 1 RX Wni Xi − √ Xi , n = √ c n i=1 c n i=1
and RYn the H-valued random element given by n n 1 1 RYn = √ WniYi − √ Yi . c n i=1 c n i=1
i }n and {Yi }n , it follows that RX and RY converge weakly P − a.s. By applying Corollary 1 to the i.i.d. sequences {X n n i=1 i=1 to ZX and ZY in their corresponding spaces, respectively. 1
) − X converge P − a.s. to 0 in their correspondOn the other hand, the SLLN guarantees that both E (Y ) − Y and E (X ing spaces and, consequently, in virtue of the Slutsky theorem it can be ensured that the H-valued random elements Y (E (Y ) − Y )RX n and (E (X ) − X )Rn converge in probability to 0, P − a.s. Thus Tn and Tn have the same asymptotic behavior and, consequently, T n converges weakly to ZQ P − a.s., which concludes the proof. Finally, the asymptotic distribution of Tn can also be mimicked by means of the wild bootstrap. To this aim let {ξi }ni=1 be ∞ a sequence of i.i.d. random variables with zero mean, variance 1 and 0 (P (|ξ1 | > t )1/2 )dt < ∞, and consider the following bootstrap statistic, n 1 Sn = √ ξ Qi . n i=1 i
Sn
According to Theorem 2, Sn converges weakly to ZQ P − a.s. as n tends to infinity. As for the weighted bootstrap approach, cannot be used directly, since it depends on unknown expectations, so the following approximation will be used, n 1 S n = √ ξ Qi . n i=1 i
Thus, the wild bootstrap statistic to be used to approximate the distribution of Tn under the null hypothesis will be Sn = ||S n ||2 . Theorem 5. Given the model (1), if E (||X j − E (X j )|| )4 < ∞ for all j ∈ K, the sequence {ξi }ni=1 of i.i.d. random variables with zero ∞ n , where cn mean, variance 1 and 0 (P (|ξ1 | > t )1/2 )dt < ∞ and α ∈ (0, 1), the test (2) in which H0 is rejected whenever Tn > cα α is the (1 − α )-quantile of the distribution of Sn , is asymptotically correct and consistent P − a.s. at a significance level α . 2 weakly P − a.s. as n tends to infinity. Proof. It is enough to check that S n → ZQ i = (X j ,i , . . . , X j ,i ) for As in the proof of Theorem 4, consider the sequence of i.i.d. Rl -valued random vectors given by X 1 l all i ∈ {1, . . . , n}. In this case
Y ξ S n = Sn + (E (Y ) − Y )RX n + (E (X ) − X )Rn + (E (Y ) − Y )(E (X ) − X )Rn , where n 1 RX ξ Xi , n = √ n i=1 i
n 1 RYn = √ ξ Yi n i=1 i
n 1 ξ and Rn = √ ξ , n i=1 i
and all converging weakly to ZX , ZY and Zξ1 , respectively, in virtue of Theorem 2. Thus, by reasoning as in the proof of 1 Theorem 4 it is deduced that
Y ξ (E (Y ) − Y )RX n + (E (X ) − X )Rn + (E (Y ) − Y )(E (X ) − X )Rn converges in probability to 0 P − a.s. and, thus, S n converges weakly to ZQ P − a.s., which concludes the proof.
4. Bootstrap for empirical processes The consistency of the weighted bootstrap for i.i.d. random elements in a separable Hilbert space stated in Corollary 1 can be proved by using the linkage established in Theorem 1 between the spaces H and l ∞ (F ). To this aim, some basic notation and the main well-known results concerning bootstrap for empirical processes are needed.
124
G. González-Rodríguez, A. Colubi / Econometrics and Statistics 1 (2017) 118–127
Let (, A, P ) be a probability space, and let X1 , . . . , Xn be a sample of independent random elements associated with a probability measure P and taking on values in an arbitrary space X . The empirical measure is defined as
Pn ( ω ) =
n 1 δXi (ω ) , n i=1
for all ω ∈ , where δ x is the Dirac delta function at x ∈ X . Let F be any class of measurable functions f : X → R. Then, the empirical process associated with F is defined as a functional (indexed by F) denoted by {Pn f, f ∈ F }, where
Pn f =
f dPn =
n 1 f (Xi ). n i=1
Analogously, Pf will denote ∫ fdP (see, e.g. Gine and Zinn, 1984). For instance, if X = R and F = {I(−∞,t] , t ∈ R}, then the empirical distribution function
Fn (t ) =
n 1 I(−∞,t] (Xi ) n i=1
is the empirical process {Pn f, f ∈ F }. Let GP = {GP ( f ), f ∈ F } be a centered Gaussian process indexed by F with covariance
EGP ( f1 )GP ( f2 ) =
f1 f2 dP −
f1 dP
f2 dP,
and let γ P be its associated measure. Recall that l ∞ (F ) denotes the space {g : F → R, gF < ∞} with gF = sup f ∈F |g( f )|, which is a Banach space when the sum and the product are defined pointwise. Then, it is said that F ∈ CLT (P ) if, and only if, {n−1/2 (Pn − P )( f ), f ∈ F } converges weakly in l ∞ (F ) to a Radon centered Gaussian probability measure, γ P , on l ∞ (F ) (see, e.g. Gine and Zinn, 1990). The result of consistency of the naive bootstrap in separable Banach spaces is derived in (Gine and Zinn, 1990) as a corollary of the same result for general empirical measures. To this aim, let F (x ) = sup f ∈F | f (x )| for all x ∈ X , and assume that F(x) < ∞ for all x ∈ X . The class of functions F must fulfill a number of measurability conditions that guarantee, among other things, that Pn can be randomized. Specifically, it is required that F ∈ M (P ), that is, F is nearly linearly deviation measurable for P, and such that the class of functions { f 2 : f ∈ F } and {( f − g)2 : f, g ∈ F } are nearly linearly supremum measurable for P. Particularly it is verified that F ∈ M (P ) whenever F is image admissible Suslin (see Gine and Zinn, 1990 and Dudley, 1982 for the details). Let {Xnωj } j=1,...,n be a sequence of independent random elements distributed according to Pn (ω ), and let Pn (ω ) be the empirical measure based on {Xnωj } j=1,...,n . Then, the following are equivalent (Gine and Zinn, 1990): a) ∫ F2 dP < ∞ and F ∈ CLT (P ). b) There exists a centered Gaussian process G on F whose law is Radon in l ∞ (F ) such that n1/2 (Pn (ω ) − Pn (ω )) → G weakly in l ∞ (F ) P − a.s. If either a) or b) holds, then G = GP . This result states the almost sure consistency of the naive bootstrap for general empirical measures. According to Gine and Zinn (1990), by mimicking the general proof, the almost sure consistency of the naive bootstrap for i.i.d. random elements in a separable Banach space (and consequently also for separable Hilbert spaces) is deduced. In detail, let B be a separable Banach space, if {Xi }i∈N are i.i.d. B-valued random elements, then in (Gine and Zinn, 1990) is proven that
E X1 2 < ∞ and X1 ∈ CLT ⇔
n
(Xn j − X n )/n1/2 → GX weakly P − a.s.
j=1
In a similar way, regarding the practical weighted bootstrap, Praestgaard and Wellner (1993) established that by considering an array of random variables {Wnj }nj=1 verifying conditions A1-5 (see Section 2), and if F ∈ M (P ) is a class of functions verifying ∫ F2 dP < ∞ such that F ∈ CLT (P ), then
n 1/2
n 1 Wn j δX jω − Pn n
→ c GP
weakly in l ∞ (F ) P − a.s.
j=1
As a corollary, Praestgaard and Wellner (1993) establishes the almost sure consistency of the naive bootstrap with ω } different resampling sizes, that is, given a sequence {Xm of independent random elements distributed according to j j=1,...,m Pn (ω ) with associated empirical measure Pm,n (ω ), if F ∈ M (P ), ∫ F2 dP < ∞ and F ∈ CLT (P ), then
m1/2 (Pm,n (ω ) − Pn (ω )) → GP as min{m, n} → ∞
G. González-Rodríguez, A. Colubi / Econometrics and Statistics 1 (2017) 118–127
125
weakly in l ∞ (F ) P − a.s. If the weights {Wnj }nj=1 verify A1 and A2, together with some simple conditions involving second and fourth order moments (see Praestgaard and Wellner, 1993, Lemma 3.1), then A3-5 are also satisfied. This is the case of the naive bootstrap or the double bootstrap, which consists in drawing a sample of size n from the bootstrap sample with replacement. Thus, the almost sure consistency of the naive bootstrap for empirical measures is also proven by Praestgaard and Wellner (1993) through the weighted bootstrap approach, although in this case the technique only provides with sufficient conditions. Additional examples of exchangeable weights verifying A1-5 can be found in (Praestgaard and Wellner, 1993), including the Delete hn Jacknife, Bayesian bootstrap, and the Urn-model bootstrap, among others. The consistency of all of them when dealing with empirical process indexed by an appropriate class of functions is established by the previous result. 4.1. The unit ball of the dual of a separable Hilbert space In Section 2, when dealing with H−valued random elements, the unit ball of the dual space was chosen as an index class of functions F. Indeed, given any h ∈ H with h = 1, let f h : H → R such that fh (x ) =< x, h > for all x ∈ H; then, by virtue of the Riesz representation theorem, F can be alternatively rewritten in the following way:
F = { fh | h ∈ H, h = 1}. Thus, for this class it is verified that
F (x ) = sup | f (x )| = f ∈F
sup
h∈H,h=1
| < x, h > | = x < ∞
for all x ∈ H. The following result ensures that F is a suitable class of functions to guarantee the almost sure consistency of different exchangeable weighted bootstraps and, as a result, the naive bootstrap. Theorem 6. If X1 , X2 , . . . is a sequence of i.i.d. H−valued random elements so that EX1 2 < ∞, then the class of functions F verifies the following properties (a) ∫ F2 dP < ∞. (b) F ∈ M (P ). (c) F ∈ CLT(P ). Proof. Concerning (a), F 2 dP = E (X1 2 ), which is finite by hypothesis. With respect to (b), by considering the weak topology in H it follows that the closed unit ball of the dual space of H, F, is w∗ -compact (Alaoglu’s Theorem) and w∗ -metrizable and thus separable (see, e.g., Aliprantis and Border (2006), Theorems 6.21, 6.30 and 6.31). Consequently, F is a complete and separable metric space (w.r.t. the weak topology), and thus a Polish space. Consider the mapping
g : (H, || · || ) × (F, w∗ ) → R, given by g(h, f ) = f (h ) for all h ∈ H and all f ∈ F. The mapping is jointly measurable (see, e.g., Aliprantis and Border (2006), Corollary 6.40) and, consequently, F is image admisible Suslin. Thus, F ∈ M (P ). Finally, regarding (c ), the CLT for i.i.d. separable Hilbert-valued random elements (see, e.g., Laha and Rohatgi (1979)) ensures that n 1 (Xi − E (X1 )) → ZX √ n i=1
weakly in H, where ZX is a centered Gaussian H-valued random element so that E (ZX h, ZX ) = EX h, X − EX h, EX for all h ∈ H. The Continuous Mapping Theorem applied to D guarantees that
n 1 D √ (Xi − E (X1 ) ) n i=1
= n 1/2 ( P n − P ) → D ( Z X )
weakly in l ∞ (F ). By definition of weak convergence in l ∞ (F ) (see Hoffmann-Jorgensen, 1974), the measure induced by D(ZX ) is a Radon probability measure and consequently tight. According to Kosorok (2008) (Proposition 7.5 – there are alternative equivalent conditions in this setting, and it does not matter which particular definition is used in (Gine and Zinn, 1990)), D(ZX ) is Gaussian if and only if (D(ZX )( f1 ), . . . , D(ZX )( fk )) is a multivariate normal distribution for any finite set { f 1 , . . . , fk } ∈ F. According to Riesz’s representation theorem associated with each f i ∈ F, there exists a unique hi ∈ H such that
(D(ZX )( f1 ), . . . , D(ZX )( fk ) ) = ( f1 (ZX ), . . . , fk (ZX )) = (h1 , ZX , . . . , hk , ZX ).
126
G. González-Rodríguez, A. Colubi / Econometrics and Statistics 1 (2017) 118–127
Now, given {α1 , . . . , αk } ⊂ Rk with α i = 0 for all i ∈ {1, . . . , k}, k
αi D(ZX )( fi ) =
i=1
k
αi hi , ZX ,
i=1
which is Gaussian and centered. Thus, it can be ensured that D(ZX ) is a centered Radon Gaussian process in l ∞ (F ), and consequently that F ∈ CLT(P ), proving c). As a consequence of the previous Theorem, by considering the unit ball of the dual space as class of index functions and whenever the weights verify the conditions A1-5, according to Praestgaard and Wellner (1993) it is verified that
n
1/2
n 1 Wn j δX jω − Pn n
→ c GP
weakly in l ∞ (F ) P − a.s.
j=1
On the other hand, the function D links conveniently the different functionals involved in the bootstrapped CLT in both spaces. In detail, for any array of random variables {Wnj }nj=1 the linearity of D implies that
n n 1 1 ω D √ Wn j X j − √ Xj n j=1 n j=1
=n
1/2
n 1 Wn j δX jω − Pn . n j=1
Thus in view of Theorem 1, the consistency of the weighted bootstrap for i.i.d. random elements in a separable Hilbert space (Corollary 1) is straightforwardly deduced by applying the Continuous Mapping Theorem to D. 4.2. Additional notes on L2 spaces In the functional data analysis context it is quite usual to consider H to be an appropriate L2 space. In general, let (T, , μ) be a measurable space such that μ(T) < ∞ and with the metric ρ (A, B ) = μ(A B ) is separable. Then
L2 ( T , μ ) =
f : T → R measurable
| f (t )|2 μ(dt ) < ∞ T
with the sum and product by a scalar defined pointwise and the inner product
f, g =
T
f (t )g(t )μ(dt )
for all f, g ∈ L2 (T , μ ),
is a separable Hilbert space (see Bruckner et al., 2008). The usual selection in the functional data context, with T = [0, 1], = the completion of the Borel σ − field and μ = λ, leads to a separable Hilbert space. In this case, for any x, h ∈ H with h = 1 it is verified that
D(x )( fh ) = fh (x ) =< h, x >=
[0,1]
x(t )h(t )λ(dt ),
and both Corollary 1 and Theorem 2 hold. This guarantees the consistency of the wild bootstrap and exchangeable weighted bootstraps (these last whenever the weights verifies conditions A1-5). In these cases ZX is a centered Gaussian H-valued random element whose covariance operator reduces to
C (y )(t ) =
[0,1]
y(s )Cov(X (s ), X (t ))λ(ds ),
for all t ∈ [0, 1] and all y ∈ L2 ([0, 1], λ ). 5. Conclusions A framework has been provided to use previous key results to guarantee the validity of various bootstrap methods in a practical way. The original results refer to the consistency of disparate bootstrap approaches for the sample mean in the abstract context of indexed empirical processes. By means of a link function it is possible to derive analogous results in the context of separable Hilbert spaces, which includes the case of the usual functional spaces. In this setting, the consistency of a number of bootstrap methods has been specified, although the list is not exhaustive, and the same procedure could be applied for other cases. The importance of this result arises from the fact that many statistics used in practice can be written in terms of a mean in a proper space. As an example, bootstrap methods have been considered to solve a hypothesis testing problem in the context of a regression model with Hilbertian response and scalar predictors. Specifically, a test statistic for independence is proposed, and it is shown how to reformulate it in terms of a sample mean in a direct sum Hilbert space. Thus, the previous results become applicable once the consistency of the asymptotic test is proved. This example highlights how simple is to exploit the approach and its notable potential applicability for use in the context of functional and other complex data analyses.
G. González-Rodríguez, A. Colubi / Econometrics and Statistics 1 (2017) 118–127
127
Acknowledgments The research in this paper has been partially supported by the Spanish Ministry of Science and Innovation Grant MTM2013-44212-P, the CRoNoS COST Action IC1408 and the grant of the Principado de Asturias, Spain, GRUPIN14-005. The authors would like to thank the anonymous associated editor and referees for their valuable suggestions. They have contributed to substantially improve the manuscript. References Aliprantis, C., Border, K., 2006. Infinite Dimensional Analysis. Springer, Berlin. Araujo, A., Giné, E., 1980. The Central Limit Theorem for Real- and Banach-Valued Random Variables. Wiley, New York. Biglieri, E., Yao, K., 1989. Some properties of singular value decomposition and their applications to digital signal processing. Signal Process. 18, 277–289. Bruckner, A., Bruckner, J., Thomson, B., 2008. Real Analysis. ClassicalRealAnalysis.com Cardot, H., Cénac, P., Zitt, P.-A., 2013. Efficient and fast estimation of the geometric median in hilbert spaces with an averaged stochastic gradient algorithm. Bernoulli 19, 18–43. Cardot, H., Ferraty, F., Sarda, P., 1999. Functional linear model. Stat. Probab. Lett. 45, 11–22. Cuevas, A., Febrero, M., Fraiman, R., 2006. On the use of the bootstrap for estimating functions with functional data. Comput. Stat. Data Anal. 51, 1063–1074. Dudley, R., 1982. A Course on Empirical Processes. Springer, Berlin. Faraway, J., 1997. Regression analysis for a functional response. Technometrics 39, 254–261. Ferraty, F., Keilegom, I., Vieu, P., 2010. On the validity of the bootstrap in non-parametric functional regression. Scand. J. Stat. 37, 286–306. Ferraty, F., Vieu, P., 2006. Nonparametric Functional Data Analysis. Springer, Berlin. Gabrys, R., Kokoszka, P., 2007. Portmanteau test of independence for functional observations. J. Am. Stat. Assoc. 102, 1338–1348. Gine, E., Zinn, J., 1984. Some limit theorems for empirical processes. Ann. Probab. 12, 929–989. Gine, E., Zinn, J., 1990. Bootstrapping general empirical measures. Ann. Probab. 18, 851–869. González-Rodríguez, G., Colubi, A., Gil, M., 2012. Fuzzy data treated as functional data: A one-way anova test approach. Comput. Stat. Data Anal. 56, 943–955. Hoffmann-Jorgensen, J., 1974. Sums of independent Banach space valued random variables. Stud. Math. 52, 59–186. Hormann, S., Kokoszka, P., 2010. Weakly dependent functional data. Ann. Stat. 38, 1845–1884. Kosorok, M., 2003. Bootstraps of sums of independent but not identically distributed stochastic processes. J. Multivar. Anal. 84, 299–318. Kosorok, M., 2008. Introduction to Empirical Processes and Semiparametric Inference. Springer, New York. Laha, R., Rohatgi, V., 1979. Probability Theory. Wiley, New York. Ledoux, M., Talagrand, M., 1988. Un critere sur les petite boules dans le theoreme limite central. Probab. Theory Relat. Fields 77, 29–47. Ledoux, M., Talagrand, M., 1991. Probability in Banach Spaces. Springer, Berlin. Li, Y., Hsing, T., 2010. Deciding the dimension of effective dimension reduction space for functional and high-dimensional data. Ann. Stat. 38, 3028–3062. Praestgaard, J., Wellner, J., 1993. Exchangeably weighted bootstraps of the general empirical process. Ann. Probab. 21, 2053–2086. Ramsay, J., Silverman, B., 2005. Functional Data Analysis. Springer, New York. Wang, S., Huang, M., Wu, X., Yao, W., 2016. Mixture of functional linear models and its application to CO2-GDP functional data. Comput. Stat. Data Anal. 97, 1–15. Wellner, J., 1992. Bootstrap limit theorems: a partial survey. In: Saleh, A.K.M.E. (Ed.), Nonparametric Statistics and Related Topics. North-Holland, Amsterdam, pp. 313–329. Yao, F., Mueller, H.-G., Wang, J.-L., 2005. Functional data analysis for sparse longitudinal data. J. Am. Stat. Assoc. 100, 577–590.