Frequency domain bootstrap for ratio statistics under long-range dependence

Frequency domain bootstrap for ratio statistics under long-range dependence

Journal of the Korean Statistical Society xxx (xxxx) xxx Contents lists available at ScienceDirect Journal of the Korean Statistical Society journal...

440KB Sizes 0 Downloads 32 Views

Journal of the Korean Statistical Society xxx (xxxx) xxx

Contents lists available at ScienceDirect

Journal of the Korean Statistical Society journal homepage: www.elsevier.com/locate/jkss

Frequency domain bootstrap for ratio statistics under long-range dependence ∗

Young Min Kim , Jongho Im Kyungpook National University, Yonsei University, Republic of Korea

article

info

Article history: Received 30 May 2018 Accepted 6 March 2019 Available online xxxx AMS 2000 subject classifications: primary 62G09 secondary 62P20 Keywords: Frequency domain bootstrap Long-range dependence Ratio statistics Spectral density Normalized periodogram ordinates

a b s t r a c t A frequency domain bootstrap (FDB) is a common technique to apply Efron’s independent and identically distributed resampling technique (Efron, 1979) to periodogram ordinates – especially normalized periodogram ordinates – by using spectral density estimates. The FDB method is applicable to several classes of statistics, such as estimators of the normalized spectral mean, the autocorrelation (but not autocovariance), the normalized spectral density function, and Whittle parameters. While this FDB method has been extensively studied with respect to short-range dependent time processes, there is a dearth of research on its use with long-range dependent time processes. Therefore, we propose an FDB methodology for ratio statistics under long-range dependence, using semi- and nonparametric spectral density estimates as a normalizing factor. It is shown that the FDB approximation allows for valid distribution estimation for a broad class of stationary, long-range (or short-range) dependent linear processes, without any stringent assumptions on the distribution of the underlying process. The results of a large simulation study show that the FDB approximation using a semi- or nonparametric spectral density estimator is often robust for various values of a long-memory parameter reflecting magnitude of dependence. We apply the proposed procedure to two data examples. © 2019 The Korean Statistical Society. Published by Elsevier B.V. All rights reserved.

1. Introduction The bootstrap method known as Efron’s independent and identically distributed (iid) bootstrap (individual data resampling) (Efron, 1979) is a powerful tool for approximating certain statistical properties, such as variance, bias, or distribution. In particular, it has been mainly leveraged for statistics whose analytic forms of certain properties cannot be easily obtained without expending excessive calculation efforts. In time series analysis, Singh (1981) reports that under short-range dependence (SRD), Efron’s iid bootstrap could be invalid. Therefore, to address this issue with respect to the time domain, Carlstein (1986) suggests the nonoverlapping block bootstrap, Künsch (1989) proposes the moving block bootstrap, and Bühlmann (1997) puts forward the autoregressive (AR) sieve bootstrap. Additionally, the recent extension of bootstrap methods under SRD has proceeded to long-range dependent (LRDt) time processes, even though Lahiri (1993) shows that the moving block bootstrap could be invalid in approximating the sample means for a class of LRDt time series generated by transformations of Gaussian processes. Kim and Nordman (2011) investigated the properties of bias, variance, and distribution of sample means using moving and nonoverlapping block bootstrap for stationary linear LRDt processes; they considered the optimal block choice, based ∗ Corresponding author. E-mail addresses: [email protected] (Y-M. Kim), [email protected] (J. Im). https://doi.org/10.1016/j.jkss.2019.03.001 1226-3192/© 2019 The Korean Statistical Society. Published by Elsevier B.V. All rights reserved.

Please cite this article as: Y-M. Kim and J. Im, Frequency domain bootstrap for ratio statistics under long-range dependence. Journal of the Korean Statistical Society (2019), https://doi.org/10.1016/j.jkss.2019.03.001.

2

Y-M. Kim and J. Im / Journal of the Korean Statistical Society xxx (xxxx) xxx

on the large sample mean squared error of a bootstrap variance estimator. The AR sieve bootstrap under long-range dependence (LRD) has been justified for causal linear LRDt time processes under certain conditions of the theorem, as presented in Kapetanios and Psaradakis (2006); additionally, Bühlmann (1997), Kapetanios and Psaradakis (2006), and Poskitt (2008) each investigated the optimal order selection method for the sieve bootstrap under SRD or LRD. In comparison to time domain bootstrap methods, a frequency domain bootstrap (FDB) method involves resampling periodogram ordinates that are studentized by spectral density estimates under weak dependence (Dahlhaus & Janas, 1996; Franke & Härdle, 1992). Franke and Härdle (1992) developed the FDB using kernel-based spectral density estimates and investigated their consistency. However, the method works only under a certain class of statistics, such as ratio statistics—normalized spectral means, autocorrelation estimates, normalized spectral density function estimates, and estimates of Whittle parameters (Dahlhaus & Janas, 1996). The conventional nonparametric spectral density estimation (NSDE) for the FDB under weak dependence uses a kernel function to smooth periodogram ordinates. For stationary and linear LRDt time processes, Kim and Nordman (2013) investigated an FDB for approximating the distribution of Whittle estimators, by using parametric spectral density estimates under LRD. Semiparametric approaches to estimating spectral density by using fractional exponential and fractional AR models have been suggested by Bhansali, Giraitis, and Kokoszka (2006), Hurvich and Brodsky (2001), Moulines and Soulier (1999, 2000), and Narukawa and Matsuda (2011). Those methods use the log-periodogram regression approach to combine the long-memory term, and the AR-approximate parametric term of a short-memory part of the spectral density function of fractional exponential models. Recently, Kim, Lahiri, and Nordman (2018) examined the NSDE under LRD, based on the smoothness of the periodogram by a kernel function; they provide optimal kernel bandwidths based on uniform and pointwise concepts. Our goal in the current study is to establish the FDB inference about ratio statistics for a different but practically wide class of stationary linear processes exhibiting strong dependence; these include popular stationary linear LRDt models such as fractional Gaussian processes (FGN; Mandelbrot & Van Ness, 1968) and the fractional autoregressive integrated moving average (FARIMA; Adenstedt, 1974; Granger & Joyeux, 1980; Hosking, 1981). In the current study, we present the FDB method, to be consistent with distribution estimation under mild and flexible conditions entailing strong dependence. The remainder of this paper is organized as follows. In Section 2, we describe the FDB method under LRD and show the validity of FDB inference for ratio statistics under LRD. We investigate in Section 3 numerical studies while considering two spectral density estimation techniques, such as semi- and nonparametric approaches. In Section 4, we provide data examples to estimate confidence intervals, and finally we provide concluding remarks in Section 5. Proofs for theorems, and some simulation results, are found in Appendix A. 2. Frequency domain bootstrap under long-range dependence 2.1. Target process Suppose that {Xt }t ∈Z is a real-valued, stationary linear process defined as Xt = µ +



bj εt −j ,

t ∈ Z,

(1)

j∈Z 2 2 4 where µ = EXt and {ε∑ t }t ∈Z is an iid sequence with Eεt = 0, Eεt = σε ∈ (0, ∞) and Eεt < ∞. The sequence of constants, 2 {bt }t ∈Z ⊂ R, satisfies t ∈Z bt < ∞ with b0 = 1. The spectral density function of the process {Xt }t ∈Z is defined as

σε2 |b(λ)|2 λ ∈ Π ≡ (−π, π], 2π √ ∑ where b(λ) = −1. In addition, the spectral density function, f (·), has the common characteristics j∈Z bj exp(ıjλ) and ı = f (λ ) =

of an LRDt process, including f (λ) ∼ Cf |λ|−(1−θ )

as λ ↓ 0,

(2)

with a long-memory parameter θ ∈ (0, 1) and a constant Cf ≡ Cf (θ ) > 0. Here, ‘‘∼’’ denotes that the ratio of quantities on the left and right-hand sides of (2) is 1 at the limit. If θ = 1, the process is a short-memory one. An alternative ∫ formulation for showing the properties of long memory is a covariance function of the process r(k) ≡ Cov(X0 , Xk ) = Π e−ıkλ f (λ)dλ, k ∈ Z, which satisfies a slow decay condition as r(k) ∼ Cr k−θ

as

k → ∞,

(3)

∑n

for θ ∈ (0, 1) and some constant Cr ≡ Cr (θ ) > 0, whereby the partial covariance summation, , diverges k=1 r(k) ∝ n as n → ∞. Characterizations of long memory through the properties of covariance (3) or the spectral density (2) are related (see Beran, 1994; Robinson, 1995a). 1−θ

Please cite this article as: Y-M. Kim and J. Im, Frequency domain bootstrap for ratio statistics under long-range dependence. Journal of the Korean Statistical Society (2019), https://doi.org/10.1016/j.jkss.2019.03.001.

Y-M. Kim and J. Im / Journal of the Korean Statistical Society xxx (xxxx) xxx

3

2.2. Frequency domain bootstrap procedure Let the discrete Fourier transform with the process X1 , . . . , Xn satisfying (1) be defined as d n (λ ) =

n ∑

Xt exp(ıt λ),

λ ∈ Π,

t =1

and let the periodogram be defined by 1

|dn (λ)|2 , λ ∈ Π . (4) 2π n The FDB method has been considered as one used to transform the time data, X1 , . . . , Xn , into a set of N approximately independent periodogram ordinates at the Fourier frequency λj , j = 1, 2, . . . , N, with N ≡ ⌊(n − 1)/2⌋, under SRD. The estimation of the spectral density function for stationary time series relies heavily on the asymptotic results of the periodogram ordinates of the process. A bootstrap method using the periodogram ordinates can be directly applied for Gaussian processes (Franke & Härdle, 1992; Nordgaard, 1992). We provide below the procedure to derive an FDB approximation. In (λ) =

Frequency domain bootstrap procedure 1. 2. 3. 4.

Obtain periodogram ordinates In (λj ) defined as (4) for j = 1, 2, . . . , N from the time process X1 , . . . , Xn . Obtain estimates fˆn (λj ) of the spectral density f (·) for j = 1, 2, . . . , N. Normalize periodogram ordinates with spectral density estimates as εˆ j ≡ In (λj )/fˆn (λj ) for j = 1, 2, . . . , N. ∑N Rescale the normalized periodogram ordinates as ε˜ j ≡ εˆ j /ˆε· for j = 1, 2, . . . , N, where εˆ · = N −1 j=1 εˆ j .

5. Draw independent bootstrap residuals ε1∗ , ε2∗ , . . . , εN∗ from the empirical distribution of the rescaled periodogram ordinates as {˜εj : j = 1, 2, . . . , N }. 6. Define bootstrap periodogram ordinates via In ∗ (λj ) ≡ fˆn (λj )εj∗ for j = 1, 2, . . . , N. The rescaled bootstrap residuals, εj∗ , j = 1, 2, . . . , N, have a mean of 1 with respect to the empirical distribution of {˜ε1 , . . . , ε˜ N }. This is asymptotically a correct value, as the true rescaled periodogram ordinates, εj = In (λj )/f (λj ), are asymptotically distributed as an exponential variable (Beran, 1994; Beran, Feng, Ghosh, & Kulik, 2013). In addition, by using the rescaling technique, one can preclude additional bias at the resampling step. Dahlhaus and Janas (1996) utilized kernel-based spectral density estimates under SRD, and Kim and Nordman (2013) considered parametric spectral density estimates for the FDB on Whittle estimation under LRD. In the current study, we apply semi- and nonparametric spectral density estimates under LRD to the FDB distribution inference. 2.3. Frequency domain bootstrap for ratio statistics The main objective of the current study is to prescribe an FDB method for ratio statistics under LRD, in an extension of the work of Dahlhaus and Janas (1996) under SRD. The bootstrapped replicates In∗ (λ) cannot appropriately capture certain sampling properties of the periodogram In (λ), because of the bootstrapping manner (Dahlhaus & Janas, 1996; Paparoditis, 2002). In other words, periodogram ordinates are not independent of each frequency, but the bootstrap method makes independent bootstrap replicates of periodogram ordinates. This problem creates different limiting variance estimators for certain periodogram statistics, such as the spectral mean and their bootstrap versions. Ratio statistics constitute some of the statistics whose limiting variance estimator is not influenced by the sampling properties of periodogram ordinates. See Dahlhaus and Janas (1996) and Paparoditis (2002) for more details. Let A(φ, F ) be the normalized spectral mean defined as A (φ, F ) ≡

π



∫π φ (λ)F (λ)dλ =

0

0

φ (λ)f (λ)dλ ∫π , f (λ)dλ 0

∫π

where F (λ) ≡ f (λ)/ 0 f (x)dx, and φ : Π → R is a function of λ ∈ Π and will be defined later in this subsection. The corresponding normalized spectral mean estimate is A (φ, Ln ) ≡

π



∫π φ (λ)Ln (λ)dλ =

0

0

φ (λ)In (λ)dλ ∫π , I (λ)dλ 0 n

∫π

where Ln (λ) ≡ In (λ)/ 0 In (λ)dλ. For example, if φ (λ) = 2 cos(kλ), k ∈ Z, A(φ, Ln ) is an estimator for the autocorrelation, and if φ (λ) = I[0,x] (λ), x ∈ [0, π], A(φ, Ln ) is the estimator of the normalized spectral density, the asymptotic distribution of the normalized spectral estimator is equal to that of the corresponding bootstrap statistic, B(φ, L∗n ) ≡

N 2π ∑

n

j=1

φ (λj )L∗n (λj ),

L∗n (λj ) ≡

I n ∗ (λ j ) 2π n

∑N

i=1 In



(λi )

.

Please cite this article as: Y-M. Kim and J. Im, Frequency domain bootstrap for ratio statistics under long-range dependence. Journal of the Korean Statistical Society (2019), https://doi.org/10.1016/j.jkss.2019.03.001.

4

Y-M. Kim and J. Im / Journal of the Korean Statistical Society xxx (xxxx) xxx

To bootstrap the distribution of A(φ, Ln ) − A(φ, F ), we use the bootstrap statistic B(φ, L∗n ) − B(φ, Fˆn ), where B(φ, Fˆn ) ≡

N 2π ∑

n

φ (λj )Fˆn (λj ),

Fˆn λj =

( )

j=1

2π n

fˆn (λj ) . ∑N fˆn (λi ) i=1

Here, fˆn (·) is any estimate of a spectral density function for a sample of time series. Let ℓˆ n and ℓˆ ∗n be defined as

ℓˆ n ≡ A(φ, Ln ) − A(φ, F ) and ℓˆ ∗n ≡ B(φ, L∗n ) − B(φ, Fˆn ). Then, we have the result where the bootstrap distributional approximation holds for ratio statistics (Dahlhaus & Janas, 1996). We can now prove that the bootstrap approximation holds for the LRDt process. Consequently, the involved assumptions for the FDB distributional results under LRD are as outlined below. Assumptions. A.1 f (λ)|λ|1−θ and f −1 (λ)|λ|−(1−θ ) are bounded and continuous on Π . A.2 φ is an even, integrable function, such that |φ (λ)| ≤ C |λ|γ , λ ∈ Π , where 0 ≤ γ < 1 and θ + γ > 1/2. A.3 For φ : Π → R, one of the following is fulfilled: C.1 φ is a Lipschitz of an order greater than 1/2 on [0, π]. C.2 φ is continuous on Π and

⏐ ⏐ ⏐ ∂φ (λ) ⏐ γ −1 ⏐ ⏐ ⏐ ∂λ ⏐ ≤ C |λ| ,

for some

2θ + γ > 1.

C.3 φ is of bounded variation on [0, π] with finite discontinuities and θ > 1/2 with |r(k)| ≤ Ck−ν for some ν > 1/2 (e.g., ν = θ ), where r(k) is an autocovariance function with a lag k. A.4 The spectral density estimator fˆn (λ) of f (λ) satisfies

⏐ ⏐ ⏐ fˆ (λ) ⏐ ⏐n ⏐ p sup ⏐ − 1⏐ −→ 0 as n → ∞. ⏐ ⏐ λ∈(0,π ] f (λ) A.5 On (0, π], either (i) f is differentiable and |∂ f (λ)/∂λ| = O |λ|θ −2−η for −(2 − θ ) < η < −θ as λ → 0, or each f (( λ)φ (λ) is of bounded variation or )is a piecewise Lipschitz of an order greater than 1/2 on (0, π]. As n → ∞,

(

}N

P 0 ∈ ch◦ π φ (λj )[In (λj ) − f (λj )]

{

i=1

)

→ 1, where ch◦ A denotes the interior convex hull of a finite set A ⊂ R.

Remark 1. Assumption A.1 entails that the spectral density f under LRD can be appropriately transformed into a bounded, or SRD-like, density f , as suggested by the pole behavior (2). For the Whittle estimation, Fox and Taqqu (1986) assume a similar condition with parametric spectral density functions under LRD. In particular, the density smoothness condition in A.1 holds for FARIMA and fractional Gaussian noise models for stationary linear LRDt processes, as does the stronger condition in A.5. In addition, the behavior of φ in Assumption A.1 controls the growth rate of the scaled periodogram ordinates, φ (λj )In (λj ), at low frequencies under LRD. Remark 2. Assumption A.2 outlines the smoothness criteria for the function φ . The functions treated by the FDB distributional inference in the study of Dahlhaus and Janas (1996) satisfy A.2, including those for autocorrelations and normalized spectral distribution. Remark 3. Hannan (1973) considered the Whittle estimation for the autoregressive moving average (ARMA) spectral densities for which a function φ satisfies C.1 of A.3. The functions f −1 associated with the fractional Gaussian and FARIMA densities fulfill C.2 of A.3 (Dahlhaus, 1983; Fox & Taqqu, 1986; Giraitis & Surgailis, 1990). A process dependence that is not extremely strong – for example, f 2 – is integrable, and allows greater flexibility in choosing more general functions in C.3 of A.3. Remark 4. In Assumption A.4, the convergence requirement supλ∈(0,π] |fˆn (λ) − f (λ)| = op (1) is mild and fulfilled by standard semi- and nonparametric approaches for estimating the spectral density function. Remark 5. The FARIMA process forms (5). Its spectral density forms 2

f (λ) =

σε2 |ψ (eιλ )| θ −1 |1 − eιλ | ∼ C |λ|θ −1 as λ → 0, 2π |ϱ(eιλ )|2

which satisfies Assumption A.5. Please cite this article as: Y-M. Kim and J. Im, Frequency domain bootstrap for ratio statistics under long-range dependence. Journal of the Korean Statistical Society (2019), https://doi.org/10.1016/j.jkss.2019.03.001.

Y-M. Kim and J. Im / Journal of the Korean Statistical Society xxx (xxxx) xxx

5

Remark 6. Since the FDB method for ratio statistics is applied in the LRDt processes, Assumptions A.1, A.4, and A.5 differ. For Assumption A.4, the spectral density under LRD has a pole at zero frequency, and λ does not include 0 for the consistent assumption. For Assumptions A.1 and A.4, since the spectral density function under LRD is not finite, the additional assumptions of f (λ) and λ ∈ (0, π] are added. Theorem 1. Let {Xt } be a real-valued stationary linear process satisfying (1) and (2). Under Assumptions A.1–A.5, for almost all samples (i.e., periodogram ordinates {In (λj )} for j = 1, . . . , N with N ≡ ⌊(n − 1)/2⌋), as n → ∞,

√ d nℓˆ n −→ N(0, V ), √ ∗ d N(0, V ) in)probability, and )⏐ 2. nℓˆ n −→ ⏐ (√ ( ⏐ ⏐ p ∗ √ ˆ∗ ˆ 3. supx∈R ⏐P nℓ n ≤ x − P nℓn ≤ x ⏐ −→ 0, 1.

where V =π

π



{

F 2 (x) φ (x) −

π



}2 φ (y)F (y)dy dx.

0

0

Theorem 1 states that the FDB approximation under LRD holds for the distribution of ratio statistics that fulfill Assumptions A.1–A.5. This implies that the resampling procedure conducted on the standardized periodogram ordinates is also consistent even for LRD, and not solely for SRD. 3. Numerical studies 3.1. Simulation setup In this section, we undertake FDB distribution approximation for the ratio statistics of stationary linear LRDt processes. In the following simulation study, we consider several types of FARIMA (p, d, q) processes (cf. Beran et al., 2013, p.738), defined by Xt = (1 − B)−d

ψ (B) εt , ϱ(B)

(5)

with∑ memory parametrization ∑q d = j(1 − θ )/2 ∈ [0, 1/2) for θ ∈ (0, 1], where B is a backshift operator, and ϱ(z) = p j 1− j=1 ϱj z and ψ (z) = j=1 ψj z are the AR and moving average (MA) polynomials (i.e., no common roots and all roots outside the unit circle), respectively. Additionally, {εt }t ∈Z denotes a sequence of iid mean-zero random variables. Note that the FARIMA process has a linear form (1) and corresponds to an ARMA model when d = 0 (θ = 1). In addition, its spectral density and covariance satisfy (2) and (3) under LRD, respectively (Beran, 1994; Beran et al., 2013). For the target processes of the simulation study, we utilize three types of FARIMA models, given by FARIMA (0, d, 0), FARIMA (1, d, 0) with AR parameter ϱ = 0.3, or ϱ = −0.7, FARIMA (0, d, 1) with MA parameter ψ = 0.3 or ψ = −0.7. Three different are used for the innovation {εt }, and they are either standard normal, centered χ12 − 1, or √ distributions √ uniform (− 3, 3). Moreover, we examine the FDB method over a range of long-memory parameters θ ∈ {0.2, 0.5, 0.8} corresponding to d ∈ (0.40, 0.25, 0.10), respectively, with sample sizes of n ∈ {250, 500, 1, 000}. The sample size and the memory parameter are expected to more critically impact performance in the FDB distribution estimation, compared to the innovation type. The target ratio statistic is the autocorrelation function of a lag of 1. The theoretical autocovariance function for FARIMA (p, d, q) processes with d, p, and q as the number of AR and MA parameters, respectively, is written as r(k) = σε2

q p ∑ ∑

ς (l)ξs M(d, p + l − k, ϕs ),

(6)

l=−q s=1

with ς (l) =

∑min(q,q+l)

j=max(0,l)

[ ∏ ]−1 [ ] ∏ ψj ψj−l , ξs = ϱs pi=1 (1 − ϱi ϱs ) m̸=s (ϱs − ϱm ) and M(d, k, ϱ) = r0 (k) ϱ2p ν (k) + ν (−k) − 1 /

σε2 , where ν (k) = GH(d + k, 1, 1 − d + k, ϱ) and the Gaussian hypergeometric function is defined by GH(x1 , x2 , x3 , y) = 1 +

x1 · x2 x3 · 1

y+

x1 · (x1 + 1) · x2 · (x2 + 1) x3 (x3 + 1) · 1 · 2

y2 + · · · .

See Beran (1994), Beran et al. (2013) and Palma (2007) for more details. The theoretical autocorrelation function of a lag of 1, ρ (1), is r(1)/r(0). Therefore, the theoretical autocorrelation of a lag of 1 is utilized to investigate the performance of the FDB for ratio statistics under LRD. Please cite this article as: Y-M. Kim and J. Im, Frequency domain bootstrap for ratio statistics under long-range dependence. Journal of the Korean Statistical Society (2019), https://doi.org/10.1016/j.jkss.2019.03.001.

6

Y-M. Kim and J. Im / Journal of the Korean Statistical Society xxx (xxxx) xxx

Table 1 Empirical coverage rates and average lengths for two-sided 95% confidence intervals for autocorrelation of a lag of 1, ρ (1) from FARIMA processes with standard normal innovations using FDB, while considering NSDE with h = Cn−1/5 [log n]3/5 and SSDE with p = C log(n). FARIMA (0, (1 − θ )/2, 0)

θ

0.2

0.5

0.8

ρ (1)

0.67

0.33

0.11

n

250

1000

250

1000

250

1000

C

CP

AL

CP

AL

C

CP

AL

CP

AL

C

CP

AL

CP

AL

NSDE

2.5 3.0 3.5

0.82 0.83 0.82

0.28 0.27 0.27

0.86 0.88 0.87

0.17 0.17 0.17

0.3 0.5 1.0

0.88 0.91 0.92

0.27 0.28 0.28

0.94 0.94 0.93

0.15 0.15 0.15

0.15 0.2 0.3

0.90 0.91 0.92

0.23 0.24 0.24

0.94 0.94 0.93

0.13 0.13 0.13

SSDE

0.5 1.0 1.5 DR

0.61 0.63 0.64 0.59

0.32 0.33 0.34 0.32

0.51 0.52 0.52 0.49

0.20 0.20 0.20 0.19

0.5 1.0 1.5 DR

0.85 0.86 0.88 0.84

0.30 0.32 0.33 0.30

0.86 0.87 0.88 0.86

0.16 0.16 0.17 0.16

0.5 1.0 1.5 DR

0.87 0.89 0.90 0.87

0.27 0.28 0.29 0.27

0.88 0.88 0.88 0.87

0.13 0.14 0.14 0.13

FARIMA (1, (1 − θ )/2, 0) with ϱ = 0.3

ρ (1)

0.84

0.61

0.41

NSDE

0.5 1.0 1.5

0.86 0.91 0.91

0.23 0.23 0.22

0.92 0.95 0.95

0.14 0.14 0.14

0.15 0.2 0.3

0.86 0.88 0.90

0.21 0.22 0.23

0.93 0.94 0.94

0.13 0.13 0.14

0.1 0.15 0.2

0.89 0.91 0.91

0.20 0.21 0.22

0.94 0.94 0.94

0.11 0.12 0.12

SSDE

0.5 1.0 1.5 DR

0.66 0.68 0.69 0.69

0.22 0.23 0.23 0.22

0.54 0.55 0.56 0.63

0.13 0.13 0.13 0.13

0.5 1.0 1.5 DR

0.86 0.88 0.89 0.86

0.24 0.25 0.26 0.24

0.87 0.87 0.87 0.86

0.13 0.13 0.13 0.13

0.5 1.0 1.5 DR

0.87 0.88 0.90 0.87

0.25 0.26 0.26 0.24

0.88 0.88 0.88 0.87

0.12 0.12 0.12 0.12

FARIMA (0, (1 − θ )/2, 1) with ψ = −0.7

ρ (1)

−0.21

−0.36

−0.43

NSDE

0.6 0.7 0.8

0.93 0.95 0.96

0.23 0.23 0.23

0.94 0.96 0.97

0.17 0.17 0.17

0.1 0.15 0.2

0.89 0.91 0.92

0.17 0.18 0.18

0.93 0.93 0.92

0.10 0.10 0.10

0.1 0.15 0.2

0.90 0.91 0.92

0.16 0.17 0.17

0.93 0.93 0.92

0.09 0.09 0.09

SSDE

0.5 1.0 1.5 DR

0.75 0.78 0.79 0.68

0.23 0.25 0.26 0.24

0.61 0.66 0.68 0.55

0.13 0.13 0.14 0.12

0.5 1.0 1.5 DR

0.87 0.88 0.89 0.84

0.20 0.21 0.22 0.21

0.87 0.88 0.88 0.85

0.10 0.10 0.10 0.10

0.5 1.0 1.5 DR

0.85 0.87 0.88 0.83

0.19 0.19 0.20 0.19

0.88 0.87 0.88 0.86

0.09 0.09 0.10 0.09

CP: coverage probability, AL: average length, DR: data-driven, d = (1 − θ )/2, C = 0.5, 1, 1.5.

3.2. Numerical results For each FARIMA process and sample size, we computed the empirical coverage probabilities and the average lengths of 95% confidence intervals for ρ (1), based on one- and two-sided tests; for this purpose we used NSDE (Kim et al., 2018) and semiparametric spectral density estimation (SSDE) (Beran et al., 2013; Hurvich & Brodsky, 2001; Moulines & Soulier, 1999, 2000; Narukawa & Matsuda, 2011). See on-line supplementary materials for more details on these spectral density estimation methods. Here, we examine various constants C for the fixed bandwidth of the NSDE and the truncation order of the SSDE from C = 0.01 to C = 4.5 to investigate the performance of a frequency domain bootstrap (FDB). We report the optimal C to minimize criteria – such as the root mean squared percentage error and mean absolute percentage error – depending on the time processes and the true long-memory parameter, respectively. Additionally, the data-driven truncation order selection method is applied for the SSDE. Several practical options can be used to estimate the memory parameter θ ∈ (0, 1] in the initialization step of the NSDE method. Monte Carlo simulation runs were considered (M = 5, 000) and bootstrap replicates were used (B = 1, 000). Table 1 presents the coverage accuracies of 95% confidence intervals for the autocorrelation of a lag of 1 from FARIMA (0, (1 − θ )/2, 0), FARIMA (1, (1 − θ )/2, 0) with ϱ = 0.3, and FARIMA (0, (1 − θ )/2, 1) with ψ = −0.7. We do not report in this paper the simulation results from using a sample size of 500 and the processes of FARIMA (1, (1 − θ )/2, 0) with ϱ = −0.7, or FARIMA (0, (1 − θ )/2, 1) with ψ = 0.3, as the √ results √ thereof show asymptotic properties similar to those in Table 1. The simulation results for χ12 − 1 and uniform (− 3, 3) are reported in the Appendix A. The simulation results in Table 1 show that the FDB method has comparable performance, given the values of various LRDt parameters. Moreover, the FDB performance relies on that of the spectral density estimation methods. In general, FDB performance with the NSDE was better than that with the SSDE; however, the performance of the FDB method with the NSDE was affected by the optimal bandwidth for the NSDE, which in turn depends on the process, the longmemory parameter, and the sample size (Kim et al., 2018) but not on the magnitude of autocorrelation of a lag of 1. In addition, the simulation result in Table 1 using SSDE when θ < 0.5, sometimes called stronger LRD, shows that some coverage probabilities with n = 1000 have smaller than those with n = 500 using some methods. The tendency is shown, Please cite this article as: Y-M. Kim and J. Im, Frequency domain bootstrap for ratio statistics under long-range dependence. Journal of the Korean Statistical Society (2019), https://doi.org/10.1016/j.jkss.2019.03.001.

Y-M. Kim and J. Im / Journal of the Korean Statistical Society xxx (xxxx) xxx

7

Fig. 1. Plot of NIST measurement deviation time series.

sometimes for stronger LRD. Kim and Nordman (2011) obtained similar simulation results for the autoregressive sieve bootstrap. The more detailed theoretical approach is out of the topic of this manuscript. In summary, the coverage probabilities of the FDB confidence intervals with the NSDE show better performance than do those with the SSDE, whereas the FDB performance with the NSDE depends on the optimal bandwidth for the NSDE. 4. Data examples In this section, we demonstrate the FDB method for ratio statistics using the NSDE and the SSDE, by using two real data sets. Section 4.1 considers a realization from a potentially weaker LRDt process θ ∈ [1/2, 1) with a relatively small sample size. Section 4.2 involves data from a process with an appropriately stronger LRDt process θ ∈ (0, 1/2) for the seemingly large sample size used. We considered a fixed-bandwidth rule for the NSDE, based on the order of the theoretically optimal bandwidth of Kim et al. (2018) and a resampling-based technique for the bandwidth selection (presented in Appendix A). In our data examples, the initial and resampling bandwidths are considered as hi = hr = 0.1n−1/5 [log n]3/5 , as motivated by the bandwidth considerations in Kim et al. (2018) and similar bandwidths in Franke and Härdle (1992). We also used a fixed-order selection rule and data-driven order selection method for the SSDE (also presented in the on-line supplementary materials). 4.1. NIST measurement deviations Fig. 1 illustrates the time process of high-precision weight measurements of a 1-kg check standard weight performed at the National Institute of Standards and Technology (NIST) (see Beran, 1994; Pollak, Croarkin, & Hagewood, 1993). The observations were measured between June 24, 1963, and October 17, 1975, with the same weighing machine; NIST then converted the measured values to deviations from 1 kg, in microgram units. The number of observations is n = 289 and the corresponding LRDt parameter estimate using the log-periodogram regression is θˆn = 0.74. The estimated value implies that the time series in Fig. 1 is a weaker LRDt process (stronger LRD if θ ∈ (0, 1/2) and weaker LRD if θ ∈ [1/2, 1); see Kim and Nordman (2011) for more details). The estimate of autocorrelation of a lag of 1 is 0.0769, and its 95% confidence interval estimate based on the tapered method in Hyndman (2015) and McMurry and Politis (2010) is (−0.0536, 0.1743). In Table 2, by applying the FDB methods using the NSDE and the SSDE to NIST data, we obtained approximate 95% confidence interval estimates for autocorrelation of a lag of 1 as (−0.0298,0.2385), (0.0011,0.2697), (0.0352,0.2979), and (−0.0295,0.1832) from the NSDE and (−0.2119,0.0948), (−0.2116,0.1107), (−0.1989,0.1675), and (−0.2240,0.0970) from the SSDE. Here, the confidence interval estimates from the NSDE and the SSDE were computed over a fixed bandwidth, h = Cn−1/5 [log n]3/5 , and a fixed truncation order, p = C log(n), corresponding to C = 0.5, 1.0, 1.5 and the optimal bandwidth and the data-driven truncation order from the Appendix A. The FDB confidence interval using the NSDE with the optimal bandwidth aligns with the tapered version confidence interval estimate. Please cite this article as: Y-M. Kim and J. Im, Frequency domain bootstrap for ratio statistics under long-range dependence. Journal of the Korean Statistical Society (2019), https://doi.org/10.1016/j.jkss.2019.03.001.

8

Y-M. Kim and J. Im / Journal of the Korean Statistical Society xxx (xxxx) xxx

Table 2 Empirical 95% confidence intervals and average lengths for autocorrelation of a lag of 1 from NIST data using FDB, considering NSDE with h = Cn−1/5 [log n]3/5 and SSDE with p = C log(n). NSDE

0.5 1.0 1.5 DR

SSDE

LB

UB

AL

LB

UB

AL

−0.0298

0.2385 0.2697 0.2979 0.1832

0.2684 0.2686 0.2627 0.2127

−0.2119 −0.2116 −0.1989 −0.2240

0.0948 0.1107 0.1675 0.0970

0.3067 0.3223 0.3663 0.3209

0.0011 0.0352 −0.0295

LB and UB: lower and upper bounds; AL: average length DR: data-driven, C = 0.5, 1, 1.5. Table 3 Empirical 95% confidence intervals and average lengths for autocorrelation of a lag of 1 from northern hemisphere temperature data using FDB, while considering NSDE with h = Cn−1/5 [log n]3/5 and SSDE with p = C log(n). NSDE

0.5 1.0 1.5 DR

SSDE

LB

UB

AL

LB

UB

AL

0.4712 0.4874 0.5099 0.4820

0.6222 0.6583 0.6765 0.6358

0.1510 0.1709 0.1666 0.1538

0.4409 0.4276 0.4286 0.4112

0.6733 0.6896 0.6655 0.6232

0.2324 0.2620 0.2369 0.2120

LB and UB: lower and upper bounds; AL: average length DR: data-driven, C = 0.5, 1, 1.5.

Fig. 2. Plot of time series of monthly temperature for the northern hemisphere.

4.2. Northern hemisphere temperature data Fig. 2 presents monthly temperature data for the northern hemisphere, from 1854 to 1989 and in degrees Celsius (Beran, 1994; Jones & Briffa, 1992); the number of observations is n = 1, 632. The data points are obtained as the difference between the monthly average temperature from 1950 to 1979 and the observed temperature from 1854 to 1989. To apply the FDB method using the NSDE and the SSDE, we first remove the linear data trend; then, the longmemory parameter using the log-periodogram regression is estimated as θˆn = 0.38, implying a stronger long-memory process. The estimate of autocorrelation of a lag of 1 is 0.6137, and its 95% confidence interval estimate based on the tapered method is (0.4852, 0.7100). In Table 3, by applying the FDB methods to northern hemisphere temperature data, we obtained approximate 95% confidence interval estimates for autocorrelation of a lag of 1 as (0.4712,0.6222), (0.4874,0.6583), (0.5099,0.6765), and (0.4820,0.6358) from the NSDE and (0.4409,0.6733), (0.4276,0.6896), (0.4286,0.665), and (0.4112,0.6232) from the SSDE, as before. The FDB confidence interval estimates using the NSDE with the optimal bandwidth agree with the tapered version confidence interval estimate. Please cite this article as: Y-M. Kim and J. Im, Frequency domain bootstrap for ratio statistics under long-range dependence. Journal of the Korean Statistical Society (2019), https://doi.org/10.1016/j.jkss.2019.03.001.

Y-M. Kim and J. Im / Journal of the Korean Statistical Society xxx (xxxx) xxx

9

5. Concluding remarks In this study, we examined the frequency domain bootstrap (FDB) method for ratio statistics under long-range dependence, using semi- and nonparametric spectral density estimation methods. We showed that the FDB method offers valid estimation of the sampling distribution of ratio statistics for a broad class of linear – but not necessarily causal – time processes, which could exhibit strong time dependence. The FDB method with semi- and nonparametric spectral density estimates has an advantage—namely, that they presuppose no knowledge on the part of the practitioner with regard to the exact structure of the data-generating mechanism or the full joint distribution of time series observations for linear and stationary long-memory processes. The performance of the FDB with the NSDE is better than that with the SSDE, even though the optimal bandwidth of the NSDE depends on the data-generating process and on the sample size—and despite the fact that the optimal truncation order for the SSDE does not strongly depend on the process. Nonetheless, the FDB with nonparametric spectral density estimates using the data-driven bandwidth selection method proposed by Franke and Härdle (1992) performed well, relative to the simulation studies presented in Section 4. Acknowledgments The authors would like to thank the editor, associate editor and reviewers for their careful readings and thoughtful comments. The authors are supported by National Research Foundation of Korea (NRF), NRF-2016R1D1A1B03932212 & NRF-2018R1D1A1B07045220, respectively. Appendix A. Proofs of main results

A.1. Proofs of Theorem 1 In the following, C denotes a generic constant that depends on the arguments (if any) but does not depend on the sample size n. Proof of Theorem (i). We can rewrite ℓˆ n as

ℓˆ n =

π



π



φ (x)F (x)dx } [∫ π { ∫ π 1 (In (x) − f (x)) = ∫π φ (z)f (z)dz F (x) φ (x) − dx f (x) { } F (x) I (x) / f (x) dx n 0 0 0 } ] ∫ π{ ∫ π F (x) + φ (x) − φ (z)F (z)dz f (x) dx φ (x)Ln (x)dx −

0

0

0

f (x)

0

≡ An · {Bn + C } .

It is natural that C = 0. Note that

1 An

∫π

=

0

(In (x)−f (x))dx ∫π 0 f (x)dx

p

∫π

+ 1. By Lemma 3, we have that

0

(In (x)−f (x))dx ∫π 0 f (x)dx

p

−→ 0 as n → ∞.

Thus, An −→ 1 as n → ∞. By Lemma 3, we obtain that as n → ∞,

⏐ ⏐ ⏐ ⏐ ) ∫ π N ( ⏐ }⏐ ( ) 2π ∑ F (λj ) { ⏐Bn − φ (λj ) − φ (x)F (x)dx In (λj ) − f (λj ) ⏐⏐ = op n−1/2 . (7) ⏐ n f (λj ) 0 ⏐ ⏐ j=1 ∫π ) F (λj ) { } ∑N ( Because of the result, (7), we use 2nπ j=1 φ (λj ) − 0 φ (x)F (x)dx f (λj ) In (λj ) − f (λj ) instead of Bn . Thus, by Lemma 3, we also have that as n → ∞, ⎛ ⎞ ) ∫ π N ( } √ F (λ j ) { 2π ∑ d n⎝ φ (λ j ) − φ (x)F (x)dx In (λj ) − f (λj ) ⎠ −→ N(0, V ). (8) n f ( λ ) j 0 j=1

Note that since V =π

∫π

∫ Π

0

f (x) φ (x) −

(

∫π 0

π

{ ∫ 2 F (x) φ (x) −

) φ (y)F (y)dy ∫ π f1(y)dy dx = 0, we have 0

}2

φ (y)F (y)dy

dx.

0

Therefore, by Slutsky’s theorem,



d

nℓˆ n −→

N(0, V )

as

n → ∞.

Please cite this article as: Y-M. Kim and J. Im, Frequency domain bootstrap for ratio statistics under long-range dependence. Journal of the Korean Statistical Society (2019), https://doi.org/10.1016/j.jkss.2019.03.001.

10

Y-M. Kim and J. Im / Journal of the Korean Statistical Society xxx (xxxx) xxx

[

Proof of Theorem (ii). Define Fˆn (λj ) = fˆn (λj )

2π n

]−1

∑N

ˆ k=1 fn (λk )

. We denote ℓˆ ∗n = A∗n (B∗n + Cn ), where

1

A∗n =

2π n

∑N

ˆ j=1 Fn (λj )

n

N 2π ∑

n

j=1

n

}

fˆn (λj )

φ (λj ) −

{

N 2π ∑

Cn =

I n ∗ (λj )

{

N 2π ∑

B∗n =

{

n

j=1

φ (λk )Fˆn (λk ) Fˆn (λj )

In ∗ (λj ) − fˆn (λj )

)

fˆn (λj )

k=1

N 2π ∑

φ (λj ) −

(

}

} φ (λk )Fˆn (λk ) Fˆn (λj ).

k=1

Since In ∗ (λj ) = εj∗ fˆn (λj ), we rewrite A∗n with E∗ A∗n −1 = 1 from the result of E∗ εj∗ = 1 as

(

1 A∗n

N

2π ∑

=

n

j=1

{

I n (λ j ) Fˆn (λj ) fˆn (λj ) ∗

2π n

}

)

ˆ j=1 fn (λj )

=

2π n

∑N

In ∗ (λj )

}

fˆn (λj )

=

ˆ λ

j=1 fn ( j )

p

To show A∗n −→ 1 as n → ∞, we first define A˜ ∗n as

[

{

∑N

]

1 A˜ ∗ n

=

2π n

∑N

2π n

∑N

ˆ λ ε∗

j=1 fn ( j ) j

2π n

∑N

ˆ λ

j=1 fn ( j )

.

F1 (λj )εj∗ , where F1 (λj ) = f (λj )

j=1

[

2π n

∑N

k=1

]−1

f (λk )

, which

1 implies E∗ A˜ − = 1. Thus, by Lemmas 4 and 5, n

1 A˜ ∗

( − E∗

)

1 A˜ ∗

n

N 2π ∑

=

n

n

p

F1 (λj ) εj∗ − 1 −→ 0

(

)

as

n → ∞.

j=1

In addition,

[

1 A˜ ∗n

( − E∗

1

)]

[ −

A˜ ∗n

1 A∗n

( − E∗

⎡ 1

=

2π n

∑N

j=1

fˆ (λj ) 2nπ

∑N

j=1

f (λ j )

1

)]

A∗n

N 2π ∑



n

f (λ j )

j=1

⎧ N ⎨ 2π ∑ ⎩ n

(

(εj − 1) fˆn (λj ) − f (λj ) ∗

⎫ )⎬ ⎭

j=1

⎤ N N ( ) ∑ 2π ∑ 2 π + f (λj )(εj∗ − 1) fˆn (λj ) − f (λj ) ⎦ . n

Since

2π n

n

j=1

j=1

∑N

j=1 f (λj ) is bounded, by Assumption A.4,

1 A∗n

(

p

−→ E∗

[

1 A˜ ∗ n

− E∗

( )]

2π n

j=1



φ (λ j ) −

[

1 A∗ n

− E∗

( )] 1 A∗ n

p

−→ 0 as n → ∞. Hence,

= 1 as n → ∞.

A∗n 2π n



)

1

∗ It is natural } to show that Bn is { that Cn = 1. Next, we want

∑N

1 A˜ ∗ n

∑N

d

n-consistent in probability. Let B˜ ∗n be defined by B˜ ∗n ≡

φ (λk )F1 (λk ) F1 (λj )(εn − 1). By Lemma 5, as n → ∞ ∗

k=1

nB˜ ∗n −→ N(0, V )



in probability,

where V ≡

Var∗

−→ π

∫ 0

{ }2 N N ) π2 ∑ π∑ ∗ ˜ nBn = φ (λj ) − φ (λk )G1 (λk ) F12 (λj )Var∗ (εj∗ )

(√

n

π

{

φ (x) −

π



N

j=1

}2

φ (y)F (y)dy

k=1

F 2 (x)dx,

0

with Var∗ (εj∗ ) = 1 for j = 1, . . . , N by Lemma 4 and |F1 (x) − F (x)| → 0 as n → ∞. Next,

⏐ ⏐ ⏐ N ⏐ ⏐ ⏐⏐ 2π ∑ ⏐ ⏐ ∗ ˜ ∗⏐ ⏐ (εj∗ − 1)∆n ⏐⏐ , ⏐Bn − Bn ⏐ ≡ ⏐ ⏐ n j=1 ⏐ Please cite this article as: Y-M. Kim and J. Im, Frequency domain bootstrap for ratio statistics under long-range dependence. Journal of the Korean Statistical Society (2019), https://doi.org/10.1016/j.jkss.2019.03.001.

Y-M. Kim and J. Im / Journal of the Korean Statistical Society xxx (xxxx) xxx

11

where N { } { } 2π ∑ ∆n = φ (λj ) Fˆn (λj ) − F (λj ) + F1 (λj ) φ (λk ) F1 (λk ) − Fˆn (λk )

n

N } 2π ∑

{

+ F1 (λj ) − Fˆn (λj )

n

k=1

φ (λk )Fˆn (λk ).

k=1

Note that by A.4, we have as n → ∞,

⏐ )) ( )⏐ ( ∑ ( ∑ ⏐ ⏐ ⏐⏐f (λj ) 2nπ Nk=1 f (λk ) − fˆn (λk ) + f (λj ) − fˆn (λj ) 2nπ Nk=1 f (λk ) ⏐⏐ p ⏐ˆ ⏐ ⏐ ⏐ ∑ −→ 0. ⏐Fn (λj ) − F1 (λj )⏐ = ∑N ⏐ ⏐ 2π N ˆ 2π ⏐n k=1 fn (λk ) n k=1 f (λk )⏐ Thus,

⏐ ⏐ ⏐ˆ ⏐ p ⏐Fn (λj ) − F1 (λj )⏐ −→

0

as n → ∞.

Proof of Theorem (iii). Using Polya’s theorem,

⏐ ⏐

(

⏐ ⏐

)

p

sup ⏐P ∗ ℓˆ ∗n ≤ x − Φ (x, V )⏐ −→ 0

as n → ∞,

x∈R

where Φ (x, V ) denotes the distribution function of the N(0, V ) distribution. For ℓˆ n ,

⏐ (



)

⏐ ⏐ sup ⏐P ℓˆ n ≤ x − Φ (x, V )⏐ −→ 0 x∈R

followed by Theorem (i), implying the consistency result

⏐ ⏐

(

)

(

)⏐ ⏐

p

sup ⏐P ∗ ℓˆ ∗n ≤ x − P ℓˆ n ≤ x ⏐ −→ 0

as

n → ∞.

x∈R

A.2. Proofs of lemmas We require some additional lemmas to assist with the proofs of main results. Lemma 1. Assume that A.1 holds. Then, as n → ∞, i. Zn ≡ max1≤j≤N In (λj )/f (λj ) = op (n1/2 ) , ii. Z˜n ≡ max1≤j≤N In (λj )/fˆn (λj ; h) = op (n1/2 ), iii. 2π n−1

∑N

iv. 2π n−1



v. 2π n−1



p

λ / ˆ λ −→ π , p λ / ˆ λ −→ 2π , and ∫ ˆ λ −→ f (x)dx = r(0)/2.

j=1 In ( j ) fn ( j ) N 2 2 j=1 In ( j ) fn ( j ) p π N j=1 fn ( j ) 0

Proof. For the proof of Lemma 1-i, select ε > 0. Since f −1 (λj ) ≤ C |λj |1−θ by A.1 and EIn 4 (λj ) ≤ C |λj |−4(1−θ ) by Lemma 1-ii of Kim et al. (2018), by Markov’s inequality, Jensen’s inequality and A.1 we have

( P

1 n1/2

Zn > ε

) ≤

1

ε n1/2

([ E

⎡ ≤

1

ε n1/2

max 1≤j≤N

N ∑



In 4 (λj )

⎡ ⎛

]1/4 )

f 4 (λ j )



1

ε n1/2

⎣E ⎝

N ∑ In 4 (λj ) j=1

f 4 (λ j )

⎞⎤1/4 ⎠⎦

⎤1/4 C |λj |−4(1−θ )+4(1−θ ) ⎦

= o(1),

j=1

⏐ ⏐ ⏐ ( ) f (λ ) ⏐ ⏐ ⏐ ⏐ · ⏐max1≤j≤N fˆ (λj ) ⏐ = op n1/2 . n j By Lemma 4 in Kim and Nordman (2013), we prove that as n → ∞, ∫ π N N 2π ∑ 2π ∑ In (λj ) p p In (λj ) −→ f (x)dx ≤ ∞, and −→ π. n n f (λ j ) 0 ⏐ ⏐ ⏐ ⏐

⏐ ⏐

By Lemma 1-i and A.4, we have ⏐Z˜n ⏐ ≤ ⏐max1≤j≤N

j=1

In (λj ) f (λ)

(9)

j=1

Please cite this article as: Y-M. Kim and J. Im, Frequency domain bootstrap for ratio statistics under long-range dependence. Journal of the Korean Statistical Society (2019), https://doi.org/10.1016/j.jkss.2019.03.001.

12

Y-M. Kim and J. Im / Journal of the Korean Statistical Society xxx (xxxx) xxx

For any λj , j = 1, 2, . . . , N, A.4 and the formula (9) imply that N 2π ∑ In (λj ) f (λj )

n

p

−→ π.

f (λj ) fˆn (λj )

j=1

Next, for the proof of Lemma 1-iv, by A.4 we have that as n → ∞,

( op (1) =

)2

f (λ j )

−1

fˆn (λj )

=

⎧( ) ⎨ f (λ ) 2 j

⎩ fˆn (λj )

Thus, the formula (10) implies that

(

−1

⎫ ⎬

(

f (λ j )

+2 1−

.

fˆn (λj )



)2 f (λj ) ˆfn (λj )

) (10)

p

−→ 1 as n → ∞. Hence, Lemma 7 in Nordman and Lahiri (2006) implies

that as n → ∞, N 2π ∑ In 2 (λj )

n

ˆ λ

=

N 2π ∑ In 2 (λj ) f 2 (λj )

n

f 2( j) j=1 n

j=1

f 2(

λj ) ˆ λ

fn2 ( j )

p

−→ 2π.

Finally, by the dominant convergence theorem, as n → ∞, Likewise, we can show that as n → ∞, N 2π ∑

n

N 2π ∑

fˆn (λj ) =

n

j=1

f (λj )

j=1

fˆn (λj ) f (λj )

p

π



f (x)dx =

−→

2π n

r(0)

0

2

∑N

j=1

f (λj ) →

∫π 0

f (x)dx =

r(0) , 2

where r(0) = Cov(Xt , Xt ).

.

φ (x)f (x)dx = 0, then Π ) { } N P 0 ∈ ch◦ π φ (λj )In (λj ) j=1 → 1 as n → ∞.

Lemma 2. Suppose that A.5 holds. If



(

Proof. See Lemma 5 in Nordman and Lahiri (2006). Lemma 3. Suppose that A.2 and A.5 hold. Let Tn = 2π n−1

(



π



n Tn −

)

φ (x)f (x)dx

d

N(0, VT )

−→

as

∑N

j=1

φ (λj )In (λj ). Then,

n → ∞,

0

where



VT = π

Π

φ 2 (x)f 2 (x)dx +

κ4,ε σε2

(∫ Π

φ (x)f (x)dx

)2

,

with the fourth-order cumulant of the white noise process, κ4,ε . Additionally,





n⎝

N 2π ∑

n

⎞ d

φ (λj )[In (λj ) − f (λj )]⎠ −→ N(0, VT ) as n → ∞.

j=1

Proof. See Lemma 6 in Nordman and Lahiri (2006). Lemma 4. Under the assumptions of the Theorem, as n → ∞, p

Var∗ εj∗ −→ 1.

( )

∗ Proof. Using E∗ εj∗ = 1, we expand Var∗ ε1n

(

V2n ≡

1 N p

In 2 (λj )

∑N

j=1 fˆ 2 (λ ) n j

)

∗ 2 = E∗ (ε1n ) − 1 = V2n · V1n − 1, where V1n ≡ −1/2

p

. By Lemma 1 and n ∼ 2N as n → ∞, we have π V2n −→ 2π and π V1n

In (λj ) j=1 fˆn (λ ) j

( ∑ N 1 N

)−2

and

p

−→ π , which imply

V2n V1n −→ 2 by the continuous mapping theorem. Lemma 5. Under Assumptions A.2 and A.5, as n → ∞, Tn∗ ≡





n⎝

N 2π ∑

n

⎞ φ (λj )f (λj )(εj∗ − 1)⎠

d

−→

N(0, V )

in probability

as

n → ∞.

j=1

Please cite this article as: Y-M. Kim and J. Im, Frequency domain bootstrap for ratio statistics under long-range dependence. Journal of the Korean Statistical Society (2019), https://doi.org/10.1016/j.jkss.2019.03.001.

Y-M. Kim and J. Im / Journal of the Korean Statistical Society xxx (xxxx) xxx

13

Proof. Note that by the property of the bootstrap innovation, εj∗ , with E∗ εj∗ = 1 for j = 1, 2, . . . , N, we derive Var∗ Tn∗ =

( )

N (2π )2 ∑

n

( ) φ 2 (λj )f 2 (λj )Var∗ εj∗ .

j=1 4π 2 n

By A.5 and the dominated convergence theorem, we have By Lemma 4 we also have

∑N

j=1

φ 2 (λj )f 2 (λj ) → 2π

∫π 0

φ 2 (x)f 2 (x)dx ≡ V1 as n → ∞.

p

Wn ≡ Var∗ (ε1∗ ) −→ 1,

(11)

p

implying V1n ≡ Var∗ Tn −→ V1 . Additionally, by Lemma 1-ii, for j = 1, 2, . . . , N,

( ∗)

Yn ≡ max Uj = op n1/2 ,

(

)

1≤j≤N

Sn ≡

N 1 ∑

N

p

Uj −→ 1,

for

Uj ≡

j=1

I n (λ j ) gˆn (λj ; h)

.

(12)

Let nk be any subsequence of {n}. By the formulas (11) and (12), there exists a subsequence {nj } ∈ {nk }, such that V1nj → V1 , Ynj

Snj → 1,

1 /2

nj

Xj∗ ≡

→ 0 and Wnj → 1 as n → ∞. Select ε > 0. Define 2π n

f (λj ) εj∗ − 1 ,

(

⎛ ∗ V2n ≡ Var∗ ⎝

N ∑

)

⎞ Xj∗ ⎠ =

j=1

for j = 1, 2, . . . , N , N 2π ∑

n

f (λj )Var∗ εj∗ =

( )

N 2π ∑

n

j=1

f (λj ),

and

j=1

Nn √ ∗) 1 ∑ ⏐ ∗ ⏐2 (⏐ ∗ ⏐ E∗ ⏐Xj ⏐ I ⏐Xj ⏐ > ε V2n G∗n,ε ≡ √ , ∗ V2n j=1

where Nn ≡ ⌊(n − 1)/2⌋ and we suppress the dependence of Xj∗ on n in our notation here. We may bound Nn

G∗n,ε



j C 2 (2π )2 ∑



nj V2n

⏐2 (

E∗ ⏐εj∗ − 1⏐ I 2π C ⏐εj∗ − 1⏐ >









∗ nj ε V2n



)

j=1 Nn

√ ∗) ⏐ ⏐ ⏐ −1 Inj (λk ) ⏐ ε nj V2n ⏐ ⏐ − 1⏐ > ≤ max Sn ∗ 1≤k≤N ⏐ j fˆ (λ ) nj V2n 2π C n k j=1 [ ( ( )] √ √ ∗) √ ∗ Ynj nj ε V2n ε V2n C 2 (2π )2 ≤ > Snj +I 1> ∗ Nnj Wnj I √ nj V2n nj 4π C 4π C j C 2 (2π )2 ∑

(

[ ]2 E∗ εj∗ − 1 I

using the fact that εj∗ assumes values among Sn−j 1 Inj (λk )/fˆn (λk ) for k = 1, 2, . . . , N. As nj → ∞, by the formula (12), Vnj → V1 > 0, Snj → 1 and

Ynj

→ 0 almost surely (a.s.), which implies ( ) ( √ √ √ ∗ ) Snj ε Vnj ,b Ynj nj ε V2n I √ > → 0 and I 1 > → 0 a.s. as n → ∞. nj 4π C 4π C √

nj

Since additionally, by the formula (12), we have fact, since ε > 0 is arbitrary, it holds that P

Wn j V∗



2n

(

1 V1

limnj →∞ G∗n,ε

Nnj

→ 21 as n → ∞, it follows that G∗n,ε → 0 a.s. In ) = 0 for ε > 0 = 1. As a consequence of this almost surely a.s. and

nj

d

convergence of G∗n,ε for ε , it holds that Tn∗j −→ N(0, V1 ) as nj → ∞, by the Lindeberg–Feller central limit theorem. In other words,

[ P

(

)

lim P∗ Tn∗j ≤ x = Φ (x, V1 ),

nj →∞

] ∀x ∈ R = 1,

where Φ (x, V1 ) is the cumulative distribution factor of a N(0, V1 ) or by Polya’s theorem,

[ P

( lim

nj →∞

] ⏐ ( ⏐) ) ⏐ ⏐ ∗ sup ⏐P∗ Tnj ≤ x − Φ (x, V1 )⏐ = 0 = 1. x∈R

Please cite this article as: Y-M. Kim and J. Im, Frequency domain bootstrap for ratio statistics under long-range dependence. Journal of the Korean Statistical Society (2019), https://doi.org/10.1016/j.jkss.2019.03.001.

14

Y-M. Kim and J. Im / Journal of the Korean Statistical Society xxx (xxxx) xxx

Since the original subsequence nk was arbitrary, we have d

Tn∗ −→ N(0, V1 )

in probability

as

n → ∞. ⏐ ( ∗ ⏐ ) ) ⏐ In other words, for any ε > 0, P supx∈R P∗ Tn ≤ x − Φ (x, V1 )⏐ > ε → 0 as n → ∞.

(

Appendix B. Supplementary data Supplementary material related to this article can be found online at https://doi.org/10.1016/j.jkss.2019.03.001. References Adenstedt, R. K. (1974). On large-sample estimation for the mean of a stationary sequence. The Annals of Statistics, 2, 1095–1107. Beran, J. (1994). Statistical methods for long memory processes. London: Chapman & Hall. Beran, J., Feng, Y., Ghosh, S., & Kulik, R. (2013). Long-memory processes: probabilistic properties and statistical methods. Berlin Heidelberg: Springer-Verlag. Bhansali, R. J., Giraitis, L., & Kokoszka, P. S. (2006). Estimation of the memory parameter by fitting fractionally differenced autoregressive models. Journal of Multivariate Analysis, 94, 2101–2130. Bühlmann, P. (1997). Sieve bootstrap for time series. Bernoulli, 3, 123–148. Carlstein, E. (1986). The use of subseries methods for estimating the variance of a general statistic from a stationary time series. The Annals of Statistics, 14, 1171–1179. Dahlhaus, R. (1983). Spectral analysis with tapered data. Journal of Time Series Analysis, 4, 163–175. Dahlhaus, R., & Janas, D. (1996). A frequency domain bootstrap for ratio statistics in time series analysis. The Annals of Statistics, 24, 1934–1963. Efron, B. (1979). Bootstrap methods: another look at the jackknife. The Annals of Statistics, 7, 1–26. Fox, R., & Taqqu, M. S. (1986). Large sample properties of parameter estimates for strongly dependent stationary gaussian time series. The Annals of Statistics, 14, 517–532. Franke, J., & Härdle, W. (1992). On bootstrapping kernel spectral estimates. The Annals of Statistics, 20, 121–145. Giraitis, L., & Surgailis, D. (1990). A central limit theorem for quadratic forms in strongly dependent linear variables and its application to asymptotical normality of Whittle’s estimate. Probability Theory and Related Field, 86, 87–104. Granger, C. W. J., & Joyeux, R. (1980). An introduction to long-memory time series models and fractional differencing. Journal of Time Series Analysis, 1, 15–29. Hannan, E. J. (1973). The asymptotic theory of linear time-series models. Journal of Applied Probability, 10, 130–145. Hosking, J. R. M. (1981). Fractional differencing. Biometrika, 68, 165–176. Hurvich, C. M., & Brodsky, J. (2001). Broadband semiparametric estimation of the memory parameter of a long-memory time series using fractional exponential models. Journal of Time Series Analysis, 22, 221–249. Hyndman, R. J. (2015). Discussion of high-dimensional autocovariance matrices and optimal linear prediction. Electronic Journal of Statistics, 9, 792–796. Jones, P. D., & Briffa, K. R. (1992). Global surface air temperature variations during the twentieth century, part 1. The Holocene, 2, 165–179. Kapetanios, G., & Psaradakis, Z. (2006). Sieve bootstrap for strongly dependent stationary processes. In Working paper 552. Dept. of Economics, Queen Mary, University of London. Kim, Y.-M., Lahiri, S. N., & Nordman, D. J. (2018). Nonparametric spectral density estimation under long-range dependence. Journal of Time Series Analysis, 39(3), 380–401. http://dx.doi.org/10.1111/jtsa.12284. Kim, Y.-M., & Nordman, D. J. (2011). Properties of a block bootstrap under long-range dependence. Sankhya, 73(1), 79–109. Kim, Y.-M., & Nordman, D. J. (2013). A frequency domain bootstrap on whittle estimation under long-range dependence. Journal of Multivariate Analysis, 115, 405–420. Künsch, H. R. (1989). The jackknife and bootstrap for general stationary observations. The Annals of Statistics, 17, 1217–1261. Lahiri, S. N. (1993). On the moving block bootstrap under long range dependence. Statistics & Probability Letters, 18, 405–413. Mandelbrot, B. B., & Van Ness, J. W. (1968). Fractional brownian motions, fractional noises and applications. SIAM Review, 10, 422–437. McMurry, T. L., & Politis, D. N. (2010). Banded and tapered estimates for autocovariance matrices and the linear process bootstrap. Journal of Time Series Analysis, 31(6), 471–482. Moulines, E., & Soulier, P. (1999). Broadband log-periodogram regression of time series with long-range dependence. The Annals of Statistics, 27, 1415–1439. Moulines, E., & Soulier, P. (2000). Data driven order selection for projection estimator of the spectral density of time series with long range dependence. Journal of Time Series Analysis, 21, 193–218. Narukawa, M., & Matsuda, Y. (2011). Broadband semi-parametric estimation of long-memory time series by fractional exponential models. Journal of Time Series Analysis, 32, 175–193. Nordgaard, A. (1992). Lecture notes in economic and mathematical systems: Vol. 376, Resampling stochastic processes using a bootstrap approach (pp. 181–185). Nordman, D. J., & Lahiri, S. N. (2006). A frequency domain empirical likelihood for short- and long-range dependence. The Annals of Statistics, 34, 3019–3050. Palma, W. (2007). Long-memory time series: theory and methods. New Jersey: Wiley. Paparoditis, E. (2002). Frequency domain bootstrap for time series. In Empirical process techniques for dependent data (pp. 365–381). Springer. Pollak, M., Croarkin, C., & Hagewood, C. (1993). Surveillance schemes with applications to mass calibration, Vol. 5158. U.S. Department of Commerce, National Institute of Standards and Technology. Poskitt, D. S. (2008). Properties of the sieve bootstrap for fractionally integrated and non-invertible processes. Journal of Time Series Analysis, 2992, 224–250. Robinson, P. M. (1995a). Gaussian semiparametric estimation of long range dependence. The Annals of Statistics, 23, 1630–1661. Singh, K. (1981). On the asymptotic accuracy of the Efron’s bootstrap. The Annals of Statistics, 9, 1187–1195.

Further reading Andrews, D. W. K., & Guggenberger, P. (2003). A bias-reduced log-periodogram regression estimator for the long-memory parameter. Econometrica, 71, 675–712. Andrews, D. W. K., & Sun, Y. (2004). Adaptive local polynomial whittle estimation of long-range dependence. Econometrica, 72, 569–614. Geweke, J., & Porter-Hudak, S. (1983). The estimation and application of long-memory time series models. Journal of Time Series Analysis, 4, 221–238. Lahiri, S. N. (2003a). A necessary and sufficient condition for asymptotic independence of discrete fourier transforms under short- and long-range dependence. The Annals of Statistics, 31, 613–641. Lahiri, S. N. (2003b). Resampling methods for dependent data. New York: Springer. Robinson, P. M. (1995b). Log-periodogram regression of time series with long range dependence. The Annals of Statistics, 23, 1048–1072.

Please cite this article as: Y-M. Kim and J. Im, Frequency domain bootstrap for ratio statistics under long-range dependence. Journal of the Korean Statistical Society (2019), https://doi.org/10.1016/j.jkss.2019.03.001.