ARTICLE IN PRESS
Physica A 375 (2007) 546–562 www.elsevier.com/locate/physa
Bootstrap approaches and confidence intervals for stationary and non-stationary long-range dependence processes Glaura C. Francoa,, Valderio A. Reisenb a
Departamento de Estatı´stica, UFMG, Av. Antoˆnio Carlos, 6627 - Belo Horizonte, MG, CEP 31270-901, Brazil b Department of Statistics, UFES, ES, Brazil Received 18 May 2006; received in revised form 19 July 2006 Available online 5 September 2006
Abstract This paper deals with different bootstrap approaches and bootstrap confidence intervals in the fractionally autoregressive moving average ðARFIMAðp; d; qÞÞ process [J. Hosking, Fractional differencing, Biometrika 68(1) (1981) 165–175] using parametric and semi-parametric estimation techniques for the memory parameter d. The bootstrap procedures considered are: the classical bootstrap in the residuals of the fitted model [B. Efron, R. Tibshirani, An Introduction to the Bootstrap, Chapman and Hall, New York, 1993], the bootstrap in the spectral density function [E. Paparoditis, D.N Politis, The local bootstrap for periodogram statistics. J. Time Ser. Anal. 20(2) (1999) 193–222], the bootstrap in the residuals resulting from the regression equation of the semi-parametric estimators [G.C Franco, V.A Reisen, Bootstrap techniques in semiparametric estimation methods for ARFIMA models: a comparison study, Comput. Statist. 19 (2004) 243–259] and the Sieve bootstrap [P. Bu¨hlmann, Sieve bootstrap for time series, Bernoulli 3 (1997) 123–148]. The performance of these procedures and confidence intervals for d in the stationary and non-stationary ranges are empirically obtained through Monte Carlo experiments. The bootstrap confidence intervals here proposed are alternative procedures with some accuracy to obtain confidence intervals for d. r 2006 Elsevier B.V. All rights reserved. Keywords: Semi-parametric and parametric procedures; Fractionally integrated ARMA process; Bootstrap
1. Introduction The ARFIMAðp; d; qÞ process has become one of the most popular tools to model series that present longrange dependence and it has been an interesting research topic in time series and correlated areas. The fractional parameter d governs the memory of the process. The long-memory property is characterized by the fact that the spectral density is unbounded in the neighborhood of zero frequency, i.e., the spectral behaves like f ðoÞCjoj2d , for o ! 0 and some positive constant C, and the autocorrelation function decays hyperbolically, rðkÞk2d1 , as k ! 1. For do0:5, the process is stationary and it is also long memory when d is positive. When d ¼ 0 or do0 the process is said to be short and intermediate memory, respectively. For a Corresponding author. Departamento de Estatı´ stica, UFMG, Av. Antoˆnio Carlos, 6627 - Belo Horizonte, MG, CEP 31270-901, Brazil. Tel.: +55 31 34995949; fax: +55 31 34995924. E-mail address:
[email protected] (G.C. Franco).
0378-4371/$ - see front matter r 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.physa.2006.08.027
ARTICLE IN PRESS G.C. Franco, V.A. Reisen / Physica A 375 (2007) 546–562
547
detailed review of these processes see, for example, Refs. [1,2]. The most interesting feature of the long-range process is when the memory parameter d 2 Rþ . Special attention has been given for d in ð0:5; 1:0Þ. In this range, the process is long-memory non-stationary and mean-reverting. Since the first published papers related to the long-range dependence processes (see, for example, Ref. [1] for a historical overview), one of the most interesting research topics focused on modelling time series with this characteristic is related to the estimation of the parameters of the ARFIMA processes. Although many researches have given their contributions to this topic, there are still questions that remain open. Since the nineties it has also been observed that the number of researchers involved in the area has been increasing significantly. To estimate the parameters, various estimators have been suggested and some of them have recently become very popular, such as the parametric method, based on the maximum likelihood [3] and semi-parametric ones, based on the regression equation using the periodogram function [4] and its modified versions such as the tapered periodogram [5], the smoothed periodogram [6,7], among others. The asymptotic properties of the estimators have also been an interesting research area and, recently, many papers have dedicated contributions to this topic. Simulation studies comparing different estimation procedures for long-memory processes can be found in Bisaglia and Gue´gan [8], Hurvich and Deo [9] and Reisen et al. [10]. The book of Doukham et al. [11] contains the global picture of long-range dependence processes in both theory and application. Some attention has also been given to applying the non-parametric bootstrap methodology to time series data. The most common alternative to perform the bootstrap in discrete time series is to resample the residuals of the fitted model, which are generally uncorrelated if there is not order misspecification [12]. Recently, different bootstrap procedures in time series data have been a main research focus in the area. Andersson and Gredenhoff [13] and references therein, de Peretti [14] and Grau-Carles [15] employ the idea of bootstrapping to construct tests for long memory processes. Andrews and Lieberman [16] prove some asymptotic properties of parametric bootstraps. Arteche and Orbe [17] and Franco and Reisen [18] present some different kinds of bootstraps and make comparisons. Unit root tests based on bootstrap procedures are the focus of Franco et al. [19] and some references therein. The use of moving blocks bootstrap in time series is also a very interesting research topic in the area and some authors (Lahiri [20] and Bu¨hlmann [21], among others) give some insights into the problem. However, the use of bootstrap procedures in the estimation and test of long memory ARFIMA processes is still in its infancy, and this is the main motivation of this paper. This work deals with different bootstrap procedures in the stationary and non-stationary ARFIMA processes. The bootstraps are used as an alternative methodology to obtain the empirical confidence intervals. Paparoditis and Politis [22] have suggested the bootstrap in the periodogram function and this method is adapted here to model series generated from the ARFIMA processes. The Sieve bootstrap proposed by Bu¨hlmann [23] and later considered in Chang and Park [24], Bisaglia and Procidano [25] and Alonso et al. [26], the bootstrap in the residuals of the regression equation used to calculate the semi-parametric estimators [18] and the popular bootstrap in the residual of the fitted model are also investigated in this paper, which is organized as follows: In Section 2 the ARFIMA model and the estimators of d are presented. Section 3 introduces the bootstrap procedures mentioned above. Section 4 deals with the simulation results and Section 5 concludes the work.
2. The model and the parameter estimators fX t g1 t¼1 is a fractionally integrated ARMA model if it satisfies fðBÞð1 BÞd X t ¼ yðBÞt ,
(1)
where fðBÞ and yðBÞ are the polynomials of order p and q, respectively, with all roots outside the unit circle and B is the back-shift operator. t is a white noise process normally distributed with zero mean and finite variance s2 and ð1 BÞd is the fractional differencing operator. The stationarity and invertibility conditions are satisfied when d 2 ð0:5; 0:5Þ. A more detailed description of ARFIMA models can be found in Hosking [27], Reisen [6] and most recently in Doukham et al. [11].
ARTICLE IN PRESS G.C. Franco, V.A. Reisen / Physica A 375 (2007) 546–562
548
For a stationary ARFIMA process, the spectral density of X t is f ðoÞ ¼ f u ðoÞð2 sin ðo=2ÞÞ2d ,
(2) d
where the function f u ðoÞ is the spectral density of the ARMAðp; qÞ process, ut ¼ ð1 BÞ X t . f ðoÞ satisfies the property f ðoÞjoj2d , as o ! 0. Let X t be a time series with length n. The usual P non-biased but inconsistent estimator of the spectral function is the periodogram function, IðoÞ ¼ ð2pnÞ1 j nt¼1 X t eiot j2 . There are a number of estimators for the memory parameter d found in the literature and they are mainly divided into parametric and semi-parametric groups. This work deals with the parametric estimator provided by Fox and Taqqu [3] and the semi-parametric ones proposed by Geweke and Porter-Hudak [4], the tapered periodogram estimator given in Hurvich and Ray [5] and the smoothed periodogram regression estimator by Reisen [6]. These procedures are briefly described as follows: 2.1. The parametric FT estimator This is a procedure due to Fox and Taqqu [3] for Gaussian long-memory processes and it is based on the periodogram and the spectral density functions. This estimator is obtained by using all harmonic frequencies and it is calculated by minimizing the approximate Gaussian log-likelihood Iðoj Þ 1 X LW ðhÞ ¼ ln f h ðoj Þ þ , (3) 2n j f h ðoj Þ 2 where Pf h is the spectral density, h ¼ ðd; f1 ; . . . ; fp ; y1 ; . . . ; yq ; s Þ denotes the vector of unknown parameters and j is sum over j ¼ 1; . . . ; n 1. Fox p and Taqqu [3] show that for strongly dependent Gaussian processes ^ , is consistent and ffiffinffiðd FT ^ dÞ converges in distribution to Nð0; 6=p2 Þ. the FT estimator, d FT
2.2. Semi-parametric estimators The semi-parametric estimators described below are obtained taking the logarithm of the spectral density (2), which approximates a regression equation, having the logarithm of the spectral density as the dependent variable and lnð2 sinðo=2ÞÞ2 as the independent variable f u ðoÞ 2 ln f ðoÞ ¼ ln f u ð0Þ d lnð2 sin ðo=2ÞÞ þ ln , (4) f u ð0Þ where f u ð:Þ is the function previously specified (see Eq. (2)). In each semi-parametric method, the spectral density is approximated by a proper estimator, that will be described when the procedure is presented. For all three semi-parametric methods the number of observations in the regression equation (the bandwidth) is a function of the sample size n, i.e., gðnÞ ¼ nZ with 0:0pZp1:0. The appropriated choice of Z has been discussed in many papers, see, for example, Reisen [6] and Robinson [7].
GPH estimator: The GPH estimator was initially proposed by Geweke and Porter-Hudak [4], and it has become one of the most popular fractional estimation methods. The estimate of d is the slope estimate obtained from the linear least squares regression equation (4) when the dependent variable is the periodogram function, ln IðoÞ. Geweke and Porter-Hudak [4] have demonstrated the asymptotic Gaussian distribution of this semi-parametric estimator when do0:0. Almost one decade later, under very mild conditions, Robinson [7] derived the asymptotic theory of a modified version of GPH method which holds asymptotic properties even for positive d. GPH estimator, d^ p has variance given by p2 , vðd^ p Þ ¼ PgðnÞ 6 i¼1 ðvi vÞ2 where vi ¼ lnf4 sin2 ðoi =2Þg.
(5)
ARTICLE IN PRESS G.C. Franco, V.A. Reisen / Physica A 375 (2007) 546–562
GPHT-Tapered estimator: Like GPH, GPHT estimator is based on log-periodogram regression. This method has been presented in Hurvich and Ray [5]. The method consists in obtaining the regression estimate based on the tapered data wt X t , for an adequate weighted function wt . Here, wt is the cosine-bell taper function. The periodogram function of the tapered series wt X t , which is used as the dependent variable in Eq. (4), is then given by
Iðoj Þ ¼
549
2p
1 P
2 X n1 w X expðio tÞ . t t j w2t t¼0
(6)
Velasco [28] has also studied the GPHT estimator. He has proved that, under Gaussian increments, GPHT estimate is consistent and asymptotically normal distributed for any d, including non-stationary and noninvertible processes. The smoothed log-periodogram regression estimator (SPR): Reisen [6] has suggested replacing, in the GPH estimator, the periodogram by a smoothed estimate of the spectral density. The author has considered the Parzen window generator. In this case, the estimate of the spectral density to be used in Eq. (4) as the dependent variable is given by the smoothed periodogram function, f sp ðoÞ, f sp ðoÞ ¼
n1 1 X lðkÞ^gðkÞ cos ðkoÞ, 2p k¼ðn1Þ
(7)
where lðkÞ is a weighting function, known as ‘‘lag window’’ and g^ ðkÞ is the sample autocovariance function. Different approaches for lðkÞ are suggested in the literature. Under Gaussian time series and some conditions, the estimator is asymptotically normal distributed (see Refs. [6,11, p. 263]). SPR estimator, d^ sp , has variance m2 vðd^sp Þ ¼ 0:53928 PgðnÞ , n i¼1 ðvi vÞ2
(8)
where m ¼ nb is the bandwidth in the lag Parzen Window and vi is defined as in GPH method. The parameter estimation for the ARFIMA model has also been extended for the non-stationary case, see for example, Robinson [7], Velasco [28,29], Santander et al. [30] and Lopes et al. [31], among others. 3. Bootstrap The basic idea of the bootstrap, as stated in Efron [32], is to replace the unknown distribution of a random variable by the empirical distribution of a random sample drawn from that distribution. This is usually done by generating a large number of resamples, based on the original sample, and computing the statistics of interest in each resample. However, in time series the data are generally not independent and some adaptations to perform the bootstrap are needed. The most common way of doing the bootstrap in time series is resampling the residuals of the fitted model [12], and its application to the ARFIMA model is straightforward. The disadvantage of this method is that it is completely model based, and serious errors can occur if the model is not correctly specified. Regarding this fact, new kinds of bootstrap are being proposed and the ARFIMA class is particularly favorable, as most of the estimation methods of the long memory parameter are done in the frequency domain. Thus, the bootstrap can be performed either in the time or frequency domain, enabling a variety of new bootstrap approaches. Some of the bootstrap procedures presented in this work have already been investigated in Franco and Reisen [18], but only for semi-parametric estimators of d. These procedures include the nonparametric bootstrap in the residuals of the fitted model (NP), the local bootstrap (LOC) and the bootstrap in the residuals of the regression equation (REG), the latter one being used only for semiparametric estimators of d.
ARTICLE IN PRESS G.C. Franco, V.A. Reisen / Physica A 375 (2007) 546–562
550
They will be compared in this work, jointly with the Sieve bootstrap (SIE). A brief description of these procedures follows: 3.1. Nonparametric bootstrap in the residuals of the fitted model (NP) Bootstrap in time series is usually done by performing a resampling in the residuals of the fitted model [12]. In the ARFIMA model, after properly estimating the parameters f, y and d, the estimated residuals, et , are calculated by 1 b b et ¼ b y ðBÞfðBÞð1 BÞd X t
which are supposed to be uncorrelated if the model is correctly specified. The nonparametric bootstrap consists of resampling these residuals with replacement, so that the bootstrap series X t can be constructed as db b 1 ðBÞð1 BÞb X t ¼ f yðBÞet ,
(9)
where et is the residual bootstrap series. Since no distribution is specified for the residuals, this approach is called nonparametric. 3.2. Sieve bootstrap (SIE) The Sieve bootstrap procedure (SIE) is summarized as follows (see, for example, Ref. [23] for more details): let fX t g1 t¼1 be a zero mean stationary process. Here, this procedure will be restricted to the case where X t is an autoregressive process of infinite order, that is, 1 X
fj X tj ¼ j ;
f0 ¼ 1,
j¼0
P1 2 where t is a sequence of uncorrelated variables with expectation Eðt Þ ¼ 0 and j¼0 fj o1. It will be assumed that the observations X 1 ; X 2 ; . . . ; X n are realizations of the process fX t g. The autoregressive coefficients f1 ; f2 ; . . . ; fn can be estimated based in this set of observations. Thus the estimates of fp ¼ ðf1 ; f2 ; . . . ; fn Þ0 can be calculated based on the Yule–Walker method, b ¼ bg , bp ¼ f G p p b p ¼ ½bgði jÞp , bgp ¼ ðbgð1Þ; bgð2Þ; . . . ; bgðpÞÞ0 and bgð:Þ is the sample autocovariance function. The next where G i;j¼1 step is to calculate the residuals bt;n ¼
pðnÞ X
b X tj ; f j;n
b ¼ 1; t ¼ p þ 1; . . . ; n. f 0;n
(10)
j¼0
In practical situations, instead of an ARð1Þ, an ARðpðnÞÞ model is considered. There exist various methodologies to choose the best value of pðnÞ. One of the alternatives is to choose the pðnÞ for the model that presents the smallest AIC (Akaike Criterium). Another possibility is to use pðnÞ ¼ Cnk , 0oko13 [25], letting pmax ¼ 10 log ðnÞ and then using the AIC to choose the optimum p. After obtaining the residuals, the bootstrap is performed in the usual way to obtain the series X t recursively. It should be noted that the stationary and invertible ARFIMA process has infinite MA representation, thus as ck C kd1 for some constant C which depends on the parameters of the process [27]. Then, P1k!1; r k¼0 k ck o1 for r ¼ ½v þ d, for any v40 and ½ denotes integer part. Hence, for the model above the assumptions A1 and A2 given in Bu¨hlmann [23] are satisfied. These assumptions are equivalent to assumptions (1) and (2) in Chang and Park [24].
ARTICLE IN PRESS G.C. Franco, V.A. Reisen / Physica A 375 (2007) 546–562
551
3.3. The local bootstrap (LOC) The local bootstrap is based on the asymptotic independence of the ordinates of the periodogram function. Assuming that the spectral density f ðoÞ is a smooth function of o, the periodogram replicates can be obtained locally, that is, sampling the respective frequencies that are in a small neighborhood of the frequency o of interest. One of the advantages of this method is that it does not require the knowledge of the order of the ARFIMA process to perform the bootstrap. The local bootstrap can be summarized as follows. (1) Select a resampling width ki , in which ki 2 @ and ki p½n=4. (2) Define i.i.d. discrete random variables S1 ; . . . ; S i that assume values in the set f0; 1; . . . ; ki g. (3) Each one of the 2ki þ 1 ordinates can be resampled with probability pki ;s ¼
1 . 2ki þ 1
(4) The bootstrap periodogram is defined by I ðoj Þ ¼ Iðoj þ sj Þ; j ¼ 1; . . . ; ½n=2, I ðoj Þ ¼ Iðoj Þ; oj o0, I ðoj Þ ¼ 0;
oj ¼ 0.
The asymptotic validity of the local bootstrap is given in Paparoditis and Politis [22]. In our simulation exercise, the local bootstrap is performed by considering the simplest case, as stated in Silva et al. [33] and the method is used in the fractional parameter estimation procedure, where an estimate of the spectral density is required. 3.4. Bootstrap in the residuals of the regression equation (REG) The REG procedure, like the local bootstrap (LOC), does not require the knowledge of the order of the ARFIMA process to be calculated. In this case, the bootstrap is performed in the residuals obtained from the approximated regression equation (4) which is used to calculate the GPH, GPHT and SPR estimators. The residuals obtained from the approximated regression equations of these estimators are asymptotically uncorrelated and homoscedastic, and the asymptotic error distributions depend on the estimator used ([6,7]). Using this, the bootstrap distributions of the GPH, GPHT or SPR estimates can be obtained as follows: (1) From an approximated regression equation of (4), calculate db by least squares and obtain the residuals ubj ; j ¼ 1; . . . ; gðnÞ. (2) Resample ubj with replacement to get ubi;j , i ¼ 1; 2; . . . ; nB , where nB is the number of bootstrap replications. (3) Use ubi;j to build the bootstrap distribution of log f i ðb oj Þ, where f ðb oÞ is an estimate of the spectral density f ðoÞ which is given in Section 2 for each semi-parametric estimator. b say db . (4) Using log f i ðb oj Þ, calculate the bootstrap regression to obtain the bootstrap distribution of d, The performance of the REG procedure has been investigated in Franco and Reisen [18] and the method presented better results when compared to other bootstrap techniques. The only disadvantage of this method is that it can only be applied to semi-parametric estimators of d that use the log periodogram regression. 4. Simulation results The bootstrap approaches described in the previous section are here empirically compared based on Monte Carlo experiments by paying special attention to the confidence interval for d. The non-bootstrap estimates, hereafter called Monte Carlo (MC), are also discussed with the aim of giving some insight into the parameter point estimation.
ARTICLE IN PRESS 552
G.C. Franco, V.A. Reisen / Physica A 375 (2007) 546–562
The data generated process (DGP) is the univariate stationary and non-stationary ARFIMAðp; d; qÞ model which is given by (1) where t , t ¼ 1; . . . ; n, are i.i.d sequences with Nð0; 1Þ distribution. The ARFIMA process was generated by using the method suggested in Hosking [34] for values of d in the stationary (d ¼ 0:0 and 0.3) and the non-stationary and non-mean-reverting (d ¼ 1:0 and 1.4) regions. The choices of d ¼ 0 and 1 have a special interest in the unit root theory. The short-memory parameter values were 0:7. The experiment was carried out based on 500 of each Monte Carlo and bootstrap replications and sample size of n ¼ 100 and 500. These are typical sample sizes in macroeconomic studies. Following the work by Reisen et al. [10], the choice of the bandwidth in the semi-parametric long-memory estimators (GPH, SPR and GPHT methods) was based on the type of the model used. The values of Z were 0.8 and 0.5 for the ARFIMAð0; d; 0Þ and the ARFIMAðp; d; qÞ models, respectively. The truncation point b in the lag Parzen window, for the smoothed periodogram function, was chosen equal to 0.9 (see Ref. [6] for more details). Confidence intervals for d were constructed using both the bootstrap and the asymptotic distribution of the estimators, and they were compared at the fixed nominal level of 95% of coverage probability. The bootstrap confidence interval approach was the percentile interval proposed by Efron and Tibshirani [12]. It consists of taking the a and ð1 aÞ percentiles of the bootstrap distribution of db and defining the percentile interval by h ðaÞ i ð1aÞ db ; db . It should be noted that in finite small sample size, the empirical distribution of the estimators of d does not have good approximation to the Gaussian distribution. This was extensively explored by Santander et al. [30] among others. Hence, to obtain the asymptotic confidence limits for the unknown parameter d, critical points were empirically calculated based on the asymptotic properties of the estimators. Estimation point results, sample mean and mean squared error (mse), are presented in Tables 1, 3 and 4 and confidence interval rates and lengths are in Tables 2, 5 and 6. The models, parameter values and sample lengths are specified in the tables (Tables 1–6). 4.1. Estimation and confidence intervals for fractionally integrated noise Table 1 displays the results when the DGP is an ARFIMAð0; d; 0Þ model. Looking at the non-bootstrap estimates, labelled as MC, the following conclusions can be drawn. When dp1:0, all estimators seem to be very competitive by not presenting significative biases which are, in general, negative and of order less then 101 , except for GPHT. The FT method outperforms the others with the smallest mse. This method is model dependent and it looses superiority when dealing with model misspecification, see for example Reisen et al. [35]. In the semi-parametric class, the smallest mse is given by the SPR method. It is not surprising since this estimator is based on a consistent estimator of the spectral density of the process. The semi-parametric GPHT overestimates the parameter and presents the largest mse, but the motivation of including this approach in the study clearly appears when d ¼ 1:4. The choice of the large bandwidth for the semi-parametric estimators ðZ ¼ 0:8Þ when dealing with an ARFIMAð0; d; 0Þ model may be justified as follows: The spectral density of a stationary ARFIMAð0; d; 0Þ is a decreasing monotonic function of the Fourier frequency. Its estimates, such as the periodogram and the smoothed periodogram, are able to capt this decay with small variation. In this situation the increasing of the bandwidth will allow the regression equation to have more periodogram or smoothed periodogram ordinates to obtain the OLS estimate and thus it will produce estimates of d with smaller variance. As an example to evidence this bandwidth property, simulated results for an ARFIMAð0; 0:3; 0Þ process with n ¼ 100 and the bandwidth Z ¼ 0:5 were also obtained. The mean and mse (values in brackets) estimates under 500 replications were 0.318 (0.089), 0.217 (0.056) and 0.335 (0.108) for GPH, SPR and GPHT methods, respectively (compare this example with the same model in Table 1 with Z ¼ 0:8). Studies related to the choice of the bandwidth in the semi-parametric estimators were also explored in the works by Reisen [6] and Reisen et al. [35] among others. Table 1 also reveals an interesting characteristic to estimate d in the non-stationary region. For the random walk process the mse values based on the FT, GPH and SPR methods are reduced. When d41, these estimators have an average close to one, no matter what the value of d is. This finding is very interesting for practical situations where a fractional estimate close to one
ARTICLE IN PRESS G.C. Franco, V.A. Reisen / Physica A 375 (2007) 546–562
553
Table 1 Mean and mse for the estimates of the ARFIMAð0; d; 0Þ (n ¼ 100 and 500) n
100
d
0.0
0.3
1.0
1.4
0.0
0.3
1.0
1.4
MC
0.005 (0.018) 0.004 (0.017) 0.005 (0.018) 0.005 (0.018) 0.011 (0.007)
0.299 (0.018) 0.302 (0.019) 0.299 (0.018) 0.300 (0.018) 0.259 (0.013)
0.998 (0.014) 0.984 (0.006) 0.998 (0.014) 0.997 (0.014) 0.922 (0.010)
1.070 (0.128) 1.021 (0.145) 1.070 (0.128) 1.069 (0.129) 0.976 (0.180)
0.000 (0.003) 0.000 (0.003) 0.000 (0.003) 0.001 (0.003) 0.002 (0.001)
0.303 (0.003) 0.304 (0.003) 0.303 (0.003) 0.302 (0.003) 0.289 (0.002)
1.001 (0.003) 0.996 (0.001) 1.001 (0.003) 1.001 (0.003) 0.972 (0.001)
1.046 (0.135) 1.012 (0.151) 1.046 (0.135) 1.045 (0.136) 1.001 (0.160)
0.025 (0.011) 0.045 (0.012) 0.025 (0.011) 0.021 (0.012) 0.033 (0.008)
0.275 (0.013) 0.250 (0.015) 0.274 (0.013) 0.280 (0.013) 0.236 (0.015)
0.995 (0.010) 0.990 (0.006) 0.995 (0.010) 1.006 (0.010) 0.896 (0.015)
1.119 (0.086) 1.073 (0.108) 1.119 (0.086) 1.128 (0.081) 0.954 (0.200)
0.008 (0.002) 0.015 (0.002) 0.008 (0.002) 0.007 (0.002) 0.009 (0.001)
0.292 (0.002) 0.285 (0.002) 0.292 (0.002) 0.294 (0.002) 0.281 (0.003)
1.004 (0.002) 1.006 (0.001) 1.004 (0.002) 1.007 (0.002) 0.968 (0.002)
1.075 (0.109) 1.046 (0.126) 1.075 (0.109) 1.077 (0.108) 0.999 (0.162)
0.001 (0.025) 0.000 (0.026) 0.001 (0.025) 0.001 (0.026) 0.021 (0.016)
0.321 (0.026) 0.338 (0.030) 0.321 (0.026) 0.317 (0.026) 0.286 (0.020)
1.203 (0.067) 1.337 (0.149) 1.203 (0.067) 1.172 (0.055) 1.100 (0.016)
1.752 (0.155) 2.002 (0.404) 1.752 (0.154) 1.700 (0.119) 1.154 (0.064)
0.003 (0.005) 0.003 (0.005) 0.003 (0.005) 0.003 (0.005) 0.008 (0.003)
0.307 (0.005) 0.312 (0.005) 0.307 (0.005) 0.307 (0.005) 0.294 (0.004)
1.043 (0.007) 1.076 (0.012) 1.043 (0.007) 1.038 (0.007) 1.024 (0.002)
1.472 (0.011) 1.541 (0.026) 1.472 (0.011) 1.464 (0.009) 1.095 (0.098)
0.012 (0.009) 0.022 (0.009) 0.016 (0.009) 0.021 (0.007)
0.293 (0.010) 0.286 (0.011) 0.286 (0.010) 0.268 (0.012)
0.982 (0.008) 0.971 (0.004) 0.991 (0.009) 0.930 (0.008)
1.060 (0.132) 1.007 (0.156) 1.073 (0.124) 0.958 (0.196)
0.002 (0.002) 0.004 (0.002) 0.003 (0.002) 0.004 (0.001)
0.299 (0.001) 0.298 (0.001) 0.298 (0.001) 0.300 (0.001)
0.996 (0.001) 0.992 (0.006) 1.002 (0.007) 0.983 (0.001)
1.053 (0.129) 1.015 (0.149) 1.062 (0.123) 0.999 (0.162)
G
NP
P
REG
H
LOC SIE MC
S
NP
P
REG
R
LOC SIE MC
G
NP
P
REG
H
LOC
T
SIE MC
F
NP
T
LOC SIE
500
Note: Numbers in brackets are the mse; in bold are the bootstrap closer to the Monte Carlo for each estimator.
does not mean that the data follows a random walk process. It is clear that the superiority of the GPHT has smaller bias and mse than the other methods. When dp1:0, the increasing of the sample size reduces substantially the bias and the mse of the estimates. For d41:0, this sample size property only holds true for GPHT method. Comparing the bootstrap procedures, it can be seen that they behave almost the same as the Monte Carlo estimates. The bootstrap approach that is closer, both in mean and mse, to the MC is the REG. The disadvantage of this procedure, as already noted, is that it can only be calculated for the regression type estimators (GPH, SPR and GPHT) and it is parameter estimate dependent. The second bootstrap approach closer to the MC results is the LOC, which has the advantage that it can be calculated for all estimators and it
ARTICLE IN PRESS G.C. Franco, V.A. Reisen / Physica A 375 (2007) 546–562
554
Table 2 Confidence interval rates and lengths for d in the ARFIMAð0; d; 0Þ, n ¼ 100 and 500 n
100
d
0.0
0.3
1.0
0.0
0.3
1.0
ASYMP
0.934 (0.516) 0.934 (0.508) 0.918 (0.486) 0.836 (0.388) 0.992 (0.507)
0.952 (0.544) 0.932 (0.511) 0.926 (0.485) 0.840 (0.384) 0.964 (0.510)
0.938 (0.473) 0.988 (0.435) 0.932 (0.447) 0.856 (0.353) 0.972 (0.508)
0.954 (0.219) 0.964 (0.231) 0.956 (0.229) 0.900 (0.180) 0.996 (0.231)
0.944 (0.233) 0.950 (0.233) 0.944 (0.229) 0.874 (0.182) 0.970 (0.231)
0.948 (0.206) 0.996 (0.214) 0.956 (0.209) 0.866 (0.162) 0.998 (0.231)
0.904 (0.384) 0.922 (0.402) 0.630 (0.202) 0.382 (0.116) 0.966 (0.403)
0.924 (0.459) 0.888 (0.415) 0.632 (0.202) 0.444 (0.124) 0.870 (0.408)
0.932 (0.402) 0.970 (0.366) 0.624 (0.171) 0.540 (0.147) 0.914 (0.426)
0.942 (0.184) 0.940 (0.187) 0.630 (0.091) 0.348 (0.045) 0.976 (0.187)
0.940 (0.186) 0.928 (0.190) 0.670 (0.091) 0.384 (0.048) 0.912 (0.188)
0.946 (0.175) 0.992 (0.167) 0.614 (0.077) 0.482 (0.056) 0.974 (0.190)
0.954 (0.585) 0.960 (0.630) 0.866 (0.476) 0.672 (0.331) 0.982 (0.623)
0.954 (0.668) 0.925 (0.623) 0.864 (0.478) 0.692 (0.341) 0.968 (0.623)
0.270 (0.611) 0.412 (0.606) 0.624 (0.508) 0.620 (0.483) 0.998 (0.630)
0.932 (0.274) 0.950 (0.293) 0.876 (0.226) 0.718 (0.158) 0.984 (0.291)
0.944 (0.279) 0.954 (0.292) 0.882 (0.227) 0.718 (0.161) 0.976 (0.292)
0.798 (0.288) 0.814 (0.279) 0.848 (0.230) 0.706 (0.180) 0.992 (0.291)
0.942 (0.375) 0.940 (0.376) 0.838 (0.291) 0.962 (0.380)
0.940 (0.403) 0.926 (0.377) 0.830 (0.295) 0.925 (0.387)
0.926 (0.343) 0.988 (0.335) 0.904 (0.349) 0.964 (0.383)
0.954 (0.155) 0.936 (0.146) 0.828 (0.114) 0.966 (0.146)
0.954 (0.142) 0.962 (0.147) 0.882 (0.117) 0.954 (0.152)
0.944 (0.135) 0.996 (0.129) 0.914 (0.122) 0.990 (0.148)
G
NP
P
REG
H
LOC SIE ASYMP
S
NP
P
REG
R
LOC SIE ASYMP
G
NP
P
REG
H
LOC
T
SIE ASYMP
F
NP
T
LOC SIE
500
Note: In bold are the coverage percents in a distance of 2.5% from the nominal level of 95%. Numbers in bracket are the length of the intervals.
is not parameter estimate dependent. The SIE bootstrap is the farthest from the MC results, but although it has a large bias (with respect to the MC and to the true value of d), its mse is in general the smallest (except for GPHT, d ¼ 1:4 and n ¼ 500). Figs. 1–4 present the empirical MC and bootstrap distributions of the estimates of d when n ¼ 100. These graphics give additional insights into the results presented in Table 1. The finite sample distribution of the methods described in the previous sections are now investigated by means of confidence intervals and the results are displayed in Table 2. As far as the authors know, the asymptotic theories of the methods here empirically investigated, have not been established yet for d41:0. Because of this and also the results in Table 1, the ARFIMAð0; d; 0Þ model with the fractional parameter in this range was not considered in this step of the experiment.
ARTICLE IN PRESS G.C. Franco, V.A. Reisen / Physica A 375 (2007) 546–562
555
Table 3 Mean and mse for the estimates of the ARFIMAð1; d; 0Þ (n ¼ 500, f ¼ 0:7 and þ0:7)
GPH
d
0.0
f
0:7
þ0:7
0:7
þ0:7
0:7
þ0:7
0:7
þ0:7
MC
0.007 (0.025) 0.008 (0.024) 0.006 (0.025) 0.004 (0.026) 0.004 (0.001) 0.033 (0.015) 0.065 (0.016) 0.033 (0.015) 0.028 (0.016) 0.036 (0.002) 0.002 (0.036) 0.002 (0.039) 0.002 (0.036) 0.001 (0.036) 0.014 (0.011) 0.003 (0.002) 0.007 (0.002) 0.005 (0.002) 0.006 (0.001)
0.072 (0.035) 0.150 (0.054) 0.072 (0.035) 0.076 (0.037) 0.068 (0.007) 0.039 (0.020) 0.081 (0.024) 0.039 (0.020) 0.046 (0.021) 0.035 (0.004) 0.059 (0.045) 0.142 (0.067) 0.060 (0.045) 0.058 (0.048) 0.040 (0.018) 0.014 (0.018) 0.053 (0.012) 0.035 (0.011) 0.028 (0.003)
0.294 (0.031) 0.296 (0.033) 0.295 (0.031) 0.293 (0.031) 0.149 (0.015) 0.246 (0.024) 0.205 (0.030) 0.246 (0.024) 0.255 (0.024) 0.114 (0.049) 0.317 (0.044) 0.340 (0.052) 0.317 (0.044) 0.312 (0.047) 0.231 (0.033) 0.295 (0.002) 0.292 (0.002) 0.293 (0.002) 0.293 (0.002)
0.379 (0.034) 0.457 (0.038) 0.378 (0.033) 0.379 (0.034) 0.295 (0.009) 0.337 (0.019) 0.374 (0.013) 0.337 (0.019) 0.349 (0.020) 0.258 (0.011) 0.394 (0.048) 0.505 (0.065) 0.394 (0.048) 0.393 (0.048) 0.331 (0.023) 0.334 (0.021) 0.367 (0.015) 0.342 (0.013) 0.328 (0.010)
1.002 (0.021) 0.978 (0.010) 1.002 (0.021) 1.001 (0.021) 0.893 (0.016) 0.987 (0.015) 0.961 (0.012) 0.987 (0.015) 1.006 (0.015) 0.847 (0.029) 1.186 (0.082) 1.344 (0.187) 1.186 (0.082) 1.157 (0.071) 1.054 (0.021) 1.040 (0.004) 1.058 (0.004) 1.046 (0.005) 1.022 (0.003)
1.032 (0.026) 1.018 (0.006) 1.032 (0.026) 1.035 (0.027) 0.933 (0.009) 1.023 (0.019) 1.019 (0.007) 1.023 (0.019) 1.045 (0.020) 0.887 (0.017) 1.261 (0.111) 1.483 (0.295) 1.260 (0.111) 1.235 (0.098) 1.084 (0.019) 1.080 (0.029) 1.039 (0.003) 1.088 (0.022) 1.006 (0.008)
1.085 (0.126) 1.023 (0.145) 1.085 (0.126) 1.083 (0.127) 0.964 (0.191) 1.138 (0.079) 1.079 (0.105) 1.138 (0.079) 1.157 (0.070) 0.922 (0.230) 1.712 (0.146) 2.030 (0.472) 1.712 (0.146) 1.671 (0.119) 1.202 (0.050) 1.084 (0.115) 1.021 (0.145) 1.095 (0.108) 0.998 (0.166)
1.083 (0.129) 1.030 (0.140) 1.083 (0.129) 1.080 (0.131) 0.974 (0.183) 1.144 (0.076) 1.091 (0.097) 1.144 (0.076) 1.162 (0.067) 0.933 (0.220) 1.787 (0.200) 2.144 (0.620) 1.786 (0.200) 1.747 (0.168) 1.174 (0.067) 1.065 (0.131) 1.011 (0.152) 1.067 (0.168) 0.975 (0.184)
NP REB LOC SIE SPR
MC NP REG LOC SIE
GPHT
MC NP REG LOC SIE
FT
MC NP LOC SIE
0.3
1.0
1.4
Note: Numbers in brackets are the mse; in bold are the bootstrap closer to the Monte Carlo for each estimator.
Table 2 reveals that, in general, apart from the GPHT asymptotic interval, the coverage rates of the asymptotic confidence intervals (ASYMP) of the remaining estimators are close to the nominal value of 95%. It appears that the coverage rates do not improve by increasing the sample size from 100 to 500, although the length intervals are reduced significantly. The good coverage rates of GPH, SPR and FT, in a random walk process, are indications that the asymptotic distributions of these estimators may be used as a unit root alternative test. This positive empirical evidence is in accordance to many papers that have recently dealt with this research topic, see for example Santander et al. [30] and the references therein. The GPH is the estimator which presents more values close to the 95% nominal level. For this estimator, all bootstrap procedures, except the LOC method, present good coverage percentages, with the REG procedure showing slightly smaller lengths. For the SPR and FT estimators, the asymptotic intervals are better than the bootstraps, but the NP and SIE procedures are good choices in some cases, especially when n increases. For
ARTICLE IN PRESS G.C. Franco, V.A. Reisen / Physica A 375 (2007) 546–562
556
Table 4 Mean and mse for the estimates of the ARFIMAð0; d; 1Þ model, (n ¼ 500, y ¼ 0:7 and þ0:7)
GPH
d
0.0
y
0:7
þ0:7
0:7
þ0:7
0:7
þ0:7
0:7
þ0:7
MC
0.007 (0.030) 0.011 (0.030) 0.007 (0.030) 0.005 (0.030) 0.000 (0.003)
0.073 (0.034) 0.145 (0.046) 0.074 (0.034) 0.078 (0.036) 0.044 (0.003)
0.319 (0.029) 0.328 (0.031) 0.319 (0.029) 0.319 (0.029) 0.213 (0.020)
0.228 (0.035) 0.152 (0.035) 0.227 (0.035) 0.223 (0.036) 0.040 (0.078)
1.002 (0.024) 0.978 (0.010) 1.002 (0.024) 1.000 (0.025) 0.886 (0.020)
0.959 (0.026) 0.918 (0.018) 0.960 (0.026) 0.960 (0.026) 0.818 (0.046)
1.075 (0.124) 1.026 (0.142) 1.075 (0.124) 1.073 (0.126) 0.971 (0.185)
1.075 (0.129) 1.019 (0.149) 1.075 (0.129) 1.073 (0.129) 0.950 (0.205)
0.027 (0.017) 0.053 (0.018) 0.027 (0.017) 0.022 (0.018) 0.032 (0.004)
0.112 (0.030) 0.209 (0.057) 0.112 (0.030) 0.110 (0.030) 0.076 (0.007)
0.269 (0.019) 0.233 (0.022) 0.269 (0.019) 0.278 (0.019) 0.177 (0.027)
0.182 (0.033) 0.090 (0.054) 0.182 (0.033) 0.187 (0.032) 0.006 (0.096)
0.985 (0.018) 0.966 (0.013) 0.985 (0.018) 1.004 (0.018) 0.840 (0.033)
0.928 (0.024) 0.879 (0.028) 0.928 (0.024) 0.945 (0.022) 0.772 (0.065)
1.140 (0.077) 1.086 (0.100) 1.140 (0.077) 1.159 (0.068) 0.929 (0.223)
1.130 (0.083) 1.069 (0.112) 1.130 (0.083) 1.148 (0.073) 0.907 (0.245)
0.022 (0.042) 0.029 (0.046) 0.022 (0.042) 0.019 (0.043) 0.014 (0.017)
0.078 (0.052) 0.159 (0.074) 0.079 (0.052) 0.081 (0.054) 0.081 (0.020)
0.335 (0.042) 0.364 (0.053) 0.334 (0.042) 0.330 (0.043) 0.258 (0.027)
0.240 (0.052) 0.154 (0.043) 0.240 (0.052) 0.232 (0.054) 0.138 (0.057)
1.182 (0.081) 1.356 (0.198) 1.182 (0.082) 1.151 (0.070) 1.058 (0.022)
1.093 (0.047) 1.097 (0.032) 1.093 (0.047) 1.061 (0.042) 0.997 (0.021)
1.731 (0.155) 1.897 (0.291) 1.731 (0.155) 1.687 (0.126) 1.196 (0.054)
1.642 (0.105) 1.692 (0.128) 1.643 (0.106) 1.598 (0.084) 1.220 (0.044)
0.001 (0.002) 0.001 (0.002) 0.000 (0.002) 0.001 (0.002)
0.054 (0.025) 0.099 (0.021) 0.076 (0.020) 0.033 (0.004)
0.297 (0.002) 0.296 (0.002) 0.296 (0.002) 0.298 (0.002)
0.261 (0.025) 0.210 (0.020) 0.231 (0.020) 0.078 (0.059)
1.222 (0.243) 1.159 (0.047) 1.160 (0.079) 1.053 (0.009)
0.968 (0.020) 0.942 (0.011) 0.955 (0.013) 0.898 (0.017)
1.362 (0.087) 1.220 (0.056) 1.337 (0.043) 1.086 (0.104)
1.180 (0.106) 1.018 (0.096) 1.198 (0.062) 0.999 (0.162)
NP REG LOC SIE SPR
MC NP REG LOC SIE
GPHT
MC NP REG LOC SIE
FT
MC NP LOC SIE
0.3
1.0
1.4
Note: Numbers in brackets are the mse; in bold are the bootstrap closer to the Monte Carlo for each estimator.
GPHT, the ASYMP, NP and SIE intervals are the ones that present better behavior, but only for the stationary case ðd ¼ 0:3Þ. These results suggest that further research is necessary in the direction of the use of bootstrap procedures to approximate the distribution of some statistics of long memory estimators. An idea is to use different bootstrap confidence intervals, as suggested in Efron and Tibshirani [12]. 4.2. Estimation and confidence intervals for fractionally integrated ARMA model The estimation results, when the DGP comes from the ARFIMAð1; d; 0Þ and the ARFIMAð0; d; 1Þ models, are presented in Tables 3 and 4, respectively, for n ¼ 500 and f ¼ y ¼ 0:7 (ARFIMA models with sample
ARTICLE IN PRESS G.C. Franco, V.A. Reisen / Physica A 375 (2007) 546–562
557
Table 5 Confidence interval rates and lengths for d in the ARFIMAð1; d; 0Þ; f ¼ 0:7, þ0:7; n ¼ 500 d
ASYMP G
NP
P
REG
H
LOC SIE ASYMP
S
NP
P
REG
R
LOC SIE ASYMP
G
NP
P
REG
H
LOC
T
SIE ASYMP
F
NP
T
LOC SIE
0.0
0.3
1.0
f ¼ 0:7
f ¼ þ0:7
f ¼ 0:7
f ¼ þ0:7
f ¼ 0:7
f ¼ þ0:7
0.928 (0.612) 0.958 (0.674) 0.938 (0.628) 0.864 (0.520) 0.998 (0.672)
0.903 (0.656) 0.907 (0.674) 0.897 (0.627) 0.780 (0.503) 1.000 (0.676)
0.942 (0.705) 0.926 (0.677) 0.886 (0.902) 0.816 (0.500) 0.996 (0.675)
0.872 (0.658) 0.946 (0.681) 0.902 (0.619) 0.774 (0.501) 0.998 (0.674)
0.942 (0.580) 0.984 (0.630) 0.942 (0.574) 0.856 (0.451) 0.982 (0.675)
0.954 (0.652) 0.998 (0.564) 0.925 (0.556) 0.808 (0.446) 0.990 (0.671)
0.898 (0.479) 0.932 (0.512) 0.628 (0.222) 0.410 (0.134) 0.998 (0.512)
0.927 (0.523) 0.933 (0.522) 0.567 (0.229) 0.330 (0.146) 0.993 (0.516)
0.874 (0.576) 0.840 (0.532) 0.532 (0.227) 0.358 (0.145) 0.604 (0.518)
0.918 (0.508) 0.990 (0.542) 0.608 (0.235) 0.374 (0.160) 0.984 (0.520)
0.916 (0.465) 0.966 (0.522) 0.590 (0.288) 0.558 (0.199) 0.892 (0.559)
0.954 (0.541) 0.996 (0.378) 0.540 (0.212) 0.526 (0.215) 0.956 (0.556)
0.954 (0.741) 0.954 (0.822) 0.878 (0.611) 0.714 (0.457) 0.998 (0.821)
0.920 (0.801) 0.910 (0.823) 0.867 (0.632) 0.663 (0.471) 1.000 (0.825)
0.950 (0.834) 0.925 (0.816) 0.834 (0.605) 0.672 (0.458) 0.986 (0.822)
0.848 (0.770) 0.910 (0.816) 0.848 (0.611) 0.652 (0.460) 0.996 (0.824)
0.578 (0.834) 0.640 (0.834) 0.702 (0.639) 0.640 (0.545) 0.996 (0.821)
0.298 (0.814) 0.372 (0.818) 0.622 (0.633) 0.554 (0.543) 1.000 (0.821)
0.940 (0.167) 0.928 (0.163) 0.846 (0.128) 0.972 (0.985)
0.907 (0.522) 0.990 (0.573) 0.917 (0.399) 0.990 (0.509)
0.946 (0.171) 0.948 (0.164) 0.876 (0.458) 0.942 (0.172)
0.876 (0.572) 0.972 (0.573) 0.912 (0.421) 0.984 (0.552)
0.624 (0.193) 0.834 (0.181) 0.672 (0.145) 0.902 (0.178)
0.952 (0.692) 1.000 (0.334) 0.704 (0.376) 0.980 (0.490)
Note: In bold are the coverage percents in a distance of 2.5% from the nominal level of 95%. Numbers in bracket are the length of the intervals.
size equal to 100 and different short-memory parameter values presented similar behavior and the results are available upon request). Now, the bandwidth in the semi-parametric methods was reduced to n0:5 . Because of the shape of the spectral density of the full model, the use of smaller bandwidth in the regression equation of the semi-parametric estimators avoids those ordinates at the frequencies away from zero that can contaminate the OLS estimates. From Table 3 it is noticed that the magnitude of the bias and mse depend on the sign of the AR coefficient. Negative f does not make any large impact on the estimates. This is explained by the shape of the spectral density of the process. For the low frequencies, the spectral density is not contaminated by negative AR coefficients, i.e., it is only dominated by the parameter d. However, contrasting behavior of the estimates is
ARTICLE IN PRESS G.C. Franco, V.A. Reisen / Physica A 375 (2007) 546–562
558
Table 6 Confidence interval rates and lengths for d in the ARFIMAð0; d; 1Þ; y ¼ 0:7, þ0:7; n ¼ 500 d
ASYMP G
NP
P
REG
H
LOC SIE ASYMP
S
NP
P
REG
R
LOC SIE ASYMP
G
NP
P
REG
H
LOC
T
SIE ASYMP
F
NP
T
LOC SIE
0.0
0.3
1.0
y ¼ 0:7
y ¼ þ0:7
y ¼ 0:7
y ¼ þ0:7
y ¼ 0:7
y ¼ þ0:7
0.950 (0.683) 0.948 (0.674) 0.924 (0.625) 0.842 (0.503) 1.000 (0.675)
0.840 (0.640) 0.852 (0.683) 0.880 (0.639) 0.804 (0.515) 0.998 (0.674)
0.952 (0.666) 0.942 (0.678) 0.932 (0.637) 0.842 (0.507) 0.986 (0.676)
0.950 (0.618) 0.984 (0.610) 0.936 (0.572) 0.810 (0.461) 0.960 (0.675)
0.870 (0.692) 0.908 (0.675) 0.858 (0.617) 0.808 (0.496) 0.654 (0.673)
0.898 (0.619) 0.966 (0.640) 0.914 (0.582) 0.834 (0.475) 0.886 (0.674)
0.926 (0.486) 0.936 (0.512) 0.602 (0.222) 0.376 (0.134) 0.994 (0.514)
0.618 (0.509) 0.598 (0.509) 0.458 (0.225) 0.270 (0.130) 0.990 (0.510)
0.896 (0.513) 0.890 (0.535) 0.604 (0.229) 0.426 (0.150) 0.864 (0.519)
0.918 (0.523) 0.964 (0.506) 0.526 (0.210) 0.526 (0.204) 0.854 (0.557)
0.594 (0.556) 0.670 (0.527) 0.424 (0.221) 0.250 (0.133) 0.174 (0.513)
0.752 (0.512) 0.872 (0.548) 0.518 (0.208) 0.518 (0.193) 0.656 (0.554)
0.946 (0.796) 0.950 (0.821) 0.830 (0.604) 0.656 (0.449) 0.994 (0.821)
0.898 (0.899) 0.850 (0.822) 0.786 (0.621) 0.624 (0.447) 0.970 (0.821)
0.948 (0.815) 0.925 (0.817) 0.862 (0.598) 0.682 (0.454) 0.992 (0.819)
0.714 (0.902) 0.588 (0.818) 0.738 (0.630) 0.664 (0.553) 0.986 (0.821)
0.908 (0.878) 0.918 (0.820) 0.808 (0.619) 0.664 (0.472) 0.966 (0.822)
0.850 (0.767) 0.988 (0.832) 0.848 (0.624) 0.792 (0.542) 0.986 (0.819)
0.954 (0.176) 0.934 (0.163) 0.862 (0.129) 0.944 (0.164)
0.798 (0.570) 0.954 (0.591) 0.846 (0.424) 1.000 (0.488)
0.950 (0.156) 0.946 (0.163) 0.874 (0.130) 0.956 (0.169)
0.740 (1.451) 0.986 (1.040) 0.928 (0.631) 0.926 (0.760)
0.824 (0.573) 0.944 (0.596) 0.844 (0.432) 0.604 (0.495)
0.925 (0.540) 0.962 (0.535) 0.900 (0.439) 0.888 (0.458)
Note: In bold are the coverage percents in a distance of 2.5% from the nominal level of 95%. Numbers in bracket are the length of the intervals.
observed when f is positive. In this short-memory parameter region and the frequencies near to the origin, the spectral density of the process has also the contribution of the positive AR coefficient, i.e., the memory of the process is governed by d and f. This explains the increasing of the bias (positive bias) and mse of the estimates. As previously noted, when d ¼ 1:4 the GPHT is the only estimator able to approximate the true value of d. Positive f increases significantly the bias of the estimates. Concerning the bootstrap procedures, the same picture revealed in the ARFIMAð0; d; 0Þ model (Table 1) is noticed here. In general, all bootstrap procedures follow the same behavior of MC estimates, except the SIE bootstrap, that presents in general the largest bias, but the smallest mse (except in the SPR case). When dealing with an ARFIMAð0; d; 1Þ model (Table 4), the MC estimates behave similarly as in the previous model but in the reverse way. In general, positive y seems to produce the effect of decreasing the
ARTICLE IN PRESS G.C. Franco, V.A. Reisen / Physica A 375 (2007) 546–562
559
GPH Estimator 4
Summary MC NP REG LOC
Density
3
SIE
2 1 0 -0.2
0.0
0.2
0.4
0.6
Fig. 1. Empirical MC and Bootstrap distributions for GPH.
4
Legend MC NP REG LOC SIE
Density
3 2 1 0 -0.2
0.0
0.2
0.4
Estimates of d
Fig. 2. Empirical MC and Bootstrap distributions for SPR.
GPHT Estimator 3.5 3.0 Density
2.5 2.0
Summary MC NP REG LOC SIE
1.5 1.0 0.5 0.0 -0.2
0.0
0.2
0.4
0.6
0.8
Fig. 3. Empirical MC and Bootstrap distributions for GPHT.
value of the estimates for all methods when dp1. This is explained by the shape of the spectral density of the ARFIMAð0; d; 1Þ process. Looking at the spectral density function of an MA process, it can be observed that, for low frequencies, its values are very low and it gets closer to zero as y ! 1. This pushes down the power spectrum of the long-memory process and the estimates tend to underestimate the real parameter substantially. However, for negative y the spectral is dominated by low frequencies indicating that the series is positively correlated. Hence, this will certainly contribute to the increasing of the positive correlation of the full model, and consequently, the estimates will tend to overestimate the true parameter. Thus, positive y tends
ARTICLE IN PRESS G.C. Franco, V.A. Reisen / Physica A 375 (2007) 546–562
560 4
Density
3
Legend MC NP LOC SIE
2 1 0 0.0
0.1
0.2 0.3 Estimates of d
0.4
0.5
Fig. 4. Empirical MC and Bootstrap distributions for FT.
to produce estimates with large negative bias while, in general, estimates with positive bias are found when dealing with y ¼ 0:7. It can be seen that in general, the model dependent FT estimator presents the smallest mse. With respect to the bootstrap procedures, it can also be seen that all bootstrap estimates are also dependent of the short-memory parameter. For some estimation methods, sample sizes and parameter values, SIE method follows the MC estimates well, however in other cases the sieve bootstrap is far from the MC point estimates. The REG bootstraps always show similar patterns by following the behavior of the MC estimates. The sign of the MA coefficient appears to produce large impact in the NP bootstrap estimates. For the stationary region, the bias increases significantly when y ¼ 0:7. The coverage rates of the asymptotic and bootstrap confidence intervals are displayed in Tables 5 and 6 for both the ARFIMAð1; d; 0Þ and the ARFIMAð0; d; 1Þ models, respectively, and n ¼ 500. As expected, there is not only one pattern and the methods do not produce constant coverage rates. The introduction of a shortmemory parameter in the models reveals that the confidence lengths are, in general, larger than the case where there is not a short-memory parameter (Table 2). The coverage rates are, in general, in the range 60–99% and there is no single procedure more accurate than the other, i.e., there is not a confidence interval procedure that gives all confidence rates close to the nominal 95% level. Also from the tables, if the estimation method is fixed and one goes through the lines, it can be observed that the sieve and local bootstrap methods give, in general, the largest and smallest coverage rates, respectively (except for SPR). The GPH method and its bootstrap confidence intervals give the best coverage percentages.
5. Conclusions In this paper four types of bootstraps applied to four different estimators of the long memory parameter d in ARFIMA models were studied and compared, based on empirical Monte Carlo results. Besides, bootstrap confidence intervals were built and their performances were analyzed through the coverage percentages. The estimators of d considered were the parametric method based on the maximum likelihood (FT) and the semiparametric methods based on the regression equation using the periodogram function (GPH), the tapered periodogram (GPHT) and the smoothed periodogram (SPR). The bootstrap procedures considered were the classical bootstrap in the residuals of the fitted model (NP), the bootstrap in the spectral density function (LOC), the bootstrap in the residuals resulting from the regression equation of the semi-parametric estimators (REG) and the Sieve bootstrap (SIE). The empirical investigation involved long-memory ARFIMA processes for models in the stationary and non-stationary range with non mean-reverting properties. In general the point estimates obtained from bootstrap resampling showed the same behavior as the Monte Carlo estimators, with the REG approach being the closest one. The disadvantage of the REG procedure is that it can only be calculated for the regression type estimators (GPH, SPR and GPHT). The SIE bootstrap is the only procedure which presents means very far from the MC results and the true value of d, although its mse is in general the smallest one.
ARTICLE IN PRESS G.C. Franco, V.A. Reisen / Physica A 375 (2007) 546–562
561
The confidence intervals built from bootstrap procedures did not show only one pattern. In general, the bootstraps considered here did not improve significantly the finite sample distribution of the memory estimators. Therefore, further research is necessary in the direction of the use of bootstrap procedures to approximate the distribution of some statistics of long-memory estimators. An idea is to use different bootstrap confidence intervals and this is one of the actual research investigation of the authors. Some of the bootstrap procedures explored here have also been considered in Franco and Reisen [36] in the context of bias correction of the estimates of d. The empirical investigation indicated that there was not any improvement in the reduction of the bias. However, this is still one topic of the author’s actual research. Acknowledgements V.A. Reisen and G.C. Franco gratefully acknowledge CNPq/Brazil for partial financial support. V.A. Reisen thanks his undergraduate student Giovanni V. Comarela for his help in some simulation study presented in this work. The authors also thank the anonymous referee for his valuable comments to improve the manuscript. References [1] J. Beran, Statistics for Long Memory Process, Chapman and Hall, New York, 1994. [2] A. Banerjee, G. Urgo, Modelling structural breaks, long memory and stock marked volatility: an overview, J. Econometrics 129 (2005) 1–34. [3] R. Fox, M.S. Taqqu, Large-sample properties of parameters estimates for strongly dependent stationary Gaussian time series, Ann. Statist. 14 (1986) 517–532. [4] J. Geweke, S. Porter-Hudak, The estimation and application of long memory time series model, J. Time Ser. Anal. 4 (4) (1983) 221–238. [5] C.M. Hurvich, B.K. Ray, Estimation of the memory parameter for nonstationary or noninvertible fractionally integrated processes, J. Time Ser. Anal. 16 (1) (1995) 017–042. [6] V.A. Reisen, Estimation of the fractional difference parameter in the ARIMA(p,d,q) model using the smoothed periodogram, J. Time Ser. Anal. 15 (3) (1994) 335–350. [7] P.M. Robinson, Log-periodogram regression of time series with long range dependence, Ann. Statist. 23 (3) (1995) 1630–1661. [8] L. Bisaglia, D. Gue´gan, A comparison of techniques of estimation in long memory processes, Comput. Statist. Data Anal. 27 (1998) 61–81. [9] C.M. Hurvich, R.S. Deo, Plug-in selection of the number of frequencies in regression estimates of the memory parameter of a longmemory time series, J. Time Ser. Anal. 20 (3) (1999) 331–341. [10] V.A. Reisen, B. Abraham, E.M.M. Toscano, Parametric and semiparametric estimations of stationary univariate ARFIMA models, Brazilian J. Probab. Statist. 14 (2001) 185–206. [11] P. Doukham, G. Oppeenhein, M.S. Taqqu, Theory and Applications of Long-Range Dependence, Birkhauser, Basel, 2003. [12] B. Efron, R. Tibshirani, An Introduction to the Bootstrap, Chapman and Hall, New York, 1993. [13] M.K. Andersson, M.P. Gredenhoff, Improving fractional integration tests with bootstrap distributions, Working Paper, 74, National Institute of Economic Research, Stockholm, Sweden, 2002. [14] de C. Peretti, Unilateral and bilateral bootstrap tests for long memory, Comput. Econom. Finance (2002) 334. [15] P. Grau-Carles, Test for long memory processes. A bootstrap approach, Comput. Econom. Finance (2004) 111. [16] D.W.K. Andrews, O. Lieberman, Higher-order improvements of the parametric bootstrap for long-memory Gaussian processes. Cowles Foundation Discussion Papers, Number 1378, 2002. [17] J. Arteche, J. Orbe, Bootstrapping the log-periodogram regression, Econom. Lett. 86 (2005) 79–85. [18] G.C. Franco, V.A. Reisen, Bootstrap techniques in semiparametric estimation methods for ARFIMA models: a comparison study, Comput. Statist. 19 (2004) 243–259. [19] G.C. Franco, V.A. Reisen, P.A. Barros, Unit root tests using semi-parametric estimators of the long-memory parameter, J. Statist. Comput. Simulation 76 (8) (2006) 727–735. [20] S.N. Lahiri, On the moving block bootstrap under long range dependence, Statist. Probab. Lett. 18 (1993) 405–413. [21] P. Bu¨hlmann, Bootstrap for time series, Statistical Science 17 (1) (2002) 52–72. [22] E. Paparoditis, D.N. Politis, The local bootstrap for periodogram statistics, J. Time Ser. Anal. 20 (2) (1999) 193–222. [23] P. Bu¨hlmann, Sieve bootstrap for time series, Bernoulli 3 (1997) 123–148. [24] Y. Chang, J.Y. Park, A sieve bootstrap for the test of a unit root, J. Time Ser. Anal. 24 (2003) 370–400. [25] L. Bisaglia, I. Procidano, On the power of the augmented Dickey-Fuller test against fractional alternatives using bootstrap, Econom. Lett. 77 (2002) 343–347. [26] A. Alonso, D. Pen˜a, J. Romo, Sieve bootstrap prediction intervals, in: OMPSTAT’ 2000. Proceedings in Computational Statistics, 2000, pp. 181–186.
ARTICLE IN PRESS 562 [27] [28] [29] [30] [31] [32] [33] [34] [35] [36]
G.C. Franco, V.A. Reisen / Physica A 375 (2007) 546–562 J. Hosking, Fractional differencing, Biometrika 68 (1) (1981) 165–175. C. Velasco, Non-stationary log-periodogram regression, J. Econometrics 91 (1999) 325–371. C. Velasco, Gaussian semiparametric estimation of non-stationary time series, J. Time Ser. Anal. 20 (1) (1999) 87–127. L.A.M. Santander, V.A. Reisen, B. Abraham, Non-cointegration tests and a fractional ARFIMA process, Statist. Methods 5 (1) (2003) 1–22. S.R.C. Lopes, B.P.A. Olbermann, V.A. Reisen, Comparison of estimation methods in non-stationary ARFIMA process, J. Statist. Comput. Simulation 74 (5) (2004) 339–347. B. Efron, Bootstrap methods: another look at the Jackknife, Ann. Statist. 7 (1979) 1–26. E.M. Silva, G.C. Franco, V.A. Reisen, F.R.B. Cruz, Local bootstrap approaches for fractional differential parameter estimation in ARFIMA models, Comput. Statist. Data Anal., 2006, forthcoming. J. Hosking, Modelling persistence in hydrological time series using fractional differencing, Water Resour. Res. 20 (12) (1984) 1898–1908. V.A. Reisen, M.R. Sena Jr., S.R.C. Lopes, Error and Order Misspecification in ARFIMA Models, Brazilian Rev. Econometrics 21 (1) (2001) 62–79. G.C. Franco, V.A. Reisen, Bootstrap bias correction in semiparametric estimation methods for ARFIMA models, XXXV Simpo´sio Brasileiro de Pesquisa Operacional, 2003, pp. 726–736.