Kalman filter estimation for a regression model with locally stationary errors

Kalman filter estimation for a regression model with locally stationary errors

Computational Statistics and Data Analysis 62 (2013) 52–69 Contents lists available at SciVerse ScienceDirect Computational Statistics and Data Anal...

692KB Sizes 0 Downloads 48 Views

Computational Statistics and Data Analysis 62 (2013) 52–69

Contents lists available at SciVerse ScienceDirect

Computational Statistics and Data Analysis journal homepage: www.elsevier.com/locate/csda

Kalman filter estimation for a regression model with locally stationary errors Guillermo Ferreira ∗ , Alejandro Rodríguez, Bernardo Lagos Department of Statistics, Universidad de Concepción, Chile

article

info

Article history: Received 6 September 2011 Received in revised form 24 July 2012 Accepted 6 January 2013 Available online 21 January 2013 Keywords: Estimation of the state Long-range dependence Local stationarity Non-stationarity State space system Time-varying models

abstract In this paper, a methodology for estimating a regression model with locally stationary errors is proposed. In particular, we consider models that have two features: time-varying trends and errors belonging to a class of locally stationary processes. The proposed procedure provides an efficient methodology for estimating, predicting and handling missing values for non-stationary processes. We consider a truncated infinite-dimensional state space representation and, with the Kalman filter algorithm we estimate the parameters of the model. As suggested by the Monte Carlo simulation studies, the performance of the Kalman filter approach is very good, even with small sample sizes. Finally, the proposed methodology is used in two real life applications. © 2013 Elsevier B.V. All rights reserved.

1. Introduction There exists a large variety of time series data that display both a local or global trend and correlated processes with short or long memory components. Data with such features arise in many fields, including economics, climatology and hydrology, among others. In order to utilize the well-known statistical techniques for stationary processes (e.g. ARMA models, methods based on the spectrum) various techniques are commonly used for trend removal, such as specialized transformations (differencing) of data, regression models or consideration of the data over small piecewise stationary time intervals. This methodology is applied even in situations where it is obvious that a non-stationary model is more adequate; an example of this is when time series data show evidence of instability in stochastic characteristics, such as time-varying trends or evolutionary second order structure. Methodologies for estimating the parameters of non-stationary processes that have a time-varying spectral representation have been studied by several authors: Silverman (1957), Priestley (1965) (also Priestley, 1981) and Dahlhaus (1996), among others. More recently, Dahlhaus (1997) has introduced a general class of locally stationary (LS hereafter) processes. In this approach, the parameters of the non-stationary model are allowed to vary smoothly over time so that these processes can be locally approximated by stationary processes. Furthermore, this author has provided a study of the asymptotic properties of short memory processes with time-varying parameters. The estimation of parameters for locally stationary long memory (LSLM) processes has been studied by Jensen and Witcher (2000), Beran (2009) and Palma and Olea (2010); see Palma et al. (2013) for a recent overview. In the context of a regression model with errors belonging to the class of LS processes, Palma (2010) has studied the asymptotic behavior of the sample mean while Palma and Ferreira (2010) have analyzed the asymptotic properties of the



Corresponding author. Tel.: +56 976084694. E-mail addresses: [email protected] (G. Ferreira), [email protected] (A. Rodríguez), [email protected] (B. Lagos).

0167-9473/$ – see front matter © 2013 Elsevier B.V. All rights reserved. doi:10.1016/j.csda.2013.01.005

G. Ferreira et al. / Computational Statistics and Data Analysis 62 (2013) 52–69

53

least squares estimator (LSE) of a regression model with locally stationary errors. However, with respect to the estimation of both time-varying trends and errors with short or long memory processes, there has been almost no discussion. In this paper we discuss the estimation and forecasting of LS processes that have exogenous variables which can be handled by state space models; see Harvey (1989) and Durbin and Koopman (2001). The space state framework proposed in this paper allows the fitting of non-stationary time series data displaying both trends and time-varying short or long memory errors, and moreover, it deals with missing values. Thus, exact and approximate maximum likelihood estimates (MLE), one-step and multi-step predictors along with their error bands can be obtained by means of the Kalman recursive equations. The article is organized as follows. In Section 2 we discuss a class of regression model with LS errors. In Section 3 we introduce the space state models and their application to locally stationary processes; more specifically we study the regression model with LS autoregressive moving average and LS fractional noise errors. The Kalman filter and its properties are also discussed. In particular, we find a finite-dimensional state space representation of the regression model, allowing estimates in a finite number of steps. In Section 4 we carry out a Monte Carlo simulation in order to assess the estimation performance of the proposed state space methodology. Section 5 includes applications for cave deposit data and tree-ring data. Final remarks are given in Section 6. 2. Regression model with LS errors Consider a Gaussian regression model with locally stationary errors

     π t t β + A0t ,T (λ) eiλt dB(λ), (1) T T −π     for t = 1, . . . , T , where X Tt = (xt ,1 , . . . , xt ,k ) is a k-vector of non-stochastic regressors, β Tt = (βt ,1 , . . . , βt ,k )⊤ is a vector of time-varying parameters, A0t ,T is a transfer function, B(λ) is a Brownian motion on [−π , π] and there is a constant K and a 2π —periodic transfer function A : (0, 1] × R → C with A(u, −λ) = A(u, λ) such that supt ,λ positive   A0 (λ) − A t , λ  ≤ K , for all T . The transfer function A0 (λ) of this class of non-stationary process changes smoothly t ,T t ,T T T Y t ,T = X

over time, hence it can be locally approximated by stationary processes. An example of this class of locally stationary process is given by the infinite moving average expansion (LSMA(∞)) Y t ,T

        ∞ t t t t β +σ ψj z t −j , =X T

T

T

(2)

T

j =0

2 where {ψj (u)} are coefficients satisfying j=0 ψj (u) < ∞ for all u ∈ [0, 1] and {zt } is a zero-mean and unit variance Gaussian white noise process. By utilizing LSMA(∞) representation, regression models with LS short or long memory errors can be easily obtained.

∞

2.1. Short memory To illustrate the case of a short-memory process, we consider the regression model with LS autoregressive moving average errors LSARMA(p, q), defined as follows

εt ,T =

p  j =1

φj

  t

T

εt − j ,T +

q  j =1

θj

  t

T

zt −j + zt ,

for t = 1, 2 . . . , T . Suppose that p = q = 1 i.e. εt ,T = φ Tt εt −j,T + θ Tt zt −1 + zt , this process is a particular case of regression model (2) with coefficients ψj (u) = (φ(u) − θ (u)) φ j−1 (u) for j ≥ 1 and ψj (u) = 1 for j = 0 in the LSMA(∞) process. The covariance structure for this model is

 

 

 s t   s  s   t s−t −1 κT (s, t ) = σ σ δs,t (A) + φ −θ φ δs,t (Ac ) T

T

T

T

 s     t     s−t  φ T − θ Ts φ T − θ Tt φ Ts     + , 1 − φ Ts φ Tt

T

(3)

where δx (A) = 1 if x ∈ A and δx (A) = 0 if x ̸∈ A, with A =  s, t ) : s−t = 0}. When q = 0 we have a LS autoregressive  t  LSAR(p)  t {( p process, as defined in Dahlhaus (1997), εt ,T = j=1 φj T εt −j,T + zt , for t = 1, 2 . . . , T . If p = 1, then εt ,T = φ T εt −j,T + zt ,

this process is a particular case of the regression model (2) with coefficients ψj (u) = φ j (u) in the LSMA(∞) processes. Its covariance is given by

s t  κT (s, t ) = σ σ T

T

  t s−t φ T    , 1 − φ Tt φ Ts

(4)

54

G. Ferreira et al. / Computational Statistics and Data Analysis 62 (2013) 52–69

for s, t  = 1, . .. ,T , s ≥ t. Using similar arguments we can handle regression models with LSMA(q) errors defined by εt ,T = qj=0 ψj Tt zt −j . 2.2. Long memory A particular case of the regression model (2) is the generalized version of the fractional noise process described by Yt ,T

        ∞ t t t t =X β +σ ηj z t −j , T

T

T

(5)

T

j =0

for t = 1, . . . , T , with infinite moving average coefficients {ηj (u)} given by

0 [j + d(u)] , 0 [j + 1]0 [d(u)]

ηj (u) =

(6)

where 0 [·] is the Gamma function, d(u) is a long-memory parameter and σ (u) is a noise scale factor. The errors of this regression model are referred to as locally stationary fractional noise processes (LSFN). Note that, according to Lemma A.1 of Palma (2010), the covariance of a LSFN process {Yt ,T } described by the formula (5) is given by

s t  κT (s, t ) = σ σ T

T

        0 1 − d Ts − d Tt 0 s − t + d Ts          , 0 1 − d Ts 0 d Ts 0 s − t + 1 − d Tt

(7)

for s, t = 1, . . . , T , s ≥ t. A natural extension of the LSFN model is the locally stationary autoregressive fractionally integrated moving average (LSARFIMA) process, which is defined by

Φ

  t

T

εt ,T = Θ

  t

T

(1 − B)−d( T ) σ t

  t

T

zt ,

for t = 1, 2, . . . , n, where for u ∈ [0, 1], Φ (u, B) = 1 − φ1 (u)B − · · · − φp (u)Bp is an autoregressive polynomial and Θ (u, B) = 1 + θ1 (u)B + · · · + θq Bq is a moving average polynomial. 3. State space representation In this section we introduce the state space representation and derive the Kalman filter algorithm for estimation and prediction of the LSMA(∞) process given in (2). Consider the non-stationary state space system

   ξt ,T + Wt ,T , βt ,T        ξt +1,T F 0 ξt ,T H = t ,T + V t ,T , βt +1,T 0 Ik βt ,T 0 

Yt ,T = Gt ,T

X t ,T

(8)

where {Yt ,T } are the observations, Gt ,T ∈ MR(1×m+1) are observation operators, [ξt ,T βt ,T ]⊤ is the state vector, Wt ,T is a white noise with Rt ,T variance, Ft ,T ∈ MR(m+1×m+1) is a state transition operator, Ir denotes the r × r identity matrix hereafter, H is a linear operator and Vt ,T is a state noise with variance Qt ,T . Hence, given a regression model with errors belonging to a class of LSMA(∞) processes, it can be represented by a state space system by generalizing the infinite-dimensional equations given by Hannan and Deistler (1988) to the non-stationary case. The following proposition provides the necessary details. Proposition 1. The non-stationary process specified in (2) can be represented by the following infinite-dimensional state space system

  

Yt ,T = σ



t

T

ξt +1,T F = t ,T βt +1,T 0 



ψ1

1 0 Ik

for t = 1, . . . , T , where Ft ,T

  t

T

ψ2

  t

T

ψ3

  t

T

 ···

⊤  ξt ,T + 1 0 0 0 ··· z t +1 , βt ,T 0   0 0 = I∞ , I∞ = diag {1, 1, . . .} and Rt ,T = 0. 0







 ξ t ,T , β t ,T

 X t ,T

(9)

G. Ferreira et al. / Computational Statistics and Data Analysis 62 (2013) 52–69

55

Proof. Let ξt ,T (j) be the j-component of the state vector. Then, from the state equation we can obtain the recursive  equation ∞ ξt ,T (j + 1) = ξt −1,T (j) = zt −j . Additionally, from the observation equation we have that Yt ,T = Xt ,T βt ,T + σ Tt j=0 ηj

ξt ,T (j + 1). Now, replacing ξt ,T (j + 1) with zt −j proves the state space representation. Furthermore, as there is no T observation noise, Rt ,T = 0 for t = 1, . . . , T . Finally, Qt ,T = [1, 0, 0, . . .]′ [1, 0, 0, . . .], since we assume E(zt2 ) = 1 for t = 1, . . . , T .  t

An approximation of the model in (2) can be performed by truncating the sum after the first m terms as follows Yt ,T = X

        m t t t t β +σ ψj εt − j , T

T

T

(10)

T

j =0

for t = 1, . . . , T and some positive integer m. The regression model given by (10), corresponds to the truncated version of LSMA(∞), in particular we have an exact expansion for a LSMA(p) with p ≤ m since in that case ψ(u) = 0 for j > p. An advantage of this approach is that we can express it by a finite state space framework; see Brockwell and Davis (1991). Consider the state space representation of the model (10) given by Yt ,T



   t 1 = σ

ψ1

T

ξt +1,T F = t ,T βt +1,T 0 



0 Ik

  t

T

ψ2

  t

···

T

ψm

  t

X t ,T

T

 ξ t ,T , βt ,T



′ 

ξ t ,T 1 0 0 ··· 0 + z t +1 , (11) βt ,T 0   = I0m 00 . The following result establishes the asymptotic magnitude of the truncation error



for t = 1, . . . , T , where Ft ,T





when approximating (2) by (10). Proposition 2. The order of the error variance rm when approximating {Yt ,T } by the finite moving average expansion defined by (10) is given by

 rm ∼

O (m2d−1 ), O (exp(−am)),

for a long-memory process for a short-memory process

for large m, where a > 0 and d = sup u d(u) < 1/2. Proof. The error variance, rm , of a truncated infinite LSMA(∞) process {Yt ,T } is given by

 rm = Var

∞ 

ψj

  t

T

j =0

∞ 

=

ψ

2 k

k=m+1

  t

T

z t −j −

m  j =0

ψj



  t

T

z t −j

 = Var

∞ 

k=m+1

ψk



  t

T

zt −k

,

where the coefficients {ψk } are approximated by

ψm ∼ m−(1−d(u)) 0 [d(u)]−1 , when m → ∞ and d reaches a maximum value at u ∈ (0, 1). Furthermore, using Lemma 3.3 of Palma (2007) we get ∞  k=m+1

ψk2

  t

T

∼ m2d−1 0 [d]−1 (2d − 1)−1 ,

(12)

obtaining that rm = O (m2d−1 ) when m → ∞. The proof for the case of a short-memory process is similar to the previous case, using ψm ∼ exp(−am) when m → ∞.  Proposition 3. The state space representation of the process in (10) is minimal for both the short and long memory cases. The proof of this result is provided in the Appendix. 3.1. Estimation of the state We can use the Kalman filter equations for estimating the parameters of the model casted into the state space representation given in (9). Moreover, the filter allows us to recursively obtain predictions of future values of the series and deal with

56

G. Ferreira et al. / Computational Statistics and Data Analysis 62 (2013) 52–69 Ď

missing observations. To this end, consider the following vectors Gt ,T = Gt ,T Xt ,T , Ψt ,T = ξt ,T βt ,T P =







⊤

Ď

, Ft ,T =



F t ,T 0

0 Ik



and

  H 0

. Then, we can re-write the state space system (8) as follows Ď

Yt ,T = Gt ,T Ψt ,T + Wt ,T Ď

Ψt +1,T = Ft ,T Ψt ,T + PVt ,T . (13)     t ,T )(Ψt ,T − Ψ t ,T )⊤ be Let ∆t ,T = E (Yt ,T −  Yt ,T )(Yt ,T −  Yt ,T )⊤ be the prediction error variance and Ωt ,T = E (Ψt ,T − Ψ ⊤  t ,T = E(ξt ,T |Yt −1,T )  the state prediction error variance–covariance matrix with Ψ βt ,T . Hence, the Kalman recursions equations are given in the following proposition.

1,T = [(0, 0 . . .), Proposition 4. Suppose that the system in (13) is satisfied, and given the initial condition Y0,T = (0, 0 . . .), Ψ   I∞ 0 (0, . . . , 0)]⊤ and Ω1,T = , then the Kalman filter equations are 0 0k×k Ď

Ď

Ď

Ď

∆t ,T = Gt ,T Ωt ,T (Gt ,T )⊤ , Θt ,T = Ft ,T Ωt ,T (Gt ,T )⊤ ,  Ď Ď 1 ⊤ F Ω (F )⊤ + Qt ,T − Θt ,T ∆− t ,T Θ t ,T , Ωt +1,T = tĎ,T t ,T tĎ,T ⊤ Ft ,T Ωt ,T (Ft ,T ) + Qt ,T , Ď   Yt ,T = Gt ,T Ψ t ,T  Ď −1   t +1,T = FtĎ,T Ψt ,T + Θt ,T ∆t ,T (Yt ,T − Yt ,T ), Ψ  F t ,T Ψt ,T ,

if t is observed, if t is missing value.

(14)

if t is observed, if t is missing value.

Proof. The result comes from the time-varying state space system given by Eqs. (12.2.6)–(12.2.7) of Brockwell and Davis (1991), and Eqs. (12.3.1)–(12.3.4) when considering missing values.  Finally, the parameters of the model, denoted by θ , can be estimated by maximizing the log-likelihood function

L(θ ) =

1 2



 T  (Yt ,T −  Y t ,T ) 2 log ∆t ,T (θ ) + . ∆t ,T (θ ) t =1 t =1

T 

(15)

Hence, the MLE is given by  θ = arg maxθ L(θ ). Note that (14) gives an exact method for estimating the parameters of the model. However, the Kalman filter in (14) can be applied straightforwardly to the truncated representation in (10), resulting, in this case, in an approximate MLE. 3.2. Prediction The Kalman filter also provides the best j-step linear mean-square predictor in-sample for  Yn+j,T = E(Yn+j,T |Yn,T , Yn−1,T ,

. . . , Y1,T ), j = 1, . . . , T − n; see Palma et al. (2013). It is given as follows  Yn+j,T = E(Yn+j,T |Yn,T , Yn−1,T , . . . , Y1,T )

= Xn+j,T  βn+j,T + Gn+j,T ξn+j,T , (16)  n+j    n +j  . From the Kalman equations, it can be observed that for j = 1, . . . , T − n, where Gn+j,T = 1, ψ1 T , . . . , ψm T Ωn+j,T satisfies the following recursions Ωn+1,T = F Ď Ωn,T (F Ď )⊤ + Q Ωn+2,T = (F Ď )2 Ωn ((F Ď )⊤ )2 + F Ď Q (F Ď )⊤ + Q .. . j−1  Ωn+j,T = (F Ď )j Ωn,T ((F Ď )⊤ )j + (F Ď )k Q ((F Ď )⊤ )k . k =0

Furthermore, the prediction error ∆n+j,T satisfies

∆n+j,T = Gn+j,T Ωn+j,T G⊤ n + j ,T ,

for j = 1, . . . , T − n.

(17)

Finally, the j-step linear predictor for out-sample  YT +h,T with h > 0, is obtained by re-defining the sample size T u = T + h u u and considering the observations T + 1, . . . , T + h as missing data.

G. Ferreira et al. / Computational Statistics and Data Analysis 62 (2013) 52–69

57

4. Numerical and simulation studies In this section we illustrate the performance of the procedure proposed for estimating a regression model with a nonstationary error term by means of a Monte Carlo simulation. In particular, we consider two models, the first is a LSAR(1), a short-memory case, and the second is a LSFN, a long-memory case. Both models have time-varying parameters. For simplicity in the presentation, we consider the regression parameters to be fixed, however, the Kalman filter can be applied straightforwardly for dealing with time-varying regression parameters; see Merger (2009) for more details. 4.1. Short memory case For this case we consider two levels of truncation m = 10, 30. In addition, we incorporate 10% and 20% of missing values, which have been randomly selected for each simulation. All simulated series are generated using the innovation algorithm; see Brockwell and Davis (1991). We also incorporate a break in the mean at time TB = 350 and variance shift in τ = 0.34. Finally, we use sample size T = 1024 and 1000 replications. The regression model given by Yt ,T = µ + β1

t

+ β2 DUt ,T + εt ,T ,

T

(18)

where DUt ,T = 1(t < TB ) is the indicator function, TB is the break point and {εt ,T } is a LSAR(1) process with time-varying parameters given by

φ(u) = a0 + a1 u,

σ (u) = b0 h u (τ ) + b21 h u (τ ),

where |φ(u)| < 1, σ (u) > 0, h u (τ ) = 1(t < [T τ ]), and h u (τ ) = 1 − h u (τ ) for (u, τ ) ∈ [0, 1] × [0, 1] and b1 > 0. Hence, the vectors of parameters to be estimated are a = (a0 , a1 ) for φ(u) and b = (b0 , b1 ) for the noise scale σ (·). The theoretical standard deviations (SD) for the MLE of short-memory LS models are based on Theorem 3.1 of Dahlhaus (2000):

0 (a)i,j =

1



ui+j−2



1 − [φ(u)]2

0

du ,

0 (b)i,j = 2

1

 0

ui+j−2

(b0 + b1 u)2



du ,

(19)

for j = 1, 2. The above integrals can be evaluated by standard calculus procedures,1 or by numerical integration. Furthermore, some fundamental large sample properties of the LSE in regression models with LS errors have been analyzed in Palma and Ferreira (2010). The LSE of β can be expressed as follows

 β T = VT−1

T 

xt ,T yt ,T ,

t =1

⊤ ⊤ where VT = XT XT⊤ = = x1 Tt , . . . , xp t =1 xt ,T xt ,T , with xt ,T = (xt ,1 , . . . , xt ,k ) optimal SD for the LSE, we use the theoretical value of its variance, which is given by

T

Var( β T ) = VT

−1

  

   T  T  t x

s=1 t =1

T

κT (s, t )x



  t

T

VT−1 ,

 t ⊤ T

=x

t T

. In order to obtain the

(20)

where κT (s, t ) is given by (4). Table 1 shows the estimates of the parameters and both the theoretical and the sample estimates of the SD. Note that these estimates are very close to their theoretical counterparts, especially as the truncation level m increases. As expected, the precision of the estimates deteriorates as the percentage of missing data increases. 4.2. Long memory case Consider the following regression model with long-memory errors given by Yt ,T = µ + β1

t T

t

+ β2 DUt ,T + β3 DUt ,T + εt ,T , T

(21)

where {εt ,T } is a LSFN process with long-memory and scale parameter given by d(u) = a0 + a1 u,

σ (u) = b0 + b1 u,

1 See, for example, Gradshteyn and Ryzhik (2000).

(22)

58

G. Ferreira et al. / Computational Statistics and Data Analysis 62 (2013) 52–69 Table 1 Estimation of model (18) with a0 = −0.3, a1 = 0.8, b0 = 1, b1 = 1.3, µ = 2, β1 = 5, β2 = 4. m = 10

% NA

 a0  a1  b0  b1  µ  β1  β2 σ ( a0 ) σ ( a1 ) σ ( b0 ) σ ( b1 ) σ ( µ) σ ( β1 ) σ ( β2 )  σ ( a0 )  σ ( a1 )  σ ( b0 )  σ ( b1 )  σ ( µ)  σ ( β1 )  σ ( β2 )

m = 30

0%

10%

20%

0%

10%

20%

−0.301

−0.299

−0.221

−0.302

−0.300

−0.295

0.798 0.995 1.298 2.008 4.994 3.991 0.061 0.102 0.058 0.134 0.263 0.482 0.196 0.062 0.103 0.064 0.139 0.270 0.494 0.198

0.789 0.890 1.297 2.001 4.997 4.001 0.064 0.108 0.061 0.141 0.293 0.525 0.206 0.068 0.126 0.066 0.146 0.306 0.540 0.212

0.679 0.835 1.227 1.984 5.029 4.011 0.068 0.114 0.065 0.150 0.313 0.558 0.212 0.076 0.122 0.069 0.153 0.346 0.581 0.244

0.793 0.997 1.298 2.005 4.986 3.996 0.061 0.102 0.058 0.134 0.263 0.482 0.196 0.059 0.103 0.057 0.133 0.270 0.489 0.194

0.797 0.991 1.297 1.996 5.003 4.003 0.064 0.108 0.061 0.141 0.293 0.525 0.206 0.063 0.113 0.063 0.143 0.304 0.531 0.204

0.786 0.945 1.263 1.997 5.005 4.003 0.068 0.114 0.065 0.150 0.313 0.558 0.212 0.071 0.118 0.064 0.148 0.342 0.562 0.217

Table 2 Estimation of model (21) with a0 = 0.20, a1 = 0.15, b0 = 0.3, b1 = 0.2, µ = 0.5, β1 = 1.5, β2 = 0.65, β3 = 2. m = 10

% NA

 a0  a1  b0  b1  µ  β1  β2  β3 σ ( a0 ) σ ( a1 ) σ ( b0 ) σ ( b1 ) σ ( µ) σ ( β1 ) σ ( β2 ) σ ( β3 )  σ ( a0 )  σ ( a1 )  σ ( b0 )  σ ( b1 )  σ ( µ)  σ ( β1 )  σ ( β2 )  σ ( β3 )

m = 30

0%

10%

20%

0%

10%

20%

0.195 0.137 0.306 0.189 0.505 1.489 0.652 1.988 0.049 0.084 0.015 0.030 0.314 0.520 0.330 0.668 0.055 0.093 0.018 0.034 0.295 0.536 0.345 0.734

0.197 0.135 0.289 0.181 0.510 1.488 0.645 2.017 0.051 0.089 0.016 0.032 0.373 0.585 0.389 0.704 0.063 0.105 0.018 0.034 0.388 0.612 0.405 0.799

0.194 0.141 0.273 0.171 0.499 1.500 0.646 2.019 0.055 0.094 0.017 0.034 0.467 0.685 0.482 0.779 0.070 0.114 0.019 0.037 0.512 0.703 0.530 0.786

0.196 0.143 0.306 0.192 0.508 1.484 0.646 1.993 0.049 0.084 0.015 0.030 0.314 0.520 0.330 0.668 0.054 0.089 0.017 0.032 0.292 0.519 0.322 0.674

0.202 0.139 0.287 0.186 0.503 1.494 0.641 2.023 0.051 0.089 0.016 0.032 0.373 0.585 0.389 0.704 0.059 0.101 0.016 0.033 0.381 0.588 0.388 0.703

0.199 0.138 0.271 0.177 0.507 1.499 0.647 1.984 0.055 0.094 0.017 0.034 0.467 0.685 0.482 0.779 0.059 0.105 0.018 0.034 0.462 0.681 0.477 0.739

where 0 < d(u) < 1/2, σ (u) > 0 for u ∈ [0, 1] and a break in mean at time TB = 420. In this case the theoretical SD for the vector of parameters a = (a0 , a1 ) are based on Theorem 2.2 of Palma and Olea (2010) and are given by 0 (a)i,j=1,2 = π 2 [6(i + j − 1)]−1 . Additionally, the exact value of the variance for βt is given by the formula (20) utilizing time-varying covariance function κT (s, t ) specified in (7). In order to explore the effect of the estimation for a more general specification of d(u), we also consider the following model



Yt ,T = µ + β1 sin ω1

t T



  t + β2 cos ω1 + εt ,T , T

(23)

G. Ferreira et al. / Computational Statistics and Data Analysis 62 (2013) 52–69

59

Table 3 Estimation of model (23) with a0 = 0.35, a1 = 0.1, b0 = 0.25, b1 = 0.25, µ = 4, β1 = 0.5, β2 = 1.5. m = 10

% NA

 a0  a1  b0  b1  µ  β1  β2 σ ( a0 ) σ ( a1 ) σ ( b0 ) σ ( b1 ) σ ( µ) σ ( β1 ) σ ( β2 )  σ ( a0 )  σ ( a1 )  σ ( b0 )  σ ( b1 )  σ ( µ)  σ ( β1 )  σ ( β2 )

m = 30

0%

10%

20%

0%

10%

20%

0.351 0.098 0.265 0.241 4.000 0.503 1.501 0.024 0.034 0.013 0.028 0.210 0.155 0.116 0.027 0.033 0.018 0.031 0.234 0.175 0.118

0.345 0.066 0.258 0.245 4.007 0.499 1.495 0.026 0.036 0.016 0.032 0.212 0.156 0.117 0.035 0.046 0.019 0.035 0.245 0.178 0.128

0.277 0.089 0.240 0.244 4.001 0.501 1.497 0.027 0.039 0.017 0.034 0.215 0.157 0.119 0.040 0.049 0.020 0.036 0.251 0.188 0.135

0.351 0.098 0.258 0.245 4.000 0.498 1.502 0.024 0.034 0.013 0.028 0.210 0.155 0.116 0.025 0.032 0.015 0.028 0.218 0.163 0.115

0.343 0.056 0.254 0.243 4.001 0.502 1.498 0.026 0.036 0.016 0.032 0.212 0.156 0.117 0.032 0.044 0.016 0.031 0.221 0.162 0.131

0.336 0.092 0.234 0.241 4.000 0.496 1.501 0.027 0.039 0.017 0.034 0.215 0.157 0.119 0.033 0.045 0.017 0.035 0.225 0.163 0.133

where ω1 = 52π , and the time-varying long-memory parameters have a sinusoidal component d(u) = a0 + a1 sin(4π u) and the noise scale is given by formula (22), for all u ∈ [0, 1]. This model corresponds to a harmonic function for the time-varying long-memory parameters with an autocovariance matrix given by

0 (a)ij =

π2



sin(λi − λj )

λi − λj

12

+

sin(λi + λj )

λi + λj



,

for i, j = 1, 2,

(24)

where λ1 = 0 y λ2 = 2π . Tables 2–3 show the simulation results from the Kalman filter estimates for two truncation levels, m = 10, 30. Note that the average estimates of the parameters are close to their true values for both truncation levels. On the other hand, the empirical SD are close to their optimal theoretical counterparts. 5. Data applications In this section we apply the proposed methodology for estimating, predicting and handling missing values to two real time series. The first set of data involves a time series consisting of speleothem layer thickness. The second illustration uses pine tree-ring measurements. 5.1. Cave deposit data Speleothems are mineral deposits formed from groundwater within underground caverns. Stalagmites, stalactites, and other forms may be annually banded or contain compounds which can be radiometrically dated. Thickness of depositional layers or isotopic records can be used as climate proxies. The speleothem layer thickness was measured in Grotta di Ernesto, an Alpine cave in northern Italy. We use growth measurements of stalagmite ER76 from this cave. These measurements, available from www.ncdc.noaa.gov/paleo/ speleothem.html, are plotted in Fig. 1. The data have been observed from AD 1445 to AD 1992, see Frisia et al. (2006). There are two periods of missing values, from AD 1651 to AD 1693 and from AD 1842 to AD 1860. In recent years, several authors have studied the relationship between the mechanism of stalagmite growth and the climate in paleoclimatic data; see for example Qin et al. (1999), Finh et al. (2001) and Manfred (2010). According to these authors, we can use a simple regression model with non-stationary errors LSARMA(1, 1) given by Y t ,T = µ + δ 0

t T

+

    2   t t αj sin ωj + βj cos ωj + εt ,T , j =1

T

T

where ω1 = 2π /150, ω2 = 2π /210 and {εt ,T } ∼ LSARMA(1, 1). In order to find out the dynamic structure of the parameters an exploratory analysis is proposed for the full data as follows: the series is divided into several windows, each window

60

G. Ferreira et al. / Computational Statistics and Data Analysis 62 (2013) 52–69

Fig. 1. Cave mineral deposit data from 1445 AD to 1992 AD.

Fig. 2. Cave deposit data. (a) Estimates of the autoregressive parameter. (b) Estimates of the moving average parameter (c) Estimates of the noise variance. In all panels the continuous line represents the locally stationary ARMA model, the horizontal dashed line indicates the stationary ARMA model and the dots represent the heuristic approach.

has N = 80 observations with shift S = 30 observations. For instance, the first window is Y1,T , . . . , Y80,T , the second is Y31,T , . . . , Y110,T and so on. Within each window, the parameters of an ARMA(1, 1) process are estimated. Fig. 2 shows the heuristic estimates represented by the dots; additionally, the dotted line represents the parameter estimates of (φ, θ , σ ) for the stationary case while the continuous line indicates the time-varying Kalman filter estimates of (φ(u), θ (u), σ (u)), using truncation m = 30. We use a simple LSARMA model setup with linear functions for the time-varying parameters i.e.,

φ(u) = a0 + a1 u,

θ (u) = a˜ 0 + a˜ 1 u,

σ ( u) = b 0 + b 1 u.

(25)

The maximum likelihood estimates obtained using the Kalman filter presented  in Table 4. On the other hand, the  are   1 ui+j−2 theoretical SD for the MLE of a˜ = a˜ 0 , a˜ 1 is given by 0 (˜a)i,j = 2 du for i, j = 1, 2 and for LSE we use the 0 1−[θ (u)]

formula (20) with an autocovariance function given by (3). From this table, it can be observed that the t-statistics for  φ(u) are highly significant whereas for  θ (u), the significant coefficient is that associated with the slope effect, a3 . The estimated noise scale  σ (u) is significant in its first component, i.e. a constant variance. Fig. 3 displays both forecasts  Y1651:1693 and  Y1842:1860 and their respective 95% prediction bands. These multi-step-ahead predictions are based on the formula given in (16). The evolution of the prediction error SD is depicted in Fig. 4. The dashed

G. Ferreira et al. / Computational Statistics and Data Analysis 62 (2013) 52–69

61

Table 4 Cave deposit data: least squares fit with locally stationary shortmemory errors. Parameter

Estimates

Standard deviation

t-value

a0 a1 a2 a3 b0 b1

0.4750 0.5012 0.0356 −0.5581 0.2234 0.0608 3.2658 0.6373 0.3677 0.0715 0.5789 −0.0820

0.0628 0.0830 0.0838 0.1394 0.0666 0.1154 0.0317 0.6844 0.1132 0.2278 0.1033 0.5045

7.5597 6.0338 0.4244 −4.0027 3.3537 0.5270 10.287 0.9311 3.2471 0.3139 5.6032 −0.1624

µ δ0 α1 β1 α2 β2

Fig. 3. Cave deposit data. Prediction of a LSARMA model: multi-step forecasts of two blocks of missing values and 95% prediction bands.

1

line represents the evolution of σ (Yt ,T ) = κ(t , s) 2 given by the formula (3). The continuous line corresponds to the empirical SD of the prediction error,

1/2 ∆ t ,T ,

while the dotted line represents the noise SD, σ (u) = 0.2234 + 0.0608 u for u ∈ [0, 1]. 1/2

Form this figure it can be observed that ∆t ,T increases right after the beginning of each data gap and decays to σ (u) as new observations become available. 1/2 The Kalman filter also provides the model residuals, et = Yt ,T −  Yt ,T , along with their SD, ∆t ,T . Fig. 5 shows three panels −1/2

exploring the structure of the standardized residuals rt = et ∆t ,T . Panel (a) displays the standardized residuals from the fitted LSARMA model, panel (b) shows the sample autocorrelation function (ACF), and panel (c) shows the Ljung–Box tests. From panel (b), it seems to be no significant autocorrelations in the residuals. This conclusion is formally supported by the Ljung–Box test, considering up to K = 12 windows, see Panel (c). This panel indicates that the white noise hypothesis is not rejected for lag 30, at the 5% level of significance. 5.2. Tree-ring data Ring count is a typical procedure in studies of forest mass to determine growth and yield of natural forests and forest plantations. This methodology is known as forest analysis, and can be only implemented in species growing in temperate regions, where it is easy to identify ring growth. In tropical climates, where there is little differentiation between the seasons, growth rates are constant, making it difficult to clearly differentiate spring and winter wood. Consequently, this data set can be used as climate proxies and indicate the chances of temperature and precipitation conditions in paleoclimatology, cf. Tan et al. (2003). Fig. 6 plots annual tree-ring width of the Pinus Longaeva form the Mammoth Creek, Utah, from 0 AD to 1989 AD. This data set, available at the National Climatic Data Center, was reported in Graybill (1990). This series displays a seasonal component. Consequently, the following model for the Pinus Longaeva is proposed,

62

G. Ferreira et al. / Computational Statistics and Data Analysis 62 (2013) 52–69

Fig. 4. Cave deposit data. Prediction of a LSARMA model: dashed line: standard deviation of Yt ,T , dotted line: SD of the noise σ (t /T ) and continuous line: 1/2

empirical prediction error SD, ∆t ,T .

Fig. 5. Cave deposit data. Residuals analysis: (a) residuals from the fitted model, (b) sample ACF, and (c) Ljung–Box statistic.

Yt = µ + δ0 DUt + δ1 tDUt +

2    αj sin(ωj t ) + βj cos(ωj t ) + εt ,

(26)

j =1

where DUt = 1(t < 269) and εt ∼ (0, σε2 ). An analysis of the periodogram of the detrended data and the autocorrelation function (ACF) reveals the presence of two plausible seasonal frequencies ω1 = 2π /120 and ω2 = 2π /150. Table 5 shows the least squares estimates. Observe that all the regression coefficients are significant at the 5% significance level, except for (α1 , β1 ). In particular, even though the LSE of the seasonal trend coefficients are very small, α2 , and β2 , they are significant at the 5% significance level. Fig. 7 shows the sample ACF of the residuals of the estimated model in panel (a), and the corresponding variances of the sample mean in panel (b). The dashed line corresponds to its expected behavior for a short-memory case with blocks of k observations, while the continuous line represents the expected behavior for a long-memory case. From both panels, this series seems to exhibit long-range dependence. In addition, a closer look at the sample ACF of the data reveals that the degree of persistence seems to vary over time. Indeed, Fig. 8 shows the sample autocorrelation of three segments of the sample: observations 1–500, observations 750–1250 and observations 1490–1990. This figure provides information for arguing possible changes in the degree of dependence,

G. Ferreira et al. / Computational Statistics and Data Analysis 62 (2013) 52–69

63

Fig. 6. Tree-ring data. Table 5 Tree-ring data: least squares fit. Parameter

µ δ0 δ1 α1 β1 α2 β2

Estimates 1.0017

−0.2214 2.4804

−0.0146 0.0214

−0.0239 0.0280

Standard deviation 0.0084 0.0439 0.5524 0.0112 0.0111 0.0112 0.0112

t-value 118.962

−5.043 4.491

−1.306 1.925

−2.144 2.508

Fig. 7. Tree-ring data. (a) Sample ACF, and (b) variance plot.

this means a clear evidence of a non-stationarity process. Therefore, it seems that the disturbance, {εt }, in the linear regression model given in (26) has a time-varying long-memory structure and the LSE estimates may not be appropriate. In order to handle these features, we use a locally stationary process and then, we can re-write the linear model as follows 2





 t t Yt ,T = µ + δ0 DUt ,T + δ1 DUt ,T + αj sin ωj T T j=1



  t + βj cos ωj + εt ,T , T

(27)

64

G. Ferreira et al. / Computational Statistics and Data Analysis 62 (2013) 52–69

a

0

5

10

15

20

25

0

5

10

15

20

25

0

5

10

15

20

25

b

c

Fig. 8. Tree-ring data. Sample ACF: (a) observations 1–500, (b) observations 750–1250, and (c) observations 1490–1990.

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.2

0.4

0.6

0.8

1.0

Fig. 9. (a) Estimates of the long-memory parameter. (b) Estimates of the noise variance. In all panels the continuous line represents the locally stationary FN model, the horizontal dashed line indicates the stationary FN model and the dots represent the heuristic approach.

where {εt ,T } is a LSFN process, i.e.,

εt ,T = (1 − B)−d(u) σ (u) zt , for t = 1, 2, . . . , T , for u ∈ [0, 1], and d(u) is the time-varying long-memory parameter, σ (u) is a scale factor and {zt } is a zero mean and unit variance white noise. Fig. 9 shows the heuristic, the stationary fractional noise and the locally stationary fractional noise model estimates of the long-memory parameter and the variance of the noise. From the figure we can suggest a linear and quadratic function for d(u) and σ (u) respectively; i.e., d(u) = a0 + a1 u,

σ ( u) = b 0 + b 1 u + b 2 u2 .

(28)

Table 6 reports the parameter estimates using the Kalman filter for the linear model in (27) with LSFN errors and truncation level m = 30. The standard deviations of the regression coefficients and the t test have been obtained using 0 (a)i,j and 0 (b)i,j for d(u) and σ (u) respectively. With respect to the LSE estimates, we have used Eq. (20) with covariance structure given in (7). As we can observe in the table, the parameters (a0 , a1 ) and (b0 , b1 , b2 ) are statistically significant at the 5% confidence level, similarly for the mean µ, while the coefficients (δ0 , δ1 , α1 , β1 , α2 , β2 ) are not significant.

G. Ferreira et al. / Computational Statistics and Data Analysis 62 (2013) 52–69

65

Table 6 Tree-ring data: least squares fit with locally stationary longmemory errors. Parameter a0 a1 b0 b1 b2

µ δ0 δ1 α1 β1 α2 β2

Estimates 0.3369

−0.2190 0.3076

−0.2919 0.3281 1.0032 −0.1983 2.3447 −0.0048 0.0313 −0.0209 0.0384

Standard deviation 0.0350 0.0605 0.0157 0.0726 0.0714 0.0327 0.2317 1.2681 0.0150 0.0187 0.0167 0.0213

t-value 9.8546

−3.8054 21.4727

−2.4205 3.0610 30.7224 −0.8560 1.8490 −0.3203 1.6738 −1.2302 1.8028

Table 7 Table of the parameters estimated with a truncated m = 30. Parameter a0 a1 b0 b1 b2

µ

Estimates 0.3416

−0.2124 0.3386

−0.1621 0.2062 0.9957

Standard deviation 0.0350 0.0605 0.0156 0.0725 0.0713 0.0491

t-value 9.7723

−3.5081 21.6479

−2.2351 2.8919 20.2610

Fig. 10. Tree-ring data: residuals analysis: (a) residuals from the fitted model, (b) sample ACF, and (c) Ljung–Box statistic.

Accordingly, we can propose the following regression model for tree ring data Y t ,T = µ + εt ,T ,

(29)

where {εt ,T } is a LSFN process, with time-varying long-memory parameters and a scale factor specificated in (28). Table 7 reports the Kalman filter estimates for the model in (29). Note that according to the fourth column of this table, the 1/2 parameters are statistically significant at the 5% level. The residuals of the model, et = Yt ,T −  Yt ,T , along with their SD, ∆t ,T , are plotted in Fig. 10. In particular, the figure exhibits three panels exploring the structure of the standardized residuals −1/2 rt = et ∆t ,T : panel (a) displays the standardized residuals for the LSFN estimates, panel (b) shows the sample ACF, and panel (c) presents the Ljung–Box statistic. From panel (b), it seems that there are no significant autocorrelations in the residuals. This conclusion is formally supported by the Ljung–Box tests, considering up to K = 12 windows. Consequently, the white noise hypothesis cannot be rejected at the 5% confidence level.

66

G. Ferreira et al. / Computational Statistics and Data Analysis 62 (2013) 52–69

Fig. 11. Tree-ring data: multi-step forecasts of the last 100 values and 95% prediction bands.

Fig. 12. Prediction of LSFN model: dashed line: standard deviation of Yt ,T , dotted line: SD of the noise σ (t /T ) and continuous line: empirical prediction 1/2

error SD, ∆t ,T .

In order to validate our estimates, we leave a data gap of a block of 100 observations from t = 1889 to t = 1990. Fig. 11 shows these observations (dotted line), together with the predictions and 95% prediction bands. These multi-step ahead predictors are based on the fitted model and on observations t = 1, 2, . . . , 1888, notice that most of the future observations fall inside the prediction bands. The evolution of the SD prediction error is depicted in Fig. 12. As with the cave deposit data, the dashed line represents the SD of the process σ (Yt ,T ), the dotted line corresponds to the noise scale SD, σ (t /T ) = 0.3386 − 0.1621 u + 0.2062 u2 , 1

and the continuous line denotes the estimated SD of the prediction error ∆t2,T . From this figure, notice that the dashed line 1

is an upper limit for the SD of the prediction error. In addition, observe that ∆t2,T = σ (t /T ) for t = 1, . . . , 1888, and that 1

∆t2,T increases right after the beginning of the data gap i.e. for t = 1889, . . . , 1990. Therefore we can conclude that σ (t /T ) 1

≤ ∆t2,T ≤ σ (Yt ,T ).

G. Ferreira et al. / Computational Statistics and Data Analysis 62 (2013) 52–69

67

6. Final remarks In this paper we propose a state space representation for a regression model with a non-stationary error component. The model presents trends in the exogenous variables and errors with short or long memory. We use the Kalman filter algorithm for obtaining the one-step-ahead innovations that are used for maximizing the log-likelihood in order to estimate the parameters. The regression model with non-stationary components needs an infinite state space representation in order to be handled. However, by an approximation through a truncated state space representation we show, through simulations, that the estimates of the parameters are accurately estimated. This alternative representation of the regression model with non-stationary components allows us to make predictions of future values of the series and underlying unobserved components; in addition, it deals with missing values without any additional assumption or procedure. These features are clear advantages over alternative procedures that deal with this kind of regression model. We apply the proposed procedure to two real time series, the first data involves a time series consisting of Speleothem Layer Thickness data, while the second illustration consists of pine tree-ring measurements. In both cases, the analysis of the data suggests that the appropriate model is a regression model with non-stationary ARFIMA errors in which the parameters and scale factor are time-varying. Moreover, the diagnosis of the resulting residuals clearly confirm the chosen models. Acknowledgments The authors would like to thank the referees for their valuable comments that led to a greatly improved presentation of the paper. Additionally, the first author would like to express his thanks for the support from ECOS-CONICYT C10E03, established by the Chilean Government. The second author gratefully acknowledges the financial support from Project FONDECYT STUDY GROUP CS. ECONOM/ADMI No. 11100100, established by the Chilean Government. Finally the third author gratefully acknowledges partial financial support from DIUC N210.014.018-1.0, established by the Universidad de Concepción. Appendix Proof of Proposition 3. For the truncated regression model with error term as in (10), we have that the observability operator is given by



Om+1

ψ1

1

     t  ψ1  T     ψ t  2 T =  ..  .    ..   .     t ψm

ψ2 ψ3

ψm

T

  t

T

  t

T

ψ2 ψ3

  t

T

.. .   t

T 0

ψm

  t

T

ψ3

  t

T

.. .  

ψm

  t

T

.. .   t

···   t

ψm

T

ψm

  t T 0

0

0

0

0

0

0

.. .

.. .

.. .

0

0

0

0

t

T

T

         ,         

where Om+1 = (G′ , F ′ G′ , F ′2 G′ , . . . , F ′m G′ )′ and G and F are given by Eq. (11). The rank of the operator O equals m + 1, if and only if ψm ( Tt ) ̸= 0. For ψm ( Tt ) ̸= 0, the system is observable. Hence we can determine the initial state ξ0,T from a trajectory of the observed process {y0,T , y1,T , . . .} in absence of state or observational noise; see Brockwell and Davis (1996) for more details. For controllability we have

Cm+1 = (H , FH , F 2 H , . . . , F m H ) = diag{1, . . . , 1, 1}

(30)

which is full rank (equal m) for all m. From Eq. (8), we write the state at time t as

ξt ,T = Hzt −1 + FHzt −2 + F 2 Hzt −3 + · · · , = CFt −1 , where Ft −1 = {. . . , zt −2 , zt −1 } is the history of the noise process at time t. Therefore, to reach a particular value ξt from Ft −1 we have

Ft −1 = (C ′ C )−1 C ′ ξt ,T .

68

G. Ferreira et al. / Computational Statistics and Data Analysis 62 (2013) 52–69

Theorem 2.3.3 in Hannan and Deistler (1988) establishes that observability and controllability are conditions to create a minimal state space system.   If we consider the LSAR(1) process with Ft ,T = φ Tt , Gt ,T = 1 and H = 1, we have that the observability and controllability operator are given by



1 



t

 φ    T   O=  2 t ,  φ  T  .. .

 C=

and

1, φ

  t

T

, φ2

  t

T

 ,... .

The system is controllable for all φ( Tt ), as C has full rank, 1, for every φ( Tt ), it is observable only for φ( Tt ) ̸= 0. Thus, it is a minimal state space representation if φ( Tt ) ̸= 0. Similar arguments are valid for LSAR(p).  Remark A.1. Observe that the state space system (11) is strong stable for all x ∈ R because it satisfies 0 0 0 F n x = F n−1 Fx =  

0 0 0

0 0 0

.. .

··· ··· ··· .. .

1

0

0

0



 ...

.. .

0 0 0 1  0  · 0

0 0 1

0 0 0

.. .

··· ··· ··· .. .

0

0

0

1

 

. ..  .   ..

0

.. .

0 0 0  x,



..  .

0

with F x → 0 when n → ∞. For exponential stability there must exist two constants, c > 0 and α > 0, such that n

∥F n ∥ ≤ ce−αn .

(31)

This means that

∥F n ∥∞ = max

1≤i≤n

n  j =1

|aij | = max [0 + 0 + · · · 0] = 0 ≤ M n

(32)

1≤i≤n

with 0 < M < 1 and n ∈ N, then

∥ F n ∥ ∞ ≤ e −n α . Setting α = − log M and c = 1 gives the inequality described for all n. References Beran, J., 2009. On parameter estimation for locally stationary long-memory processes. Journal of Statistical Planning and Inference 139 (3), 900–915. Brockwell, P.J., Davis, R.A., 1991. Time Series: Theory and Methods, second ed. Springer, New York. Brockwell, P.J., Davis, R.A., 1996. Introduction to Time Series and Forecasting. Springer, New York. Dahlhaus, R., 1996. On the Kullback–Leibler information divergence of locally stationary processes. Stochastic Processes and their Applications 62 (1), 139–168. Dahlhaus, R., 1997. Fitting time series models to nonstationary processes. The Annals of Statistics 25 (1), 1–37. Dahlhaus, R., 2000. A likelihood approximation for locally stationary processes. The Annals of Statistics 28 (6), 1762–1794. Durbin, J., Koopman, S.J., 2001. Time Series Analysis by State Space Methods. Oxford University Press, Oxford. Finh, A., Shaw, P., Weedon, G., Holmgren, K., 2001. Trace element variation in speleothem aragonite: potential for palaeoenvironmental reconstruction. Earth and Planetary Science Letters 186, 255–267. Frisia, S., Borsato, A., Preto, N., McDermott, F., 2006. Grotta di Ernesto, Italy speleothem layer thickness data. IGBP PAGES/World Data Center for Paleoclimatology Data Contribution Series 2006-078. Gradshteyn, I.S., Ryzhik, I.M., 2000. Table of Integrals, Series, and Products, sixth ed. Academic, San Diego, CA, pp. 64–66. Graybill, D.A., 1990. Pinus longaeva tree ring data. Mammoth Creek, Utah. Hannan, E.J., Deistler, M., 1988. The Statistical Theory of Linear Systems. Wiley, New York. Harvey, A.C., 1989. Forecasting Structural Time Series and the Kalman Filter. Cambridge University Press, Cambridge. Jensen, M., Witcher, B., 2000. Time-varying long memory in volatility: detection and estimation with wavelets. Technical Report, EURANDOM. Manfred, Mudelsee, 2010. Climate Time Series Analysis. In: Atmosfery and Oceanographic Sciences Library, Springer. Merger, S., 2009. Applications of state space models in finance. An Empirical Analysis of Time-varying Relationsship between Macroeconomics. University Gottingen. Palma, W., 2007. Long-Memory Time Series: Theory and Methods. In: Wiley Series in Probability and Statistics, Wiley, Hoboken, NJ. Palma, W., 2010. On the sample mean of locally stationary long-memory processes. Journal of Statistical Planning and Inference 140, 3764–3774. Palma, W., Ferreira, G., 2010. Regression estimation with locally stationary long-memory errors. Journal of Multivariate Analysis 116, 14–24. Palma, W., Olea, R., 2010. An efficient estimator for locally stationary Gaussian long-memory preocesses. Unpublished doctoral thesis, Department of Statistics, Pontificia Universidad Católica de Chile, Chile. Palma, W., Olea, R., Ferreira, G., 2013. Estimation and forecasting of locally stationary processes. Journal of Forescasting 32, 86–96. Priestley, M.B., 1965. Evolutionary spectra and non-stationary processes. Journal of the Royal Statistical Society. Series B. Statistical Methodology 27, 204–237. Priestley, M.B., 1981. Spectral Analysis and Time Series. vol. 2. Academic Press, London.

G. Ferreira et al. / Computational Statistics and Data Analysis 62 (2013) 52–69

69

Qin, X., Tan, M., Liu, T., Wang, X., Li, T., Lu, J., 1999. Spectral analysis of a 1000-year stalagmite lamina-thickness record from Shihua Cavern, Beijing, China, and its climatic significance. Holocene 9 (9), 689–694. Silverman, R., 1957. Locally stationary random processes. IRE Transactions on Information Theory 3 (3), 182–187. Tan, M., Liu, T.S., Hou, J., Qin, X., Zhang, H., Li, T., 2003. Cyclic rapid warming on centennial-scale revealed by a 2650-year stalagmite record of warm season temperature. Geophys. Res. Lett. 30 (12), 1617. http://dx.doi.org/10.1029/2003GL017352.