Journal of Statistical Planning and Inference 143 (2013) 827–841
Contents lists available at SciVerse ScienceDirect
Journal of Statistical Planning and Inference journal homepage: www.elsevier.com/locate/jspi
Inference for linear and nonlinear stable error processes via estimating functions A. Thavaneswaran a, N. Ravishanker b,n, Y. Liang a a b
Department of Statistics, University of Manitoba, Canada Department of Statistics, University of Connecticut, 215 Glenbrook Road, Storrs, CT 06269, USA
a r t i c l e i n f o
abstract
Article history: Received 10 August 2011 Received in revised form 25 October 2012 Accepted 29 October 2012 Available online 6 November 2012
This paper describes an estimating function approach for parameter estimation in linear and nonlinear times series models with infinite variance stable errors. Joint estimates of location and scale parameters are derived for classes of autoregressive (AR) models and random coefficient autoregressive (RCA) models with stable errors, as well as for AR models with stable autoregressive conditionally heteroscedastic (ARCH) errors. Fast, on-line, recursive parametric estimation for the location parameter based on estimating functions is discussed using simulation studies. A real financial time series is also discussed in some detail. & 2012 Elsevier B.V. All rights reserved.
Keywords: AR model AR-ARCH model RCA model Recursive on-line estimation Sine and cosine estimating functions Stable distribution
1. Introduction The occurrence of sharp spikes or occasional bursts of outlying observations, and heavy tailed behavior in time series data, is often attributed to the possibility of the underlying process having an infinite variance stable distribution, see Samorodnitsky and Taqqu (1994). There has been a great deal of interest in accurate modeling and inference for such time series over a wide range of applications; for example, see Mandelbrot (1963), Fama (1965), Stuck and Kleiner (1974), Adler et al. (1998), and references therein. These papers discuss statistical techniques for analyzing heavy tailed distributions and processes in applications such as communications, finance, etc. Although inferential theory and applications for finite variance processes are now well developed (see for instance, Li and Turtle, 2000; Bera et al., 2006; Hubalek and Posedel, 2011 for applications in finance), there are still several open questions in the corresponding theory for infinite variance stable processes, particularly in the area of parametric inference for nonlinear time series models. One difficulty in estimating time series models with infinite variance stable errors is that the density of the stable random variable is not available in closed form, which makes likelihood based inference cumbersome. Least squares estimation is also undesirable since the variance of the non-Gaussian stable distribution is infinite. Yohai and Maronna (1977) studied asymptotic properties of the usual least squares estimators of an autoregressive process driven by an infinite variance error process. Gross and Steiger (1979) studied least absolute deviations (LAD) estimates in infinite variance autoregressions, but no closed forms for the estimates were given. Thavaneswaran and Peiris (1999) discussed minimum dispersion parameter estimation in the Bayesian context for
n
Corresponding author. Tel.: þ1 860 486 4760; fax: þ 1 860 486 4113. E-mail address:
[email protected] (N. Ravishanker).
0378-3758/$ - see front matter & 2012 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.jspi.2012.10.014
828
A. Thavaneswaran et al. / Journal of Statistical Planning and Inference 143 (2013) 827–841
regression models with infinite variance errors. Mikosch et al. (1995) studied Whittle estimates for autoregressive moving average (ARMA) processes with infinite variance errors, while Qiou and Ravishanker (1998) described Bayesian inference for such processes. Feuerverger and Mureika (1977) studied properties of empirical characteristic functions for independent observations and applied the approach for a test of symmetry. Yu (2004) and references therein also provide a detailed review of procedures based on the empirical characteristic function for independent observations and stationary time series. Small and McLeish (1994) first used estimating functions based on the characteristic function to study inference for stable distributions based on independent observations. For models with heavy tailed distributions, Thavaneswarana and Heyde (1999) discussed the superiority of the LAD estimating function over the least squares estimating function. In a recent paper, Merkouris (2007) studied parameter estimation for an AR(1) process using combinations of bounded sine and cosine estimating functions. However, there is still a gap in the time series literature in terms of effective use of estimating functions for inference for nonlinear time series models with stable errors. In this paper, we study joint estimation of location and scale parameters via estimating functions for classes of linear (AR) and nonlinear (RCA, AR-ARCH) time series with infinite variance stable errors. The contribution of this research lies in (a) our use of a class of bounded estimating functions for joint parameter estimation in linear and nonlinear time series models with stable errors (with stability index 0 o l r 2), and (b) in our derivation of fast, recursive, on-line estimation for location parameters in linear and nonlinear models. Details of the article follow. In Section 2, we give a brief review of the theory of estimating functions, and then derive a framework for parameter estimation in stable error time series models. For a class of estimating functions based on sine and cosine functions, we derive formulas for the optimal estimating function and the corresponding information matrix. In Section 3, we describe joint estimation for location and scale parameters for three classes of time series models with l-stable errors. In Section 4, we present recursive formulas leading to fast, on-line parameter estimation for the location parameter when the scale is known, and demonstrate the accuracy and computational speed of our approach using simulation studies. In Section 5, we present an illustration using a time series of daily transaction volumes of the NASDAQ index. 2. Parametric inference via estimating functions Suppose that fyt ,t ¼ 1, . . . ,ng is a realization of a discrete time stochastic process, and suppose its distribution depends on a vector parameter h belonging to an open subset H of the p-dimensional Euclidean space, with p 5n. Let ðO,F ,P h Þ denote the underlying probability space, and let F yt be the s-field generated by fy1 , . . . ,yt ,t Z1g. Let ht ¼ ht ðy1 , . . . , yt , hÞ,1r t rn, be specified q-dimensional vectors that are martingale differences. 2.1. A review of estimating functions Godambe (1985) first used the estimating function approach in inference for stochastic processes. Consider the class M of zero-mean and square integrable p-dimensional martingale estimating functions of the form ( ) n X M ¼ gn ðhÞ : gn ðhÞ ¼ at1 ht , t¼1
where at1 are p q matrices depending on y1 , . . . ,yt1 , 1 rt r n. The estimating functions gn ðhÞ are further assumed to be almost surely differentiable with respect to the components of h, and are such that E½@gn ðhÞ=@h9F yn1 and E½gn ðhÞgn ðhÞ0 9 F yn1 are nonsingular for all h 2 H. Note that expectations are always taken with respect to Ph . An estimator of h can be obtained by solving the estimating equations gn ðhÞ ¼ 0. Furthermore, the p p matrix E½gn ðhÞgn ðhÞ0 9 F yn1 is assumed to be positive definite for all h 2 H. Then, in the class of all zero-mean and square integrable martingale estimating functions M, the optimal estimating function gnn ðhÞ which maximizes the information 0 1 @gn ðhÞ y @gn ðhÞ y E½gn ðhÞgn ðhÞ0 9F yn1 E F n1 F n1 Ign ðhÞ ¼ E @h @h is given by gnn ðhÞ ¼
n X t¼1
ant1 ht ¼
0 n X @ht y E ðE½ht h0t 9F yt1 Þ1 ht , F t1 @ h t¼1
and the corresponding optimal information reduces to E½gnn ðhÞgnn ðhÞ0 9 F yn1 . The function gnn ðhÞ is also called the ‘‘quasiscore’’ and has properties similar to those of a score function in the sense that E½gnn ðhÞ ¼ 0 and E½gnn ðhÞgnn ðhÞ0 ¼ E½@gnn ðhÞ=@h0 . This is a rather general result in the sense that for its validity, we do not need to assume that the true underlying distribution belongs to the exponential family of distributions. The maximum correlation between the optimal estimating function and the true unknown score justifies the terminology ‘‘quasi-score’’ for gnn ðhÞ. It is useful to note that the same procedure for derivation of optimal estimating equations may be used irrespective of whether the time series are stationary or nonstationary. Moreover, the finite sample properties of the estimating functions remain the same, although
A. Thavaneswaran et al. / Journal of Statistical Planning and Inference 143 (2013) 827–841
829
asymptotic inferential properties will differ. In the Appendix, we give a motivation for the estimation function approach, and discuss two examples of nonstationary (random coefficient autoregression and doubly stochastic) time series.
2.2. Estimating functions for time series with stable errors For models with stable errors, estimating functions based on sines and cosines are natural, since closed form expressions are available for cosðuyt Þ and sinðuyt Þ in the characteristic function E½expðiuyt Þ ¼ E½cosðuyt Þ þ i sinðuyt Þ. Suppose first that y1 , . . . ,yn is a sequence of independent, identically distributed (i.i.d.) random variables following a symmetric stable distribution with real-valued location parameter y, positive scale parameter c, and stability index l with 0 o l r2. In our derivations, we have assumed that l is known. In practice, we estimate l by the quantile estimation method proposed by McCulloch (1986); see Section 5. The characteristic function of yt takes the form E½expðiuyt Þ ¼ l expðiuy9cu9 Þ, for u 40. The cases l ¼ 1 and l ¼ 2 correspond, respectively, to the Cauchy and normal distributions. Consider a bivariate estimating function of the form
y y y y h0t ðy,cÞ ¼ sin u1 t ,cos u2 t c c for estimating the location parameter y and scale parameter c, where u1 and u2 are chosen to provide adequate information about the parameters, i.e., maximize the information associated with the estimating functions (see Merkouris, 2007, p. 1996). The first component of this vector is an odd function about the median y, and corresponds to location estimation, while the second component is an even function and corresponds to estimating the scale parameter. The expectation of h0t ðy,cÞ evaluated at (y1 ,c1 ) is given by
c u l y y y1 y c1 u2 l ,exp : l0t ðy1 ,c1 Þ ¼ exp 1 1 sin u1 1 cos u2 c c c c l
Therefore, the expectation of h0t ðy,cÞ evaluated at ðy,cÞ is l0t ðy,cÞ ¼ ð0,expð9u2 9 Þ, and its variance–covariance matrix l l l evaluated at ðy,cÞ is a diagonal matrix with elements ½1expð92u1 9 Þ=2 and ½1 þexpð92u2 9 Þ=2expð29u2 9 Þ. We develop martingale estimating functions based on sines and cosines for parameter estimation in time series models with stable errors, including AR(p), AR(p) with ARCH(q) errors, and RCA(p) models. We assume throughout that fyt g is a scalar-valued process. The AR(p) process is defined as yt ¼ /0 yt1 þ et ,
ð2:1Þ
where /0 ¼ ðf1 , . . . , fp Þ, y0t1 ¼ ðyt1 , . . . ,ytp Þ, and et ’s are i.i.d. random variables following a l-stable distribution with location parameter zero and scale parameter c. The RCA(p) process is defined as yt ¼ ð/0 þ b0t Þyt1 þ et ,
ð2:2Þ
where / ¼ ðf1 , . . . , fp Þ0 , and bt ¼ ðbt,1 , . . . ,bt,p Þ0 . Assume that fet g is a sequence of i.i.d. l-stable random variables with location parameter zero and unknown scale parameter ce ð/Þ. Assume that the components of fbt g are mutually independent, and are independent of fet g. Assume that fbt,j g is a sequence of i.i.d. l-stable random variables with location parameter zero and unknown scale parameter bj ¼ cb,j . Let b ¼ ðb1 , . . . , bp Þ0 ¼ ðcb,1 , . . . ,cb,p Þ0 . Then yt follows a stable l l l distribution with location parameter /0 yt1 and scale parameter ct ð/, bÞ, where 9ct 9 ¼ 9b0 yt1 9 þ9ce ð/Þ9 . The AR(p) process with ARCH(q) errors is defined by yt ¼ /0 yt1 þ ct et ,
ð2:3Þ l
l
l
where / ¼ ðf1 , . . . , fp Þ0 , yt1 ¼ ðyt1 , . . . ,ytp Þ0 , 9ct 9 ¼ a0 þ a1 9yt1 9 þ þ aq 9ytq 9 , and et ’s have a l-stable distribution with location parameter zero and scale parameter one. These models may be viewed as special cases of the generic form yt ¼ f ðF yt1 , /Þ þcðF yt1 , bÞet ,
ð2:4Þ
where f is a function of the location parameter / and c is a function of the scale parameter b. For the AR(p) model (2.1), f ðF yt1 , /Þ ¼ /0 yt1 and cðF yt1 , bÞ ¼ c. For the RCA(p) model (2.2), f ðF yt1 , /Þ ¼ /0 yt1 and cðF yt1 , bÞ ¼ ct , where l l l 9ct 9 ¼ 9b0 yt1 9 þ9ce ð/Þ9 . For the AR(p) model with stable ARCH(q) errors in (2.3), f ðF yt1 , /Þ ¼ /0 yt1 and cðF yt1 , bÞ ¼ ct l l l is given by 9ct 9 ¼ a0 þ a1 9yt1 9 þ þ aq 9ytq 9 . The estimating functions for the parameters / and b are given by !# " !# ( " ) yt f ðF yt1 , /Þ yt f ðF yt1 , /Þ l 0 ,cos u2 expð9u2 9 Þ , ht ð/, bÞ ¼ sin u1 cðF yt1 , bÞ cðF yt1 , bÞ
830
A. Thavaneswaran et al. / Journal of Statistical Planning and Inference 143 (2013) 827–841
where u1 and u2 are chosen to provide adequate information about the location and scale parameters. Moreover, the expectation of h0t ð/, bÞ at ð/1 , b1 Þ is given by 8 0 1 ( " #) cðF y , b Þu l < f ðF yt1 , /1 Þf ðF yt1 , /Þ 0 t1 1 1 A @ lt ð/1 , b1 Þ ¼ exp , sin u1 y y cðF t1 , bÞ : cðF t1 , bÞ 0 1 ( " #) cðF y , b Þu l f ðF yt1 , /1 Þf ðF yt1 , /Þ A 2 1 t1 cos u exp@ 2 cðF yt1 , bÞ cðF yt1 , bÞ o l expð9u2 9 Þ :
3. Joint estimation of location and scale parameters For linear or nonlinear time series with stable errors, we assume that ht has two components based on sine and cosine functions. 3.1. AR(p) models with stable errors The AR(p) process was defined in (2.1) as (Brockwell and Davis, 2006) yt ¼ /0 yt1 þ et , where /0 ¼ ðf1 , . . . , fp Þ, ¼ ðyt1 , . . . ,ytp Þ, and e0t ’s are i.i.d. random variables following a l-stable distribution with location parameter zero and scale parameter c. In order to estimate the vector parameter h ¼ ð/0 ,cÞ0 , we consider a class of martingale differences
y /0 yt1 y /0 yt1 l ,cos u2 t expð9u2 9 Þ,t ¼ p þ 1, . . . ,n : ht ðhÞ ¼ sin u1 t c c
y0t1
The conditional expectation of ht at h1 ¼ ð/01 ,c1 Þ0 is given by
0 ð/ /0 Þyt1 c u l , l0t ðh1 Þ ¼ exp 1 1 sin u1 1 c c
ð/01 /0 Þyt1 l c1 u2 l exp expð9u2 9 Þ : cos u2 c c Then the optimal estimating function based on ht ðhÞ is 0 1 n l X yt /0 yt1 2u1 e9u1 9 y sin u 1 l t1 B C c cð1e92u1 9 Þ B C t ¼ pþ1 B C n gn ðhÞ ¼ B
C 0 n X l 9u 9l B C y / y 2 l t t1 @ 2l9u2 9 le expð9u2 9 Þ A cos u2 l c c 1 þ e92u2 9 2e29u2 9 t ¼ pþ1 and the corresponding information matrix is 0 n l X 2u21 expð29u1 9 Þ yt1 y0t1 B c2 ½1expð92u1 9l Þ B t ¼ pþ1 Ignn ¼ B B @ 0 2 c
ð3:1Þ
1 0 l
2
2nl u22l e29u2 9 92u2 9l
1þe
29u2 9l
2e
C C C: C A
Similar to the discussion in Merkouris (2007), for each l, optimal values of u1 and u2 can be chosen to maximize the l l information. Specifically, we choose u1 to maximize the expression 2u21 expð29u1 9 Þ=½1expð92u1 9 Þ, while u2 is chosen l l l 2 2l to maximize the expression 2l u2 expð29u2 9 Þ=ð1 þ expð92u2 9 Þ2 expð29u2 9 ÞÞ. These values of u1 and u2 are shown in Table 1 for different values of l. As an illustration, consider n observations y1 , . . . ,yn from an AR(1) model yt ¼ fyt1 þ et , where e0t ’s are i.i.d. l-stable random variables with location parameter zero and an unknown scale parameter c. Given l, the optimal estimating function for ðf,cÞ based on h0t ’s is 0 1 n X l yt fyt1 2u1 e9u1 9 yt1 sin u1 B C l c cð1e92u1 9 Þ B C t¼2 B C gnn ðhÞ ¼ B
C n l X l l 9u 9 B C yt fyt1 2l9u2 9 e 2 9u 9 2 @ A e cos u 2 l l c c 1 þ e92u2 9 2e29u2 9 t¼2
A. Thavaneswaran et al. / Journal of Statistical Planning and Inference 143 (2013) 827–841
831
Table 1 Values of u1 and u2.
l
u1
u2
0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00 1.10 1.20 1.30 1.40 1.50 1.60 1.70 1.80 1.90 2.00
5.0 5.0 5.0 5.0 3.62 2.06 1.42 1.10 0.92 0.80 0.71 0.65 0.61 0.57 0.54 0.51 0.48 0.44 0.39 0.01
0.31 0.54 0.65 0.71 0.75 0.77 0.78 0.79 0.80 0.80 0.79 0.79 0.78 0.77 0.76 0.75 0.73 0.69 0.64 0.01
and the corresponding information matrix is 0 1 n l X 2u21 e29u1 9 2 y 0 B c2 1e92u1 9l C t1 B C t¼2 C: Ignn ¼ B l B C 2 2l 29u 9 2 2n l u e @ A 2 0 l l 92u 9 29u 9 2 c
1þe
2
2e
2
3.2. RCA(p) models with stable errors We describe estimating functions for the RCA model when the location and scale of the observed process depend on the same parameter. The RCA(p) process was defined in (2.2) as yt ¼ ð/0 þ b0t Þyt1 þ et , where / ¼ ðf1 , . . . , fp Þ0 , and bt ¼ ðbt,1 , . . . ,bt,p Þ0 . Assume that fet g is a sequence of i.i.d. l-stable random variables with location parameter zero and unknown scale parameter ce ð/Þ. Assume that the components of fbt g are mutually independent, and are independent of fet g. Assume that fbt,j g is a sequence of i.i.d. l-stable random variables with location parameter zero and unknown scale parameter bj ¼ cb,j . Let b ¼ ðb1 , . . . , bp Þ0 ¼ ðcb,1 , . . . ,cb,p Þ0 . Then, the distribution of yt, conditional on the past, follows a stable l l l distribution with location parameter /0 yt1 and scale parameter ct ð/, bÞ, where 9ct 9 ¼ 9b0 yt1 9 þ 9ce ð/Þ9 . This is a variation of the RCA process defined by Nicholls and Quinn (1980). In order to jointly estimate the vector parameter h ¼ ð/0 , b0 Þ0 , we consider a class of estimating functions
y /0 yt1 y /0 yt1 l h0t ðhÞ ¼ sin u1 t ,cos u2 t expð9u2 9 Þ ,t ¼ p þ1, . . . ,n : ct ct The conditional expectation of ht at h1 ¼ ð/01 , b01 Þ0 is given by ( 1 l ! 0
c u ð/ /0 Þyt1 , l0t ðh1 Þ ¼ exp t 1 sin u1 1 ct ct ) 1 l ! 0
c u2 ð/1 /0 Þyt1 l exp t cos u2 expð9u2 9 Þ , ct ct where c1t ¼ ct ð/1 , b1 Þ. Theorem 1. The optimal estimating function based on the martingale differences h0t ’s is 0 1 n l X yt1 yt /0 yt1 2u1 e9u1 9 sin u1 B C l ct ct 1e92u1 9 B C t ¼ pþ1 B C B C ! 0 l 1 0 n B C X l 9u 9l @9 b 9 b y n @9 b 9 C 2l9u2 9 e 2 p t1 1 gn ðhÞ ¼ B 9yt1 9, . . . , 9ytp 9 C B l B 1 þ e92u2 9l 2e29u2 9l t ¼ p þ 1 C @ b b @ jct j 1 p B C h i n o B C 0 l @ A yt / yt1 expð9u2 9 Þ cos u2 ct
ð3:2Þ
832
A. Thavaneswaran et al. / Journal of Statistical Planning and Inference 143 (2013) 827–841
and the corresponding information matrix is a symmetric matrix 0 1 n l X yt1 y0t1 2u21 e29u1 9 0 B C l B 1e92u1 9 t ¼ p þ 1 C c2t B C B C: n P Ign ¼ B 2 l 2 p 0 2 C n X 2 2l 29u 9l 2l2 b y y 2 B t1 2l u 2 e b j ¼ 1 tj C @ A 0 l l 1 þ e92u2 9 2e29u2 9 c2t l t ¼ pþ1 Proof. The optimal estimating function based on ht ðhÞ is given by " #0 n X
1 @lt ðh1 Þ n gn ðhÞ ¼ E½ht ðhÞh0t ðhÞ9 F yt1 ht ðhÞ @ h 1 h1 ¼ h t¼1 0 1 l l l1 @9c 9 l
u e9u1 9 y
B 1 ct t1 B B B ^ B B 9u 9l n B u1 e 1 ytp X B ct ¼ B B t ¼ pþ1 B B 0 B B B @ 0 0 2
l
B 1e92u1 9 @ 0
l9u2 9 e9u2 9 9ce 9
e
@f1
C C C C C l 9u 9l l1 @9ce 9 C l9u2 9 e 2 9ce 9 C @fp C l C 9ct 9 C l 9u 9l l1 @9b1 9 C 0 l9u2 9 e 2 9b yt1 9 y j j t1 C @b1 C l jc t j C C l @9 b 9 l l1 p l9u2 9 e9u2 9 9b0 yt1 9 y tp A @bp l jc t j h i 1 0 1 0 0 sin u1 yt /ct yt1 C B C h i A@ 2 l A yt /0 yt1 l l expð9u 9 Þ cos u 2 2 ct 1e92u2 9 2e29u2 9 l
9ct 9
^
0
n X l yt1 yt /0 yt1 2u1 e9u1 9 sin u B 1 l ct ct 1e92u1 9 B t ¼ pþ1 B B !0 0 l1 n B l X l @9bp 9 b yt1 @9b1 9 2l9u2 9 e9u2 9 ¼B 9y 9, . . . , 9y 9 B t1 tp B 1 þ e92u2 9l 2e29u2 9l t ¼ p þ 1 @bp @b1 jct jl B h i n o B 0 l @ cos u2 yt /ct yt1 expð9u2 9 Þ
1 C C C C C C: C C C C A
Hence, the corresponding information matrix can be computed by using the equation " #0 " # n X
1 @lt ðh1 Þ @lt ðh1 Þ 0 y n E½hh ðyt Þhh ðyt Þ9 F t1 : & Ign ¼ @h1 h1 ¼ h @h1 h1 ¼ h t¼1
Optimal values of u1 and u2 are the same as those obtained for the AR models and are shown in Table 1. As an illustration, consider n observations y1 , . . . ,yn from an RCA(1) model. We can show that the optimal estimating function is 0 1 n l X yt1 yt fyt1 2u1 e9u1 9 sin u B C 1 l ct ct 1e92u1 9 B C t¼2 B C n gn ðhÞ ¼ B l
C n B 2l9u2 9l e9u2 9l 9b9l1 @9@bb9 X C y y f y l t1 t t1 @ expð9u2 9 Þ A l l c cos u2 c 1 þ e92u2 9 2e29u2 9 t t t¼2 and the corresponding information matrix is 0 1 n l X y2t1 2u21 e29u1 9 0 B C l B 1e92u1 9 t ¼ 2 c2t C B C : Ignn ¼ B n 2l C 2 2l 29u 9l 2l2 X B C y 2 b 2l u2 e t1 A @ 0 l l 2l 1 þ e92u2 9 2e29u2 9 t ¼ 2 ct
3.3. AR(p) models with stable ARCH(q) errors The AR(p) time series with stable ARCH(q) errors was defined in (2.3) as yt ¼ /0 yt1 þ ct et , where / ¼ ðf1 , . . . , fp Þ0 , l l l yt1 ¼ ðyt1 , . . . ,ytp Þ0 , 9ct 9 ¼ a0 þ a1 9yt1 9 þ þ aq 9ytq 9 , and e0t ’s have a stable distribution with location parameter zero and scale parameter one. When l ¼ 2, the error term ct et corresponds to an AR(p) model with normal ARCH(q) errors
A. Thavaneswaran et al. / Journal of Statistical Planning and Inference 143 (2013) 827–841
833
(Engle, 1982). When l ¼ 1, ct corresponds to the scale parameter defined by Panorska et al. (1995). Let a ¼ ða0 , a1 , . . . , aq Þ, l l Yt1 ¼ ð1,9yt1 9 , . . . ,9ytq 9 Þ0 , and r ¼ maxfp þ 1,qþ 1g. In order to jointly estimate the vector parameter h ¼ ð/0 , a0 Þ0 , we consider a class of estimating functions
y /0 yt1 y /0 yt1 l ,cos u2 t expð9u2 9 Þ ,t ¼ r, . . . ,n : h0t ðhÞ ¼ sin u1 t ct ct The conditional expectation of ht at h1 ¼ ð/01 , a01 Þ0 is given by ( 0 ! 0
a Y u l ð/ /0 Þyt1 l0t ðh1 Þ ¼ exp 1 t1 1 sin u1 1 , ct ct ) 0 ! 0
0 a Yt1 u2 l cos u2 ð/1 / Þyt1 expð9u2 9l Þ : exp 1 ct ct
Theorem 2. The optimal estimating function based on ht is 0 1 n l X yt1 yt /0 yt1 2u1 e9u1 9 sin u B C 1 l ct ct 1e92u1 9 B C t¼r B C
gnn ðhÞ ¼ B C 0 n X l 9u 9l B C Y y / y l 29u2 9 e 2 t1 t t1 @ A cos u 9 Þ expð9u 2 2 l l l c 1 þ e92u2 9 2e29u2 9 t t ¼ r 9ct 9 and the corresponding information matrix is a symmetric matrix ! I// I/a Ignn ¼ , Ia/ Iaa where l
I// ¼
n 2u21 e29u1 9 X yt1 y0t1 , l c2t 1e92u1 9 t ¼ r
I/a ¼ Ia/ ¼ 0, Iaa ¼
2u22l e29u2 9 l
l
n X Yt1 Y0
t1
l
1 þ e92u2 9 2e29u2 9
t¼r
c2t l
:
Proof. The optimal estimating function based on ht ðhÞ is given by " # n X
1 @lt ðh1 Þ gnn ðhÞ ¼ ht ðhÞ E½ht ðhÞh0t ðhÞ9 F yt1 @h1 h1 ¼ h t¼r 0 10 l 2 0 u1 e9u1 9 yct1 0 n l X t B CB 1e92u1 9 l ¼ @ l 9u2 9 Yt1 A@ 2 0 0 9u2 9 e l t¼r 92u 9l 9ct 9
1þe
2
1 29u2 9l
C A
2e
h i 1 0 sin u1 yt /ct yt1 B C h i @ 0 l A cos u2 yt /ct yt1 expð9u2 9 Þ 0 1 n l X yt1 yt /0 yt1 2u1 e9u1 9 sin u B C 1 l ct ct 1e92u1 9 B C t¼r B C
¼B C: 0 n l X l B C Yt1 yt / yt1 l 29u2 9 e9u2 9 @ cos u2 expð9u2 9 Þ A 92u2 9l 29u2 9l l ct 1þe 2e t ¼ r 9ct 9 0
The corresponding information matrix can be computed as " #0 " # n X
1 @lt ðh1 Þ @lt ðh1 Þ y 0 n Ign ¼ : E½ht ðhÞht ðhÞ9 F t1 @h1 h1 ¼ h @h1 h1 ¼ h t¼1
&
It is of interest to note that when e0t ’s have a Cauchy or a normal distribution with density f ðÞ, the conditional density function of yt given yt1 and Yt1 is f ððyt /0 yt1 Þ=ct Þ, and the optimal estimating function turns out to be the score
834
A. Thavaneswaran et al. / Journal of Statistical Planning and Inference 143 (2013) 827–841
function sn ðhÞ, 0
1 y B t1 C 0 B C B t ¼ r f yt / yt1 ct C ct B C B C: sn ðhÞ ¼ B C yt /0 yt1 _ n f B X C c Y t B t1 C @ A 0 yt / yt1 c t t¼rf ct n f_ X
yt /0 yt1 ct
2 R _ f ðzÞ=f ðzÞ f ðzÞ dz, then the conditional Fisher information matrix In ðhÞ is given by Let if be 0 X 1 n n yt1 y0t1 f X yt1 Y0t1 f i Bi C 2 2 ct ct B t¼r C t¼r B C, n n 0 B X Y y0 C X Y Y t1 t1 t1 t1 A f @ if i 2 2 c c t t t¼r t¼r where if ¼ 1=2 for the Cauchy density, and if ¼1 for the normal density. As an illustration, consider the AR(1) time series with ARCH(1) errors given by yt ¼ fyt1 þ ct et , l
l
where 9ct 9 ¼ a0 þ a1 9yt1 9 , and e0t ’s have a stable distribution with location parameter zero and scale parameter one. First, an ad hoc quantile estimator of an unknown lambda is used (McCulloch, 1986). Then, the three estimating equations are solved as shown below for the parameters f, a0 and a1 . The optimal estimating function for ðf, a0 , a1 Þ0 based on h0t ’s has the form 0 1 n l X yt1 yt fyt1 2u1 e9u1 9 sin u 1 l B C ct ct 1e92u1 9 B C t¼2 B C
n B C l X 1 l l yt fyt1 29u2 9 e9u2 9 B C 9u2 9 n cos u e C 2 l l gn ðhÞ ¼ B 92u 9 29u 9 l 2 2e 2 B C c 1þe t t ¼ 2 9ct 9 B C B
C l n B C X l 9u 9l l y y f y 2 29u 9 e t1 t t1 @ 2 cos u2 e9u2 9 A 92u2 9l 29u2 9l c c 1þe 2e t¼2
t
t
and the corresponding information matrix is a 3 3 symmetric matrix with components l
I11 ¼
n 2u21 e29u1 9 X y2t1 , l 2 1e92u1 9 t ¼ 2 ct
I22 ¼ l
I23 ¼ I32 ¼
2u22l e29u2 9 l
1 þ e92u2 9 2e29u2 9
l n X 9yt1 9
2u22l e29u2 9 l
l
1 þe92u2 9 2e29u2 9
l
t¼2
c2t l
l
n X 1 , 2l c t¼2 t l
,
I33 ¼
2u22l e29u2 9 l
n X y2l
t1
l
1þ e92u2 9 2e29u2 9
t¼2
c2t l
,
and Iij ¼ 0, otherwise. 4. Recursive estimation of location for known scale One effective approach for parametric inference via estimating functions is recursive estimation. Recursive estimation for stable processes in the Bayesian context (Thavaneswaran and Peiris, 1999) was based on application of minimum dispersion methods, assuming the stability index l 4 1. Recursive estimation can be studied when 0 o l r 2 by using the general estimating function theory (Thavaneswaran and Abraham, 1988), which provides a clear focus on particular classes of estimating functions, and within these, leads to optimal estimates with large information. This theory has not yet been used in inference for nonlinear time series models with stable errors, and it deserves attention both for its concentration on information issues and for its focus on the choice of an optimal family of estimating functions. 4.1. Recursive estimation of the location parameter in AR(p) models We describe recursive estimation of / for known scale c. Based on g nn ðhÞ in (3.1), the optimal estimate is obtained by solving the estimating equation n X y /0 yt1 ¼ 0, Kðu, lÞ yt1 sin u t c t ¼ pþ1 l
l
where Kðu, lÞ ¼ 2ue9u9 =cð1e92u9 Þ. Since the above equation is nonlinear, a recursive parameter estimation procedure is proposed by using Taylor expansion and substituting the recursive estimate for f at each step. The estimate based on the
A. Thavaneswaran et al. / Journal of Statistical Planning and Inference 143 (2013) 827–841
835
first t1 observations is given by ( " !#)1 ( " !# t1 t1 ^0 y ^0 y X X ys1 y0s1 y / y / s1 s1 s1 s1 cos u s /^ t1 ¼ u ys1 sin u s c c c s ¼ pþ1 s ¼ pþ1 " !#) t1 ^0 y X ys1 y0s1 ^ y / s1 s1 : / s1 cos u s þu c c s ¼ pþ1 When the tth observation becomes available, the estimate becomes ( " !#)1 ( " !# t t ^0 y ^0 y X X ys1 y0s1 y / y / s1 s1 s1 s1 cos u s /^ t ¼ u ys1 sin u s c c c s ¼ pþ1 s ¼ pþ1 " !#) t 0 0 ^ X ys1 ys1 ^ y / s1 ys1 þu / s1 cos u s : c c s ¼ pþ1 ys1 y0s1 ^ 0 y =cÞ, then cos½uðys / s1 s1 c " !# 0 ^0 uKðu, l Þy y y 1 1 t1 t1 t / t1 yt1 cos u Jt ¼ Jt1 þ , c c
Let J1 t ¼ uKðu, lÞ
Pt
s ¼ pþ1
and
1 ^ 1 ^ 1 ^ ^ /^ t /^ t1 ¼ Jt J1 t / t Jt / t1 ¼ Jt Jt / t Jt1 / t1
"
¼ Jt Kðu, lÞyt1
^0 y y / t1 t1 sin u t c
! " !# ^0 y uKðu, lÞyt1 y0t1 y / t1 t1 cos u t /^ t1 c c
!# :
It is easy to show that the recursive equations for / become ( )1 " !# ^0 y uKðu, lÞyt1 y0t1 yt / t1 t1 cos u Jt1 Jt1 , Jt ¼ Jp þ c c where Jp is the p p identity matrix, and " !# ^0 y yt / t1 t1 ^ ^ : / t ¼ / t1 þ Jt Kðu, lÞyt1 sin u c
ð4:1Þ
ð4:2Þ
We provide explicit recursion formulas for estimating the AR(1) parameter f, when c ¼1. Based on g nn ðfÞ, we can show that 1 2 ^ J1 t ¼ J t1 þ uKðu, lÞyt1 cos½uðyt f t1 yt1 Þ,
and 1 ^ 1 ^ 1 ^ 2 ^ ^ ^ f^ t f^ t1 ¼ Jt ðJ 1 t f t J t f t1 Þ ¼ J t ðJ t f t J t1 f t1 uKðu, lÞyt1 cos½uðyt f t1 yt1 Þf t1 Þ
^ ¼ Jt Kðu, lÞyt1 sin½uðyt f t1 yt1 Þ: It is straightforward to show that the recursive equations are Jt ¼
Jt1 ^ 1 þ uKðu, lÞy2t1 cos½uðyt f t1 yt1 ÞJ t1
ð4:3Þ
and
f^ t ¼ f^ t1 þ J t Kðu, lÞyt1 sin½uðyt f^ t1 yt1 Þ:
ð4:4Þ
We present results from simulated data to demonstrate the effectiveness of parameter estimation based on the estimating function approach. The overriding advantage is the speed of the computation in all cases in order to obtain highly accurate estimates. We generated M¼500 sets of data each of size n ¼1000, from an AR(1) process with l-stable errors. We chose four values of f, viz., 0.5, 0.9, 0.5, and 0.9; and four values of l, viz., 2.0, 1.5, 1.0, and 0.5. For each b y , where f ^ is the sample serial correlation. We estimated generated series yt, we first obtained the residuals b e t ¼ yt f 0 t1 0 b also serves as the initial estimate to start the recursions l via the quantile estimate l~ (McCulloch, 1986). The estimate f 0 in (4.3) and (4.4). These computations were run on a Unix system—Intel(R) Core(TM)2 Quad CPU Q6700 at 2.66 GHz 4096 kB cache size. For M¼500 simulated sets, the CPU time for the recursive estimation using Fortran 77 code was less than 30 s on this Unix system. The empirical median as well as the 25th and 75th percentiles based on M¼500 simulates b . It may be seen that the recursive procedure yields extremely accurate estimate f b sets are shown in Table 2 for l~ and f n n of f, for all the values of l and f.
836
A. Thavaneswaran et al. / Journal of Statistical Planning and Inference 143 (2013) 827–841
Table 2 Percentiles of the distribution of recursive estimates from a stable AR(1) process: n¼ 1000, M ¼ 500. Estimate
True f
True l 2.0
2.0
1.9143
2.0
0.4802
0.50
0.5183
0.9
b f n l~
1.9111
2.0
2.0
b f n l~
0.8897
0.8992
0.9085
0.5
1.9163
2.0
2.0
b f n l~
0.5143
0.9
b f n
0.9071
0.8984
0.8889 1.5471
1.9165
0.4983 2.0
0.4815 2.0
0.5
l~
1.4657
1.5064
0.4660
0.4998
0.5424
0.9
b f n l~
1.4639
1.5052
1.5473
b f n l~
0.8940
0.8996
0.9038
0.5
1.4649
1.5068
1.5444
b f n l~
0.5301
0.5005
0.4370
0.9
1.4664
1.5013
1.5429
b f n
0.9080
0.8998
0.8917 1.0374
0.5
l~
0.9770
1.0016
0.0788
0.4810
0.7497
0.9
b f n l~
0.9791
1.0093
1.0479
b f n l~
0.7765
0.8995
0.9433
0.5
0.9772
1.0018
1.0360
b f n l~
0.7230
0.4952
0.1574
0.9
0.9843
1.0112
1.0462
b f n
-1.0132
0.9004
0.8204 0.5271
u1 ¼0.8
0.5
75th
l~
u1 ¼0.54
1.0
50th
0.5
u1 ¼0.01
1.5
25th
0.5
l~
0.4855
0.4979
0.4727
0.4984
0.5124
0.9
b f n l~
0.4441
0.5080
0.5703
b f n l~
0.8939
0.8981
0.9018
0.5
0.4886
0.4990
0.5265
0.5175
0.5005
0.4819
0.9
b f n l~
0.4844
0.5225
0.5690
b f n
0.9003
0.9001
0.8952
u1 ¼3.62
4.2. Recursive estimation of the location parameter in RCA(p) models Based on the optimal estimating function g nn ðfÞ, which is the first component in the vector given in (3.2), the optimal estimate of / is obtained by solving the estimating equation n X yt1 y /0 yt1 Kðu, lÞ sin u t ¼ 0: ct ct t ¼ pþ1 l
l
where Kðu, lÞ ¼ 2u expð9u9 Þ=ð1expð92u9 ÞÞ. A recursive parameter estimation procedure is proposed by using a Taylor expansion and substituting the recursive estimate for / at each step. The estimate based on the first t1 observations is given by " !# " !# ^ ^0 y ^0 y Pt1 Pt1 ys / ys1 y0s1 / ys / ys1 s1 s1 s1 s1 s1 sin u cos u þ u s ¼ pþ1 s ¼ pþ1 cs cs cs c2s " !# : /^ t1 ¼ ^0 y Pt1 ys1 y0s1 ys / s1 s1 cos u u s ¼ pþ1 cs c2s When the tth observation becomes available, the estimate based on the first t observations is given by " !# " !# ^ ^0 y ^0 y Pt Pt ys / ys1 y0s1 / ys / ys1 s1 s1 s1 s1 s1 þu sin u cos u s ¼ pþ1 s ¼ pþ1 cs cs cs c2s " !# : /^ t ¼ ^0 y Pt ys1 y0s1 ys / s1 s1 cos u u s ¼ pþ1 cs c2s
A. Thavaneswaran et al. / Journal of Statistical Planning and Inference 143 (2013) 827–841
837
Let J1 t
" !# t ^0 y X ys1 y0s1 ys / s1 s1 ¼ uKðu, lÞ cos u , cs c2s s ¼ pþ1
then 1 J1 t ¼ Jt1 þ
" !# ^0 y uKðu, lÞyt1 y0t1 yt / t1 t1 cos u , ct c2t
and " !# ! ^0 y uKðu, lÞyt1 y0t1 yt / 1 ^ 1 ^ 1 ^ 1 ^ t1 t1 ^ ^ ^ / t / t1 ¼ Jt ðJt / t Jt / t1 Þ ¼ Jt Jt / t Jt1 / t1 cos u / t1 ct c2t " !# ^0 y y / Kðu, lÞyt1 t1 t1 sin u t ¼ Jt : ct ct Now it is easily shown that the recursive equations become ( " !# )1 ^0 y uKðu, lÞyt1 y0t1 yt / t1 t1 cos u Jt1 Jt1 Jt ¼ Jp þ ct c2t
ð4:5Þ
and
/^ t ¼ /^ t1 þ Jt
" !# ^0 y y / Kðu, lÞyt1 t1 t1 sin u t : ct ct
ð4:6Þ
The algorithm described in (4.5) and (4.6) gives the new estimate at time t as the old estimate at time t1 plus an ^ in (4.6) is adjustment. Given initial values /0 and J0 , we can compute the estimate recursively. The recursive estimate / t usually referred to as an ‘on-line’ estimate and it can be mathematically computed in a feasible way, especially when data arise sequentially. To illustrate for the RCA(1) model, the optimal estimate is obtained by solving the estimating equation n X yt1 y fyt1 Kðu, lÞ sin u t ¼ 0, ct ct t¼2 l
l
where Kðu, lÞ ¼ 2u expð9u9 Þ=ð1expð92u9 ÞÞ. It is easily shown that when p ¼1, the recursive equations become Jt ¼
J "t1 !# ^ uKðu, lÞy2t1 yt f t1 yt1 1þ cos u J t1 ct c2t
ð4:7Þ
and
f^ t ¼ f^ t1 þ J t
" !# ^ Kðu, lÞyt1 y f t1 yt1 sin u t : ct ct
ð4:8Þ
The algorithm described in (4.7) and (4.8) gives the new estimate at time t as the old estimate at time t1 plus an adjustment. Given initial values f0 and J0, we again compute the estimate recursively. We present results from a simulation study for recursive estimation of f for the stable RCA(1) process. We first generated M¼500 sets of data each of size n ¼2500, generated from an RCA(1)process with l-stable errors. We chose four values of f, viz., 0.5, 0.9, 0.5, and 0.9; and four values of l, viz., 2.0, 1.5, 1.0, and 0.5. The values of sb and se were set at b y , where f is the 0.0005 and 0.05, respectively. For each generated series yt, we first obtained the residuals b e t ¼ yt f 0 t1 0 sample serial correlation. We estimated l from the resulting residuals via quantile estimation (McCulloch, 1986), and then Pn l b also serves as the initial estimate to start the recursions in (4.3) and estimated ct by c~ t ¼ ð1=nÞ t ¼ 1 9Y t 9 . The estimate f 0 (4.4). For M¼500 simulated sets, the CPU time for the recursive estimation using Fortran 77 code was under 1/2 min. For true f values of 0.5, 0.9, 0.5 and 0.9, Table 3 shows the empirical median as well as the 25th and 75th percentiles b . We see that the recursive procedure yields an extremely accurate based on M¼500 simulated data sets for l~ and f n b estimate f n of f, for all true values of l and all true values of f. 5. Illustration: volumes of NASDAQ index In this section, we carry out recursive estimation under an RCA(1) model for a financial time series. The data consists of n ¼3279 records of daily transaction volumes of the NASDAQ index, from January 1990 to December 2002. This data was discussed in Wang and Ghosh (2008), who used a Bayesian approach to fit an RCA(1) model.
838
A. Thavaneswaran et al. / Journal of Statistical Planning and Inference 143 (2013) 827–841
Table 3 Percentiles of the distribution of recursive estimates from a stable RCA(1) process: n¼ 2500, M ¼500. True l 2.0
True f
75th 2.0
1.9404
2.0
0.4802
0.50
0.5183
0.9
b f n l~
1.9409
2.0
2.0
b f n l~
0.8929
0.8996
0.9052
0.5
1.9434
2.0
2.0
b f n l~
0.5097
0.9
0.9062
0.9005
0.8941
0.5
b f n l~
1.4794
1.5044
1.5301
0.4939
0.4997
0.5059
0.9
b f n l~
1.4789
1.5014
1.5289
b f n l~
0.8974
0.9000
0.9024
0.5
1.4785
1.5043
1.5310
b f n l~
0.5063
0.5001
0.4938
0.9
1.4803
1.5040
1.5305
0.9027
0.9002
0.8976
0.5
b f n l~
0.9833
0.9991
1.0188
0.4967
0.4999
0.5029
0.9
b f n l~
0.9805
0.9980
1.0223
b f n l~
0.8992
0.8999
0.9008
0.5
0.9841
0.9994
1.0438
b f n l~
0.5039
0.4999
0.4964
0.9
0.9818
1.0001
1.0219
0.9008
0.9000
0.8990
0.5
b f n l~
0.4742
0.4839
0.4920
0.4053
0.4999
0.5345
0.9
b f n l~
0.3215
0.3882
0.4339
0.8891
0.9000
0.9071
0.5
b f n l~
0.4752
0.4847
0.5939
0.5384
0.5000
0.4460
0.9
b f n l~
0.3502
0.3918
0.4424
b f n
0.9002
0.8999
0.8984
u1 ¼0.8
0.5
50th
l~
u1 ¼0.54
1.0
25th
0.5
u1 ¼0.01
1.5
Estimate
u1 ¼3.62
1.9412
0.4996 2.0
0.4896 2.0
The top plot in Fig. 1 shows a plot of the daily data Xt, on a logarithmic scale; the plot exhibits an upward trend over time. The bottom plot in Fig. 1 shows a plot of the residuals Yt, obtained from detrending the original data via the deterministic regression X t ¼ b0 þ b1 t þet ; b b b t, where b b ¼ 14:066 and b b ¼ 7:516 104 were the least squares estimates of b and b , respectively. i.e., Y t ¼ X t b 0 1 0 1 0 1 The estimated standard errors of the estimated regression intercept and slope were, respectively, 7:53 103 and 3:97 106 , while the coefficient of determination was R2 ¼ 0:916. We first fit the RCA(1) model with Gaussian errors to the series ðY 1 , . . . ,Y n Þ using the approach based on estimating equations (Thavaneswaran and Abraham, 1988) in order to obtain consistent estimates of the f1 parameter. Estimates of the scale coefficients cb and ce are obtained using the following steps, which are similar to those described in Pantula (1988). First, we run a regression through the origin of Yt on Y t1 , and compute the least squares residuals e~ t . Next, we assume that Z~ 0 and Z~ 1 denote the intercept and slope in the regression of e~ 2t on e~ 2t1 , and compute the fits nt ¼ Z~ 0 þ Z~ 1 e~ 2t1 . 1 ~2 ~2 Finally, we regress n1 and n1 t e t on nt t e t1 . The estimated generalized least squares estimates in this model, respectively, b ¼ 0:4502. These estimates estimate cb and ce to be 0.1914 and 0.0307. Given these values, the recursive estimate of f1 is f 1 were similar to those obtained by Wang and Ghosh (2008), and were obtained using our fast computational approach based on estimating equations. The residual series from the Gaussian RCA(1) fit and its normal Q–Q plot are shown in Fig. 2, and indicate the possibility of heavy-tailed behavior in the data. The estimate of the stability index of the residual b ¼ 1:6591, indicating possible heavy-tailed behavior. series using the code from Nolan (1997) gives l We fit an RCA(1) model with l-stable errors to the series ðY 1 , . . . ,Y n Þ using the recursive approach discussed in Section l 4.2. Rather than estimating cb and ce , we compute ct as aveð9Y t 9 Þ and then obtain the estimated index of stability to be b l ¼ 1:6481. The optimal value of u corresponding to the estimated l is 0.49, and the recursive estimate of f is 0.5733.
Log NASDAQ Volumes
A. Thavaneswaran et al. / Journal of Statistical Planning and Inference 143 (2013) 827–841
839
17 16 15 14 13 12 0
500
1000
1500
2000
2500
3000
2000
2500
3000
Detrended Residual Series
Time
0 −1 −2 −3 0
500
1000
1500 Time
RCA(1) Residuals
Fig. 1. NASDAQ log volumes and detrended series from 1990 to 2002.
0 −3 0
500
1000
1500
2000
2500
3000
Sample Quantiles
Time 0 −3 −2
0
2
Theoretical Quantiles
Fig. 2. RCA(1) residuals and its normal Q–Q plot.
6. Conclusions In this paper, we have used of a class of bounded estimating functions for joint parameter estimation of location and scale in linear and nonlinear time series models with infinite variance stable errors (with stability index 0 o l r 2). Our research fills a gap and adds to the literature on inference for such time series. It is useful to note that the same estimating equations will be obtained irrespective of whether the time series are stationary or nonstationary. Although asymptotic inferential properties might differ, the finite sample properties that correspond to the theory of estimating equations will be valid for stationary and nonstationary processes. A significant useful contribution of our paper is the derivation of fast, recursive, on-line estimation for location parameters in linear and nonlinear time series models.
Acknowledgments The authors are grateful for reviewer suggestions. We thank John P. Nolan for the Fortran code for quantile estimation of the stability index l, and Dazhe Wang and Sujit Ghosh for providing the data set used in Section 5. The work of the first author was supported in part by NSERC.
Appendix A. Motivation for the estimating function approach A.1. Random coefficient autoregressive models Consider the RCA(1) model yt ¼ ðy þ bt Þyt1 þ et ,
ðA:1Þ
where fbt g and fet g are uncorrelated zero mean processes with respective unknown variances s2b and s2e ¼ s2e ðyÞ where y is an unknown location parameter. In (A.1), assume that s2b is known, and y is the only parameter to be estimated. The conditional mean of yt is mt ðyÞ ¼ yt1 y. Let mt ðyÞ ¼ yt mt ðyÞ, so /mSt ¼ y2t1 s2b þ s2e ðyÞ. The conditional least squares
840
A. Thavaneswaran et al. / Journal of Statistical Planning and Inference 143 (2013) 827–841
function and the associated information for estimating y are given by P n X ð n y2 Þ2 yt1 mt and ICLS ðyÞ ¼ Pn t ¼ 12 t1 : g CLS ðyÞ ¼ t ¼ 1 yt1 /mSt t¼1 The optimal martingale estimating function and the associated information based on mt ðyÞ are given by g nM ðyÞ ¼
n X y t¼1
t1 mt /mSt
and
Ig nM ðyÞ ¼
n X y2t1 : /mS t t¼1
The inequality n X
y2t1 /mSt
t¼1
n X y2t1 Z /mS t t¼1
!2
n X
y2t1
t¼1
implies that ICLS ðyÞ rIg nM ðyÞ. Hence, the optimal estimating function is more informative than the conditional least squares function. A.2. Doubly stochastic time series models RCA models are special cases of what Tjøstheim (1986) refers to as doubly stochastic time series models, given by yt ¼ yt f ðt,F yt1 Þ þ et : Here, fy þ bt g of (A.1) is replaced by a more general stochastic sequence fyt g, while yt1 is replaced by a function of the past, viz., F yt1 . Suppose that fyt g is a moving average sequence of the form
yt ¼ y þ at þ at1 , where fat g are square integrable independent random variables with mean zero and variance s2a . We further assume that fet g and fat g are independent, so that E½yt 9F yt1 depends on the posterior mean ut ¼ E½at 9F yt and posterior variance vt ¼ E½ðat ut Þ2 9F yt of at. Under the normality assumption of fet g and fat g, and the initial condition y0 ¼ 0, ut and vt satisfy the following Kalman-like recursive algorithms (see Shiryayev, 1984, p. 439): ut ðyÞ ¼
s2a f ðt,F yt1 Þðyt ðy þ ut1 Þf ðt,F yt1 ÞÞ s2e ðyÞ þf 2 ðt,F yt1 Þðs2a þ vt1 Þ
and vt ðyÞ ¼ s2a
s4a f 2 ðt,F yt1 Þ , s2e ðyÞ þf 2 ðt,F yt1 Þðs2a þ vt1 Þ
where u0 ¼ 0 and v0 ¼ s2a . Hence, the conditional mean and variance of yt are given by
mt ðyÞ ¼ ðy þ ut1 ðyÞÞf ðt,F yt1 Þ and
s2t ðyÞ ¼ s2e ðyÞ þf 2 ðt,F yt1 Þðs2a þ vt1 ðyÞÞ which can be computed recursively. 2 For the sequence of martingale differences mt ðyÞ ¼ yt mt ðyÞ, we can derive that /mSt ¼ s2e ðyÞ þf ðt,F yt1 Þðs2a þ vt1 ðyÞÞ. The optimal estimating function and associated information based on mt ðyÞ are given by n X @ut1 ðyÞ mt g nm ðyÞ ¼ f ðt,F yt1 Þ 1 þ @y /mSt t¼1 and 2 @ut1 ðyÞ 2 f ðt,F yt1 Þ 1 þ @y Ig nm ðyÞ ¼ : /mS t t¼1 n X
Then the inequality
n X
@ut1 ðyÞ 2 f ðt,F yt1 Þ 1 þ @y t¼1 Z
n X t¼1
2
f ðt,F yt1 Þ 1 þ
2
/mSt
@ut1 ðyÞ @y
2 @ut1 ðyÞ 2 y n f ðt,F t1 Þ 1þ X @y t¼1
2 ! 2
/mSt
A. Thavaneswaran et al. / Journal of Statistical Planning and Inference 143 (2013) 827–841
841
implies that 2 !2 @ut1 ðyÞ 2 y f ðt,F Þ 1 þ t¼1 t1 @y r Ignm ðyÞ: ICLS ðyÞ ¼ 2 Pn @ut1 ðyÞ 2 y f ðt,F Þ 1 þ /mS t t¼1 t1 @y Pn
That is, the optimal linear estimating function g nm ðyÞ is more informative than the conditional least squares estimating function g CLS ðyÞ. Moreover, the relations @ut1 ðyÞ 2 2 f ðt,F yt1 Þs2a 1 þ ðs2e ðyÞ þ f ðt,F yt1 Þðs2a þ vt1 ðyÞÞÞ @ut ðyÞ @y ¼ 2 @y ðs2e ðyÞ þ f ðt,F yt1 Þðs2a þvt1 ðyÞÞÞ2 2 @se ðyÞ @vt1 ðyÞ 2 s2a ðyt f ðt,F yt1 Þðy þ ut1 ðyÞÞÞ þ f ðt,F yt1 Þ @y @y 2 ðs2e ðyÞ þ f ðt,F yt1 Þðs2a þvt1 ðyÞÞÞ2 and @vt ðyÞ ¼ @y
s4a f 2 ðt,F yt1 Þ
2 @se ðyÞ @vt1 ðyÞ 2 þ f ðt,F yt1 Þ @y @y 2
ðs2e ðyÞ þf ðt,F yt1 Þðs2a þ vt1 ðyÞÞÞ2
can be applied to calculate the estimating functions and associated information recursively. References Adler, R.J., Feldman, R., Taqqu, M.S. (Eds.), 1998. A User’s Guide to Heavy Tails: Statistical Techniques for Analyzing Heavy Tailed Distributions and ¨ Processes. Birkhauser, Boston. Bera, A.K., Bilias, Y., Simlai, P., 2006. Estimating functions and equations: an essay on historical developments with applications to economics. In: Mills, T.C., Patterson, K. (Eds.), Palgrave Handbook of Econometrics, vol. I, pp. 427–476. Brockwell, P.J., Davis, R.A., 2006. Time Series: Theory and Methods. Springer-Verlag, New York. Engle, R., 1982. Autoregressive conditional heteroskedasticity with estimates of the variance of United Kingdom inflation. Econometrica 50, 987–1007. Fama, E., 1965. The behavior of stock market prices. Journal of Business 38, 34–105. Feuerverger, A., Mureika, R.A., 1977. The empirical characteristic function and its applications. Annals of Statistics 5, 88–97. Godambe, V.P., 1985. The foundations of finite sample estimation in stochastic process. Biometrika 72, 319–328. Gross, S., Steiger, W.L., 1979. Least absolute deviation estimates in autoregression with infinite variance. Journal of Applied Probability 167, 104–116. Hubalek, F., Posedel, P., 2011. Joint analysis and estimation of stock prices and trading volume in Barndorff-Nielsen and Shephard stochastic volatility models. Quantitative Finance 11 (6), 174–186. Li, D.X., Turtle, H.J., 2000. Semiparametric ARCH models: an estimating function approach. Journal of Business and Economic Statistics 18 (2), 174–186. Mandelbrot, B., 1963. The variation of speculative prices. Journal of Business 36, 394–419. McCulloch, J.H., 1986. Consistent estimators of stable distribution parameters. Communications in Statistics: Simulation and Computation 15 (4), 1109–1136. Merkouris, T., 2007. Transform martingale estimating functions. Annals of Statistics 35, 1975–2000. Mikosch, T., Gadrich, T., Kluppelberg, C., Adler, R.J., 1995. Parameter estimation for ARMA models with infinite variance innovations. Annals of Statistics 23, 305–326. Nicholls, D.F., Quinn, B.G., 1980. The estimation of autoregressive models with random coefficients. Journal of Time Series Analysis 1, 37–46. Nolan, J.P., 1997. Numerical calculation of stable densities and distribution functions, Communications in Statistics. Stochastic Models 13 (4), 759–774. Panorska, A., Mittnik, S., Rachev, S.T., 1995. Stable GARCH models for financial time series. Applied Mathematics Letters 8, 33–37. Pantula, S.G., 1988. Estimation of autoregressive models with arch errors. Sankhya, Series B 50, 119–138. Qiou, Z., Ravishanker, N., 1998. Bayesian inference for time series with stable innovations. Journal of Time Series Analysis 19, 235–249. Samorodnitsky, G., Taqqu, M.S., 1994. Stable Non-Gaussian Random Processes: Stochastic Models with Infinite Variance. Chapman & Hall, New York. Shiryayev, A.N., 1984. Probability Graduate Texts in Mathematics, Vol. 95. Springer-Verlag, New York. Small, C.C., McLeish, D.L., 1994. Hilbert Space Methods in Probability and Statistical Inference. Wiley, New York. Stuck, B.W., Kleiner, B., 1974. A statistical analysis of telephone noise. The Bell System Technical Journal 53, 1263–1320. Thavaneswaran, A., Abraham, B., 1988. Estimation of nonlinear time series models using estimating functions. Journal of Time Series Analysis 9, 99–108. Thavaneswarana, A., Heyde, C.C., 1999. Prediction via estimating functions. Journal of Statistical Planning and Inference 77, 89–101. Thavaneswaran, A., Peiris, S., 1999. Estimation for regression with infinite variance errors. Mathematical and Computer Modelling 29, 177–180. Tjøstheim, D., 1986. Estimation in nonlinear time series models. Stochastic Processes and their Applications 21, 251–273. Wang, D., Ghosh, S.K., 2008. Bayesian estimation and unit root tests for Random Coefficient AutoRegressive models. Model Assisted Statistics and Applications 3, 281–295. Yohai, V., Maronna, R.A., 1977. Asymptotic behavior of least squares estimates for autoregressive processes with infinite variances. Annals of Statistics 5, 554–560. Yu, J., 2004. Empirical characteristic function estimation and its applications. Econometric Reviews 23, 93–123.