Journal of Statistical Planning and Inference 173 (2007) 212 – 229 www.elsevier.com/locate/jspi
First-order random coefficient integer-valued autoregressive processes Haitao Zhenga , Ishwar V. Basawaa , Somnath Dattab,∗ a Department of Statistics, University of Georgia, Athens, GA 30602, USA b Department of Bioinformatics and Biostatistics, University of Louisville, Louisville, KY 40202, USA
Received 23 February 2005; accepted 14 December 2005 Available online 20 February 2006
Abstract A first-order random coefficient integer-valued autoregressive (RCINAR(1)) model is introduced. Ergodicity of the process is established. Moments and autocovariance functions are obtained. Conditional least squares and quasi-likelihood estimators of the model parameters are derived and their asymptotic properties are established. The performance of these estimators is compared with the maximum likelihood estimator via simulation. © 2006 Elsevier B.V. All rights reserved. Keywords: Models of count data; Thinning models; INAR models; Random coefficient models; Conditional least squares; Quasi-likelihood; Maximum likelihood; Asymptotic distributions
1. Introduction Integer-valued time series data are fairly common in practice (e.g., monthly unemployment figures), but models and methods for their analysis are relatively new in the literature. There are only a few broad classes of time series models that have been developed recently for count data. See, for instance, Davis et al. (1999) for a recent review. See, also, MacDonald and Zucchini (1997). Two main classes are: (a) state-space models (using latent processes) and (b) thinning models. See Fukasawa and Basawa (2002) for an example of state-space modeling. The integer autoregressive (INAR) models belong to the class of thinning models. A first-order autoregressive model with count (or integer-valued) data is defined through the “thinning” operator ◦ which is due to Steutal and Van Harn (1979). Let X be an integer-valued random variable and ∈ [0, 1], then the thinning operator “◦” is defined as ◦X=
X
Bi ,
i=1
∗ Corresponding author. Tel.: +1 502 852 6376; fax: +1 502 852 3294.
E-mail address:
[email protected] (S. Datta). 0378-3758/$ - see front matter © 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.jspi.2005.12.003
(1)
H. Zheng et al. / Journal of Statistical Planning and Inference 173 (2007) 212 – 229
213
where {Bi } is an i.i.d. Bernoulli random sequence with P (Bi = 1) = that is independent of X. With this operator, the INAR(1) model is defined as Xt = ◦ Xt−1 + Zt ,
t 1,
(2)
where {Zt } is a sequence of i.i.d. non-negative integer-valued random variables with mean and variance 2z and {Zt } is independent of X0 . Throughout the rest of the paper, {Zt } will be referred to as innovations. This model has some similar properties as the ordinary AR(1) model and it has been discussed by McKenzie (1985a,b, 1986, 1987, 1988a,b), Al-osh and Alzaid (1987, 1991, 1992); Alzaid and Al-osh (1988), among others. In particular, note that the parameter in (2) plays the role of an autoregressive parameter. In some situations, the parameter may vary with time and it may be random. For example, let Xt denotes the number of terminally ill patients in the tth month, say. Here, Xt could potentially satisfy an INAR model where ◦ Xt−1 is the number of surviving patients from the previous month and Zt stands for the newly admitted patients in the current month. In addition, the survival rate may be affected by various environmental factors, such as the quality of health care, the state of health of patients, etc. and could vary randomly over time. Suppose Xt denote the number of unemployed in the tth month. Then Xt can be modeled as the sum of the previously unemployed ◦ Xt−1 and the newly unemployed Zt . Again the employment rate may vary randomly over time, and may be affected by factors such as the state of the economy, productivity growth, etc. In this paper, we extend the above model to a random coefficient model, where the fixed is replaced by a random parameter t , where t are realizations of i.i.d. random variables taking values in the interval [0, 1). Our main objective is to investigate basic probabilistic and statistical properties of this model and inferential methods for the relevant parameters associated with the model. The paper is organized as follows. In Section 2, the random coefficient model is described in detail and some statistical properties are established. In Section 3, we propose two estimation methods for the model parameters. In Section 4, we present some simulation results for the estimation methods. The parameter estimates are compared with the benchmark maximum likelihood (ML) estimates which can be computed under a full parametric specification as in the simulation setting. We also compare these estimates with those obtained from the standard fixed coefficient INAR(1) model. The paper ends with a discussion section. All the proofs are deferred to the Appendix.
2. The first-order random coefficient integer-valued autoregressive model A first-order random coefficient autoregressive (RCINAR (1)) model is defined by the following recursive equation: Xt = t ◦ Xt−1 + Zt ,
t 1,
(3)
where {t } is an i.i.d. sequence with cumulative distribution function (CDF) P on [0, 1); {Zt } is an i.i.d. nonnegative integer-valued sequence with probability mass function (PMF) fz such that E(Zt4 ) < ∞; X0 , {t } and {Zt } are independent; E(X02 ) < ∞. Let = E(t ), 2 = Var(t ), = E(Zt ), 2 = 2 + 2 , 2z = Var(Zt ) and note that they are all assumed finite. It is easy to see that Xt is a Markov chain on {0, 1, 2, . . .} with the following transition probabilities: Pij = P (Xt = i|Xt−1 = j ) =
min(i,j ) k=0
j k
fz (i − k)
1 0
k1 (1 − 1 )j −k dP1 .
(4)
The Markovity follows from (3) and the fact that {t } is an i.i.d. sequence. Eq. (4) is obtained by noting that conditional on t , t ◦ Xt−1 is binomial with parameters (Xt−1 , t ), and hence, conditional on both (Xt−1 , t ), the PMF of Xt is given by fz (Xt − t ◦ Xt−1 ). Finally, unconditioning, we get the result in (4). The ergodicity property of the process is of independent interest. It will be very useful to derive the asymptotic properties of the parameter estimates. The moments and conditional moments will be useful in obtaining the appropriate estimating equations for parameter estimation.
214
H. Zheng et al. / Journal of Statistical Planning and Inference 173 (2007) 212 – 229
Proposition 2.1. For t 1, (i) (ii) (iii) (iv)
E(Xt |Xt−1 ) = Xt−1 + . E(Xt ) = /(1 − ), if E(X0 ) = /(1 − ). Var(Xt |Xt−1 , t ) = t (1 − t )Xt−1 + 2z . 2 + ((1 − ) − 2 )X 2 Var(Xt |Xt−1 ) = 2 Xt−1 t−1 + z .
(v) Var(Xt ) = b/(1 − 2 ), where b = /(1 − )((1 − ) − 2 ) + 2z + 2 2 /(1 − )2 , if Var(X0 ) = b/(1 − 2 ).
(vi) Cov(Xt+k , Xt ) = k Var(Xt ), k 0. If Var(X0 ) = b/(1 − 2 ), (k) = k , where (k) = Corr(Xt+k , Xt ).
Proposition 2.2. The process {Xt } is an irreducible, aperiodic and positive recurrent (and hence ergodic) Markov chain. Moreover, the stationary distribution of {Xt } is given by that of ∞ k=2 (k ◦ · · · ◦ 2 ◦ Zk ) + Z1 , where the series converges almost surely and also in L2 . Corollary 2.1. The mean and variance of the stationary distribution are given by m1 = /(1 − ) and 2x = b/(1 − 2 ), respectively. Next, we consider the problem of estimation of the first two moments of the random auto-regressive coefficients t and the innovation Zt .
3. Estimation methods For the RCINAR(1) model, our primary interest lies in estimating the mean autoregressive rate = E(t ) and the mean innovation = E(Zt ) from a sample (X1 , X2 , . . . , Xn ). However, the corresponding variances 2 = Var(t )
and 2z = Var(Zt ) need to be estimated as well. In this paper, we mainly consider two methods, namely, conditional least squares (CLS) and the modified quasilikelihood (MQL). An advantage of these methods is that they do not require specifying the exact family of distributions for the random autoregressive coefficients, as well as the innovations. As a result, they are more robust albeit less efficient than the ML. We will consider the ML method as well under a specific parametric model for the distributions of t and Zt in a simulation experiment. In fact, we use the ML method as a benchmark while comparing the CLS and MQL in our simulation. 3.1. Conditional least squares (CLS) Let S() =
n
(Xt − Xt−1 − )2 ,
t=1
with = (, )T , be the CLS criterion function. The CLS estimators of and are obtained by minimizing S over ∈ {0 < 1, > 0} and are given by ˆ=
n
t=1 Xt Xt−1 − 2 n nt=1 Xt−1
ˆ = n
−1
n t=1 Xt−1 t=1 Xt , 2 n − t=1 Xt−1
n
n
n t=1
ˆ Xt −
n
Xt−1 .
t=1
The following result establishes the asymptotic distribution of the CLS estimator.
(5)
H. Zheng et al. / Journal of Statistical Planning and Inference 173 (2007) 212 – 229
215
ˆ ) ˆ given by (5), we have Theorem 3.1. For the CLS estimators (, √
ˆ − d n ˆ → N(0, V −1 W V −1 ), −
where W=
21 12
12 22
,
21 = E((X1 − X0 − )2 ), V
−1
= (m2 − m21 )−1
m2 = m21 + 2x ,
22 = E(X02 (X1 − X0 − )2 ) −m1 m2
1 −m1
and
12 = E(X0 (X1 − X0 − )2 ),
,
j
mj = E(X1 ), j = 1, 2,
2x = Var(X1 ) = m2 − m21 > 0,
where E denotes expectation with respect to the stationary distribution. The consistency of ˆ and ˆ follows readily from Theorem 3.1. Corollary 2.2 of Klimko and Nelson (1978) can be applied here as well and the conditions for the theorem can be easily verified but omitted here. 3.2. Modified quasi-likelihood (MQL) Let = (2 , , 2z )T , where 2 = Var(1 ), = (1 − ) − 2 and 2z = Var(Z1 ) and = (, )T . Recall from Proposition 2.1, the expression for the one-step conditional variance 2 + Xt−1 + 2z . V (Xt |Xt−1 ) := Var(Xt |Xt−1 ) = 2 Xt−1
A set of standard QL estimating equations take the form: n
V−1 (Xt |Xt−1 )(Xt − Xt−1 − ) = 0,
t=1
n
V−1 (Xt |Xt−1 )Xt−1 (Xt − Xt−1 − ) = 0.
(6)
t=1
Note that the presence of in the expression for the conditional variance makes the corresponding estimating equations complicated and intractable in the general case. Consequently, we propose substituting a suitable consistent estimator ˆ of obtained by other means and then solve the resulting MQL estimating equations for the primary parameters of interest. This approach leads to the following closed form estimator of : −1 2 Xt−1 V ˆ−1 (Xt |Xt−1 ) Xt−1 V ˆ−1 (Xt |Xt−1 ) ˜ −1 = ˜ Xt−1 V ˆ−1 (Xt |Xt−1 ) V ˆ (Xt |Xt−1 ) −1 Xt−1 Xt V ˆ (Xt |Xt−1 ) × . Xt V ˆ−1 (Xt |Xt−1 )
A consistent estimator of is proposed next (following Schick, 1996) that can be used in (7).
(7)
216
H. Zheng et al. / Journal of Statistical Planning and Inference 173 (2007) 212 – 229
Proposition 3.1. Let T (x) be a bounded function of x and T¯ = n−1 nt=1 T (Xt−1 ), then the following estimators are consistent: n ˆ t−1 − ) ˆ 2 − (ˆ − ˆ 2 ) n (T (Xt−1 ) − T¯ )Xt−1 (T (Xt−1 ) − T¯ )(Xt − X t=1 , ˆ 2 = t=1 n 2 −X ¯ (T (X ) − T )(X t−1 t−1 ) t=1 t−1 ˆ 2z
n n n ˆ 2) ˆ 2 1 (ˆ − 2 2 ˆ ˆ = (Xt − Xt−1 − ) − (Xt−1 − Xt−1 ) − Xt−1 , n n n t=1
t=1
t=1
2 ˆ = ˆ − ˆ − ˆ 2 ,
(8)
ˆ and ˆ are consistent estimators of and . In practice, we can use the CLS estimators of and . where We should point out that the above estimators cannot be guaranteed to be positive estimators for 2 and 2z for all
samples. Alternatively, one can use the ML estimators of 2 and 2z , under an assumed parametric model, for such samples. One such parametric model will be to assume that Z follows a Poisson distribution. Statistical properties of such a hybrid estimator of 2 are studied in the next section in a simulation study. Asymptotic normality of the MQL estimators in (7) is established in the following theorem. ˜ ) ˜ given by (7) is Theorem 3.2. The joint limit distribution of the MQL estimators (, ˜ − √ d n ˜ −→ N(0, T −1 () Q()T −1 ()), − where
Q() =
T1 () T3 ()
T3 () , T2 ()
T
−1
() = (T1 ()T2 () − T32 ())−1
T1 () −T3 () −T3 () T2 () ,
where T1 () = E[V−1 (X1 |X0 )], T2 () = E[X02 V−1 (X1 |X0 )], and T3 () = E[X0 V−1 (X1 |X0 )]. Note that the consistency of ˜ and ˜ follows readily from the above result. 4. Simulation studies Consider the model Xt = t ◦ Xt−1 + Zt , where {t } is an i.i.d. sequence with log(t /(1 − t )) = + Ut , where = E(log(t /(1 − t ))) and {Ut } is an i.i.d. sequence of normal random variables with mean 0 and variance 2 ; {Zt } is an i.i.d. Poisson sequence with mean . In the simulation, we fixed X0 at 3. The following is a typical sample path from this model, for a sample size 200, when = 1, = 10. We use the above model to generate data, and then use CLS, MQL and MLE methods to estimate the parameters, and here the MLE method can be used as a benchmark. The ML score equations under this assumed parametric model for the Z are given by
1 k min(Xt ,Xt−1 ) Xt−1 Xt −k Xt−1 −k n j/j /(X − k)! (1 − ) dP t 1 k=0 0 1 k 1 = 0,
1 k min(Xt ,Xt−1 ) Xt−1 Xt −k Xt−1 −k /(Xt − k)! 0 1 (1 − 1 ) dP1 t=1 k=0 k
H. Zheng et al. / Journal of Statistical Planning and Inference 173 (2007) 212 – 229
217
Table 1 Estimates for parameter and for = 1 CLS
MQL
MLE
F-MQL
F-MLE
(−0.0889 0.1217) (−0.0448 0.0878) (−0.0238 0.0598)
(−0.0594 0.1219) (−0.0274 0.0819) (−0.0151 0.0550)
(−0.0211 0.0828) (−0.0137 0.0534) (−0.0065 0.0370)
(−0.0669 0.1204) (−0.0318 0.0822) (−0.0175 0.0557)
(−0.0650 0.0922) (−0.0549 0.0589) (−0.0478 0.0396)
(0.2656 0.4024) (0.1329 0.2784) (0.0645 0.1807)
(0.1775 0.4009) (0.0826 0.2632) (0.0398 0.1676)
(0.0660 0.2687) (0.0571 0.1786) (0.0178 0.1125)
(0.1985 0.3975) (0.0945 0.2635) (0.0463 0.1697)
(0.1880 0.3158) (0.1661 0.2116) (0.1409 0.1273)
(−0.0796 0.1200) (−0.0439 0.0762) (−0.0251 0.0620)
(−0.0473 0.1173) (−0.0269 0.0732) (−0.0143 0.0571)
(−0.0172 0.0802) (−0.0110 0.0505) (−0.0076 0.0417)
(−0.0575 0.1174) (−0.0330 0.0737) (−0.0184 0.0592)
(−0.0943 0.0921) (−0.0799 0.0608) (−0.0707 0.0465)
(0.3815 0.6990) (0.2417 0.4770) (0.1399 0.3752)
(0.2585 0.6856) (0.1733 0.4622) (0.0913 0.3422)
(0.0783 0.4678) (0.0703 0.3137) (0.0496 0.2424)
(0.3164 0.6875) (0.2079 0.4634) (0.1153 0.3560)
(0.5300 0.5595) (0.4713 0.3800) (0.4175 0.2750)
(−0.0432 0.1471) (−0.0200 0.1058) (−0.0119 0.0777)
(−0.0302 0.1490) (−0.0112 0.1038) (−0.0067 0.0753)
(0.0014 0.1340) (0.0018 0.0939) (−0.0015 0.0702)
(−0.0367 0.1480) (−0.0157 0.1039) (−0.0093 0.0754)
(−0.0264 0.1377) (−0.0191 0.0946) (−0.0188 0.0675)
(0.0979 0.2635) (0.0490 0.1677) (0.0190 0.1169)
(0.0494 0.2620) (0.0229 0.1664) (0.0050 0.1145)
(0.0082 0.2446) (0.0078 0.1523) (0.0018 0.1072)
(0.0570 0.2636) (0.0275 0.1669) (0.0076 0.1145)
(0.0444 0.2510) (0.0340 0.1562) (0.0221 0.1058)
(−0.0539 0.1337) (−0.0197 0.1030) (−0.0117 0.0760)
(−0.0368 0.1387) (−0.0142 0.0973) (−0.0063 0.0737)
(−0.0087 0.1238) (−0.0055 0.0890) (−0.0027 0.0677)
(−0.0462 0.1354) (−0.0170 0.1000) (−0.0088 0.0741)
(−0.0525 0.1206) (−0.0341 0.0859) (−0.0211 0.0638)
(0.1330 0.4103) (0.0550 0.3126) (0.0331 0.2244)
(0.0862 0.4208) (0.0379 0.2955) (0.0183 0.2204)
(0.0188 0.3763) (0.0223 0.2759) (0.0169 0.2066)
(0.1082 0.4151) (0.0456 0.3042) (0.0243 0.2209)
(0.1354 0.3976) (0.0991 0.2766) (0.0552 0.1890)
= 1, = 1 = 0.6967347, 2 = 0.03335209
n = 50 n = 100 n = 200
n = 50 n = 100 n = 200
= 1, = 2 = 0.6967347, 2 = 0.03335209
n = 50 n = 100 n = 200
n = 50 n = 100 n = 200
= −1, = 1 = 0.3032653, 2 = 0.03335209
n = 50 n = 100 n = 200
n = 50 n = 100 n = 200
= −1, = 2 = 0.3032653, 2 = 0.03335209
n = 50 n = 100 n = 200
n = 50 n = 100 n = 200
where = (, , ) and
(log(1 /(1 − 1 )) − )2 dP1 = exp − √ 22 (1 − 1 ) 2 1
d1 .
The (univariate) integrals in the above expressions were computed using standard integration routine in R. For comparison, we also use the methods for the fixed coefficient model, MQL and MLE (denoted by F-MQL and F-MLE in Tables 1 and 2), to estimate parameters and by using the same data generated from the above model. For the fixed coefficient model, Var(Xt |Xt−1 ) = Xt−1 (1 − ) + 2z and we use it in the MQL estimation equations. The conditional density f (Xt |Xt−1 ) for the fixed coefficient model is min(Xt ,Xt−1 ) Xt−1 Xt −k f (Xt |Xt−1 ) = k (1 − )Xt−1 −k . k (Xt − k)! k=0 We can get the F-MLEs for and by maximizing nt=1 log f (Xt |Xt−1 ).
218
H. Zheng et al. / Journal of Statistical Planning and Inference 173 (2007) 212 – 229
Table 2 Estimates for parameter and for = 10 CLS
MQL
MLE
F-MQL
F-MLE
(−0.0891 0.1770) (−0.0510 0.1284) (−0.0322, 0.0979)
(−0.0302 0.1653) (−0.0170 0.1129) (−0.0078, 0.0809)
(−0.0144 0.1253) (−0.0021 0.0801) (−0.0013 0.0546)
(−0.0649 0.1781) (−0.0352 0.1233) (−0.0213, 0.0913)
(−0.1473 0.1324) (−0.1317 0.0896) (−0.1019 0.0584)
(0.1681 0.2826) (0.0985 0.2150) (0.0577, 0.1500)
(0.0506 0.2622) (0.0322 0.1876) (0.0131, 0.1230)
(0.0173 0.1933) (0.0050 0.1482) (0.0035 0.0977)
(0.0993 0.2823) (0.0580 0.2050) (0.0318, 0.1373)
(0.2754 0.2707) (0.2488 0.1920) (0.2065 0.1401)
(−0.1220 0.1632) (−0.0574, 0.1361) (−0.0353 0.0991)
(−0.0518 0.1575) (−0.0194, 0.1087) (−0.0108 0.0812)
(−0.0224 0.1153) (0.0087 0.0764) (0.0037 0.0470)
(−0.0976 0.1713) (−0.0443, 0.1318) (−0.0265 0.0947)
(−0.1927 0.1082) (−0.1595 0.0891) (−0.1569 0.0632)
(0.3123 0.4652) (0.1656, 0.4148) (0.1111 0.2969)
(0.1036 0.4373) (0.0502, 0.3229) (0.0339 0.2306)
(0.0155 0.3000) (0.0051 0.1835) (−0.0098 0.1371)
(0.2375 0.4782) (0.1246, 0.3967) (0.0825 0.2776)
(0.5672 0.3236) (0.5015 0.2267) (0.4606 0.1783)
(−0.1063 0.1586) (−0.0579 0.1262) (−0.0338 0.0927)
(−0.0404 0.1476) (−0.0162 0.1087) (−0.0083 0.0715)
(−0.0212 0.1043) (−0.0101 0.0723) (−0.0085 0.0466)
(−0.0786 0.1581) (−0.0385 0.1204) (−0.0218 0.0849)
(−0.1798 0.1190) (−0.1579 0.0890) (−0.1483 0.0614)
(0.2136 0.3031) (0.1141 0.2244) (0.0647 0.1660)
(0.0726 0.2790) (0.0288 0.1891) (0.0136 0.1281)
(0.0299 0.1972) (0.0114 0.1313) (0.0123 0.0909)
(0.1350 0.2966) (0.0646 0.2103) (0.0349 0.1492)
(0.3727 0.2986) (0.3357 0.2025) (0.3159 0.1398)
(−0.1068 0.1619) (−0.0619 0.1277) (−0.0296 0.1025)
(−0.0396 0.1604) (−0.0126 0.1097) (−0.0034 0.0712)
(−0.0135 0.1096) (−0.0116 0.0595) (−0.0095 0.0403)
(−0.0780 0.1755) (−0.0456 0.1247) (−0.0199 0.0976)
(−0.1765 0.1143) (−0.1922 0.0957) (−0.1814 0.0778)
(0.2618 0.5106) (0.2122 0.4116) (0.0970 0.3476)
(0.0901 0.4856) (0.0408 0.3403) (0.0058 0.2311)
(0.0358 0.3106) (0.0324 0.1912) (0.0186 0.1333)
(0.2034 0.5216) (0.1566 0.3958) (0.0624 0.3249)
(0.7586 0.3539) (0.7173 0.2471) (0.6955 0.1872)
= −1, = 1 = 0.4607841, 2 = 0.2093921
n = 50 n = 100 n = 200
n = 50 n = 100 n = 200
= −1, = 2 = 0.4607841, 2 = 0.2093921
n = 50 n = 100 n = 200
n = 50 n = 100 n = 200
= 1, = 1 = 0.5391493, 2 = 0.2097591
n = 50 n = 100 n = 200
n = 50 n = 100 n = 200
= 1, = 2 = 0.5391493, 2 = 0.2097591
n = 50 n = 100 n = 200
n = 50 n = 100 n = 200
We chose various combinations of the parameters and for = 1 and 10. These, in turn, resulted in various values for and 2 . From the above model = E(1 ) =
1 0
(log(1 /(1 − 1 )) − )2 exp − √ 22 (1 − 1 ) 2 1
d1
and 2 = E(21 ) − 2 , where the first term can be computed similarly. For instance, if = 1 and = 1 then one gets
= 0.7 and 2 = 0.03. We computed the empirical bias and the standard error (SE) based on 500 replications for each parameter combination. These values are reported within parenthesis in Tables 1 and 2. In the following tables, the format (BIAS SE) is used; for example, (0.0062 0.0054) means that the bias is 0.0062 and SE is 0.0054.
H. Zheng et al. / Journal of Statistical Planning and Inference 173 (2007) 212 – 229
219
Table 3 Estimates for parameter 2 for = 1 Hybrid
= 1, = 1
( = 0.6967347,
n = 50 n = 100 n = 200
(−0.0153 0.0524) (−0.0058 0.0374) (−0.0057 0.0270)
Percentage
MLE
0.686 0.796 0.868
(−0.0067 0.0262) (−0.0030 0.0206) (−0.0026 0.0150)
0.890 0.926 0.958
(−0.0034 0.0194) (−0.0019 0.0147) (−0.0020 0.0103)
0.618 0.652 0.692
(−0.0036 0.0351) (−0.0026 0.0301) (−0.0022 0.0240)
0.724 0.765 0.788
(0.0010 0.0398) (−0.0010 0.0289) (−0.0016 0.0239)
Percentage
MLE
0.968 0.99 1
(−0.0836 0.0433) (−0.0711 0.0272) (−0.0416 0.0153)
0.984 1 1
(−0.0542 0.03560) (−0.0476 0.0225) (−0.0494 0.0075)
0.988 0.998 1
(−0.0670 0.0310) (−0.0653 0.0143) (−0.0727 0.0057)
0.99 1 1
(−0.0526 0.0339) (−0.0680 0.0081) (−0.0654 0.0039)
= 0.03335209) 2
= 1, = 2
( = 0.6967347,
n = 50 n = 100 n = 200
(−0.0011 0.0293) (−0.0005 0.0239) (−0.0023 0.0163)
2 = 0.03335209)
= −1, = 1
( = 0.3032653,
n = 50 n = 100 n = 200
(−0.0066 0.1142) (−0.0070 0.0880) (−0.0029 0.0678)
= −1, = 2
( = 0.3032653,
n = 50 n = 100 n = 200
(0.0030 0.0850) (0.0011 0.0623) (−0.001 0.0456)
2 = 0.03335209)
2 = 0.03335209)
Table 4 Estimates for parameter 2 for = 10 Hybrid
= −1, = 1
( = 0.4607841,
n = 50 n = 100 n = 200
(−0.0338 0.0911) (−0.0249 0.0704) (−0.0146, 0.0452)
= 0.2093921)
= −1, = 2
( = 0.4607841,
n = 50 n = 100 n = 200
(−0.0226 0.0650) (−0.0180, 0.0490) (−0.0093 0.0331)
= 1, = 1
( = 0.5391493,
n = 50 n = 100 n = 200
(−0.0230 0.0804) (−0.0165 0.0572) (−0.0097 0.0384)
= 1, = 2
( = 0.5391493,
n = 50 n = 100 n = 200
(−0.0219 0.0627) (−0.0097 0.0417) (−0.0090 0.0275)
2
2 = 0.2093921)
2 = 0.2097591)
2 = 0.2097591)
Recall that both the CLS and the MQL are primarily used for computing estimates of and . However, a preliminary estimate of 2 (and similarly of 2z ) is needed before the MQL method could be used to compute parameter estimates of and . Recall that such an estimator was proposed in (8), where we use the CLS estimates of and . Furthermore, we have used the function T (x) = 1/(1 + (1 + x)2 ), which is a positive and bounded function, for our calculation. Note that for finite sample sizes, there is a positive probability that the moment-based estimators of 2 and/or 2z are negative. As mentioned earlier, for such samples, we would use the MLEs obtained from a parametric (Poisson) model for Z. The statistical properties of this hybrid estimator of 2 are listed in the Tables 3 and 4. We also report the percentage of positive moment estimates for 2 in the Percentage column in the tables for 2 .
220
H. Zheng et al. / Journal of Statistical Planning and Inference 173 (2007) 212 – 229
10
X(t)
8 6 4 2 0 0
50
100 t
150
200
Fig. 1. A typical sample path from a RCINAR(1) model.
From the above results, we can see that CLS and MQL are both good estimation methods producing estimators whose bias and SEs are overall comparable to the benchmark ML method. This is specially true, if in addition, one factors in the fact that both these methods are fast and easy to implement. It is perhaps not surprising that, under the correct parametric model, the MLE has the best performance in terms of both bias and the MSE. On the other hand, as to be expected, the simplest method, namely the CLS, had the largest MSE. However, overall, all three methods produced good estimators of parameters, especially for larger sample sizes. Also, as expected, when 2 is not too small (e.g., 2 = 0.2 for = 10), the MQL estimator obtained from a fixed coefficient model is substantially worse than that under a random coefficient model. Interestingly, the F-MQL using a wrong variance formula is still slightly better than the CLS. F-MLE is the worst performer both in bias and overall MSE. We should mention that for large , the percentage of negative estimates for 2 is very small. It is to be noted that the CLS estimate remains the same under both random coefficient model (true model) and the fixed coefficient model (wrong model) since the conditional expectation function remains the same under the two models. Since the conditional variance function changes under the two models, MQL estimate also changes. However, the performance of the MQL estimate remains reasonably good under the wrong variance function and its first-order efficiency under the wrong model is not affected significantly when 2 is not too large. Therefore, the MQL estimate is relatively robust against changes in the variance function as long as our parameters of interest ( and ) are the mean parameters from the conditional expectation function, which is the case here. It may be noted that the MQL is closer to F-MQL compared to MLE is to F-MLE. In this sense, MQL is more robust than MLE with regard to misspecification of model (Fig. 1).
5. Summary and conclusions In this paper, we have introduced a new model for count data by extending the fixed coefficient INAR(1) model. The autoregressive parameter is allowed to vary randomly over time. The stationarity and ergodicity of the process are established. Conditional least squares (CLS) and modified quasi-likelihood (MQL) estimators of the model parameters are derived and their asymptotic distributions are obtained. In the simulation study, we compare CLS, MQL and the ML estimators. The simulation results show that all three methods give good estimates for large samples. The CLS method can be used for its simplicity. However, the MQL method gives more efficient estimators than the CLS. Care must be exercised, however, in computing the MQL estimates (substituting a positive estimate of the variance parameter 2 ) as suggested in Section 4. The ML method is the most efficient (as can be expected) even though it is computationally most intensive of the three methods. Another drawback of the ML method is that it requires the specific knowledge of the distributions of t and Zt . For instance, if a fixed coefficient INAR(1) model is used when, in fact, the data come from a random coefficient model, the ML method performs poorly whereas the MQL method is more robust and gives good estimates even when the wrong model is used. On balance, therefore, we recommend the MQL method for the new model we have proposed in this paper.
H. Zheng et al. / Journal of Statistical Planning and Inference 173 (2007) 212 – 229
221
Acknowledgements We thank the two referees for their detailed and constructive suggestions.
Appendix Proof of Proposition 2.1. The results (i) to (iv) are straightforward to verify. We give below the proofs for (v) and (vi) only. (v) Var(Xt ) = Var(E(Xt |Xt−1 )) + E(Var(Xt |Xt−1 )) 2 = Var(Xt−1 ) + E(2 Xt−1 + ((1 − ) − 2 )Xt−1 + 2z )
= 2 Var(Xt−1 ) + 2 (Var(Xt−1 ) + E 2 (Xt−1 )) + ((1 − ) − 2 )E(Xt−1 ) + 2z . The conclusion can be reached by induction from the above equation with initial values Var(X0 ) = b/(1 − 2 ) and E(X0 ) = /(1 − ). (vi) By repeated application of (3), with t replaced by t + k, we have Xt+k = t+k ◦ · · · t+1 ◦ Xt + k−1 i=0 t+k ◦ · · · t+k−i+1 ◦ Zt+k−i , where, if i = 0, t+k ◦ · · · t+k−i+1 ◦ Zt+k−i = Zt+k . Therefore Cov(Xt+k , Xt ) = Cov t+k ◦ · · · t+1 ◦ Xt +
k−1
t+k ◦ · · · t+k−i+1 ◦ Zt+k−i , Xt
i=0
= Cov(t+k ◦ · · · t+1 ◦ Xt , Xt ) since Xt is independent of the second part = E[Xt (t+k ◦ · · · t+1 ◦ Xt )] − E(t+k ◦ · · · t+1 ◦ Xt )E(Xt ). By repeated conditioning, the first part equals to k E(Xt2 ) and the second part equals to k E 2 (Xt ) leading to k Var(Xt ) for the covariance term. Proof of Proposition 2.2. From (4) and the assumption fz > 0, we get Pij > 0 and hence the process {Xt } is an irreducible and aperiodic Markov chain. Thus, it is either positive recurrent or limit Pijn = 0,
(9)
n→∞
for any i, j ; see, for e.g., Theorem 4.3.3 in Ross (1996). Repeated application of (3) with t replaced by n gives Xn = n ◦ · · · ◦ 1 ◦ X0 +
n−1
(n ◦ · · · ◦ n−k+1 ◦ Zn−k ) + Zn
k=1
d = n ◦ · · · ◦ 1 ◦ X0 + (k+1 ◦ · · · ◦ 2 ◦ Zk+1 ) + Z1 = Yn n−1
say,
(10)
k=1
n−1 where, for n = 1, n−1 k=1 (n ◦ · · · ◦ n−k+1 ◦ Zn−k ) = 0 = k=1 (k+1 ◦ · · · ◦ 2 ◦ Zk+1 ), the equality in distribution holds both unconditionally and conditionally given X0 = j , say. The first term in (10) is op (1), again both conditionally given X0 = j and unconditionally since, as before, E(n ◦ · · · ◦ 1 ◦ X0 ) = n E(X0 ) → 0.
222
H. Zheng et al. / Journal of Statistical Planning and Inference 173 (2007) 212 – 229
Next, we show that Yn converges almost surely. For all > 0, m, n ∈ N0 , P max |Yn − Yn+k | > 1k m k =P max |n ◦ · · · ◦ 1 ◦ X0 − n+k ◦ · · · ◦ 1 ◦ X0 + (n+i ◦ · · · ◦ 2 ◦ Zn+i )| > P
1k m
i=1
|n ◦ · · · ◦ 1 ◦ X0 − n+m ◦ · · · ◦ 1 ◦ X0 | +
m
(n+k ◦ · · · ◦ 2 ◦ Zn+k ) >
k=1
since ◦ X X E(n ◦ · · · ◦ 1 ◦ X0 − n+m ◦ · · · ◦ 1 ◦ X0 + m k=1 (n+k ◦ · · · ◦ 2 ◦ Zn+k )) n (1 − m )E(X0 ) + n (1 − m )/(1 − ) = → 0 as n → ∞. Therefore, by the above arguments, Vn converges in distribution (both conditionally and unconditionally) to the same limit where Vn = n−1 k=1 (k+1 ◦ · · · ◦ 2 ◦ Zk+1 ) + Z1 . Hence, limit P (Xn = i|X0 = j ) = limit P (Yn = i|X0 = j )
n→∞
n→∞
= limit P (Vn = i|X0 = j ) n→∞
= limit P (Vn = i) n→∞
because Vn is independent of X0 = limit P (Yn = i) n→∞
= limit P (Xn = i) n→∞
(11)
Suppose now that (9) holds, we then have limit Pijn = 0 = P {Y = i},
n→∞
by (11), for all i, j 0, where Y is the almost sure limit of Yn . However, this contradicts the fact that P (Y < ∞) = 1. This proves the first part of the proposition. Almost sure convergence of Vn follows from the same arguments used above. Now, we prove that {Vn , n 0} converges in L2 . In the proof, we need the following results: E( ◦ X)( ◦ X) = E(E( ◦ X)( ◦ X)|X)) = E(X 2 ), E( ◦ X)2 = E(E(( ◦ X)2 |X)) = (1 − )E(X) + 2 E(X 2 ), for all and ∈ [0,1]. By (12) and taking repeated conditional expectation, for i > j , E(n+i ◦ · · · ◦ 2 ◦ Zn+i )(n+j ◦ · · · ◦ 2 ◦ Zn+j ) = E(E[(n+i ◦ · · · ◦ 2 ◦ Zn+i )(n+j ◦ · · · ◦ 2 ◦ Zn+j )|n+j ◦ · · · ◦ 2 ◦ Zn+j , × n+j ◦ · · · ◦ 2 ◦ Zn+i , n+i , . . . , n+j +1 ]), ⎡⎛ ⎞ ⎤ = E ⎣⎝ n+k+1 ⎠ (n+j ◦ · · · ◦ 2 ◦ Zn+j )(n+j ◦ · · · ◦ 2 ◦ Zn+i )⎦ j k
⎛⎛
⎞⎛
= E ⎝⎝
2l ⎠ ⎝
n+j l=2
j k
where 0 < c = E(21 ) < 1.
⎞
⎞
n+k+1 ⎠ 2 ⎠ = cn+j −1 i−j 2 ,
(12) (13)
H. Zheng et al. / Journal of Statistical Planning and Inference 173 (2007) 212 – 229
223
Finally, using the above to tackle the cross product terms and (13) to calculate the second moments we get ⎡ 2 ⎤ m E(|Vn − Vn+m |2 ) = E ⎣ (n+k ◦ · · · ◦ 2 ◦ Zn+k ) ⎦ k=1
m k=1
+2
[(n + k − 1)( − c)n+k−2 + cn+k−2 E(Z12 )]
i−j cn+j −1 2 ,
(14)
m i>j 1
while
i−j cn+j −1
m i>j 1
n+i−1 =
mi j 1
m
in+i−1 → 0,
i=1
as n → ∞. The convergence of the first term in (14) is similar. Therefore, for all m > 0, E(|Vn − Vn+m |2 ) → 0 This ends the proof.
if n → ∞.
Proof of Corollary 2.1. The proof follows from Proposition 2.2 by using (12) and (13).
Proof of Theorem 3.1. Here, we give a direct proof of this result. Alternatively, one can verify the regularity conditions of Klimko and Nelson (1978). Recall from the beginning of Section 3.1 that S() =
n
(Xt − Xt−1 − )2 ,
t=1
with = (, )T , and jS()/j = 0 leads to the CLS estimator of . Now, let Fn = {X0 , X1 , . . . , Xn } and Mn = −2−1 (jS()/j) = nt=1 (Xt − Xt−1 − ), M0 = 0. Then E(Mn |Fn−1 ) = E(Mn−1 + (Xn − Xn−1 − )|Fn−1 ) = Mn−1 + E(Xn − Xn−1 − |Fn−1 ) = Mn−1 , i.e., {Mn , Fn , n0} is a martingale. We have shown that Vn converges in L2 and almost surely, which means that {Vn2 , n 0} is uniformly integrable. Therefore (Xn − Xn−1 − )2 , n 1, is uniformly integrable. By Theorem 1.1 of Billingsley (1961), n 1 a.s. (Xt − Xt−1 − )2 −→ E((X1 − X0 − )2 ) = 21 . n t=1
√ d Thus, by Corollary 3.2 from Hall and Heyde (1980), the martingale CLT applies and we get, (1/ n)Mn −→ N(0, 21 ). n −1 Similarly, we can prove that Mn = −2 (jS()/j) = t=1 Xt−1 (Xt − Xt−1 − ) is a martingale, n 1 2 a.s. Xt−1 (Xt − Xt−1 − )2 −→ E(X02 (X1 − X0 − )2 ) = 22 n t=1
√ d and (1/ n)Mn −→ N(0, 22 ).
224
H. Zheng et al. / Journal of Statistical Planning and Inference 173 (2007) 212 – 229
In the same way, for any c = (c1 , c2 )T ∈ R 2 \(0, 0), we have 1 √ cT n
Mn Mn
1 =√ (c1 Xt−1 + c2 )(Xt − Xt−1 − ), n n
t=1
d
−→ N(0, E((c1 X0 + c2 )2 (X1 − X0 − )2 )). Thus, by Cramer–Wold device,
1 √ n
Mn Mn
d
−→ N
2 0 1 , 0 12
12 22
,
where 12 = E(X0 (X1 − X0 − )2 ). By setting Mn = 0 and Mn = 0, we have the following estimators ˆ=
n
n n t=1 Xt Xt−1 − t=1 Xt−1 t=1 Xt , n 2 2 − n nt=1 Xt−1 t=1 Xt−1 n n
n
ˆ = n−1
ˆ Xt −
Xt−1 .
t=1
t=1
After some algebra, we have
ˆ − ˆ −
⎛
n
= n−1 ⎝n−1
2 Xt−1 − n−2
t=1
⎛ ⎜ ×⎝
n t=1
2 ⎞−1 Xt−1 ⎠
t=1 n
−n−1
1 −n−1
n
n−1
Xt−1
t=1 n t=1
Xt−1
2 Xt−1
⎞
⎟ Mn . ⎠ Mn
By the ergodicity of the process (see Proposition 2.2) and Theorem 1.1 of Billingsley (1961), ⎛ ⎝n−1
n
2 Xt−1 − n−2
t=1 a.s.
−→(m2 − m21 )−1 where m1 = limitn→∞ n−1 Therefore,
n
⎛ 2 ⎞−1 ⎜ Xt−1 ⎠ ⎝
t=1
1 −m1
n
−m1 m2
t=1 Xt−1
21 12
12 . 22
n
−1
n t=1
n−1
Xt−1
n
t=1
,
and m2 = limitn→∞ n−1
ˆ − √ d n ˆ −→ N(0, V −1 W V −1 ), − where W =
−n−1
t=1
=V
−n−1
1
n
2 t=1 Xt−1 .
Xt−1
2 Xt−1
⎞ ⎟ ⎠
H. Zheng et al. / Journal of Statistical Planning and Inference 173 (2007) 212 – 229
225
Proof of Theorem 3.2. Let = (, )T , = (2 , , 2z )T , where 2 = Var(1 ), = E(1 (1 − 1 )) and 2z = Var(Z1 ). 2 + X 2 V (Xt |Xt−1 ) = Var(Xt |Xt−1 ) = 2 Xt−1 t−1 + z . First, we suppose is known. For the following estimation equations:
Sn(1) (, ) =
n
V−1 (Xt |Xt−1 )(Xt − Xt−1 − ),
t=1
Sn(2) (, ) =
n
V−1 (Xt |Xt−1 )Xt−1 (Xt − Xt−1 − ),
t=1
we have E[V−1 (Xt |Xt−1 )(Xt − Xt−1 − )|Ft−1 ] = V−1 (Xt |Xt−1 )E[(Xt − Xt−1 − )|Ft−1 ] = 0 and (1)
(1)
E[St (, )|Ft−1 ] = St−1 (, ). (1)
Thus, {St (, ), Ft , t 0} is a martingale. By Theorem 1.1 of Billingsley (1961), n 1 −2 V (Xt |Xt−1 )(Xt − Xt−1 − )2 n t=1 a.s.
−→ E[V−2 (X1 |X0 )(X1 − X0 − )2 ]
= E(E[V−2 (X1 |X0 )(X1 − X0 − )2 |X0 ]) = E[V−1 (X1 |X0 )] = T1 ().
By the similar argument as for the CLS estimators, we have 1 d √ Sn(1) (, ) −→ N(0, T1 ()). n Similarly, n 1 −2 2 V (Xt |Xt−1 )Xt−1 (Xt − Xt−1 − )2 n t=1 a.s.
−→ E[V−2 (X1 |X0 )X02 (X1 − X0 − )2 ]
= E(E[V−2 (X1 |X0 )X02 (X1 − X0 − )2 |X0 ]) = E[X02 V−1 (X1 |X0 )] = T2 ()
and 1 d √ Sn(2) (, ) −→ N (0, T2 ()). n Again by Cramer–Wold device, for any c = (c1 , c2 )T , c1 and c2 ∈ R are not both 0, we have (1) cT Sn (, ) d −→ N(0, E[V−2 (X1 |X0 )(c2 X0 + c1 )2 (X1 − X0 − )2 ]), √ n Sn(2) (, ) implying 1 √ n
0 T1 () T3 () −→ N , , (2) T3 () T2 () Sn (, ) 0 (1)
Sn (, )
d
where T3 () = E[V−2 (X1 |X0 )X0 (X1 − X0 − )2 ] = E[V−1 (X1 |X0 )X0 ].
(15)
226
H. Zheng et al. / Journal of Statistical Planning and Inference 173 (2007) 212 – 229
Now, we replace V−2 (Xt |Xt−1 ) by V ˆ−2 (Xt |Xt−1 ), where ˆ is a consistent estimator of . Then we want
1 √ n
(1) ˆ d 0 T1 () ) Sn (, −→ N , (2) ˆ 0 Sn (, T3 () )
T3 () T2 ()
.
(16)
To have this we need to prove that 1 p ˆ ) − √1 S (i) (, ) −→ 0, √ Sn(i) (, n n n
i = 1, 2.
(17)
√ (1) Let Rn () = (1/ n)Sn (, ). Then ∀ > 0 and > 0 such that − 1 > 0, where 1 is the unit vector, we have ˆ − Rn ()| > ) P (|ˆ 2 − 2 | > ) + P (|ˆ − | > ) + P (|ˆ 2 − 2 | > ) P (|Rn () z z + P(
sup {|21 −2 |<, |1 −|<, 2 2 |2 −z |<}
|Rn (1 ) − Rn ()| > ),
where 1 = (21 , 1 , 22 )T . Let D = {|21 − 2 | < , |1 − | < , |22 − 2z | < }. If ˆ is a consistent estimator of , then we just need to prove that p P sup |Rn (1 ) − Rn ()| > −→ 0. D
By Markov inequality,
P sup |Rn (1 ) − Rn ()| > D
1 2 E sup (R ( ) − R ()) n 1 n 2 D n 1 1 −1 −1 2 2 = 2 E sup (V1 (Xt |Xt−1 ) − V (Xt |Xt−1 )) (Xt − Xt−1 − ) D n t=1 1 −1 −1 2 2 = 2 E sup (V1 (X1 |X0 ) − V (X1 |X0 )) (X1 − X0 − ) D ((21 − 2 )X02 + (1 − )2 X0 + (22 − 2z ))2 1 2 = 2E sup (X1 − X0 − ) V 2 (X1 |X0 )V (X1 |X0 )2 D 1 ((21 − 2 )X02 + (1 − )2 X0 + (22 − 2z ))2 1 = 2 E sup V 2 (X1 |X0 )V (X1 |X0 ) D
1
1 2 sup {(21 − 2 )2 c1 + (1 − )2 c2 + (22 − 2z )2 c3 + 2c4 |(21 − 2 )(1 − )| D + 2c5 |(22 − 2z )(21 − 2 )| + 2c6 |(1 − )(22 − 2z )|}
where c’ s are finite moments
2
C , 2
√ (2) where C is a positive constant. Similar argument can be used for (1/ n)Sn (, ). Letting go to zero, we get our assertion which in turn establishes (16).
H. Zheng et al. / Journal of Statistical Planning and Inference 173 (2007) 212 – 229
227
Similarly, we have n n 1 −1 1 −1 p V (Xt |Xt−1 ) − V ˆ (Xt |Xt−1 ) −→ 0, n n t=1
t=1
n n 1 −1 1 −1 p V (Xt |Xt−1 )Xt−1 − V ˆ (Xt |Xt−1 )Xt−1 −→ 0, n n t=1
t=1
t=1
t=1
n n 1 −1 1 −1 p 2 2 V (Xt |Xt−1 )Xt−1 − V ˆ (Xt |Xt−1 )Xt−1 −→ 0. n n
Therefore, by the above and Theorem 1.1 of Billingsley (1961), ⎡ 2 ⎤−1 n n n 1 1 1 2 ⎣ − V ˆ−1 (Xt |Xt−1 ) V ˆ−1 (Xt |Xt−1 )Xt−1 V ˆ−1 (Xt |Xt−1 )Xt−1 ⎦ n n n t=1
t=1
t=1
⎞ n 1 V ˆ−1 (Xt |Xt−1 )Xt−1 ⎟ ⎜ n t=1 ⎟ ⎜ ×⎜ ⎟ n n 1 ⎝1 2 ⎠ V ˆ−1 (Xt |Xt−1 )Xt−1 V ˆ−1 (Xt |Xt−1 )Xt−1 n t=1 n t=1 T1 () −T3 () p 2 −1 −→(T1 ()T2 () − T3 ()) −T3 () T2 () = T −1 (). ⎛
n 1 V −1 (Xt |Xt−1 ) n t=1 ˆ
After some algebra, we have ⎡ ˜ n n − 1 −1 −1 −1 ⎣ 1 2 =n V ˆ (Xt |Xt−1 ) V ˆ (Xt |Xt−1 )Xt−1 ˜ − n n t=1
t=1
2 ⎤−1 n 1 V ˆ−1 (Xt |Xt−1 )Xt−1 ⎦ − n t=1
⎛ ⎜ ⎜ ×⎜ ⎝
n 1 V −1 (Xt |Xt−1 ) n t=1 ˆ
−
n 1 V −1 (Xt |Xt−1 )Xt−1 n t=1 ˆ
Therefore, by (16) and (18), ˜ − √ d n ˜ −→ N(0, T −1 () Q()T −1 ()), − where Q() =
T1 () T3 ()
T3 () . T2 ()
This completes the proof.
⎞ n 1 V ˆ−1 (Xt |Xt−1 )Xt−1 (1) ˆ ) ⎟ Sn (, n t=1 ⎟ . ⎟ (2) ˆ n 1 ) ⎠ Sn (, 2 V ˆ−1 (Xt |Xt−1 )Xt−1 n t=1
−
(18)
228
H. Zheng et al. / Journal of Statistical Planning and Inference 173 (2007) 212 – 229
Proof of Proposition 3.1. Let An =
n 1 (T (Xt−1 ) − T¯ )(Xt − Xt−1 − )2 , n t=1
n 1 2 Bn = (T (Xt−1 ) − T¯ )Xt−1 , n t=1
n 1 Cn = (T (Xt−1 ) − T¯ )Xt−1 . n t=1
By Theorem 1.1 of Billingsley (1961), a.s.
An −→ E((T (X0 ) − E(T (X0 )))(X1 − X0 − )2 ) = 2 ( 1 − 2 ) + ( − 2 ) 2 , a.s.
Bn −→ 1 , a.s.
Cn −→ 2 , where 2
1 = E[(T (X∞ ) − E(T (X∞ )))X∞ ],
2 = E[(T (X∞ ) − E(T (X∞ )))X∞ ]
and X∞ denotes the limiting random variable corresponding to the stationary distribution of the process. Therefore 2
ˆ − )Cn + ( ˆ − )2 Bn + 2( ˆ − )(ˆ − )Cn − (ˆ − ˆ )Cn An − 2( Bn − C n 2 ( − ) + ( − 2 ) − ( − 2 ) 2 2 2 p 1 −→ = 2 for 1 = 2 .
1 − 2
ˆ 2 =
In order to avoid the case 1 = 2 , we need to assume that the limiting random variable X∞ is not a constant with X∞ = 0 or 1. We also assume that T (X) = E(T (X)). p
p
Similar arguments lead to ˆ 2z −→ 2z and ˆ −→ .
References Al-osh, M.A., Alzaid, A.A., 1987. First order integer-valued autoregressive (INAR(1)) processes. J. Time Ser. Anal. 8, 261–275. Al-osh, M.A., Alzaid, A.A., 1991. Binomial autoregressive moving average models. Stochastic Models 7, 261–282. Al-osh, M.A., Alzaid, A.A., 1992. First order autoregressive time series with negative binomial and geometric marginals. Comm. Statist. Theory Methods 21, 2483–2492. Alzaid, A.A., Al-osh, M.A., 1988. First order integer-valued autoregressive (INAR(1)) processes: distributional and regression properties. Statist. Neerlandica 42, 53–61. Billingsley, P., 1961. Statistical Inference for Markov Processes. University of Chicago Press, Chicago. Davis, R.A., Dunsmuir, T.M., Wang, Y., 1999. Modeling time series of count data. In: Gosh, S. (Ed.), Asymptotics, Nonparametrics and Time Series. Marcel-Dekker, New York, pp. 63–114. Fukasawa, T., Basawa, I.V., 2002. Estimation for a class of generalized state-space time series models. Statist. Probab. Lett. 60, 459–473. Hall, P., Heyde, C.C., 1980. Martingale Limit Theory and Its Application. Academic Press, New York.
H. Zheng et al. / Journal of Statistical Planning and Inference 173 (2007) 212 – 229
229
Klimko, L.A., Nelson, P.I., 1978. On conditional least squares estimation for stochastic processes. Ann. Statist. 6, 629–642. MacDonald, I.L., Zucchini, W.Z., 1997. Hidden Markov and Other Models for Discrete-valued Time Series. Chapman & Hall, London. McKenzie, E., 1985a. Contribution to the discussion of Lawrence and Lewis. J. Roy. Statist. Soc. B 47, 187–188. McKenzie, E., 1985b. Some simple models for discrete variate time series. Water Resour. Bull. 21, 645–650. McKenzie, E., 1986. Autoregressive moving-average processes with negative-binomial and geometric marginal distributions. Adv. Appl. Probab. 18, 679–705. McKenzie, E., 1987. Innovation distributions for gamma and negative-binomial autoregressions. Scand. J. Statist. 14, 79–85. McKenzie, E., 1988a. The distributional structure of finite moving-average processes. J. Appl. Probab. 25, 313–321. McKenzie, E., 1988b. Some ARMA models for dependent sequence of Poisson counts. Adv. Appl. Probab. 20, 822–835. Ross, S.M., 1996. Stochastic Processes. second ed.. Wiley, New York. √ Schick, A., 1996. n-consistent estimation in a random coefficient autoregressive model. Austral. J. Statist. 38, 155–160. Steutal, F., Van Harn, K., 1979. Discrete analogues of self-decomposability and stability. Ann. Probab. 7, 893–899.