First-order random coefficient integer-valued autoregressive processes

Journal of Statistical Planning and Inference 173 (2007) 212 – 229 www.elsevier.com/locate/jspi First-order random coefﬁcient integer-valued autoregr...

Download PDF

265KB Sizes 0 Downloads 67 Views

Report

PDF Reader
Full Text

Journal of Statistical Planning and Inference 173 (2007) 212 – 229 www.elsevier.com/locate/jspi

First-order random coefﬁcient integer-valued autoregressive processes Haitao Zhenga , Ishwar V. Basawaa , Somnath Dattab,∗ a Department of Statistics, University of Georgia, Athens, GA 30602, USA b Department of Bioinformatics and Biostatistics, University of Louisville, Louisville, KY 40202, USA

Received 23 February 2005; accepted 14 December 2005 Available online 20 February 2006

Abstract A ﬁrst-order random coefﬁcient integer-valued autoregressive (RCINAR(1)) model is introduced. Ergodicity of the process is established. Moments and autocovariance functions are obtained. Conditional least squares and quasi-likelihood estimators of the model parameters are derived and their asymptotic properties are established. The performance of these estimators is compared with the maximum likelihood estimator via simulation. © 2006 Elsevier B.V. All rights reserved. Keywords: Models of count data; Thinning models; INAR models; Random coefﬁcient models; Conditional least squares; Quasi-likelihood; Maximum likelihood; Asymptotic distributions

1. Introduction Integer-valued time series data are fairly common in practice (e.g., monthly unemployment ﬁgures), but models and methods for their analysis are relatively new in the literature. There are only a few broad classes of time series models that have been developed recently for count data. See, for instance, Davis et al. (1999) for a recent review. See, also, MacDonald and Zucchini (1997). Two main classes are: (a) state-space models (using latent processes) and (b) thinning models. See Fukasawa and Basawa (2002) for an example of state-space modeling. The integer autoregressive (INAR) models belong to the class of thinning models. A ﬁrst-order autoregressive model with count (or integer-valued) data is deﬁned through the “thinning” operator ◦ which is due to Steutal and Van Harn (1979). Let X be an integer-valued random variable and ∈ [0, 1], then the thinning operator “◦” is deﬁned as ◦X=

X

Bi ,

i=1

∗ Corresponding author. Tel.: +1 502 852 6376; fax: +1 502 852 3294.

E-mail address: [email protected] (S. Datta). 0378-3758/$ - see front matter © 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.jspi.2005.12.003

(1)

H. Zheng et al. / Journal of Statistical Planning and Inference 173 (2007) 212 – 229

213

where {Bi } is an i.i.d. Bernoulli random sequence with P (Bi = 1) = that is independent of X. With this operator, the INAR(1) model is deﬁned as Xt = ◦ Xt−1 + Zt ,

t 1,

(2)

where {Zt } is a sequence of i.i.d. non-negative integer-valued random variables with mean and variance 2z and {Zt } is independent of X0 . Throughout the rest of the paper, {Zt } will be referred to as innovations. This model has some similar properties as the ordinary AR(1) model and it has been discussed by McKenzie (1985a,b, 1986, 1987, 1988a,b), Al-osh and Alzaid (1987, 1991, 1992); Alzaid and Al-osh (1988), among others. In particular, note that the parameter in (2) plays the role of an autoregressive parameter. In some situations, the parameter may vary with time and it may be random. For example, let Xt denotes the number of terminally ill patients in the tth month, say. Here, Xt could potentially satisfy an INAR model where ◦ Xt−1 is the number of surviving patients from the previous month and Zt stands for the newly admitted patients in the current month. In addition, the survival rate may be affected by various environmental factors, such as the quality of health care, the state of health of patients, etc. and could vary randomly over time. Suppose Xt denote the number of unemployed in the tth month. Then Xt can be modeled as the sum of the previously unemployed ◦ Xt−1 and the newly unemployed Zt . Again the employment rate may vary randomly over time, and may be affected by factors such as the state of the economy, productivity growth, etc. In this paper, we extend the above model to a random coefﬁcient model, where the ﬁxed is replaced by a random parameter t , where t are realizations of i.i.d. random variables taking values in the interval [0, 1). Our main objective is to investigate basic probabilistic and statistical properties of this model and inferential methods for the relevant parameters associated with the model. The paper is organized as follows. In Section 2, the random coefﬁcient model is described in detail and some statistical properties are established. In Section 3, we propose two estimation methods for the model parameters. In Section 4, we present some simulation results for the estimation methods. The parameter estimates are compared with the benchmark maximum likelihood (ML) estimates which can be computed under a full parametric speciﬁcation as in the simulation setting. We also compare these estimates with those obtained from the standard ﬁxed coefﬁcient INAR(1) model. The paper ends with a discussion section. All the proofs are deferred to the Appendix.

2. The ﬁrst-order random coefﬁcient integer-valued autoregressive model A ﬁrst-order random coefﬁcient autoregressive (RCINAR (1)) model is deﬁned by the following recursive equation: Xt = t ◦ Xt−1 + Zt ,

t 1,

(3)

where {t } is an i.i.d. sequence with cumulative distribution function (CDF) P on [0, 1); {Zt } is an i.i.d. nonnegative integer-valued sequence with probability mass function (PMF) fz such that E(Zt4 ) < ∞; X0 , {t } and {Zt } are independent; E(X02 ) < ∞. Let = E(t ), 2 = Var(t ), = E(Zt ), 2 = 2 + 2 , 2z = Var(Zt ) and note that they are all assumed ﬁnite. It is easy to see that Xt is a Markov chain on {0, 1, 2, . . .} with the following transition probabilities: Pij = P (Xt = i|Xt−1 = j ) =

min(i,j ) k=0

j k

fz (i − k)

1 0

k1 (1 − 1 )j −k dP1 .

(4)

The Markovity follows from (3) and the fact that {t } is an i.i.d. sequence. Eq. (4) is obtained by noting that conditional on t , t ◦ Xt−1 is binomial with parameters (Xt−1 , t ), and hence, conditional on both (Xt−1 , t ), the PMF of Xt is given by fz (Xt − t ◦ Xt−1 ). Finally, unconditioning, we get the result in (4). The ergodicity property of the process is of independent interest. It will be very useful to derive the asymptotic properties of the parameter estimates. The moments and conditional moments will be useful in obtaining the appropriate estimating equations for parameter estimation.

214

H. Zheng et al. / Journal of Statistical Planning and Inference 173 (2007) 212 – 229

Proposition 2.1. For t 1, (i) (ii) (iii) (iv)

E(Xt |Xt−1 ) = Xt−1 + . E(Xt ) = /(1 − ), if E(X0 ) = /(1 − ). Var(Xt |Xt−1 , t ) = t (1 − t )Xt−1 + 2z . 2 + ((1 − ) − 2 )X 2 Var(Xt |Xt−1 ) = 2 Xt−1 t−1 + z .

(v) Var(Xt ) = b/(1 − 2 ), where b = /(1 − )((1 − ) − 2 ) + 2z + 2 2 /(1 − )2 , if Var(X0 ) = b/(1 − 2 ).

(vi) Cov(Xt+k , Xt ) = k Var(Xt ), k 0. If Var(X0 ) = b/(1 − 2 ), (k) = k , where (k) = Corr(Xt+k , Xt ).

Proposition 2.2. The process {Xt } is an irreducible, aperiodic and positive recurrent (and hence ergodic) Markov chain. Moreover, the stationary distribution of {Xt } is given by that of ∞ k=2 (k ◦ · · · ◦ 2 ◦ Zk ) + Z1 , where the series converges almost surely and also in L2 . Corollary 2.1. The mean and variance of the stationary distribution are given by m1 = /(1 − ) and 2x = b/(1 − 2 ), respectively. Next, we consider the problem of estimation of the ﬁrst two moments of the random auto-regressive coefﬁcients t and the innovation Zt .

3. Estimation methods For the RCINAR(1) model, our primary interest lies in estimating the mean autoregressive rate = E(t ) and the mean innovation = E(Zt ) from a sample (X1 , X2 , . . . , Xn ). However, the corresponding variances 2 = Var(t )

and 2z = Var(Zt ) need to be estimated as well. In this paper, we mainly consider two methods, namely, conditional least squares (CLS) and the modiﬁed quasilikelihood (MQL). An advantage of these methods is that they do not require specifying the exact family of distributions for the random autoregressive coefﬁcients, as well as the innovations. As a result, they are more robust albeit less efﬁcient than the ML. We will consider the ML method as well under a speciﬁc parametric model for the distributions of t and Zt in a simulation experiment. In fact, we use the ML method as a benchmark while comparing the CLS and MQL in our simulation. 3.1. Conditional least squares (CLS) Let S() =

n

(Xt − Xt−1 − )2 ,

t=1

with = (, )T , be the CLS criterion function. The CLS estimators of and are obtained by minimizing S over ∈ {0 < 1, > 0} and are given by ˆ=

n

t=1 Xt Xt−1 − 2 n nt=1 Xt−1

ˆ = n

−1

n t=1 Xt−1 t=1 Xt , 2 n − t=1 Xt−1

n

n

n t=1

ˆ Xt −

n

Xt−1 .

t=1

The following result establishes the asymptotic distribution of the CLS estimator.

(5)

H. Zheng et al. / Journal of Statistical Planning and Inference 173 (2007) 212 – 229

215

ˆ ) ˆ given by (5), we have Theorem 3.1. For the CLS estimators (, √

ˆ − d n ˆ → N(0, V −1 W V −1 ), −

where W=

21 12

12 22

,

21 = E((X1 − X0 − )2 ), V

−1

= (m2 − m21 )−1

m2 = m21 + 2x ,

22 = E(X02 (X1 − X0 − )2 ) −m1 m2

1 −m1

and

12 = E(X0 (X1 − X0 − )2 ),

,

j

mj = E(X1 ), j = 1, 2,

2x = Var(X1 ) = m2 − m21 > 0,

where E denotes expectation with respect to the stationary distribution. The consistency of ˆ and ˆ follows readily from Theorem 3.1. Corollary 2.2 of Klimko and Nelson (1978) can be applied here as well and the conditions for the theorem can be easily veriﬁed but omitted here. 3.2. Modiﬁed quasi-likelihood (MQL) Let = (2 , , 2z )T , where 2 = Var(1 ), = (1 − ) − 2 and 2z = Var(Z1 ) and = (, )T . Recall from Proposition 2.1, the expression for the one-step conditional variance 2 + Xt−1 + 2z . V (Xt |Xt−1 ) := Var(Xt |Xt−1 ) = 2 Xt−1

A set of standard QL estimating equations take the form: n

V−1 (Xt |Xt−1 )(Xt − Xt−1 − ) = 0,

t=1

n

V−1 (Xt |Xt−1 )Xt−1 (Xt − Xt−1 − ) = 0.

(6)

t=1

Note that the presence of in the expression for the conditional variance makes the corresponding estimating equations complicated and intractable in the general case. Consequently, we propose substituting a suitable consistent estimator ˆ of obtained by other means and then solve the resulting MQL estimating equations for the primary parameters of interest. This approach leads to the following closed form estimator of : −1 2 Xt−1 V ˆ−1 (Xt |Xt−1 ) Xt−1 V ˆ−1 (Xt |Xt−1 ) ˜ −1 = ˜ Xt−1 V ˆ−1 (Xt |Xt−1 ) V ˆ (Xt |Xt−1 ) −1 Xt−1 Xt V ˆ (Xt |Xt−1 ) × . Xt V ˆ−1 (Xt |Xt−1 )

A consistent estimator of is proposed next (following Schick, 1996) that can be used in (7).

(7)

216

H. Zheng et al. / Journal of Statistical Planning and Inference 173 (2007) 212 – 229

Proposition 3.1. Let T (x) be a bounded function of x and T¯ = n−1 nt=1 T (Xt−1 ), then the following estimators are consistent: n ˆ t−1 − ) ˆ 2 − (ˆ − ˆ 2 ) n (T (Xt−1 ) − T¯ )Xt−1 (T (Xt−1 ) − T¯ )(Xt − X t=1 , ˆ 2 = t=1 n 2 −X ¯ (T (X ) − T )(X t−1 t−1 ) t=1 t−1 ˆ 2z

n n n ˆ 2) ˆ 2 1 (ˆ − 2 2 ˆ ˆ = (Xt − Xt−1 − ) − (Xt−1 − Xt−1 ) − Xt−1 , n n n t=1

t=1

t=1

2 ˆ = ˆ − ˆ − ˆ 2 ,

(8)

ˆ and ˆ are consistent estimators of and . In practice, we can use the CLS estimators of and . where We should point out that the above estimators cannot be guaranteed to be positive estimators for 2 and 2z for all

samples. Alternatively, one can use the ML estimators of 2 and 2z , under an assumed parametric model, for such samples. One such parametric model will be to assume that Z follows a Poisson distribution. Statistical properties of such a hybrid estimator of 2 are studied in the next section in a simulation study. Asymptotic normality of the MQL estimators in (7) is established in the following theorem. ˜ ) ˜ given by (7) is Theorem 3.2. The joint limit distribution of the MQL estimators (, ˜ − √ d n ˜ −→ N(0, T −1 () Q()T −1 ()), − where

Q() =

T1 () T3 ()

T3 () , T2 ()

T

−1

() = (T1 ()T2 () − T32 ())−1

T1 () −T3 () −T3 () T2 () ,

where T1 () = E[V−1 (X1 |X0 )], T2 () = E[X02 V−1 (X1 |X0 )], and T3 () = E[X0 V−1 (X1 |X0 )]. Note that the consistency of ˜ and ˜ follows readily from the above result. 4. Simulation studies Consider the model Xt = t ◦ Xt−1 + Zt , where {t } is an i.i.d. sequence with log(t /(1 − t )) = + Ut , where = E(log(t /(1 − t ))) and {Ut } is an i.i.d. sequence of normal random variables with mean 0 and variance 2 ; {Zt } is an i.i.d. Poisson sequence with mean . In the simulation, we ﬁxed X0 at 3. The following is a typical sample path from this model, for a sample size 200, when = 1, = 10. We use the above model to generate data, and then use CLS, MQL and MLE methods to estimate the parameters, and here the MLE method can be used as a benchmark. The ML score equations under this assumed parametric model for the Z are given by

1 k min(Xt ,Xt−1 ) Xt−1 Xt −k Xt−1 −k n j/j /(X − k)! (1 − ) dP t 1 k=0 0 1 k 1 = 0,

1 k min(Xt ,Xt−1 ) Xt−1 Xt −k Xt−1 −k /(Xt − k)! 0 1 (1 − 1 ) dP1 t=1 k=0 k

H. Zheng et al. / Journal of Statistical Planning and Inference 173 (2007) 212 – 229

217

Table 1 Estimates for parameter and for = 1 CLS

MQL

MLE

F-MQL

F-MLE

(−0.0889 0.1217) (−0.0448 0.0878) (−0.0238 0.0598)

(−0.0594 0.1219) (−0.0274 0.0819) (−0.0151 0.0550)

(−0.0211 0.0828) (−0.0137 0.0534) (−0.0065 0.0370)

(−0.0669 0.1204) (−0.0318 0.0822) (−0.0175 0.0557)

(−0.0650 0.0922) (−0.0549 0.0589) (−0.0478 0.0396)

(0.2656 0.4024) (0.1329 0.2784) (0.0645 0.1807)

(0.1775 0.4009) (0.0826 0.2632) (0.0398 0.1676)

(0.0660 0.2687) (0.0571 0.1786) (0.0178 0.1125)

(0.1985 0.3975) (0.0945 0.2635) (0.0463 0.1697)

(0.1880 0.3158) (0.1661 0.2116) (0.1409 0.1273)

(−0.0796 0.1200) (−0.0439 0.0762) (−0.0251 0.0620)

(−0.0473 0.1173) (−0.0269 0.0732) (−0.0143 0.0571)

(−0.0172 0.0802) (−0.0110 0.0505) (−0.0076 0.0417)

(−0.0575 0.1174) (−0.0330 0.0737) (−0.0184 0.0592)

(−0.0943 0.0921) (−0.0799 0.0608) (−0.0707 0.0465)

(0.3815 0.6990) (0.2417 0.4770) (0.1399 0.3752)

(0.2585 0.6856) (0.1733 0.4622) (0.0913 0.3422)

(0.0783 0.4678) (0.0703 0.3137) (0.0496 0.2424)

(0.3164 0.6875) (0.2079 0.4634) (0.1153 0.3560)

(0.5300 0.5595) (0.4713 0.3800) (0.4175 0.2750)

(−0.0432 0.1471) (−0.0200 0.1058) (−0.0119 0.0777)

(−0.0302 0.1490) (−0.0112 0.1038) (−0.0067 0.0753)

(0.0014 0.1340) (0.0018 0.0939) (−0.0015 0.0702)

(−0.0367 0.1480) (−0.0157 0.1039) (−0.0093 0.0754)

(−0.0264 0.1377) (−0.0191 0.0946) (−0.0188 0.0675)

(0.0979 0.2635) (0.0490 0.1677) (0.0190 0.1169)

(0.0494 0.2620) (0.0229 0.1664) (0.0050 0.1145)

(0.0082 0.2446) (0.0078 0.1523) (0.0018 0.1072)

(0.0570 0.2636) (0.0275 0.1669) (0.0076 0.1145)

(0.0444 0.2510) (0.0340 0.1562) (0.0221 0.1058)

(−0.0539 0.1337) (−0.0197 0.1030) (−0.0117 0.0760)

(−0.0368 0.1387) (−0.0142 0.0973) (−0.0063 0.0737)

(−0.0087 0.1238) (−0.0055 0.0890) (−0.0027 0.0677)

(−0.0462 0.1354) (−0.0170 0.1000) (−0.0088 0.0741)

(−0.0525 0.1206) (−0.0341 0.0859) (−0.0211 0.0638)

(0.1330 0.4103) (0.0550 0.3126) (0.0331 0.2244)

(0.0862 0.4208) (0.0379 0.2955) (0.0183 0.2204)

(0.0188 0.3763) (0.0223 0.2759) (0.0169 0.2066)

(0.1082 0.4151) (0.0456 0.3042) (0.0243 0.2209)

(0.1354 0.3976) (0.0991 0.2766) (0.0552 0.1890)

= 1, = 1 = 0.6967347, 2 = 0.03335209

n = 50 n = 100 n = 200

n = 50 n = 100 n = 200

= 1, = 2 = 0.6967347, 2 = 0.03335209

n = 50 n = 100 n = 200

n = 50 n = 100 n = 200

= −1, = 1 = 0.3032653, 2 = 0.03335209

n = 50 n = 100 n = 200

n = 50 n = 100 n = 200

= −1, = 2 = 0.3032653, 2 = 0.03335209

n = 50 n = 100 n = 200

n = 50 n = 100 n = 200

where = (, , ) and

(log(1 /(1 − 1 )) − )2 dP1 = exp − √ 22 (1 − 1 ) 2 1

d1 .

The (univariate) integrals in the above expressions were computed using standard integration routine in R. For comparison, we also use the methods for the ﬁxed coefﬁcient model, MQL and MLE (denoted by F-MQL and F-MLE in Tables 1 and 2), to estimate parameters and by using the same data generated from the above model. For the ﬁxed coefﬁcient model, Var(Xt |Xt−1 ) = Xt−1 (1 − ) + 2z and we use it in the MQL estimation equations. The conditional density f (Xt |Xt−1 ) for the ﬁxed coefﬁcient model is min(Xt ,Xt−1 ) Xt−1 Xt −k f (Xt |Xt−1 ) = k (1 − )Xt−1 −k . k (Xt − k)! k=0 We can get the F-MLEs for and by maximizing nt=1 log f (Xt |Xt−1 ).

218

H. Zheng et al. / Journal of Statistical Planning and Inference 173 (2007) 212 – 229

Table 2 Estimates for parameter and for = 10 CLS

MQL

MLE

F-MQL

F-MLE

(−0.0891 0.1770) (−0.0510 0.1284) (−0.0322, 0.0979)

(−0.0302 0.1653) (−0.0170 0.1129) (−0.0078, 0.0809)

(−0.0144 0.1253) (−0.0021 0.0801) (−0.0013 0.0546)

(−0.0649 0.1781) (−0.0352 0.1233) (−0.0213, 0.0913)

(−0.1473 0.1324) (−0.1317 0.0896) (−0.1019 0.0584)

(0.1681 0.2826) (0.0985 0.2150) (0.0577, 0.1500)

(0.0506 0.2622) (0.0322 0.1876) (0.0131, 0.1230)

(0.0173 0.1933) (0.0050 0.1482) (0.0035 0.0977)

(0.0993 0.2823) (0.0580 0.2050) (0.0318, 0.1373)

(0.2754 0.2707) (0.2488 0.1920) (0.2065 0.1401)

(−0.1220 0.1632) (−0.0574, 0.1361) (−0.0353 0.0991)

(−0.0518 0.1575) (−0.0194, 0.1087) (−0.0108 0.0812)

(−0.0224 0.1153) (0.0087 0.0764) (0.0037 0.0470)

(−0.0976 0.1713) (−0.0443, 0.1318) (−0.0265 0.0947)

(−0.1927 0.1082) (−0.1595 0.0891) (−0.1569 0.0632)

(0.3123 0.4652) (0.1656, 0.4148) (0.1111 0.2969)

(0.1036 0.4373) (0.0502, 0.3229) (0.0339 0.2306)

(0.0155 0.3000) (0.0051 0.1835) (−0.0098 0.1371)

(0.2375 0.4782) (0.1246, 0.3967) (0.0825 0.2776)

(0.5672 0.3236) (0.5015 0.2267) (0.4606 0.1783)

(−0.1063 0.1586) (−0.0579 0.1262) (−0.0338 0.0927)

(−0.0404 0.1476) (−0.0162 0.1087) (−0.0083 0.0715)

(−0.0212 0.1043) (−0.0101 0.0723) (−0.0085 0.0466)

(−0.0786 0.1581) (−0.0385 0.1204) (−0.0218 0.0849)

(−0.1798 0.1190) (−0.1579 0.0890) (−0.1483 0.0614)

(0.2136 0.3031) (0.1141 0.2244) (0.0647 0.1660)

(0.0726 0.2790) (0.0288 0.1891) (0.0136 0.1281)

(0.0299 0.1972) (0.0114 0.1313) (0.0123 0.0909)

(0.1350 0.2966) (0.0646 0.2103) (0.0349 0.1492)

(0.3727 0.2986) (0.3357 0.2025) (0.3159 0.1398)

(−0.1068 0.1619) (−0.0619 0.1277) (−0.0296 0.1025)

(−0.0396 0.1604) (−0.0126 0.1097) (−0.0034 0.0712)

(−0.0135 0.1096) (−0.0116 0.0595) (−0.0095 0.0403)

(−0.0780 0.1755) (−0.0456 0.1247) (−0.0199 0.0976)

(−0.1765 0.1143) (−0.1922 0.0957) (−0.1814 0.0778)

(0.2618 0.5106) (0.2122 0.4116) (0.0970 0.3476)

(0.0901 0.4856) (0.0408 0.3403) (0.0058 0.2311)

(0.0358 0.3106) (0.0324 0.1912) (0.0186 0.1333)

(0.2034 0.5216) (0.1566 0.3958) (0.0624 0.3249)

(0.7586 0.3539) (0.7173 0.2471) (0.6955 0.1872)

= −1, = 1 = 0.4607841, 2 = 0.2093921

n = 50 n = 100 n = 200

n = 50 n = 100 n = 200

= −1, = 2 = 0.4607841, 2 = 0.2093921

n = 50 n = 100 n = 200

n = 50 n = 100 n = 200

= 1, = 1 = 0.5391493, 2 = 0.2097591

n = 50 n = 100 n = 200

n = 50 n = 100 n = 200

= 1, = 2 = 0.5391493, 2 = 0.2097591

n = 50 n = 100 n = 200

n = 50 n = 100 n = 200

We chose various combinations of the parameters and for = 1 and 10. These, in turn, resulted in various values for and 2 . From the above model = E(1 ) =

1 0

(log(1 /(1 − 1 )) − )2 exp − √ 22 (1 − 1 ) 2 1

d1

and 2 = E(21 ) − 2 , where the ﬁrst term can be computed similarly. For instance, if = 1 and = 1 then one gets

= 0.7 and 2 = 0.03. We computed the empirical bias and the standard error (SE) based on 500 replications for each parameter combination. These values are reported within parenthesis in Tables 1 and 2. In the following tables, the format (BIAS SE) is used; for example, (0.0062 0.0054) means that the bias is 0.0062 and SE is 0.0054.

H. Zheng et al. / Journal of Statistical Planning and Inference 173 (2007) 212 – 229

219

Table 3 Estimates for parameter 2 for = 1 Hybrid

= 1, = 1

( = 0.6967347,

n = 50 n = 100 n = 200

(−0.0153 0.0524) (−0.0058 0.0374) (−0.0057 0.0270)

Percentage

MLE

0.686 0.796 0.868

(−0.0067 0.0262) (−0.0030 0.0206) (−0.0026 0.0150)

0.890 0.926 0.958

(−0.0034 0.0194) (−0.0019 0.0147) (−0.0020 0.0103)

0.618 0.652 0.692

(−0.0036 0.0351) (−0.0026 0.0301) (−0.0022 0.0240)

0.724 0.765 0.788

(0.0010 0.0398) (−0.0010 0.0289) (−0.0016 0.0239)

Percentage

MLE

0.968 0.99 1

(−0.0836 0.0433) (−0.0711 0.0272) (−0.0416 0.0153)

0.984 1 1

(−0.0542 0.03560) (−0.0476 0.0225) (−0.0494 0.0075)

0.988 0.998 1

(−0.0670 0.0310) (−0.0653 0.0143) (−0.0727 0.0057)

0.99 1 1

(−0.0526 0.0339) (−0.0680 0.0081) (−0.0654 0.0039)

= 0.03335209) 2

= 1, = 2

( = 0.6967347,

n = 50 n = 100 n = 200

(−0.0011 0.0293) (−0.0005 0.0239) (−0.0023 0.0163)

2 = 0.03335209)

= −1, = 1

( = 0.3032653,

n = 50 n = 100 n = 200

(−0.0066 0.1142) (−0.0070 0.0880) (−0.0029 0.0678)

= −1, = 2

( = 0.3032653,

n = 50 n = 100 n = 200

(0.0030 0.0850) (0.0011 0.0623) (−0.001 0.0456)

2 = 0.03335209)

2 = 0.03335209)

Table 4 Estimates for parameter 2 for = 10 Hybrid

= −1, = 1

( = 0.4607841,

n = 50 n = 100 n = 200

(−0.0338 0.0911) (−0.0249 0.0704) (−0.0146, 0.0452)

= 0.2093921)

= −1, = 2

( = 0.4607841,

n = 50 n = 100 n = 200

(−0.0226 0.0650) (−0.0180, 0.0490) (−0.0093 0.0331)

= 1, = 1

( = 0.5391493,

n = 50 n = 100 n = 200

(−0.0230 0.0804) (−0.0165 0.0572) (−0.0097 0.0384)

= 1, = 2

( = 0.5391493,

n = 50 n = 100 n = 200

(−0.0219 0.0627) (−0.0097 0.0417) (−0.0090 0.0275)

2

2 = 0.2093921)

2 = 0.2097591)

2 = 0.2097591)

Recall that both the CLS and the MQL are primarily used for computing estimates of and . However, a preliminary estimate of 2 (and similarly of 2z ) is needed before the MQL method could be used to compute parameter estimates of and . Recall that such an estimator was proposed in (8), where we use the CLS estimates of and . Furthermore, we have used the function T (x) = 1/(1 + (1 + x)2 ), which is a positive and bounded function, for our calculation. Note that for ﬁnite sample sizes, there is a positive probability that the moment-based estimators of 2 and/or 2z are negative. As mentioned earlier, for such samples, we would use the MLEs obtained from a parametric (Poisson) model for Z. The statistical properties of this hybrid estimator of 2 are listed in the Tables 3 and 4. We also report the percentage of positive moment estimates for 2 in the Percentage column in the tables for 2 .

220

H. Zheng et al. / Journal of Statistical Planning and Inference 173 (2007) 212 – 229

10

X(t)

8 6 4 2 0 0

50

100 t

150

200

Fig. 1. A typical sample path from a RCINAR(1) model.

From the above results, we can see that CLS and MQL are both good estimation methods producing estimators whose bias and SEs are overall comparable to the benchmark ML method. This is specially true, if in addition, one factors in the fact that both these methods are fast and easy to implement. It is perhaps not surprising that, under the correct parametric model, the MLE has the best performance in terms of both bias and the MSE. On the other hand, as to be expected, the simplest method, namely the CLS, had the largest MSE. However, overall, all three methods produced good estimators of parameters, especially for larger sample sizes. Also, as expected, when 2 is not too small (e.g., 2 = 0.2 for = 10), the MQL estimator obtained from a ﬁxed coefﬁcient model is substantially worse than that under a random coefﬁcient model. Interestingly, the F-MQL using a wrong variance formula is still slightly better than the CLS. F-MLE is the worst performer both in bias and overall MSE. We should mention that for large , the percentage of negative estimates for 2 is very small. It is to be noted that the CLS estimate remains the same under both random coefﬁcient model (true model) and the ﬁxed coefﬁcient model (wrong model) since the conditional expectation function remains the same under the two models. Since the conditional variance function changes under the two models, MQL estimate also changes. However, the performance of the MQL estimate remains reasonably good under the wrong variance function and its ﬁrst-order efﬁciency under the wrong model is not affected signiﬁcantly when 2 is not too large. Therefore, the MQL estimate is relatively robust against changes in the variance function as long as our parameters of interest ( and ) are the mean parameters from the conditional expectation function, which is the case here. It may be noted that the MQL is closer to F-MQL compared to MLE is to F-MLE. In this sense, MQL is more robust than MLE with regard to misspeciﬁcation of model (Fig. 1).

5. Summary and conclusions In this paper, we have introduced a new model for count data by extending the ﬁxed coefﬁcient INAR(1) model. The autoregressive parameter is allowed to vary randomly over time. The stationarity and ergodicity of the process are established. Conditional least squares (CLS) and modiﬁed quasi-likelihood (MQL) estimators of the model parameters are derived and their asymptotic distributions are obtained. In the simulation study, we compare CLS, MQL and the ML estimators. The simulation results show that all three methods give good estimates for large samples. The CLS method can be used for its simplicity. However, the MQL method gives more efﬁcient estimators than the CLS. Care must be exercised, however, in computing the MQL estimates (substituting a positive estimate of the variance parameter 2 ) as suggested in Section 4. The ML method is the most efﬁcient (as can be expected) even though it is computationally most intensive of the three methods. Another drawback of the ML method is that it requires the speciﬁc knowledge of the distributions of t and Zt . For instance, if a ﬁxed coefﬁcient INAR(1) model is used when, in fact, the data come from a random coefﬁcient model, the ML method performs poorly whereas the MQL method is more robust and gives good estimates even when the wrong model is used. On balance, therefore, we recommend the MQL method for the new model we have proposed in this paper.

H. Zheng et al. / Journal of Statistical Planning and Inference 173 (2007) 212 – 229

221

Acknowledgements We thank the two referees for their detailed and constructive suggestions.

Appendix Proof of Proposition 2.1. The results (i) to (iv) are straightforward to verify. We give below the proofs for (v) and (vi) only. (v) Var(Xt ) = Var(E(Xt |Xt−1 )) + E(Var(Xt |Xt−1 )) 2 = Var(Xt−1 ) + E(2 Xt−1 + ((1 − ) − 2 )Xt−1 + 2z )

= 2 Var(Xt−1 ) + 2 (Var(Xt−1 ) + E 2 (Xt−1 )) + ((1 − ) − 2 )E(Xt−1 ) + 2z . The conclusion can be reached by induction from the above equation with initial values Var(X0 ) = b/(1 − 2 ) and E(X0 ) = /(1 − ). (vi) By repeated application of (3), with t replaced by t + k, we have Xt+k = t+k ◦ · · · t+1 ◦ Xt + k−1 i=0 t+k ◦ · · · t+k−i+1 ◦ Zt+k−i , where, if i = 0, t+k ◦ · · · t+k−i+1 ◦ Zt+k−i = Zt+k . Therefore Cov(Xt+k , Xt ) = Cov t+k ◦ · · · t+1 ◦ Xt +

k−1

t+k ◦ · · · t+k−i+1 ◦ Zt+k−i , Xt

i=0

= Cov(t+k ◦ · · · t+1 ◦ Xt , Xt ) since Xt is independent of the second part = E[Xt (t+k ◦ · · · t+1 ◦ Xt )] − E(t+k ◦ · · · t+1 ◦ Xt )E(Xt ). By repeated conditioning, the ﬁrst part equals to k E(Xt2 ) and the second part equals to k E 2 (Xt ) leading to k Var(Xt ) for the covariance term. Proof of Proposition 2.2. From (4) and the assumption fz > 0, we get Pij > 0 and hence the process {Xt } is an irreducible and aperiodic Markov chain. Thus, it is either positive recurrent or limit Pijn = 0,

(9)

n→∞

for any i, j ; see, for e.g., Theorem 4.3.3 in Ross (1996). Repeated application of (3) with t replaced by n gives Xn = n ◦ · · · ◦ 1 ◦ X0 +

n−1

(n ◦ · · · ◦ n−k+1 ◦ Zn−k ) + Zn

k=1

d = n ◦ · · · ◦ 1 ◦ X0 + (k+1 ◦ · · · ◦ 2 ◦ Zk+1 ) + Z1 = Yn n−1

say,

(10)

k=1

n−1 where, for n = 1, n−1 k=1 (n ◦ · · · ◦ n−k+1 ◦ Zn−k ) = 0 = k=1 (k+1 ◦ · · · ◦ 2 ◦ Zk+1 ), the equality in distribution holds both unconditionally and conditionally given X0 = j , say. The ﬁrst term in (10) is op (1), again both conditionally given X0 = j and unconditionally since, as before, E(n ◦ · · · ◦ 1 ◦ X0 ) = n E(X0 ) → 0.

222

H. Zheng et al. / Journal of Statistical Planning and Inference 173 (2007) 212 – 229

Next, we show that Yn converges almost surely. For all > 0, m, n ∈ N0 , P max |Yn − Yn+k | > 1k m k =P max |n ◦ · · · ◦ 1 ◦ X0 − n+k ◦ · · · ◦ 1 ◦ X0 + (n+i ◦ · · · ◦ 2 ◦ Zn+i )| > P

1k m

i=1

|n ◦ · · · ◦ 1 ◦ X0 − n+m ◦ · · · ◦ 1 ◦ X0 | +

m

(n+k ◦ · · · ◦ 2 ◦ Zn+k ) >

k=1

since ◦ X X E(n ◦ · · · ◦ 1 ◦ X0 − n+m ◦ · · · ◦ 1 ◦ X0 + m k=1 (n+k ◦ · · · ◦ 2 ◦ Zn+k )) n (1 − m )E(X0 ) + n (1 − m )/(1 − ) = → 0 as n → ∞. Therefore, by the above arguments, Vn converges in distribution (both conditionally and unconditionally) to the same limit where Vn = n−1 k=1 (k+1 ◦ · · · ◦ 2 ◦ Zk+1 ) + Z1 . Hence, limit P (Xn = i|X0 = j ) = limit P (Yn = i|X0 = j )

n→∞

n→∞

= limit P (Vn = i|X0 = j ) n→∞

= limit P (Vn = i) n→∞

because Vn is independent of X0 = limit P (Yn = i) n→∞

= limit P (Xn = i) n→∞

(11)

Suppose now that (9) holds, we then have limit Pijn = 0 = P {Y = i},

n→∞

by (11), for all i, j 0, where Y is the almost sure limit of Yn . However, this contradicts the fact that P (Y < ∞) = 1. This proves the ﬁrst part of the proposition. Almost sure convergence of Vn follows from the same arguments used above. Now, we prove that {Vn , n 0} converges in L2 . In the proof, we need the following results: E( ◦ X)( ◦ X) = E(E( ◦ X)( ◦ X)|X)) = E(X 2 ), E( ◦ X)2 = E(E(( ◦ X)2 |X)) = (1 − )E(X) + 2 E(X 2 ), for all and ∈ [0,1]. By (12) and taking repeated conditional expectation, for i > j , E(n+i ◦ · · · ◦ 2 ◦ Zn+i )(n+j ◦ · · · ◦ 2 ◦ Zn+j ) = E(E[(n+i ◦ · · · ◦ 2 ◦ Zn+i )(n+j ◦ · · · ◦ 2 ◦ Zn+j )|n+j ◦ · · · ◦ 2 ◦ Zn+j , × n+j ◦ · · · ◦ 2 ◦ Zn+i , n+i , . . . , n+j +1 ]), ⎡⎛ ⎞ ⎤ = E ⎣⎝ n+k+1 ⎠ (n+j ◦ · · · ◦ 2 ◦ Zn+j )(n+j ◦ · · · ◦ 2 ◦ Zn+i )⎦ j k
⎛⎛

⎞⎛

= E ⎝⎝

2l ⎠ ⎝

n+j l=2

j k
where 0 < c = E(21 ) < 1.

⎞

⎞

n+k+1 ⎠ 2 ⎠ = cn+j −1 i−j 2 ,

(12) (13)

H. Zheng et al. / Journal of Statistical Planning and Inference 173 (2007) 212 – 229

223

Finally, using the above to tackle the cross product terms and (13) to calculate the second moments we get ⎡ 2 ⎤ m E(|Vn − Vn+m |2 ) = E ⎣ (n+k ◦ · · · ◦ 2 ◦ Zn+k ) ⎦ k=1

m k=1

+2

[(n + k − 1)( − c)n+k−2 + cn+k−2 E(Z12 )]

i−j cn+j −1 2 ,

(14)

m i>j 1

while

i−j cn+j −1

m i>j 1

n+i−1 =

mi j 1

m

in+i−1 → 0,

i=1

as n → ∞. The convergence of the ﬁrst term in (14) is similar. Therefore, for all m > 0, E(|Vn − Vn+m |2 ) → 0 This ends the proof.

if n → ∞.

Proof of Corollary 2.1. The proof follows from Proposition 2.2 by using (12) and (13).

Proof of Theorem 3.1. Here, we give a direct proof of this result. Alternatively, one can verify the regularity conditions of Klimko and Nelson (1978). Recall from the beginning of Section 3.1 that S() =

n

(Xt − Xt−1 − )2 ,

t=1

with = (, )T , and jS()/j = 0 leads to the CLS estimator of . Now, let Fn = {X0 , X1 , . . . , Xn } and Mn = −2−1 (jS()/j) = nt=1 (Xt − Xt−1 − ), M0 = 0. Then E(Mn |Fn−1 ) = E(Mn−1 + (Xn − Xn−1 − )|Fn−1 ) = Mn−1 + E(Xn − Xn−1 − |Fn−1 ) = Mn−1 , i.e., {Mn , Fn , n0} is a martingale. We have shown that Vn converges in L2 and almost surely, which means that {Vn2 , n 0} is uniformly integrable. Therefore (Xn − Xn−1 − )2 , n 1, is uniformly integrable. By Theorem 1.1 of Billingsley (1961), n 1 a.s. (Xt − Xt−1 − )2 −→ E((X1 − X0 − )2 ) = 21 . n t=1

√ d Thus, by Corollary 3.2 from Hall and Heyde (1980), the martingale CLT applies and we get, (1/ n)Mn −→ N(0, 21 ). n −1 Similarly, we can prove that Mn = −2 (jS()/j) = t=1 Xt−1 (Xt − Xt−1 − ) is a martingale, n 1 2 a.s. Xt−1 (Xt − Xt−1 − )2 −→ E(X02 (X1 − X0 − )2 ) = 22 n t=1

√ d and (1/ n)Mn −→ N(0, 22 ).

224

H. Zheng et al. / Journal of Statistical Planning and Inference 173 (2007) 212 – 229

In the same way, for any c = (c1 , c2 )T ∈ R 2 \(0, 0), we have 1 √ cT n

Mn Mn

1 =√ (c1 Xt−1 + c2 )(Xt − Xt−1 − ), n n

t=1

d

−→ N(0, E((c1 X0 + c2 )2 (X1 − X0 − )2 )). Thus, by Cramer–Wold device,

1 √ n

Mn Mn

d

−→ N

2 0 1 , 0 12

12 22

,

where 12 = E(X0 (X1 − X0 − )2 ). By setting Mn = 0 and Mn = 0, we have the following estimators ˆ=

n

n n t=1 Xt Xt−1 − t=1 Xt−1 t=1 Xt , n 2 2 − n nt=1 Xt−1 t=1 Xt−1 n n

n

ˆ = n−1

ˆ Xt −

Xt−1 .

t=1

t=1

After some algebra, we have

ˆ − ˆ −

⎛

n

= n−1 ⎝n−1

2 Xt−1 − n−2

t=1

⎛ ⎜ ×⎝

n t=1

2 ⎞−1 Xt−1 ⎠

t=1 n

−n−1

1 −n−1

n

n−1

Xt−1

t=1 n t=1

Xt−1

2 Xt−1

⎞

⎟ Mn . ⎠ Mn

By the ergodicity of the process (see Proposition 2.2) and Theorem 1.1 of Billingsley (1961), ⎛ ⎝n−1

n

2 Xt−1 − n−2

t=1 a.s.

−→(m2 − m21 )−1 where m1 = limitn→∞ n−1 Therefore,

n

⎛ 2 ⎞−1 ⎜ Xt−1 ⎠ ⎝

t=1

1 −m1

n

−m1 m2

t=1 Xt−1

21 12

12 . 22

n

−1

n t=1

n−1

Xt−1

n

t=1

,

and m2 = limitn→∞ n−1

ˆ − √ d n ˆ −→ N(0, V −1 W V −1 ), − where W =

−n−1

t=1

=V

−n−1

1

n

2 t=1 Xt−1 .

Xt−1

2 Xt−1

⎞ ⎟ ⎠

H. Zheng et al. / Journal of Statistical Planning and Inference 173 (2007) 212 – 229

225

Proof of Theorem 3.2. Let = (, )T , = (2 , , 2z )T , where 2 = Var(1 ), = E(1 (1 − 1 )) and 2z = Var(Z1 ). 2 + X 2 V (Xt |Xt−1 ) = Var(Xt |Xt−1 ) = 2 Xt−1 t−1 + z . First, we suppose is known. For the following estimation equations:

Sn(1) (, ) =

n

V−1 (Xt |Xt−1 )(Xt − Xt−1 − ),

t=1

Sn(2) (, ) =

n

V−1 (Xt |Xt−1 )Xt−1 (Xt − Xt−1 − ),

t=1

we have E[V−1 (Xt |Xt−1 )(Xt − Xt−1 − )|Ft−1 ] = V−1 (Xt |Xt−1 )E[(Xt − Xt−1 − )|Ft−1 ] = 0 and (1)

(1)

E[St (, )|Ft−1 ] = St−1 (, ). (1)

Thus, {St (, ), Ft , t 0} is a martingale. By Theorem 1.1 of Billingsley (1961), n 1 −2 V (Xt |Xt−1 )(Xt − Xt−1 − )2 n t=1 a.s.

−→ E[V−2 (X1 |X0 )(X1 − X0 − )2 ]

= E(E[V−2 (X1 |X0 )(X1 − X0 − )2 |X0 ]) = E[V−1 (X1 |X0 )] = T1 ().

By the similar argument as for the CLS estimators, we have 1 d √ Sn(1) (, ) −→ N(0, T1 ()). n Similarly, n 1 −2 2 V (Xt |Xt−1 )Xt−1 (Xt − Xt−1 − )2 n t=1 a.s.

−→ E[V−2 (X1 |X0 )X02 (X1 − X0 − )2 ]

= E(E[V−2 (X1 |X0 )X02 (X1 − X0 − )2 |X0 ]) = E[X02 V−1 (X1 |X0 )] = T2 ()

and 1 d √ Sn(2) (, ) −→ N (0, T2 ()). n Again by Cramer–Wold device, for any c = (c1 , c2 )T , c1 and c2 ∈ R are not both 0, we have (1) cT Sn (, ) d −→ N(0, E[V−2 (X1 |X0 )(c2 X0 + c1 )2 (X1 − X0 − )2 ]), √ n Sn(2) (, ) implying 1 √ n

0 T1 () T3 () −→ N , , (2) T3 () T2 () Sn (, ) 0 (1)

Sn (, )

d

where T3 () = E[V−2 (X1 |X0 )X0 (X1 − X0 − )2 ] = E[V−1 (X1 |X0 )X0 ].

(15)

226

H. Zheng et al. / Journal of Statistical Planning and Inference 173 (2007) 212 – 229

Now, we replace V−2 (Xt |Xt−1 ) by V ˆ−2 (Xt |Xt−1 ), where ˆ is a consistent estimator of . Then we want

1 √ n

(1) ˆ d 0 T1 () ) Sn (, −→ N , (2) ˆ 0 Sn (, T3 () )

T3 () T2 ()

.

(16)

To have this we need to prove that 1 p ˆ ) − √1 S (i) (, ) −→ 0, √ Sn(i) (, n n n

i = 1, 2.

(17)

√ (1) Let Rn () = (1/ n)Sn (, ). Then ∀ > 0 and > 0 such that − 1 > 0, where 1 is the unit vector, we have ˆ − Rn ()| > ) P (|ˆ 2 − 2 | > ) + P (|ˆ − | > ) + P (|ˆ 2 − 2 | > ) P (|Rn () z z + P(

sup {|21 −2 |<, |1 −|<, 2 2 |2 −z |<}

|Rn (1 ) − Rn ()| > ),

where 1 = (21 , 1 , 22 )T . Let D = {|21 − 2 | < , |1 − | < , |22 − 2z | < }. If ˆ is a consistent estimator of , then we just need to prove that p P sup |Rn (1 ) − Rn ()| > −→ 0. D

By Markov inequality,

P sup |Rn (1 ) − Rn ()| > D

1 2 E sup (R ( ) − R ()) n 1 n 2 D n 1 1 −1 −1 2 2 = 2 E sup (V1 (Xt |Xt−1 ) − V (Xt |Xt−1 )) (Xt − Xt−1 − ) D n t=1 1 −1 −1 2 2 = 2 E sup (V1 (X1 |X0 ) − V (X1 |X0 )) (X1 − X0 − ) D ((21 − 2 )X02 + (1 − )2 X0 + (22 − 2z ))2 1 2 = 2E sup (X1 − X0 − ) V 2 (X1 |X0 )V (X1 |X0 )2 D 1 ((21 − 2 )X02 + (1 − )2 X0 + (22 − 2z ))2 1 = 2 E sup V 2 (X1 |X0 )V (X1 |X0 ) D

1

1 2 sup {(21 − 2 )2 c1 + (1 − )2 c2 + (22 − 2z )2 c3 + 2c4 |(21 − 2 )(1 − )| D + 2c5 |(22 − 2z )(21 − 2 )| + 2c6 |(1 − )(22 − 2z )|}

where c’ s are ﬁnite moments

2

C , 2

√ (2) where C is a positive constant. Similar argument can be used for (1/ n)Sn (, ). Letting go to zero, we get our assertion which in turn establishes (16).

H. Zheng et al. / Journal of Statistical Planning and Inference 173 (2007) 212 – 229

227

Similarly, we have n n 1 −1 1 −1 p V (Xt |Xt−1 ) − V ˆ (Xt |Xt−1 ) −→ 0, n n t=1

t=1

n n 1 −1 1 −1 p V (Xt |Xt−1 )Xt−1 − V ˆ (Xt |Xt−1 )Xt−1 −→ 0, n n t=1

t=1

t=1

t=1

n n 1 −1 1 −1 p 2 2 V (Xt |Xt−1 )Xt−1 − V ˆ (Xt |Xt−1 )Xt−1 −→ 0. n n

Therefore, by the above and Theorem 1.1 of Billingsley (1961), ⎡ 2 ⎤−1 n n n 1 1 1 2 ⎣ − V ˆ−1 (Xt |Xt−1 ) V ˆ−1 (Xt |Xt−1 )Xt−1 V ˆ−1 (Xt |Xt−1 )Xt−1 ⎦ n n n t=1

t=1

t=1

⎞ n 1 V ˆ−1 (Xt |Xt−1 )Xt−1 ⎟ ⎜ n t=1 ⎟ ⎜ ×⎜ ⎟ n n 1 ⎝1 2 ⎠ V ˆ−1 (Xt |Xt−1 )Xt−1 V ˆ−1 (Xt |Xt−1 )Xt−1 n t=1 n t=1 T1 () −T3 () p 2 −1 −→(T1 ()T2 () − T3 ()) −T3 () T2 () = T −1 (). ⎛

n 1 V −1 (Xt |Xt−1 ) n t=1 ˆ

After some algebra, we have ⎡ ˜ n n − 1 −1 −1 −1 ⎣ 1 2 =n V ˆ (Xt |Xt−1 ) V ˆ (Xt |Xt−1 )Xt−1 ˜ − n n t=1

t=1

2 ⎤−1 n 1 V ˆ−1 (Xt |Xt−1 )Xt−1 ⎦ − n t=1

⎛ ⎜ ⎜ ×⎜ ⎝

n 1 V −1 (Xt |Xt−1 ) n t=1 ˆ

−

n 1 V −1 (Xt |Xt−1 )Xt−1 n t=1 ˆ

Therefore, by (16) and (18), ˜ − √ d n ˜ −→ N(0, T −1 () Q()T −1 ()), − where Q() =

T1 () T3 ()

T3 () . T2 ()

This completes the proof.

⎞ n 1 V ˆ−1 (Xt |Xt−1 )Xt−1 (1) ˆ ) ⎟ Sn (, n t=1 ⎟ . ⎟ (2) ˆ n 1 ) ⎠ Sn (, 2 V ˆ−1 (Xt |Xt−1 )Xt−1 n t=1

−

(18)

228

H. Zheng et al. / Journal of Statistical Planning and Inference 173 (2007) 212 – 229

Proof of Proposition 3.1. Let An =

n 1 (T (Xt−1 ) − T¯ )(Xt − Xt−1 − )2 , n t=1

n 1 2 Bn = (T (Xt−1 ) − T¯ )Xt−1 , n t=1

n 1 Cn = (T (Xt−1 ) − T¯ )Xt−1 . n t=1

By Theorem 1.1 of Billingsley (1961), a.s.

An −→ E((T (X0 ) − E(T (X0 )))(X1 − X0 − )2 ) = 2 ( 1 − 2 ) + ( − 2 ) 2 , a.s.

Bn −→ 1 , a.s.

Cn −→ 2 , where 2

1 = E[(T (X∞ ) − E(T (X∞ )))X∞ ],

2 = E[(T (X∞ ) − E(T (X∞ )))X∞ ]

and X∞ denotes the limiting random variable corresponding to the stationary distribution of the process. Therefore 2

ˆ − )Cn + ( ˆ − )2 Bn + 2( ˆ − )(ˆ − )Cn − (ˆ − ˆ )Cn An − 2( Bn − C n 2 ( − ) + ( − 2 ) − ( − 2 ) 2 2 2 p 1 −→ = 2 for 1 = 2 .

1 − 2

ˆ 2 =

In order to avoid the case 1 = 2 , we need to assume that the limiting random variable X∞ is not a constant with X∞ = 0 or 1. We also assume that T (X) = E(T (X)). p

p

Similar arguments lead to ˆ 2z −→ 2z and ˆ −→ .

References Al-osh, M.A., Alzaid, A.A., 1987. First order integer-valued autoregressive (INAR(1)) processes. J. Time Ser. Anal. 8, 261–275. Al-osh, M.A., Alzaid, A.A., 1991. Binomial autoregressive moving average models. Stochastic Models 7, 261–282. Al-osh, M.A., Alzaid, A.A., 1992. First order autoregressive time series with negative binomial and geometric marginals. Comm. Statist. Theory Methods 21, 2483–2492. Alzaid, A.A., Al-osh, M.A., 1988. First order integer-valued autoregressive (INAR(1)) processes: distributional and regression properties. Statist. Neerlandica 42, 53–61. Billingsley, P., 1961. Statistical Inference for Markov Processes. University of Chicago Press, Chicago. Davis, R.A., Dunsmuir, T.M., Wang, Y., 1999. Modeling time series of count data. In: Gosh, S. (Ed.), Asymptotics, Nonparametrics and Time Series. Marcel-Dekker, New York, pp. 63–114. Fukasawa, T., Basawa, I.V., 2002. Estimation for a class of generalized state-space time series models. Statist. Probab. Lett. 60, 459–473. Hall, P., Heyde, C.C., 1980. Martingale Limit Theory and Its Application. Academic Press, New York.

H. Zheng et al. / Journal of Statistical Planning and Inference 173 (2007) 212 – 229

229

Klimko, L.A., Nelson, P.I., 1978. On conditional least squares estimation for stochastic processes. Ann. Statist. 6, 629–642. MacDonald, I.L., Zucchini, W.Z., 1997. Hidden Markov and Other Models for Discrete-valued Time Series. Chapman & Hall, London. McKenzie, E., 1985a. Contribution to the discussion of Lawrence and Lewis. J. Roy. Statist. Soc. B 47, 187–188. McKenzie, E., 1985b. Some simple models for discrete variate time series. Water Resour. Bull. 21, 645–650. McKenzie, E., 1986. Autoregressive moving-average processes with negative-binomial and geometric marginal distributions. Adv. Appl. Probab. 18, 679–705. McKenzie, E., 1987. Innovation distributions for gamma and negative-binomial autoregressions. Scand. J. Statist. 14, 79–85. McKenzie, E., 1988a. The distributional structure of ﬁnite moving-average processes. J. Appl. Probab. 25, 313–321. McKenzie, E., 1988b. Some ARMA models for dependent sequence of Poisson counts. Adv. Appl. Probab. 20, 822–835. Ross, S.M., 1996. Stochastic Processes. second ed.. Wiley, New York. √ Schick, A., 1996. n-consistent estimation in a random coefﬁcient autoregressive model. Austral. J. Statist. 38, 155–160. Steutal, F., Van Harn, K., 1979. Discrete analogues of self-decomposability and stability. Ann. Probab. 7, 893–899.

First-order random coefficient integer-valued autoregressive processes

First-order random coefficient integer-valued autoregressive processes

Recommend Documents