A test of goodness of fit testing for stochastic intensities associated to counting processes

A test of goodness of fit testing for stochastic intensities associated to counting processes

Statistics & Probability Letters 64 (2003) 287 – 292 A test of goodness of t testing for stochastic intensities associated to counting processes Ra%u...

199KB Sizes 0 Downloads 53 Views

Statistics & Probability Letters 64 (2003) 287 – 292

A test of goodness of t testing for stochastic intensities associated to counting processes Ra%ul Fierro Instituto de Matem aticas, Universidad Cat olica de Valpara so, Casilla 4059, Valpara so, Chile Received January 2003; received in revised form March 2003

Abstract An asymptotic goodness of t testing for stochastic intensities associated to counting processes is derived. Moreover, this result is applied to the stochastic version of the Kermack and McKendrick model for a general epidemic. c 2003 Elsevier B.V. All rights reserved.  Keywords: Goodness of t testing; Counting processes; Asymptotic distribution; Epidemic models

1. Introduction In a variety of elds their models are described by means of a nite number of counting processes and their corresponding predictable stochastic intensities. Statistical analysis to study certain characteristics of these intensities is treated in a number of references, for instance, Andersen et al. (1983), Jacobsen (1982), Karr (1991) and Prakasa Rao (1999) deal with methods for both point estimation and hypothesis testing based on point processes. We are concerned with the adequacy of these stochastic intensities as well. In fact, the di;erence between the counting processes and their predictable compensators are martingales, which in some sense, play the role of residuals in a regression model. Consequently, the adequacy of the stochastic intensities is closely related with the e;ect of these martingales to be minimal. To this end, we de ne a test statistic consisting in a measure of these martingales, and the Laplace transform of its asymptotic distribution is derived. This fact enables us to state an asymptotic goodness of t test for the stochastic intensities associated to their corresponding counting processes. Our approach is di;erent to those mentioned above. In fact, while the known-methods emphasize on models of multiplicative intensity, and their statistical inference is carried out on the parametric E-mail address: r [email protected] (R. Fierro). c 2003 Elsevier B.V. All rights reserved. 0167-7152/03/$ - see front matter  doi:10.1016/S0167-7152(03)00173-1

288

R. Fierro / Statistics & Probability Letters 64 (2003) 287 – 292

part of these intensities, we aim to test the whole form of the intensities, which includes to test the functional relation between the parametric and observable parts. An application to a known epidemic model is introduced. In fact, we apply our main result to the stochastic version of the Kermack and McKendrick model for a general epidemic. This model assumes that its stochastic intensities have multiplicative form. Martingale theory and its asymptotic behavior use to be applied for epidemic models based on point processes, see for instance, Andersson and Britton (2000), and Becker (1993). Assuming that the multiplicative form of the stochastic intensities is correct for the model considered here, the methods introduced by these authors can be appropriate to make statistical inference on the parameters. However, we are interested in testing the correctness both the parameters and the multiplicative form of the intensities of the model, for this reason, we recommend our main result to test whether the empirical information ts this model. The paper is structured as follows. In Section 2 preliminaries are stated and the test statistic is de ned. In Section 3 we state the main result. In Section 4 the main result is applied to the stochastic version of the Kermack and McKendrick model for a general epidemic. Finally, the proof of the main result is carried out in Section 5. 2. Preliminaries Let (; F; P) and F = {Ft ; t ¿ 0} be a probability space and an increasing family of - elds on , respectively, such that for each t ¿ 0, Ft ⊆ F. On the stochastic basis (; F; P; F), we consider k point processes Z1n ; : : : ; Zkn , n ∈ N, without common jumps and such that for each i ∈ {1; : : : ; k}, the predictable intensity of Zin is nani , where an1 ; : : : ; ank are k nonnegative predictable processes. That is, for each i ∈ {1; : : : ; k},  t n Zi (t) = n ani (u) du + Min (t); 0 n n M1 ; : : : ; Mk

where are k martingales starting at zero without common jumps. We are interested in proving the hypothesis testing that this model is correct. To this end, we propose to observe, for a stopping time (n), the counting processes Z1n ; : : : ; Zkn along the time interval [0; (n)] and de ne the following test statistic: k  (n)  2 Sn ( (n)) = mni (t)2 dt; i=1

0

where for each i ∈ {1; : : : ; k}, mni denotes the martingale de ned as  t I+ (ani (u)) dMin (u) n mi (t) = : [nani (u)]1=2 0 Here, I+ : R → R is the indicator function on ]0; ∞[, that is, I+ (g) = 1 if g ¿ 0 and I+ (g) = 0 if g 6 0. Moreover, I+ (0)=0 is de ned to be zero. The martingales mn1 ; : : : ; mnk are a measure for the discrepancy between the observed processes and those being postulated by means of their intensities. Moreover, S2n ( (n)) is the integral of squares of these martingales. Consequently, S2n ( (n)) is a suitable measure for the overall discrepancy between

R. Fierro / Statistics & Probability Letters 64 (2003) 287 – 292

289

the empirical evidence observed along the time interval [0; (n)], and the posed stochastic model. However, the main justi cation for using this statistic is that its limiting distribution is independent of the statistical model as we state in the next section. 3. Main result Theorem 1. Let T ¿ 0 and suppose that the following two conditions hold: P

(n) → T:

(1.1)

There exist deterministic functions a1 ; : : : ; ak from [0; ∞[ to ]0; ∞[ such that for each i ∈ {1; : : : ; k};

P

sup |ani (t) − ai (t)| → 0:

06t 6T

(1.2)

Then, (S2n ( (n)); n ∈ N) converges in distribution to a random variable S2 (T ) having Laplace transform given by √ (1.3) k () = E(exp(−S2 (T ))) = 1=coshk=2 (T 2) ( ¿ 0): Remark. (1) A hypothesis testing to reject or not the validity of the model is performed in the following manner: given a signi cance level  ∈ ]0; 1[, we choose t ¿ 0 so that P(S2 (T ) ¿ t ) 6 . Then, we compare the statistic S2n ( (n)) to t . If, S2n ( (n)) ¿ t , we reject the hypothesis as false, while if S2n ( (n)) 6 t , we conclude that there is no suKcient evidence that the model is incorrect and consequently, we accept the hypothesis as true. (2) The moments of S2 (T ) can be calculated by means of (1.3). In particular, the mean and variance of S2 (T ) is given by kT 2 =2 and 2kT 4 =3, respectively. 4. An application to an epidemic model Let us suppose that a closed and homogeneously mixing population of total size n is subdivided within three classes of individuals, containing each of them, susceptibles infectives and removed cases. Each infective remains in this state during a period of time and can infect susceptible individuals. After that period, the infective becomes a removed case playing no role in the infection process. This model is known as the general epidemic model, and was introduced by Kermack and McKendrick (1927). Both deterministic and stochastic versions of this model have received much attention in the literature and have been analyzed in some detail by Bailey (1975). To de ne this model precisely, let us denote by S n (t), I n (t) and Rn (t) the number of susceptibles, infectives and removed cases at time t, respectively. Hence, S n (t) + I n (t) + Rn (t) = n. Let Z1n (t) and Z2n (t) denote the number of infections and removed cases, respectively, during the time interval [0; t]. Then, for each t ¿ 0, S n (t) = S n (0) − Z1n (t); I n (t) = I n (0) + Z1n (t) − Z2n (t)

290

R. Fierro / Statistics & Probability Letters 64 (2003) 287 – 292

and Rn (t) = Rn (0) + Z2n (t): For each n ∈ N, the processes Z1n (t) and Z2n (t) are assumed to be counting processes with no common jumps and transition probabilities given by  P(Z1n (t + Lt) = 1 + Z1n (t) | S n (t); I n (t)) = S n (t)I n (t)Lt + o(Lt) n and P(Z2n (t + Lt) = 1 + Z2n (t) | S n (t); I n (t)) = I n (t)Lt + o(Lt); where  and  are positive constants. Other transition probabilities are assumed to be zero. Hence, we have   t  t n n n n n S (u)I (u) du + M1 (t) and Z2 (t) =  I n (u) du + M2n (t); Z1 (t) = n 0 0 where M1n and M2n are martingales starting at zero without common jumps. We are interested in proving the hypothesis testing that Z1n and Z2n have indeed predictable stochastic intensities given by S n I n =n and I n , respectively. To this end, we x T ¿ 0 and de ne S2n ( (n)) by means of    2 2   t  (n) t n n n n n I+ (S (u)I (u)) dM1 (u) I+ (I (u)) dM2 (u)  1 n   dt; + S2n ( (n)) = n n   S (u)I (u) I n (u) 0 0 0 where (n) = sup{t 6 T : LZ1n (t) + LZ2n (t) = 0}. In order to state the next result, we x p such that 0 ¡ p ¡ 1 and set (x; y; z) as the unique solution to the following system of ordinary di;erential equations: x˙ = −xy; y˙ = xy − y; z˙ = y; where x(0) = p, y(0) = 1 − p and for each t ¿ 0, x(t) + y(t) + z(t) = 1. Theorem 2. Suppose (S n (0)=n; n ∈ N) and (I n (0)=n; n ∈ N) converge in probability to p and 1 − p, respectively, as n → ∞. Then, (S2n ( (n)); n ∈ N) converges in distribution to S2 (T ) as n → ∞, where S2 (T ) is a random variable having distribution function F given by

∞ −(2j − 1)2 !2 t 4  (−1) j−1 exp F(t) = 1 − : (2.1) ! j=1 2j − 1 8T 2 Proof. It follows from Theorem 2.1 of Chapter 11 in Ethier and Kurtz (1986) that P

(n) → T; P

sup |S n I n =n2 − x(t)y(t)| → 0

06t 6T

R. Fierro / Statistics & Probability Letters 64 (2003) 287 – 292

291

and P

sup |I n =n − y(t)| → 0:

06t 6T

Since 0 ¡ p ¡ 1, x(t) ¿ 0 and y(t) ¿ 0, for all t ¿ 0. Consequently, by Theorem 1 (S2n ( (n)); n ∈ N) converges in distribution to a random variable S2 (T ) having Laplace transform √ 2 () = E(exp(−S2 (T ))) = 1=cosh(T 2) ( ¿ 0): The complex inversion formula can be calculated for 2 () by means of the calculus of residues. This method enables us to conclude that the inverse Laplace transform F(t) of 2 () is given by (2.1) and therefore, the proof is complete. 5. Proof of main result For each i ∈ {1; : : : ; k} the predictable quadratic variation of mni is given by  t n mi (t) = I+ (ani (u)) du (t ¿ 0): 0

n

Let m be the vectorial martingale de ned as mn = (mn1 ; : : : ; mnk ), and let us prove that (mn ; n ∈ N) converges in distribution to a standard k-dimensional Brownian motion. To this end, we apply Corollary 12 in Section II.5 by Rebolledo (1979). Since for each u ¿ 0, (ani (u); n ∈ N) converges in probability to ai (u), it follows from (1.2) that for each t ¿ 0, ( mni (t); n ∈ N) converges in probability to t. For each # ¿ 0, let  Lmn (u)2 I{Lmn (u)¿#} ; # (mn )t = u 6t

where   and IA denote the Euclidean norm in Rk and the indicator function of the set A, respectively. Denoting by ˜ # (mn ) the predictable compensator of # (mn ), it suKces to prove that for each t ¿ 0, (˜ # (mn )t ; n ∈ N) converges in probability to zero. I+ (ani (u)) Let Hin (u) = [na n 1=2 . i (u)] We have # (mn )t =

k   i=1 u6t

=

k   i=1

t

0

|Lmni (u)|2 I{|Lmni (u)|¿#}

Hin (u)2 I{Hin (u)¿#} dZin (u):

Thus, it is easy to see that k  t  # n ˜ (m )t = I{Hin (u)¿#} du: i=1

0

292

R. Fierro / Statistics & Probability Letters 64 (2003) 287 – 292

From condition (1.2), the sequence (˜ # (mn )t ; n ∈ N) converges in probability to zero. Therefore, (mn ; n ∈ N) converges in distribution to a standard k-dimensional Brownian motion W =(W1 ; : : : ; Wk ). Since  (n) 2 Sn ( (n)) = mn (u)2 du; 0

and condition (1.2) holds, the sequence (S2n ( (n)); n ∈ N) converges in distribution to  T S2 (T ) = W (u)2 du: 0

The Laplace transform of S2 (T ) is given by k () = E(exp(−S2 (T )))



 T 2 W1 (u) du · · · E exp − = E exp − 0

0

T

2

Wk (u) du

:

By applying the Cameron and Martin formula (see for instance Karatzas and Shreve, 1991), we have

 T √ 2 E exp − Wi (u) du = 1=cosh1=2 (T ); 0

√ and therefore, k () = 1=coshk=2 (T 2), ( ¿ 0), which concludes the proof. Acknowledgements This research has been partially supported by Direcci%on General de Investigaci%on de la Universidad Cat%olica de Valpara%Qso and FONDECYT under grants No. 1.000270 and No. 1.030986. References Andersen, P.K., Borgan, O., Gill, R.D., Keiding, N., 1983. Statistical Models based on Counting Processes. Springer, New York. Andersson, H., Britton, T., 2000. Stochastic Epidemic Models and their Statistical Analysis. Springer, New York. Bailey, N.T.J., 1975. The Mathematical Theory of Infectious Diseases. GriKn and Co., London. Becker, N., 1993. Parametric inference for epidemic models. Math. Biosci. 117, 239–251. Ethier, S.N., Kurtz, T.G., 1986. Markov Processes. Characterization and Convergence. Wiley, New York. Jacobsen, M., 1982. Statistical Analysis of Counting Processes. Springer, New York. Karatzas, I., Shreve, S.E., 1991. Brownian Motion and Stochastic Calculus. Springer, New York. Karr, A.F., 1991. Point Processes and their Statistical Inference. Marcel Dekker, New York. Kermack, W.O., McKendrick, A.G., 1927. Contribution to the mathematical theory of epidemics. Proc. R. Soc. Lond. A 115, 700–721. Prakasa Rao., 1999. Semimartingales and their Statistical Inference. Chapman & Hall/CRC, Boca Raton, FL. % Rebolledo, R., 1979. La m%ethode des martingales appliqu%e a l’Etude de la convergence en loi de processus. Bull. de la Soci%et%e Math%ematique de France 62, 1–126.