Statistics & Probability Letters 48 (2000) 217 – 227
A semiparametric model for truncated and censored data Liuquan Sun∗ , Lixing Zhu Institute of Applied Mathematics, Academia Sinica, Beijing 100080, People’s Republic of China Received March 1999; received in revised form August 1999
Abstract In this paper we propose and study a semiparametric estimator of the survival function based on left truncated and right censored data. Uniform consistency and a functional central limit result for this estimator are established. In addition, this c 2000 Elsevier Science estimator is shown to be superior to the product-limit estimator in terms of asymptotic variance. B.V. All rights reserved MSC: 60F15; 62G05 Keywords: Truncated and censored data; Product-limit estimator; Maximum-likelihood estimation; Functional central limit theorem; Uniform consistency
1. Introduction Let (X; T; Y ) denote random variables where X is the variable of interest, called the lifetime variable, with continuous distribution function (d.f.) F; T is the random left truncation time with arbitrary d.f. G, and Y is the random right censoring time with arbitrary d.f. H . It is assumed that X; T; Y are mutually independent and, without loss of generality, that they are nonnegative. In the random left truncation and right censoring (LTRC) model one observes (Z; T; ) if Z¿T where Z = X ∧ Y = min(X; Y ) and = I (X 6Y ) is the indicator of censoring status. When Z ¡ T nothing is observed. Let ≡ P(T 6Z) ¿ 0, and W denote the d.f. of Z, i.e., 1 − W = (1 − F)(1 − H ). Let (Zi ; Ti ; i ), i = 1; 2; : : : ; n be an independent and identically distributed (i.i.d.) sample of (Z; T; ) which one observes (i.e., Zi ¿Ti ). For any distribution function L, let aL = inf {x: L(x) ¿ 0} and bL = sup{x: L(x) ¡ 1} be lower and upper endpoints of its support. Then under the current model as discussed by Gijbels and Wang (1993) and Arcones and Gine (1995), we assume that aG 6aW and bG 6bW and Z ∞ dW (x) ¡ ∞; (1.1) 3 aW G (x) Research is supported by the Postdoctoral Programme Foundation and the National Natural Science Foundation of China and a CRCG grant of the University of Hong Kong. ∗
Corresponding author. E-mail address:
[email protected] (L. Sun)
c 2000 Elsevier Science B.V. All rights reserved 0167-7152/00/$ - see front matter PII: S 0 1 6 7 - 7 1 5 2 ( 9 9 ) 0 0 2 0 7 - 2
218
L. Sun, L. Zhu / Statistics & Probability Letters 48 (2000) 217 – 227
holds. Condition (1.1) requires in a sense that the left truncation is not heavy, as one would expect in order to obtain sensible estimates. De ne C(x) = P(T 6x6Z | T 6Z) = −1 G(x)(1 − W (x−)); W1 (x) = P(Z6x; = 1 | T 6Z) =
−1
Z
x
aW
G(u)(1 − H (u−)) dF(u):
It can be shown that Z x dW1 (u) : (x) = aW C(u) Let P Cn (x) and W1n (x) be the empirical Pnestimators of C(x) and W1 (x), respectively, i.e., Cn (x) = n n−1 i = 1 I (Ti 6x6Zi ) and W1n (x) = n−1 i = 1 I (Zi 6x; i = 1). Hence a natural estimator of is Z x n dW1n (u) X I (Zi 6x; i = 1) = ; n (x) = nCn (Zi ) aW Cn (u) i=1
which is comparable to the Nelson–Aalen estimator of the cumulative hazard function for right censored data. A one-to-one correspondence between F and is as follows: Y (1 − (u)); (1.2) 1 − F(x) = exp(−c (x)) u6x
where c (x) is the continuous part of . As a fully nonparametric estimator of F the product-limit (PL) estimator Fnpl pertaining to n is de ned by Y (1 − [nCn (Zi )]−1 )i ; 1 − Fnpl (x) = i : Zi 6x
which received much attention in the literature (e.g., Tsai et al., 1987; Gu and Lai, 1990; Lai and Ying, 1991; Gijbels and Wang, 1993; Zhou, 1996). For right-censored data, Abdushukurov (1987), Cheng and Lin (1987) and Dikta (1998) studied the estimators of the survival function for some semiparametric models, respectively. For left truncated data, Wang (1989) derived the maximum-likelihood estimator (MLE) of the survival function for a semiparametric model in which the truncation mechanism is parameterized. To motivate our semiparametric estimator, we observe that Z x m(u) dW ∗ (u); W1 (x) = aW
where W ∗ (x) = P(Z6x | T 6Z) = −1
Z
x
aW
G(u) dW (u)
and m(x) = P( = 1 | Z = x; T 6Z) denotes the conditional expectation of given T 6Z = x. The importance of m(x) has been pointed out in Stute and Wang (1993) and Dikta (1998) for censored data. It can be checked that Z x m(u) dW ∗ (u): (1.3) (x) = C(u) aW Obviously, (1.3) can be used to de ne a great variety of semiparametric models by describing m in some parametric form. An appropriate estimator of F can then be obtained by estimating m parametrically. This approach seems more exible for practical purposes than nonparametric model, and the corresponding estimator
L. Sun, L. Zhu / Statistics & Probability Letters 48 (2000) 217 – 227
219
of F might also be more ecient than the PL estimator. In this article, as discussed by Dikta (1998), we assume that m belongs to a parametric family, so that we can write m(x) = m(x; 0 ); where m(· ; ·) is a known continuous function and 0 = (01 ; : : : ; 0k ) ∈ is an unknown parameter. Possible candidates for m can be found in Cox and Snell (1989) and Dikta (1998). For the estimation of the parameter 0 we use a maximum likelihood approach. The function m may then be estimated parametrically through mˆ n (x) = m(x; ˆn ), where ˆn is the MLE of 0 . The semiparametric estimators of and F, respectively, are then given by Z x X m(Zi ; ˆn ) m(u; ˆn ) dWn (u) ˆ = n (x) = Cn (u) nCn (Zi ) aW i : Z 6x i
and 1 − Fˆ n (x) =
Y i : Zi 6x
m(Zi ; ˆn ) 1− nCn (Zi )
! ;
Pn where Wn (x) = n−1 i=1 I (Zi 6x). In the next section we state the main results of the paper. The proofs of the results are given in the third section. In particular, we show that our estimator is superior to the PL estimator in terms of asymptotic variance under the stated model assumptions. Apart from the technical details, the outlines of the proofs of theorems are similar to those of Dikta (1998). 2. Main results We begin this section with some general results on the consistency and asymptotic normality of MLE for the parameter 0 . Similar to Stute (1992) and Dikta (1998), the likelihood function is given by Ln () =
n Y
m(Zi ; )i (1 − m(Zi ; ))1−i :
i=1
Hence, the corresponding normalized log-likelihood function equals ln () = n−1
n X
[i ln m(Zi ; ) + (1 − i )ln(1 − m(Zi ; ))]
i=1
and the strong law of large numbers (SLLN) yields that with probability one, Z ∞ [m(x; 0 )ln m(x; ) + (1 − m(x; 0 ))ln(1 − m(x; ))] dW ∗ (x) ln () → aW
:= LW ∗ (0 ; ):
(2.1)
We now state the conditions for consistency and asymptotic normality of ˆn . Basically, these conditions are adaptions of those which can be found in the usual MLE theory (see, for example, Perlman, 1972; Dikta, 1998). To simplify the notation we write Dr m(x; 0 ) for [@m(x; )=@r ]| = 0 and Grad(m(x; 0 )) for (D1 m(x; 0 ); : : : ; Dk m(x; 0 )). (A1) For each 6= 0 , Z ∞ Z ∞ I (m(x; ) = 0)m(x; 0 ) dW ∗ (x) = 0 = I (m(x; ) = 1)(1 − m(x; 0 )) dW ∗ (x) aW
aW
220
and
L. Sun, L. Zhu / Statistics & Probability Letters 48 (2000) 217 – 227
Z
∞
aW
I (B) dW ∗ (x) ¿ 0
where B = {x: m(x; ) 6= m(x; 0 )}:
(A2) There exists a measurable solution ˆn ∈ of equation Grad(ln ())=0, which tends to 0 in probability. (A3) ln m(x; ) and ln(1 − m(x; )) possesses continuous partial derivatives of second order with respect to at each ∈ and x¿0. Furthermore, Dr; s ln m(·; ) and Dr; s ln(1 − m(·; )) are measurable for each ∈ and there exists a neighborhood V1 (0 ) ⊂ of 0 and a measurable function M1 such that EM12 (Z1 ) ¡ ∞ and |Dr; s ln m(x; )| + |Dr; s ln(1 − m(x; ))|6M1 (x) for all ∈ V1 (0 ), x¿0, and 16r; s6k. (A4) For 16r6k, [Dr m(Z1 ; 0 )=m(Z1 ; 0 )]4 and [Dr m(Z1 ; 0 )=(1 − m(Z1 ; 0 ))]4 have nite expectation. (A5) The matrix I (0 ) = (rs )16r; s6k is positive de nite, where Dr m(Z1 ; 0 )Ds m(Z1 ; 0 ) : (2.2) rs = E m(Z1 ; 0 )(1 − m(Z1 ; 0 )) (A6) m(x; ) possesses continuous partial derivatives of second order with respect to at each ∈ and x¿0. Furthermore, Dr; s m(·; ) is measurable for each ∈ and there exists a neighborhood V2 (0 ) ⊂ of 0 and a measurable function M2 such that EM22 (Z1 ) ¡ ∞ and |Dr; s m(x; )|6M2 (x) for all ∈ V2 (0 ), x¿0, and 16r; s6k, and nally, Grad(m(·; 0 )) is bounded on [aW ; b], where aW ¡ b ¡ bW . (A7) For 16r6k, Dr m(·; 0 ) is Lipschitz continuous of order one on [aW ; b], where aW ¡ b ¡ bW . First, we derive strong consistency of the MLE. It constitutes an adaptation of Theorem 2:4 of Perlman (1972) or Theorem 2:1 of Dikta (1998) to the present situation. Theorem 1. Let ⊂ Rk be compact and 0 ∈ . Assume that LW ∗ (0 ; 0 ) is ÿnite and that m(·; ·) is continuous. If (A1) holds; then ˆn → 0 a:s: Next, we state the asymptotic normality of MLE. Theorem 2. Let be a connected; open subset of Rk . If (A2) – (A5) are satisÿed; then n1=2 (ˆn − 0 ) → Nk (0; I −1 (0 )) in distribution; where I (0 ) = (rs )16r; s6k is deÿned in (2:2). Uniform consistency of ˆn and Fˆ n are given in the following theorem. Theorem 3. Let be a connected; open subset of Rk . Assume that ˆn ∈ is a measurable solution of equation Grad(ln ())=0 such that ˆn → 0 with probability one; and that m(x; ) possesses continuous partial derivatives with respect to at each ∈ and x¿0. Furthermore; Dr m(·; ) is measurable for each ∈ and there exists a neighborhood V (0 ) ⊂ of 0 and a measurable function M such that |Dr m(x; )|6M (x) and EM 2 (Z1 ) ¡ ∞ for all ∈ V (0 ); x¿0; and 16r6k. Then for aW ¡ b ¡ bW ; sup |ˆn (x) − (x)| → 0
a:s:
(2.3)
sup |Fˆ n (x) − F(x)| → 0
a:s:
(2.4)
aW 6x6b
and aW 6x6b
L. Sun, L. Zhu / Statistics & Probability Letters 48 (2000) 217 – 227
221
The next theorem presents the functional central limit result for the process n1=2 (ˆn (x) − (x)) and n (Fˆ n (x) − F(x)). 1=2
Theorem 4. If W is continuous and (A2)–(A7) are satisÿed; then (i) n1=2 (ˆn (x) − (x)) converges weakly to a mean zero Gaussian process with covariance Z xZ y Z x m(u; 0 ) (u; v) dW1 (u) + dW ∗ (v) dW ∗ (u) Cov( (x); (y)) = 2 (u) C C(u)C(v) aW aW aW
(2.5)
for aW 6x6y6b; where (u; v) = (Grad(m(u; 0 )) I −1 (0 )Grad(m(v; 0 )): (ii) n1=2 (Fˆ n (x) − F(x)) converges weakly to a mean zero Gaussian process (1 − F) . The following corollary shows that for each aW 6x6b ¡ bW , the asymptotic variance of the PL process exceeds that of n1=2 (Fˆ n − F). For this, we denote the asymptotic variance of n1=2 (Fnpl (x) − F(x)) by Vp (x) and that of n1=2 (Fˆ n (x) − F(x)) by V (x), respectively. Corollary 5. Under the assumptions of Theorem 4; Vp (x) − V (x) = (1 − F(x))2 r(x)¿0 for aW 6x6b ¡ bW ; where Z xZ x Z x 1 − m(u; 0 ) (u; v) dW1 (u) − dW ∗ (v) dW ∗ (u): r(x) = 2 (u) C C(u)C(v) aW aW aW
(2.6)
3. Proofs of the main results For two functions 1 ; 2 de ned on [0; ∞) with values in [0; 1], which satisfy Z ∞ Z ∞ I (2 = 0)1 dW ∗ = 0 = I (2 = 1)(1 − 1 ) dW ∗ ; aW
de ne
aW
Z KW ∗ (1 ; 2 ) =
∞
aW
∗
ln(1 =2 )1 dW +
Z
∞
aW
ln((1 − 1 )=(1 − 2 ))(1 − 1 ) dW ∗ ;
(3.1)
where 0 ln(0=0) and 0=0 are taken to be zero. Similar to the corresponding arguments in usual MLE theory (see, e.g. Stute, 1992; Dikta, 1998), we get that KW ∗ is well de ned with KW ∗ (1 ; 2 )¿0 and Z ∞ ∗ I (1 6= 2 ) dW ∗ ¿ 0: (3.2) KW (1 ; 2 ) ¿ 0 ⇔ aW
Note that KW ∗ (m(·; 0 ); m(·; )) = LW ∗ (0 ; 0 ) − LW ∗ (0 ; ); where LW ∗ is de ned in (2.1). If LW ∗ (0 ; 0 ) is nite and (A1) holds, then LW ∗ (0 ; ·) has a unique maximum at 0 . Proofs of Theorems 1 and 2. The proofs are almost exactly the same as those in Theorems 2:1 and 2:3 of Dikta (1998), hence are omitted.
222
L. Sun, L. Zhu / Statistics & Probability Letters 48 (2000) 217 – 227
Proof of Theorem 3. It follows from Taylor expansion that m(x; ˆn ) = m(x; 0 ) + (Grad(m(x; ∗ ))) (ˆn − 0 ); n
where
lies in the line segment connecting ˆn and 0 . Therefore, Z x Z x m(u; 0 ) m(u; 0 ) dWn (u) − dWn (u) ˆn (x) − (x) = C(u) aW Cn (u) aW Z x Z x m(u; 0 ) m(u; 0 ) dWn (u) − dW ∗ (u) + C(u) C(u) aW aW n∗
Z +
x aW
(Grad(m(u; n∗ ))) (ˆn − 0 ) dWn (u) Cn (u)
:= In1 (x) + In2 (x) + In3 (x): ∗
Let G (x) = P(T 6x | T 6Z) and Gn (x) = n−1 ∗
G (x) ¡∞ aW 6x6b C(x) sup
Pn
i=1
I (Ti 6x). It can be checked that
∗
W (x) ¡ ∞: aW 6x6b C(x)
and
sup
Note that C(x) = G ∗ (x) − W ∗ (x−) and Cn (x) = Gn (x) − Wn (x−). Then from Csaki (1975), we have that for any ¿ 0, |Cn (x) − C(x)| |Gn (x) − G ∗ (x)| G ∗1=2 (x) 6 sup sup 1=2 (x) C 1=2 (x) G ∗1=2 (x) aW 6x6b aW 6x6b aW 6x6b C sup
+
|Wn (x−) − W ∗ (x−)| W ∗1=2 (x) sup 1=2 (x) W ∗1=2 (x−) aW 6x6b aW 6x6b C sup
= o(n−1=2 (log n)1+ ) a:s:
(3.3)
In view of (1.1), by Lemma 1 of Zhou (1996) (showing that supi : Zi 6b C(Zi )=Cn (Zi ) = O(log n) a:s:) and the SLLN, we get that for any ¿ 0, Z |Cn (x) − C(x)| C(Zi ) b dWn (u) max sup |In1 (x)| = sup i : Zi 6b Cn (Zi ) aW C 3=2 (u) C 1=2 (x) aW 6x6b aW 6x6b = o(n−1=2 (log n)2+ )
a:s:
The process In2 (x) is the empirical processes over VC classes of functions with square integrable envelope, so it satis es the LIL (e.g. Alexander and Talagrand, 1989), i.e., its sup over aW 6x6b is a.s. of the order of ((log log n)=n)1=2 . Similarly, sup |In3 (x)| → 0
aW 6x6b
a:s:;
which completes the proof of (2.3). As in the proofs of Lemmas 1:6 and 1:7 of Stute (1993), we obtain (3.4) sup |1 − Fˆ n (x) − exp(−ˆn (x))| = O(n−1 ) a:s: aW 6x6b
Now, (2.4) follows from the continuity of exp together with (3.4) and (2.3). This completes the proof of Theorem 3. The following lemma gives a representation of ˆn − , which easily leads to the corresponding weak limit.
L. Sun, L. Zhu / Statistics & Probability Letters 48 (2000) 217 – 227
223
Lemma 1. Under the assumptions of Theorem 3; if W is continuous and (A6) holds; then Z x Z x m(u; 0 ) m(u; 0 )(C(u) − Cn (u)) ∗ ˆ d[Wn (u) − W (u)] + dW ∗ (u) n (x) − (x) = C(u) C 2 (u) aW aW + n−1
n X i=1
i − m(Zi ; 0 ) m(Zi ; 0 )(1 − m(Zi ; 0 ))
Z
x
aW
(u; Zi ) dW ∗ (u) + op (n−1=2 ) C(u)
uniformly in aW 6x6b. Proof. A straightforward calculation shows m(u; 0 ) m(u; 0 )(C(u) − Cn (u)) m(u; ˆn ) − m(u; 0 ) m(u; ˆn ) = + + Cn (u) C(u) C 2 (u) C(u) + :=
m(u; ˆn ) − m(u; 0 ) m(u; ˆn )(Cn (u) − C(u))2 + (C(u) − Cn (u)) Cn (u)C 2 (u) C 2 (u)
5 X
Lni (u):
(3.5)
i=1
Note that
Z sup
aW 6x6b
x
aW
Z Ln5 (u) dWn (u) 6kCn − Ck
b
aW
|m(u; ˆn ) − m(u; 0 )| dWn (u): C 2 (u)
Taylor expansion together with (A2) and (A6) yields Z b Z b |m(u; ˆn ) − m(u; 0 )| dWn (u) ˆ dWn (u) 6 sup kGrad(m(x; 0 ))k · kn − 0 k 2 (u) 2 C aW 6x6b aW aW C (u) k + kˆn − 0 k2 2
Z
b
aW
M (u) dWn (u) + op (1): C 2 (u)
It follows from Cauchy–Schwartz inequality and the SLLN that !1=2 !1=2 Z Z b Z b b M (u) dWn (u) 2 dWn (u)6 M (u) dWn (u) ¡ ∞ a:s: 2 4 aW C (u) aW aW C (u) Since Cn − C is the dierence of two empirical processes, the Dvoretzky–Kiefer–Wolfowitz bound and the consistency of ˆn imply Z x Ln5 (u) dWn (u) = op (n−1=2 ): (3.6) sup aW 6x6b
aW
Likewise,
Z sup
aW 6x6b
x
aW
Ln4 (u) dWn (u) = Op (n−1 ):
Similar to Lemma 3:5 of Dikta (1998), we have m(x; ˆn ) − m(x; 0 ) = n−1
n X i=1
i − m(Zi ; 0 ) (x; Zi ) + Rn (x); m(Zi ; 0 )(1 − m(Zi ; 0 ))
(3.7)
224
L. Sun, L. Zhu / Statistics & Probability Letters 48 (2000) 217 – 227
where (x; y) = (Grad(m(x; 0 ))) I −1 (0 )Grad(m(y; 0 )) and uniformly for aW 6x6b ¡ bW , |Rn (x)|6M (x)Op (n−1 ) + op (n−1=2 ): Thus,
Z Z x n x X i − m(Zi ; 0 ) (u; Zi ) −1 dWn (u) = op (n−1=2 ): Ln3 (u) dWn (u) − n sup m(Zi ; 0 )(1 − m(Zi ; 0 )) aW C(u) aW 6x6b aW i=1
Simple algebra shows Z x n X i − m(Zi ; 0 ) (u; Zi ) −1 dWn (u) n m(Zi ; 0 )(1 − m(Zi ; 0 )) aW C(u) i=1
=Un1 (x) − Un2 (x) − n−1 (Un1 (x) − Un2 (x)) + n−1 rn (x); where Un1 (x) =
Un2 (x) =
1 n(n − 1) 1 n(n − 1)
and rn (x) = n−1
n X
X 16i6=j6n
X 16i6=j6n
i (Zj ; Zi ) I (Zj 6x); m(Zi ; 0 )C(Zj ) (1 − i )(Zj ; Zi ) I (Zj 6x) (1 − m(Zi ; 0 ))C(Zj )
(Zi ; Zi ) i − m(Zi ; 0 ) I (Zi 6x): m(Zi ; 0 )(1 − m(Zi ; 0 )) C(Zi )
i=1
Let Z˜ i = i Zi . Since W is continuous, i (x; Zi ) (Z˜ i ¿ 0)(x; Z˜ i ) = a:s: m(Zi ; 0 ) m(Z˜ i ; 0 ) Therefore, Un1 (x) =
1 n(n − 1)
X
h1 (Zj ; Z˜ i )I (Zj 6x);
16i6=j6n
is a U-statistic process as studied in Stute (1994) with kernel I (y ¿ 0)(x; y) I (x6b): h1 (x; y) = m(y; 0 )C(x) It follows from (A4), (A6) and Cauchy–Schwarz inequality that Z ∞Z b h21 (x; y) dW ∗ (x) dW1 (y) ¡ ∞: aW
aW
Hence Theorem 1:5 of Stute (1994) gives Z ∞Z x Z h1 (u; v) dW ∗ (u) dW1n (v) − Un1 (x) = aW
Z
+
aW
∞ aW
Z
∞
aW
x
aW
Z
x
aW
h1 (u; v) dW ∗ (u) dW1 (v) + rn1 (x);
h1 (u; v) dWn (u) dW1 (v)
L. Sun, L. Zhu / Statistics & Probability Letters 48 (2000) 217 – 227
225
where sup |rn1 (x)| = Op (n−1 ):
aW 6x6b
Similarly, we have Z ∞Z Un2 (x) = aW
aW
Z
+
x
∞
h2 (u; v) dW ∗ (u) dW2n (v) −
Z
aW
x
Z
∞
Z
aW
x
aW
h2 (u; v) dWn (u) dW2 (v)
h2 (u; v) dW ∗ (u) dW2 (v) + rn2 (x);
aW
where h2 (x; y) =
I (y ¿ 0)(x; y) I (x6b); (1 − m(y; 0 ))C(x)
W2 (x) = P(Z6x; = 0 | T 6Z); −1
W2n (x) = n
n X
I (Zi 6x; i = 0)
i=1
and sup |rn2 (x)| = Op (n−1 ):
aW 6x6b
Thus, −1
Un1 (x) − Un2 (x) = n
n X i=1
i − m(Zi ; 0 ) m(Zi ; 0 )(1 − m(Zi ; 0 ))
Z
x
aW
(u; Zi ) dW ∗ (u) + Op (n−1 ) C(u)
uniformly on aW 6x6b. It can be checked that sup |Un1 (x) − Un2 (x)| = Op (1)
aW 6x6b
and sup |rn (x)| = Op (1):
aW 6x6b
Therefore,
Z Z x n x X i − m(Zi ; 0 ) (u; Zi ) −1 ∗ dW (u) = op (n−1 ): Ln3 (u) dWn (u) − n sup m(Z ; )(1 − m(Z ; )) C(u) i 0 i 0 aW 6x6b aW aW
(3.8)
i=1
In view of (1.1), similar to the proof of Theorem 5:1 of Arcones and Gine (1995), we obtain Z x m(u; 0 )(C(u) − Cn (u)) ∗ dW (u) = O(n−1 log log n) a:s: sup Ln2 (u) dWn (u) − 2 C (u) aW 6x6b
aW
Putting together (3.5) – (3.9), we complete the proof of Lemma 1. Proof of Theorem 4. De ne n (x) = n1=2 (Wn (x) − W ∗ (x)); n (x) = n1=2 (Cn (x) − C(x))
(3.9)
226
L. Sun, L. Zhu / Statistics & Probability Letters 48 (2000) 217 – 227
and
n (x) = n1=2
n X i=1
i − m(Zi ; 0 ) (x; Zi ): m(Zi ; 0 )(1 − m(Zi ; 0 ))
It follows from Lemma 1 that n1=2 (ˆn (x) − (x)) =
n (x)
+ op (1);
(3.10)
uniformly on aW 6x6b, where Z x Z x Z x m(u; 0 ) dn (u) n (u) dW (x) = (u) +
n (u) dW ∗ (u) − n 1 2 (u) C(u) C aW aW aW :=
n1 (x)
+
n2 (x)
+
n3 (x):
In view of (A7), arguments similar to those used in the proof of Lemma 3:13 and Theorem 2:5 of Dikta (1998) imply that n1=2 (ˆn (x) − (x)) converges weakly to a centered Gaussian process . A straightforward calculation shows that the covariance structure of has the form given in (2.5), which completes the proof of part (i). Part (ii) is immediate from part (i) and (3.4). The proof of Theorem 4 is completed. Proof of Corollary 5. It is well known that for aW 6x6b, Z x dW1 (u) : Vp (x) = (1 − F(x))2 2 aW C (u) Thus, Vp (x) − V (x) = (1 − F(x))2 r(x); where r(x) is de ned in (2.6). Similar to the proof of Corollary 2:7 in Dikta (1998), we get r(x)¿0, which completes the proof of Corollary 2:5. Further reading Beirlant et al., 1992; Billingsley, 1968; Struthers and Farewell, 1989; Wang et al., 1987. Acknowledgements The authors are extremely grateful to referees for many valuable comments and suggestions. Special thanks are due to one of the referees whose careful correction of the language greatly improved the presentation of the manuscript. References Abdushukurov, A.A., 1987. Nonparametric estimation in the proportional hazards model of random censorship. Akad. Nauk Uz Tashkent. VINITI No. 3448-V (in Russian). Alexander, K., Talagrand, M., 1989. The law of the iterated logarithm for empirical process on Vanik–Cervonenkis classes. J. Multivariate Anal. 30, 155–166. Arcones, M.A., Gine, E., 1995. On the law of the iterated logarithm for canonical U-statistics and processes. Stochastic Process. Appl. 58, 217–245.
L. Sun, L. Zhu / Statistics & Probability Letters 48 (2000) 217 – 227
227
Beirlant, J., Carbonez, A., van der Meulen, E., 1992. Long run proportional hazards models of random censorship. J. Statist. Plann. Inference 32, 25– 44. Billingsley, P., 1968. Convergence of Probability Measure. Wiley, New York. Cheng, P.E., Lin, G.D., 1987. Maximum likelihood estimation of a survival function under the Koziol–Green proportional hazards model. Statist. Probab. Lett. 5, 75–80. Cox, D.R., Snell, E.J., 1989. Analysis of Binary Date, 2nd Edition. Chapman & Hall, London. Csaki, E., 1975. Some notes on the law of the iterated logarithm for empirical distribution function. In: Revesz, P. (Ed.), Collq. Math. Soc. Janos Bolyai 11, Limit Theorem of Probability Theory. North-Holland, Amsterdam, pp. 45–58. Dikta, G., 1998. On semiparametric random censorship models. J. Statist. Plann. Inference 66, 253–279. Gijbels, I., Wang, J.L., 1993. Strong representations of the survival function estimator for truncated and censored data with applications. J. Multivariate Anal. 47, 210–229. Gu, M.G., Lai, T.L., 1990. Functional laws of the iterated logarithm for the product-limit estimator of a distribution function under random censorship or truncation. Ann. Probab. 18, 160–189. Lai, T.L., Ying, Z., 1991. Estimating a distribution function with truncated and censored data. Ann. Statist. 19, 417– 442. Perlman, M.D., 1972. On the strong consistency of approximate maximum likelihood estimates. Proceedings of the Sixth Berkeley Symposium on Mathematics Statistics and Probability, Vol. 1. University of California Press, Berkeley, CA, pp. 263–281. Struthers, C.A., Farewell, V.T., 1989. A mixture model for times to AIDS data with left truncation and an uncertain origin. Biometrika 76, 814 –817. Stute, W., 1992. Strong consistency of the MLE under random censorship. Metrika 39, 257–267. Stute, W., 1993. Almost sure representations of the product-limit estimator for truncated data. Ann. Statist. 21, 146–156. Stute, W., 1994. U-statistic processes: a martingale approach. Ann. Probab. 22, 1725–1744. Stute, W., Wang, J.L., 1993. The strong law under random censorship. Ann. Statist. 21, 1591–1607. Tsai, W.Y., Jeweli, N.P., Wang, M.C., 1987. A note on the product limit estimator under right censoring and left truncation. Biometrika 74, 883–886. Wang, M.C., 1989. A semiparametric model for randomly truncated data. J. Amer. Statist. Assoc. 84, 742–748. Wang, M.C., Jewell, N.P., Tsai, W.Y., 1987. Asymptotic properties of the product limit estimate under random truncation. Ann. Statist. 14, 1597–1605. Zhou, Y., 1996. A note on the TJW product–limit estimator for truncated and censored data. Statist. Probab. Lett. 12, 381–387.