Statistics & Probability Letters 6 (1988) 419-426 North-Holland
May 1988
DENSITY ESTIMATION IN THE SIMPLE PROPORTIONAL HAZARDS MODEL
Sfindor C S O R G O *
Bolyai Institute, Szeged University, Szeged, Hungary
Jan MIELNICZUK Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland Received April 1987 Revised September 1987
Abstract: In the simple proportional hazards model of random right censorship the limiting v a r i a n c e O2CL(X) at x of the kernel density estimator based on the Abdushukurov-Cheng-Lin estimator is shown to be equal to the corresponding variance pertaining to the Kaplan-Meier estimator times the expected proportion p of uncensored observations. More surprisingly, for appropriate p, V2CL(X)is smaller than the asymptotic variance of the classical kernel estimator based on a complete sample, for any x below the (1 - e-1)-quantile. Keywords: random censorship, proportional hazards, density estimation, kernel, nearest neighbor, small asymptotic variance.
1. I n t r o d u c t i o n
Let X 1. . . . . X n be i n d e p e n d e n t real r a n d o m variables with a d i s t r i b u t i o n f u n c t i o n F. These are c e n s o r e d on the right b y the i n d e p e n d e n t r a n d o m variables Y1. . . . . Yn with d i s t r i b u t i o n f u n c t i o n G, so that the observations available are the p a i r s (Z~, 8k), where Z k = min(X~, Y~) a n d 8 k = I { Z ~ = X k } , 1 ~ k<~ n, and where I { . } is the i n d i c a t o r function. I n the u s u a l general m o d e l of r a n d o m censorship the only assumption is that the c e n s o r e d X sequence a n d the censoring Y sequence a r e i n d e p e n d e n t a n d hence 1 - H ( x ) = P ( Z > x } = (1 - F ( x ) ) ( 1 - G ( x ) ) , x ~ R. T h e e n o r m o u s l i t e r a t u r e o n this m o d e l is centered around the c e l e b r a t e d K a p l a n - M e i e r p r o d u c t - l i m i t e s t i m a t o r fin of F d e f i n e d as Jn( t )
1-P~(x)=
I-I((n-i)/(n-i+l))
~''n i f x < Z n , ~
i=1
and 1 - f i n ( x ) = O, if x >/Zn,,, where Z1, . ~< - - . ~< Z~,n are the o r d e r statistics o f Z 1. . . . . Z n a n d 8a . . . . . . . 8~,n d e n o t e the c o n c o m i t a n t labelling of 81 . . . . . 8~, a n d j n ( t ) is the value of j such that Zj, n <~t < Zj+ l,n.
The simple proportional hazards model is an appealing special nonparametric-parametric model in which there exists a positive constant c, the censoring parameter, such that 1-G(x)=(1-F(x))
c,
x~N,
(1.1)
so that the h a z a r d f u n c t i o n of G is c times the h a z a r d f u n c t i o n of F. This m o d e l has b e e n very frequently used for an easier a n d heuristically m o r e clearly i n t e r p r e t a b l e d e s c r i p t i o n o f the b e h a v i o r of the * Supported by the Hungarian National Foundation for Scientific Research, Grant No. 1808.
0167-7152/88/$3.50 © 1988, Elsevier Science Publishers B.V. (North-Holland)
419
Volume 6, Number 6
STATISTICS & PROBABILITY LETTERS
May 1988
product-limit estimator and other estimators related to it. The presently available literature concerning the model is surveyed and discussed at length in S. CsSrg5 (1988). Under (1.1), the expected proportion p = P{ 6 = 1} of uncensored observations is p = (1 + c) -1 and
1-F(x)=(1-H(x))
p,
x~R.
(1.2)
The turning point is Abdushukurov (1984) an Cheng and Lin (1984, 1987) who have independently pointed out that the natural (sufficient and maximum likelihood) estimator F, of F defined by
1-F,(x)=(1-Hn(x))
p°,
x~n,
(1.3)
where H,( x ) = n - a # { 1 <~k <~n: Z k <.~x } and p, = n lE~k=18k, is to be preferred to ft, under (1.1). It is of course of a much simpler structure and the variance function of the mean-zero limiting Gaussian process of nl/2{ F,,(.)- F(.)} is uniformly strictly smaller than that of nl/2{ fin(')- F(-)}. (See also Cheng and Chang (1987) and S. CsSrg8 (1988) for a thorough discussion and Section 3 below.) Suppose now that F has a density function f with respect to the Lebesgue measure on R. From now on we assume that we are in the proportional hazards model (1.1). (For testing for this model see S. CsSrg8 (1989).) In this model the natural kernel-type estimator of f ( x ) is
1K 1 n (x-Zil( fn(x)~-~nf (xT- Y) dFn(y)-Pn-~n i~1K~ -
1
p-1 -H.(Zi))
"
,
(1.4)
where K(- ) is a given kernel density function and b, is a sequence of positive bandwidth numbers tending to zero as n -o oo. Also, the natural nearest neighbor (NN) estimator for f ( x ) is
1
( x-y
)
1 " Ix-Z j\ dFo(y)=p°nR°(x) 2= KIR--7(;TJ(1--Uo(Z,))P° ',
(1.5)
where R , ( x ) / 2 is the random distance from x to its k,-th nearest neighbor in the sequence Z 1..... Z,, where k, is a sequence of positive integers such that k, = o(n) as n --o oo. Of course, the way in which these estimators arise is that the ACL estimator F, replaces the common empirical distribution function, used when the X-sample is fully observable, or the K M product-limit estimator F,, used in the general censorship model. For properties of the corresponding estimators f~ and f~* in the latter case we refer to Mielniczuk (1986, 1987) and the references therein. The aims of the present paper are to study the asymptotic properties of the estimators'f~ and f,* introduced in (1.4) and (1.5) and to compare these properties with the corresponding properties of f~ and ]~* and of the ones with a fully observable X-sample. These are done in Sections 2 and 3, respectively. Quite unexpectedly it turns out that for some proportion p < 1, the asymptotic variance of f, is smaller than that of its classical analogue f o with fully observable X's, uniformly on a large part of the support of F. One can also consider the natural estimator f ~ ( x ) = p . ( 1 - I-I~(x)) p" ~h.(x) of f , directly based on (1.2), where hn(x) is any estimator of h(x) = H ' ( x ) based on Z 1. . . . . Z,. The asymptotic results for f~ can be easily derived from those for h,. Of course, it turns out that f~ is asymptotically equivalent to f, if f% is a kernel estimator and to f,* if h, is a nearest neighbor estimator. However, contrary to the cases of f , and f,*, f~ is always discontinuous. At the time of the revision of this paper (Aug_ust, 1987) the authors received Abdushukurov (1987). He investigates f, and f~, with a kernel estimator h,, and the corresponding hazard rate estimators. In the latter relation, see also Cheng (1987). 420
Volume 6, Number 6
STATISTICS& PROBABILITYLETTERS
May 1988
2. Results
Set FF= inf{x: F ( x ) = 1) ~< oo. Theorem 1 below summarizes our present knowledge about f,. For the sake of simplicity of formulation, in the asymptotic confidence band result of part (iii) we assume that the support of the kernel density K ( . ) is contained in [ - 1, 1]. Preparing to this statement, we introduce the notation
Ko=fK~(t) dt,
K~=(K2(1)+K2(-a))/(2Ko),
K2=f(K'(t)) ~ d t / ( 2 K o ) ,
c. = ( b - a ) / b ~ ,
(2 log c,,) a/2 + (2 d,,=
log C n ) - 1 / 2 ( 2
-1
log log
c n + log(Kilt -1/2 ) ),
(2 log c,,)1/2+ (2 log c.) -1/2 log('rr 12 1/2K1/2),
K a > O, K a =0.
where K ' ( . ) is the derivative of K(-), and (nb,)'/ZlL(x)
Q.(x) =
-f(x)
[ 1/2 "
(p.(l-H.(x))P"-lf(x)fK2(y)dy) Also, convergence and order relations are meant as n ~ oo if not specified otherwise, --,e denotes convergence in distribution, "a.s." stands for "almost surely" and N ( m , v 2) is a normal random variable with mean m and variance v2. Theorem I. (i) I f K is bounded on a compact support, b, is such that ~OOn=lexp( - anb n) < oo for every a > 0, and x < Tr is a Lebesgue point o f f , then f,(x)---, f ( x ) a.s. (ii) If, in addition to the conditions in ( i ), K is an even function, b, = o( n - 1 / s ) , f has a second derivative which is bounded in a neighborhood of x and f ( x ) > O, then
(nb~)l/2(fn(x)-f(x))~eN(O,
p(l-
g(x)) p lf(x) fK2(y)
dy).
(2.1)
(iii) I f K is an even, differentiable function with support contained in [ - 1, 1] such that K ' is bounded, bn = n -~ with ½ < a < ½, f is positive on [a - c, b + c] and the second derivative f " o f f is bounded on [a - ~,b + c], where - oe < a < b < T F and e > 0 is arbitrarily small, then for each t ~ R, P{(21ogcn)l/2(
sup Q . ( x ) ' d ~ ) f ~ t } ~ e x p ( - 2 e - t ) .
(2.2)
a<~x<~b
Proof. (i) Since the support of K is contained in some finite interval [ - a , a], it will be enough to deal with only those terms in the sum in (1.4) for which x - ab n <~Z i <<.x + abn. Introduce the intermediate approximation
f."'(x) =
1
~
,__z:l
[x-Zi)
m z , ) ) "-1
A routine application of the mean value theorem, the Glivenko-CanteUi theorem for H . - H and the inequality [log u[ < u -1, 0 < u < 1, show that, for any t ~ [x - abe, x + abe], (1-Hn(t)) p"-I- (1-g(t))
p-1 <~An(X ) 421
Volume 6, Number 6
STATISTICS & PROBABILITY LETTERS
May 1988
a.s. for all n sufficiently large, where
An(x)=(1-H(x+ab.))-2{4
]tt~(y)-H(y)[+[pn-p[ }.
sup x - ab. <~y <~x + ab.
Therefore I f , ( x ) (1979),
-f,°)(x) I <<-PnA,(x)hn(x), a.s.
h.(x) = - ~1 i ~__1K where
h(x) = H'(x) f(x)
--*h(x)
for all n large enough, where, b y Devroye and Wagner
a.s.,
i2.3)
for which by (1.2) we have
=p(1 -
H(x))P-lh(x).
(2.4)
The G l i v e n k o - C a n t e l l i theorem and the strong law of large n u m b e r s hence imply that fn ( x ) - f ( 1 ) ( x ) ---+0 a.s. N o w if
(xz)
1 K f ( 2 ) ( X ) = p " ~ n i__2
---~
(1 - H(Z,))
P-' ,
then, since (1 - H(Zi)) p-1 >1(1 - H(x + abn))p-1 >~ c > 0 for some c for all large enough n, we see that f~l)(x)-f~2)(x)-+0 a.s. Finally, if f~3>(x)=p(1-H(x))P-lhn(x), where hn(x ) is as in (2.3), then another application of the mean value theorem gives t h a t [ f n { 2 ) ( x ) - f ( 3 ) ( x ) [ < ~ p ( 1 - p ) B ~ ( x ) h n ( x ), where
Bn(x ) =
(1 -
H(x + abn)) p-2
sup
IH(y) -
H(x)
I --+ 0
(2.5)
x-abn ~ y ~ x + ab~
by the continuity of H at x. Hence (2.3) implies fo(2)(x) - f~3)(x) --+ 0 a.s., and the proof is complete. (ii) The proof of this part goes through the same steps as the p r o o f of part (i). Noting first that the condition bn = o(n -1/5) implies b, log log n ~ 0, we see that the S m i r n o v - C h u n g and the Hinchin laws of the iterated logarithm, applied respectively to H n - H and p , - p , ensure that (nb n)a/2 [fn (x) -f~°)(x) ] ---i0 and (nb,) 1/2 [ f,(l)(x) - f,(Z)(x) [ --+ 0 a.s. Reasoning similarly to Section 2 of Parzen (1962), we see that
(nbn)I/2(f(2)(X)-gf(2>(x))----)~N(O,p ( 1 -
H(x))P-lf(x) fK2(y) dy),
(2.6)
where, in view of (2.4), we have 1 Efn~2)(x) = --~nfK(X~bnY)P(l-H(y))P-'h(y)dy= -1~ f
K
(X~bnY)f(y)dy.
Applying now a two-term Taylor expansion, using the conditions stipulated on F and that K is an even function, we find that U f ( 2 ' ( X ) -- f ( x ) ~- O ( b2 ).
This implies the claimed result on account of the condition nb~ ~ O. (iii) It is easy to see that it is enough to show the statement for Q n ( x ) replacing Q,(x) obtained from Qn(x) upon replacing pn by p and H n b y H. Then again, following the lines of the p r o o f of part (i), it can be checked that ( 2 l o g Cn) 1/2 sup a<~x<<.b
422
]Q.n(x)-Q°(x)]--+0
a.s.,
Volume 6, Number 6
STATISTICS& PROBABILITYLETTERS
May 1988
where
Q°(x) = (nb.)'/2lh.(x) - h(x) l \ 1/2 (h(x)fK2(y) dy) The result now follows by the Corollary of Bickel and Rosenblatt (1973). The analogous summary for the N N estimator is the following. Theorem 2. (i) l f the support of K is contained in [ - ½, ½] and K ( X u ) >~K ( u ) for any u in that support and any O <~X <~l, k. is such that E~.=a e x p ( - a k . ) < oo for every a > O, and x < TF is a Lebesgue point of f for which f ( x ) > O, then f~*(x) ~ f ( x ) a.s. (ii) If, in addition to the conditions in (i), K is even, k. = 0 ( n 2 / 3 ) , and F has a second derivative bounded in a neighbourhood of x, then
kl./2(f~*(x)-f(x)) ~
N(O,
fZ(x) f K2(y) dy).
(2.7)
(iii) If K is even with support in [ - 1, 1] and a bounded second derivative in ( - 1, 1), k , = [n ~] with < a < ~, and f is as in part (iii) of Theorem 1, then (2.2) holds true with c, and d, redefined by setting c, = n( H( b ) - H( a ) ) / k , and Q,( x ) replaced by
Q*(x) =
ka./2lL*(x) -
f(x) [
dy)
1/2
"
Proof. This is completely analogous to that of Theorem 1. The only real change is to use the condition R , ( x ) ---,0 a.s. instead of bn ~ 0, which follows from the fact that x belongs to the support of the measure induced by H and the condition k,/n---, O. In part (i) the strong consistency of h*(x), defined by replacing b, by R , ( x ) in h , ( x ) , follows from Devroye and Wagner (1979) through the Moore and Yackel (1977) equivalence theorem. To obtain a statement analogous to (2.5) in the proof of part (ii) it is enough to notice that kln/2R,(x) = O(k3/Zn -1) ~ 0 a.s. in view of the fact that k , ( R n ( x ) n ) -1 ---, h ( x ) > 0 a.s., and that
P EK nR.(x)
(x-Z~)
((a-mz,))
1-h(x)}=O(Rn(x)h:(x)).
i=1
In the last step we use the results of Moore and Yackel (1976) on the asymptotic normality of NN estimators. Part (iii) is proved analogously to part (iii) of Theorem 1 using the result of M. Cs~Srg6 and R6v~sz (1982).
3. Discussion 3.1. The first point is to compare the limiting variances arising in (2.1) and (2.7) with the corresponding variances for the estimators based on the KM product-limit estimator Fn. Note first that if OA2cL(X)and O~.M(X) denote the variance functions of the limiting Gaussian processes o f nl/Z{F.(.) -F(.)} and nl/Z(F,(.) --F(')} at x < TF, respectively, then (cf. Abdushukurov (1984), 423
Volume 6, Number 6
STATISTICS & PROBABILITY LETTERS
May 1988
Cheng and Lin (1984, 1987), or Cheng and Chang (1985), S. CsSrg5 (1987)) under (1.1) we have
O2cL(X)
P + (l-p)
1 -- H(x) log2( 1 _ 7/-G5
H(x)) <
1.
(3.1)
The smallest value p of this ratio is obtained only as x $ t F = sup(x: F ( x ) = 0) or x ~ Tr. Now if we define )~*(x) by putting F. in (1.5) in place of F. (with the slight modification that R . ( x ) / 2 is the distance from x to its kn-th nearest uncensored neighbor among Z 1. . . . . Z., then Mielniczuk (1986) proves that under the same conditions as in Theorem 2 (ii) we have (2.7) for f~*(x) replacing f.*(x).oThis means that the price that we pay for the otherwise nice property of the NN estimators that their asymptotic variance does not depend on censoring is that the variance reduction appearing in (3.1) is ruined. The NN notion does not seem to be capable to use a prior information about the censoring mechanism. On the other hand, for f . ( x ) one can expect a reduction in the limiting variance relative to .~(x). An explanation of this phenomenon is the following. Define ]~(x) by plugging F. in (1.4) in place of F. and consider also the estimator n
t . ( x ) = (nb,,) -1 E 8 , K ( ( x - Z , ) / b . ) ( 1
- H.(Z,)) p"-I
i=l
Since the model (1.1) is characterized by the independence of the Z ' s and the 8's (cf. Sethuraman (1965)), we have E t , ( x ) = Eft(x) and it can be seen that the asymptotic variance of t~(x) is the same as that of f~(x). Now our f , ( x ) is obtained from t , ( x ) by averaging out the ~i's from the sum and this is what results in an asymptotically smaller variance. The precise result is the following. Mielniczuk (1986) proves that, under the same conditions as in Theorem l(ii), (nb,)l/2(f~(x) - f ( x ) ) ~ N(O, VIM(X)), where V2M(X) = f ( x ) ( 1 -- G ( x ) ) - I f K E ( y ) d y, already under the general censorship model. Now if VA2cL(X)denotes the asymptotic variance appearing in (2.1), then presently, under (1.1), we have V2ACL(X)/V~M(X) =p, regardless of x < TF. Thus, comparing with (3.1), we see that the asymptotic gain of using ACL-based kernel density estimators in the proportional hazards model rather than KM-based kernel estimators is uniformly greater than when estimating a survival function. The heavier is the censorship, the larger is this gain in the clearest possible sense. Also, the kernel notion is sensitive concerning the censoring mechanism and can utilize prior information about it. We note that the optimal variance ratio p has also been obtained by Cheng (1987) for the estimation of the hazard rate function (cf. also Abdushukurov (1987)). Of course we neglect here the aspect of bias in finite samples. This is of necessity because not enough is known presently about the comparison of the corresponding mean square errors. See, however, Theorem 3 below. 3.2. Let us pretend now that even though we have a censored sample obtained under (1.1), by some miraculous happening we can have a glance into the unknown and get hold of the full sample X 1. . . . . X, with sample distribution function F°. Substituting the latter for F~ in (1.4) and (1.5), we obtain the classic estimates f 0 ( x ) and f 0 . (x), respectively. Nothing changes with f 0 . because by the Moore and Yackel (1976) theorem we have (2.7) for f~°*(x) just as well as for f,*(x). This is of no surprise in view of the above. On the other hand, under the same conditions as in Theorem l(ii), by (2.6) we have (nb,)a/2(f~°(x) f ( x ) ) ~ N(0, v02(x)), where v2(x) = f ( x ) f K 2 ( y ) dy. Thus
R r ( p , x) = O2AcL(X)/oE(x) = p ( 1 -- F ( x ) ) (p 1)/p,
(3.2)
where the superscript r refers to right censoring. Whence R~(p, x ) < 1 if and only if F ( x ) < q ( p ) = 424
V o l u m e 6, N u m b e r 6
STATISTICS & PROBABILITY LETTERS
M a y 1988
y = F(x)
.942
1
1.1
I
Fig. 1.
1 _pp/O-e). Since q(p) is a strictly increasing function on (0, 1) with q(0 + ) = 0 and q(1 - ) = 1 - e -~, for any x o < F - l ( 1 - e -a) there exists a proportion Po < 1 such that for any p r o p o r t i o n p >~P0 we have VA2cL(X) < O02(X)uniformly on (-- 00, x0]. Here F -1 is the inverse to F. In fact, for a given x < F - l ( 1 - e 1) the optimal proportion p which minimizes Rr(p, x) is p = p ( x ) --- - l o g ( 1 - F(x)) < 1, for which
Rro(F(x))
=
R r ( - log(1 -- F(x)), x ) = e(1 - F(x)) log(1
-
F ( X ) ) -1,
0 ~< F(x) < 1 - e -1.
So far we were talking about the simple proportional hazards model of r a n d o m right censorship. Consider now the situation when the X ' s are censored on the left so that the observations ( Z k, 8k) are such that Z k = max(X k, Yk) and 8 k = I { Z k = Xk} is as before, 1 ~< k ~< n. The simple left proportional hazards model is when G ( x ) = ( F ( x ) ) c, x ~ R, for some c > 0. The left A C L estimator is F/(x)= (H,(x)) p., x ~ R, and the results for it all remain valid if we change H and F to 1 - H and 1 - F , respectively, in all the formulae (cf. S. C s r r g 8 (1988)). Here we have to be above t F >~ 00. In particular, using F~t, for the resulting kernel estimator f~ we have (2.1) with H(x) replacing 1 - H(x). The variance ratio corresponding to (3.2) is Rl(p, x) =p(F(x)) (p-1)/p. Here Rl(p, x ) < 1 if and only if F(x) > 1 q(p), thus for any x0 > F - l ( e -1) there is a P0 such that for any p >~P0 we have Rl(p, x) < 1 uniformly on [x0, 00). For a given x > F - 1 ( e 1) the optimal p minimizing Rl(p, x) is p = p ( x ) = - l o g F(x) < 1, for which --
RIo(r(x)) = R ' ( - l o g r ( x ) , x) = e r ( x ) l o g ( r ( x ) )
1,
e_ 1 < F(x) <~1.
The attractive shape of these optimal variance ratios is depicted in Figure 1. As a last result we state the following.
Theorem 3. If K has a compact support and x < TF, then
The proof is based on the equality 1
EL(x)= ~fg.(tlh(t)K where
l + ( n _ l)Hn_~(t) )p, 1) n
425
Volume 6, Number 6
STATISTICS & PROBABILITY LETTERS
May 1988
where Hn_ 1 is the empirical distribution function of Z 1. . . . . Z~_ 1 and the last equality follows by the independence of ( Z 1. . . . . Z~) and (81 . . . . . 8.). N o w if we can replace g~(t) by g(t)=p(1 - H ( t ) ) p-1 in the integral then, since g(t)h(t) =f(t), it becomes Ef°(x). So the problem is to prove that s u p ( I g . ( t ) g(t) I: t ~ Ux } = O(n-1), where Ux is an open neighborhood of x. The lengthier details of showing this are omitted. Since Bias (f°(x)) = O(b 2) if F and K satisfy the assumptions of Theorem l(ii), we conclude that the same result holds true for f.(x) provided that the sequence (nl/2b. } is bounded away from zero. Can artificial controlled proportional hazards censoring, created by a bootstrap procedure based on'the above findings, be useful in complete-sample kernel density estimation?
References Abdushukurov, A.A. (1984), On some estimates of the distribution function under random censorship, in: Conference of Young Scientists. Math. lnst. Acad. Sci. Uzbek SSR, Tashkent, VINITI No. 8756-V (in Russian). Abdushukurov, A.A. (1987), Estimation of a probability density and the hazard rate function in the Koziol-Green model of random censorship, Izv. Akad. Nauk UzSSR Set. Fiz.-Mat. Nauk 3, 3-10 (in Russian). Bickel, P. and M. Rosenblatt (1973), On some global measures of the deviations of density function estimates, Ann. Statist. 1, 1071-1095. Cheng, P.E. (1987), Hazard rate estimation under a simple proportional hazards model, Bull. Inst. Math. Acad Sinica 15, 245-254. Cheng, P.E. and Y.C. Chang (1985), PLE versus MLE of survivor functions under a simple proportional hazards model, J. Chinese Statist. Assoc. 23 (Special Issue), 57-73. Cheng, P.E. and G.D. Lin (1987), Maximum likelihood estimation of survival function under the Koziol-Green proportional hazards model, Statist. Probab. Letters 5, 75-80. Cs~rgS, M. and P. R~v~sz (1982), An invariance principle for NN empirical density functions, in: B.V. Gnedenko, M.L. Puri and I. Vincze, eds., Colloquia Math. Soc. J. Bolyai 32, Nonparametric Statistical Inference (North-Holland, Amsterdam) pp. 151-170.
426
Cs0rg~5, S. (1988), Estimation in the proportional hazards model of random censorship, Statistics, to appear. Cs0rg6, S. (1989), Testing for the proportional hazards model of random censorship, in: Proc. Fourth Prague Syrup. Asymp. Statist. (Charles University Press, Prague, to appear). Devroye, L.P. and T.J. Wagner (1979), The L1 convergence of kernel density estimates, Ann. Statist. 7, 1136-1139. Mielniczuk, J. (1986), Some asymptotic properties of kernel estimators of a density function in case of censored data, Ann. Statist. 14, 766-773. Mielniczuk, J. (1987), Asymptotic confidence bands for densities based on nearest neighbor estimators under censoring, Statist. Probab. Letters 5, 125-128. Moore, D.S. and J.W. Yackel (1976), Large sample properties of nearest neighbor density function estimates, in: S.S. Gupta and D.S. Moore, eds., Statistical Decision Theory and Related Topics (Academic, New York) pp. 269-279. Moore, D.S. and J.W. Yackel (1977), Consistency properties of nearest neighbor density estimates, Ann. Statist. 5, 143-156. Parzen, E. (1962), On estimation of a probability density function and mode, Ann. Math. Statist. 33, 1065-1076. Sethuraman, J. (1965), On a characterization of the three limiting types of the extreme, Sankhy~ Ser. A 27, 357-364.