Nonlinear
Analysis,
Theory.
Methods
Pergamon
dr Applications, Vol. 30, No. 6, pp. 3539-3546, 1997 Proc. 2nd World Congress of Nonlinear Analysts
0 1997 Elsevier Science Ltd Printed in Great Britain. All rights reserved 0362-546)(/97
PII: SO362-546X(97)00053-9
NONPARAMETRIC ESTIMATION WITH MEASUREMENT D.A. IOANNIDEStand tDepartment fDepartment
IN TIME ERRORS
$17.00
f 0.00
SERIES
P.D.ALEVIZOS$
of Economics, University of Macedonia, of Mathematics, University of Patras,
54006 26110
Thessaloniki, Rio-Patras,
Greece Greece
1. INTRODUCTION
Let {X,}, j 2 1, be a stationary sequence of real- valued random variables (r.v.‘s), and for each integer p >_1, let f(x) = f(zr, . .. . zP) be the joint probability density function (p.d.f.) of the problem r.v.‘s X1, .. .. X,. Consider the deconvolution Yj =Xj+Cj,
(1-l)
where the (noise) process {cj}, j 2 1, consists of i.i.d. T.V.% independent of the process (Xi}, j 2 1, with known density c(x). Then the joint density function of Yr, . . ..Yn is given by the multidimensional convolution
(1.2) where
S(U) =frEi(uj)-
(1.3)
j=l
In this paper we are interested in estimating nonparametrically the joint cumulative distribution function (c.d.f.) of the r.v.‘s X1, . .. . X,, F(x) = F(zr, .. . . zp), on the basis of the r.v.‘s Yr, . .. . Y,, where it is assumed that p < n. The deconvolution model arises in biostatistics, economics, and various other fields. For example, Xj may represent the true income of subject j, Yj his measured income, and cj the measurement error. The interested reader may find in [ll] additional examples and applications of this model. The nonparametric deconvolution kernel density estimator for the c.d.f. F is related to the corresponding model for the p.d.f. f, because the deconvoluting estimator of F will be defined as The literature on the nonparametric the integral of the deconvoluting kernel density estimator. deconvolution problem for the p.d.f. is quite extensive. This problem was initially studied in [18], 1211, 1121, and [23], and then more systematically in [I], [4], [13], and [22]. A disadvantage in all of the above works is that the error variables follow a specified distribution.
Key words and phrases: Deconvolution, ness error distribution, p-mixing.
nonparametric
3539
estimation,
distribution
function,
smooth-
3540
Second World
Congress
of Nonlinear
Analysts
Recently, nonparametric deconvolution problems were studied in [5] with the error variables following one of two different types of distributions; namely, ordinary smooth and super smooth distributions. Following [5], the error variable c has an ordinary smooth distribution, if its characteristic function at(.) satisfies the conditions
for some positive constants do, dr, p. On the other hand, the error variable distribution, if its characteristic function QE(.) satisfies the conditions
c has a super smooth
for some positive constants do, dr, p, y and constants PO, /Il. In [5], the optimal mean square convergence rates of the p.d.f. and the c.d.f. were obtained, using kernel estimators. From his paper, one can conclude that the smoother the error distribution is the harder the deconvolution problem will be. Subsequently, the asymptotic normality for deconvoluting kernel estimators and the best global rates of LP-convergence, 1 2 p < 03 were obtained in [6] and [7], respectively. In addition, the &-convergence, and the asymptotic normality of the same estimator were studied in [14] and [15], respectively, in the case of strong and p-mixing r.v.‘s. Also, the L,-convergence was obtained in [16], for the case of strong mixing r.v.‘s. Nonparametric regression problems with errors-in-variables are closely, related to the deconvolution problems for p.d.f. estimation. Such problems were studied in [B], [9], [lo] and [17]. Unlike to the nonparametric deconvolution problem for the p.d.f. and the regression function, little attention has been given to the corresponding problem for the c.d.f. However, in the absence of measurement errors, the asymptotic properties of a nonparametric estimator for the univariate c.d.f. were studied by several authors. Recently, for the i.i.d. case, it was shown in [2] that the nonparametric distribution estimator converges uniformly to the c.d.f. with a rate of order “gFn)3, if th e rate of the bandwidth convergence towards zero is not too slow. For multivariate ( i.i.d. observations, it was proved in [3] that the classical empirical distribution function converges uniformly to the c.d.f. with rate of order (%)3. First, we introduce some notation. To this end, the characteristic functions of the p.d.f.‘s f(x), MY,
g(x),
and S(z) by @f(t),
Qfu(t),
@At),
and @At), respectively.
Then (1.4)
Let K(z) be a kernel density function the deconvoluting kernel function by
with
characteristic
gK(t), and for h, > 0, define
function
.
(1.5)
deSet K(X) = n&r k(zj), and W%(x) = r]T=, En(zj). Th en, if n = k,p, the nonparametric convoluting kernel density estimator for the joint p.d.f. of the r.v.‘s Xr, . .. . X,, p < n, is defined by
(1.6)
Second World
Congress
of
Nonlinear Analysts
3541
where x = (21, . . .. zp)‘,Yj = (Y*+(j-l)p,...,Yp+(j--l)p)‘, h, J 0 as n -+ 00. To construct the estimator in (1.6), we have taken n = 0 (mod p) for the sake of simlplicity. The function Wn(x) in (1.6), generally, is not integrable, and the nonparametric estimator for the c.d.f. F(x) is defined by I;‘,(x)
=
(1.7)
where A, = [-M,,q]
x . .. x [-Mn,zp],
x = (rr,...,~~)‘,
and
M, -+ co
as
n + 00.
In this paper, we give conditions under which pn converges to F uniformly on compacts for p-mixing r.v.‘s involving measurement errors, while the error variable follows an ordinary smooth distribution. Sharp rates are obtained using an exponential bound for a sum of p-mixing r.v.‘s given in [20]. For a bandwidth chosen as h, = 0[( ?)&I, the convergence rate is of order (%)h, which is weaker than the optimal mean square error obtained in [5] for p = 1. However, in the absense of measurement errors (/3 = 0), th e above rate is indentical to that found in [3], although weaker than the one obtained in [2] for p = 1. The case where the error variable has a super smooth distribution can be treated similarly, but the rate to be obtained is not expected to be as good. 2. UNIFORM Let F/
CONVERGENCE
OF 1”,
be the o-field
of events generated by the random vectors {Xj,sj, i < j 5 k}, where )’ and Ed = (~r+(j-r)~, . . . . cp+(j-r),)‘. Then, for the sequence {Xj, sj}, xi = (X1+(,-1),, ..-7x,++l), define the maximal correlation p(n), as follows: p(n) = sup{ ]p(<, v)] : c being FF - measurable, measurable
r~ being 3gn-
with I!?]<]’ < co, E1q12 < co, n 2 l},
where
p([,q) is the correlation coefficient of the r.v.‘s [ and 7. The sequence {Xj, Ed}, is called p-mixing, if the mixing coefficients more details on this subject, see, for example, [19]. To obtain uniform convergence of E;,, we need the following assumptions Assumption (Bl). (i) The process {Xj,~j},j > 1, is strictly stationary. (ii) The process {X1,&j}, j 2 1, is p-mixing, and c,“==, p(j) < CM. (iii) Let o = a(n), p = p(n) be positive integers such that 2ap = k,. (iv) The mixing coefficient satisfies the requirement lim .ssk [l tP+r
m]’
p(n) 1 0, as n -+ co. For on Q.s and arc.
= C < 00.
satisfy the requirements: Assumption (B2). The characteristic functions ips, and @K (i) /sg(t)] > 0, for all t E IR. (ii) ]t]P]ss(t)] > d, f or 1arge t, for some p > 0 and d > 0. (iii) J_“, Jtlfl-l(zlc(t)ldt < 03, for some p > 1. (iv) srm ]t]2P]%1~(1)/2dt < co, for some p > 0. Assumption (B3). The kernel function g is a bounded p.d.f. satisfying the condition O(/X(-~-~), for some 6 > 0.
z(z)
=
Second World Congress of Nonlinear Analysts
3542
PROPOSITION
2.1. Under
(B2) and for only x1, x2 E lR.P:
h(,P”)PIp~(X1)
where
- pn;n(X’~)l 5 c(lX1 - Xz((,
c
> 0 and
((x1(
= 2
/Xjl.
j=l
PROOF:
Clearly,
since by Proposition PROPOSITION
1 in [14], h$Fn(u)j
2.2. Under
< co.
Assumption
(B3), and if F(x)
is Lipschitz
IEF~(xl) - EFn(xz)j 2 cIIX~ - X211,for h:
of order
1, then
every x1, x2 E IRP.
PROOF: By Fubini’s theorem, the definition of Wn(u), and the independence of the error variable E; and Xi, we have that Efi%(xl) = J,,[E’( x1 - y) - F(-M, - hny)]$K(y)dy, and that ”
~etn(xl) - mn(xz)~ = I
JRp(~(xl-y)- qxz- y))+W~I
JWY)&
< 4x1 - xzll
-
hP,
< 4IXl - x2ll
-
Let S,(x)
= E;,(x)
- Ekn(x),
and A,(x)
For some increasing integer-valued r.v.‘s Aj defined above can be grouped
hp,
= $
’
JzM,
=
C j=Z(i-l)a+l
Aj(x)7
Then
sequences (Y = cu(n), p = p(n), set k, = 2crp. Then successively into 2~ blocks of size cx, as follows:
(li-1)a
un,i(x)
Wn(q)ds.
Vn,i(X)
=
2ia c j=(Zi-l)o+l
A,(x)
for i = l,...,~.
the
(2.2)
Second World
By (2.2), relation
Congress
Analysts
of Nonlinear
3543
(2.1) becomes
S,(x)= & { 2 Kdx>+2 Vdx~} i=l
i=l
=&G {Sn(x,1)+ Sn(x,a)),
(2.3)
where Sn(x, I) = Cr=‘=, U,,i(x), &(x, 2) = ~~=‘=, V,,;(X). Th en we have the following LEMMA 2.1. Suppose Assumptions (Bl) - (B2) are satisfied. Then:
Ek
~+x~J))]
5 e(+ES;(x,l)) k”
where 0 < X 5 c?, c > 0. PROOF: It is clear that, by an application
From this point on, the proof is identical THEOREM
2.1.
Suppose Assumptions
sup ]kn(x) XEI PROOF:
[l+
of Fubini’s
6et ‘+mlV,
theorem
to that of Lemma (Bl)-(B2)
and an integration,
we have
3.2 in [20].
are satisfied.
Then:
a.s. with S(n) = (-)logk n n
- Ekn(x)l = O[$(rt)]
result.
l/2
, and I = [O,llp.
Let
(2.4) Since 1 is compact, it can be covered by, say, v cubes 1, centered the coordinate axes. Next, rEy @‘n(x) - E~n(x)l
I Ir=ua
For x E I,, use Proposition
2.1 and relation
at x, and with
- f’&ul Y
IEhn(xv)
Ik(x,)
(2.4) to obtain
LLo[(~)“] h(,p+‘)P = W@(n)).
- E@n‘,(x)l
- EE;,(x,)l.
sides parallel
to
3544
Second World
Congress
of Nonlinear
Analysts
Therefore
and
(2.5) BY (2.3), Sn(xy)
= &(xy)
- EQX,)
= k,iP
{.%6(X”,
1) + Sn(%,
2))
Observe that
(2.6) By Lemma
2.1, and Assumption
(Bl)(iv), (2.7)
whereas
by relation
(1.8) in [19],
ES;(x,,
1) = LYEA; < oEA;(x,)
+ 2
c l
+ 2
c p(j - i)E*A;(x,)E+A;(x,) l
co~(A,(x,),
A,(x”))
(2.8)
I 241 + &(j))EA:(x,). j=l Also, from the boundedness
of Aj(x), EA;(xu)
By (2.8) and (2.9), inequality
(2.9)
(2.7) becomes
P[,F;F~~l&!x,, -72 Now choose c, = O(s)+ ” it follows that
5 c
1) > fl I Cexp/+)exp
and X = O(r,k,hEflp). 2 P[ max l
I]Sn(x,,
( $& nn >
Then by (2.10) and the Borel-CanteUi l)] > %] < co.
(2.10) Lemma, (2.11)
Similarly, (2.12)
Second World Congress of Nonlinear Analysts Relations
(2.11),
REMARK
(2.12), in combination
2.1. The rate (s)i
with (2.6), yield relation
is faster than the uniform
3545
(2.5).
convergence
kernel p.d.f. estimator (see”T”heorem 2.2 in [16]). In the following result, we assume that &n
” < M,, 0 < h, < 1.
THEOREM
are satisfied.
2.2. Suppose
(i) If F is Lipschitz
Assumptions
Zen Ii’&)
(ii) If, in addition,
(Bl)-(B3)
of order 1, and its first absolute
moment
- F(x)1 = &&]
rate of the deconvoluting
Then:
is finite, + O(h)
a.~.
h, = 0[(v)&],
“X”PI $‘n(x)
-
F(x)1
=
Cl(F)-+=]
U.S. n
PROOF:
(i) As in the proof of Proposition
E&(X)
- F(x)
= lmp[F(x
2.2,
- y) - F(x)]&($)dy
- Imp F(-Mn
- htY)K(Y)dY.
By the Lipschitz condition and Assumption (B3), supXEl IEpn(x) - F(x)1 = O(h,). Next, using the assumptions made on M,, the first moment of the estimated c.d.f., and Assumption (B3), we obtain that 1smp F(-M, - h,y)li(y)dyl = O(h,). C om b ining the above relations with the fact that sup Ik’n(x) - F(s)1 i sup Ik(x) - &@)I + ~~7 IJ%(x> - F(x)l, XEI XEI we obtain the desired result. (ii) Follows easily from part (i). REMARK 2.2. In Theorem 2.2, we imposed the condition that the first moment of the unknown c.d.f. is finite. In [3], an artificial condition on the left tail of the c.d.f. was introduced in order to establish an upper bound for the mean square error of pn(x). We could have used a similar condition but we have chosen not to do so. As it is mentioned in [3], p.1264, the artificial assumption alluded to cannot be verified. Therefore, we considered it more appropriate to assume that the first absolute moment of the unknown c.d.f. is finite. REFERENCES 1. CARROL R.J.& HALL P., Optimal rates for deconvolving a density, J. Amer. Statist. Assoc., 83(1988), 1184.1186. 2. DEGENHARDT H.J.A., Strong limit theorems for the difference of the perturbed empirical distribution function and the classical empirical distribution function, Stand. J. Statist., 23( 1996), 331-351. 3. DEVROYE L., A Uniform bound for the deviation of empirical distribution functions, J. Multivariate Anal., 7(1977), 594-597.
3546
Second World
Congress
of Nonlinear
Analysts
4. DEVROYE L., Consistent deconvolution in density estimation, Canad.J.Statist., 17(1989), 235-239. 5. FAN J., On the optimal rates of convergence for nonparametric deconvolution problems, Ann.Statist., 19(1991a), 1257-1272. 6. FAN J., Asymptotic normality for deconvolution kernel density estimators, Sankhyi Ser.A, 53(1991b), 97-110. 7. FAN J., Global behavior of deconvolution kernel estimates, Statist. Sinica, 1(1991c), 541-551. 8. FAN J. & TRUONG Y.K., Nonparametric regression with errors in variables, Ann. Statist., 21(1993), 1900-1925. 9. FAN J., TRUONG Y.K. & WANG Y., Nonparametric function estimation involving errorsin-variables, Nonparametric functional estimation and related topics, G.G.Roussas (Editor) Kluwer Academic Puplishers(NAT0 AS1 SER.), 1991,613-627. 10. FAN J. & MASRY E., Multivariate regression estimation with error-invariables, J. Multivariate Anal., 43(1992), 237-271. 11. FULLER W.A., Measurements error models, Wiley, New York, (1987). 12. KUELPS J.D., Estimation of the multidimensional probability density function, MRC Technical Report No.1646, (1976), U niversity of Wisconsin, Madison, Wisconsin. 13. LIU M.C. & TAYLOR R.L., A consistent nonparametric density estimator for the deconvolution problem, Canad.J.Statist., 17(1989), 427-438. 14. MASRY E., Multivariate probability density deconvolution estimators for stationary random process, IEEE Trans.Infor.Theory, IT-37(1991), 1105-1115. 15. MASRY E., Asymptotic normality for deconvolution estimators of multivariate densities of stationary process, J. Multivariate Anal., 47 (1993a), 47-68. 16. MASRY E., Strong consistency and rates for deconvolution of multivariate densities of stationary process, Stochastic Process Appl., 47(1993b), 53-74. 17. MASRY E., Multivariate regression estimation with error-in- variables for stationary processes, Nonparametric Statist., 3(1993c), 13-16. 18. NADARAYA E., On nonparamatric estimation of density function and regression , Theory Probab. Appl., 10(1965), 186-190. 19. ROUSSAS G.G. & IOANNIDES D.A., Moment inequalities for mixing sequences of random variables, Stochastic Anal. Appl., 5(1987a), 61-120. 20. ROUSSAS G.G. & IOANNIDES D.A., Probability bounds for sums in triangular arrays of random variables under mixing conditions, Statistical theory and data analysis II, K.Matusita (Editor) Elsevier Science Publioherss, (1987b), 293-308. 21. SCHWARTZ S.C., On the estimation of a Gaussian convolution probability density, SIAM J. Appl. Math., 17(1967), 447-453. 22. STEFANSKY L. & CARROL R.J., Deconvolving kernel density estimators, Statistics, 21(1991), 165-184. 23. WISE G.L., TRAGANIDIS A.P. & THOMAS J.B., The estimation of a probability density function from measurements corrupted by Poisson noise, IEEE Trans.Infor.Theory, IT23(1977), 764-766.