Journal of Statistical Planning and Inference 141 (2011) 3475–3488
Contents lists available at ScienceDirect
Journal of Statistical Planning and Inference journal homepage: www.elsevier.com/locate/jspi
Conditional quantile estimation with auxiliary information for left-truncated and dependent data ˜ a-A´lvarez b Han-Ying Liang a,b,, Jacobo de Un a b
Department of Mathematics, Tongji University, Shanghai 200092, PR China ´micas y Empresariales, Universidad de Vigo, Campus Lagoas-Marcosende, 36310 Vigo, Spain Department of Statistics and OR, Facultad de Ciencias Econo
a r t i c l e i n f o
abstract
Article history: Received 17 September 2008 Received in revised form 2 May 2011 Accepted 4 May 2011 Available online 19 May 2011
In this paper, the empirical likelihood method is used to define a new estimator of conditional quantile in the presence of auxiliary information for the left-truncation model. The asymptotic normality of the estimator is established when the data exhibit some kind of dependence. It is assumed that the lifetime observations with multivariate covariates form a stationary a-mixing sequence. The result shows that the asymptotic variance of the proposed estimator is not larger than that of standard kernel estimator. Finite sample behavior of the estimator is investigated via simulations too. & 2011 Elsevier B.V. All rights reserved.
Keywords: Asymptotic normality Conditional quantile estimator Truncated data a-Mixing Auxiliary information
1. Introduction Let Y be a response variable with continuous distribution function (df) F~ ðÞ and X a random covariates vector taking its values in Rd (d Z 1) with the joint df LðÞ and the joint density lðÞ. Throughout the paper, x ¼ ðx1 , . . . ,xd Þ 2 Rd . For any x, the conditional df of Y given X ¼ x is FðyjxÞ ¼ E½IðY r yÞjX ¼ x, which can be written into FðyjxÞ ¼
Z
y
def
f ðx,tÞ dt=lðxÞ : ¼ F1 ðx,yÞ=lðxÞ,
ð1:1Þ
1
where f ð,Þ is the probability density function of ðX,YÞ, and lðÞ is assumed positive at x. In the context of regression, it is of interest to estimate FðyjxÞ and/or the pertaining quantile function xp ðxÞ ¼ inffy : FðyjxÞ Z pg for p 2 ð0,1Þ. Indeed, it is well known that the conditional quantile functions can give a good description of the data (see, e.g. Chaudhuri et al., 1997), such as robustness to heavy-tailed error distributions and outliers, and especially the conditional median functions. For independent and complete data, many authors considered this problem; see for example Mehra et al. (1991), Chaudhuri (1991), Fan et al. (1994), and Xiang (1996). Under censoring, Dabrowska (1992) established a Bahadur-type representation of kernel quantile estimator; see also Van Keilegom and Veraverbeke (1998) for the fixed design regression framework or Iglesias-Pe´rez (2003) for the inclusion of left-truncation. Xiang (1995) obtained the deficiency of the sample quantile estimator with respect to a kernel estimator using coverage probability, Qin and Tsao (2003) studied empirical likelihood inference for median regression models and they showed that the limiting distribution of the empirical ratio for the parameter vector estimate is a weighted sum of w2 distributions. Ould-Saı¨d (2006) constructed a kernel estimator of the Corresponding author at: Department of Mathematics, Tongji University, Shanghai 200092, PR China.
˜ a-A´lvarez). E-mail addresses:
[email protected] (H.-Y. Liang),
[email protected] (J. de Un 0378-3758/$ - see front matter & 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.jspi.2011.05.012
´ lvarez / Journal of Statistical Planning and Inference 141 (2011) 3475–3488 ˜a-A H.-Y. Liang, J. de Un
3476
conditional quantile under i.i.d. censorship model and established its strong uniform convergence rate; Liang and ˜ a-A´lvarez (2011) proved the strong uniform convergence and asymptotic normality for this estimator under de Un dependent assumptions. In practice, the response variable Y, which is the variable of interest, and is referred to hereafter as lifetime, may be subject to right censoring and/or left-truncation. In this paper we are interested in the left-truncation model. Such data occur in astronomy, economics, epidemiology and biometry; see, e.g. Woodroofe (1985) and He and Yang (1994). Recently, Lemdani et al. (2009) studied asymptotic properties of kernel conditional quantile estimator for the left-truncated model in independent setting. The dependent data scenario is an important one in a number of applications with survival data. For example, when sampling clusters of individuals (family members, or repeated measurements on the same individual, for example), lifetimes within clusters are typically correlated (see Kang and Koehler, 1997, or Cai et al., 2000). There is some literature devoted to the study of the conditional quantile estimation under dependence. To mention some examples, Cai (2002) investigated the asymptotic normality and the weak convergence of a weighted Nadaraya–Watson conditional df and quantile estimator for a-mixing time series. Honda (2000) dealt with a-mixing processes and proved the uniform convergence and asymptotic normality of an estimate of xp ðxÞ using the local polynomial fitting method. Ferraty et al. (2005) considered quantile regression under dependence when the conditioning variable is infinite dimensional. Nonparametric conditional median predictors for time series based on the double kernel method and the constant kernel method were proposed by Gannoun et al. (2003). A nice extension of the conditional quantile process theory to setindexed processes under strong mixing was establish in Polonik and Yao (2002). Also, Zhou and Liang (2000) reported asymptotic analysis of a kernel conditional median estimator for dependent data. Ould-Saı¨d et al. (2009) recently discussed strong uniform convergence with rate of the kernel conditional quantile estimator with the left-truncated and dependent data. However, for the best of our knowledge, there have been no results dealing with the estimation of the conditional quantile for left-truncated and dependent data with auxiliary information. It is assumed that some auxiliary information about the conditional distribution function is available in the sense that there exist k ðk Z1Þ functions g1 ðyÞ, . . . ,gk ðyÞ such that MðxÞ ¼ EðgðYÞjX ¼ xÞ ¼ 0,
ð1:2Þ
where gðyÞ ¼ ðg1 ðyÞ, . . . ,gk ðyÞÞt is an k-dimensional vector. This model is of interest in many circumstances where only some partial information about the conditional distribution of the sample is known. For example, for given x, if the conditional mean mðxÞ ¼ EðYjX ¼ xÞ is known, we have (1.2) by taking gðyÞ ¼ ymðxÞ; if the conditional distribution is symmetric about a known constant y0 for a given x, then (1.2) holds for gðyÞ ¼ IðyZ y0 Þ12; in case that it is known that Pða oY objX ¼ xÞ ¼ p0 , then we can let gðyÞ ¼ Iða o yo bÞp0 . The empirical likelihood method as a nonparametric technique for constructing confidence regions in the nonparametric setting was introduced by Owen(1988, 1990). Chen and Qin (1993) showed that the empirical likelihood method can be naturally applied to make more accurate statistical inference in finite population estimation problems by employing auxiliary information efficiently. By using the empirical likelihood method, Zhang (1995) proposed a new class of more efficient M-estimators and quantile estimators in the presence of some auxiliary information, under a nonparametric setting; Qin and Wu (2001) considered the estimation of the conditional quantile with the auxiliary information for complete samples. In addition, Shen and He (2007) studied empirical likelihood for the difference of quantiles under censorship; Tse (2005) considered quantile processes for left-truncated and right censored data; Li et al. (1996) investigated nonparametric likelihood ratio confidence bands for quantile functions from incomplete survival data; Zhou et al. (2000) studied the estimation of a quantile function based on left-truncated and right-censored data by the kernel smoothing method. All of the above works related to the empirical likelihood method are devoted to the independent setting. By using the empirical likelihood method, we define, in this paper, a new estimator of xp ðxÞ in the presence of auxiliary information (1.2) for the left-truncation model, and establish the asymptotic normality of the estimator when the data exhibit some kind of dependence. It is assumed that the lifetime observations with multivariate covariates form a stationary a-mixing sequence. The result shows that the asymptotic variance of the proposed estimator is not larger than that of standard kernel estimator. Recall that a sequence fxk ,k Z 1g is said to be a-mixing if the a-mixing coefficient k aðnÞ :¼ supsupfjPðABÞPðAÞPðBÞj : A 2 F 1 n þ k ,B 2 F 1 g kZ1
converges to zero as n-1, where F m l denotes the s-algebra generated by xl , xl þ 1 , . . . , xm with l r m. Among various mixing conditions used in the literature, a-mixing is reasonably weak and is known to be fulfilled for many stochastic processes including many time series models. Withers (1981) derived the conditions under which a linear process is a-mixing. In fact, under very mild assumptions linear autoregressive and more generally bilinear time series models are strongly mixing with mixing coefficients decaying exponentially, i.e., aðnÞ ¼ Oðrn Þ for some 0 o r o 1. See Doukhan (1994, p. 99), for more details, and Cai and Kim (2003) for motivation in the scope of survival analysis.
´ lvarez / Journal of Statistical Planning and Inference 141 (2011) 3475–3488 ˜a-A H.-Y. Liang, J. de Un
3477
The rest of the paper is organized as follows. Section 2 introduces the estimator of the conditional quantile. Main result is formulated in Section 3. A simulation study is presented in Section 4. Section 5 gives proof of the main result. Some preliminary lemmas, which are used in the proof of the main result, are collected in Appendix A.
2. Estimator Let fðXk ,Yk ,Tk Þ,1 r kr Ng be from ðX,Y,TÞ, where T is the truncation variable. For the components of ðX,Y,TÞ, in addition to the assumptions and notation for X and Y we made above, we assume throughout that T and ðX,YÞ are independent, and T has continuous df G. Let Fð,Þ be the joint df of the random variable ðX,YÞ. Without loss of generality, we assume that both Y and T are non-negative random variables, as usual in survival analysis. In the random left-truncation model, the lifetime Yi is interfered by the truncation random variable Ti in such a way that both Yi and Ti are observable only when Yi ZTi , whereas neither is observed if Yi o Ti for i ¼ 1, . . . ,N, where the N is the potential sample size. Due to the occurrence of truncation, the N is unknown, and n, which is the size of the actually observed sample, is random with n rN. Let y ¼ PðY Z TÞ be the probability that the random variable Y is observable. Note that y ¼ 0 implies that no data can be observed, so we suppose throughout the paper that y 4 0. Since the N is unobserved and the n is observed, our results are not stated with respect to the probability measure P (related to the N-sample) but will involve the conditional probability P with respect to the actually observed n-sample, i.e., P always refers to the conditioning event Y ZT. Also E and E will denote the expectation operators under P and P, respectively. In the sequel, the observed sample fðXk ,Yk ,Tk Þ,1r k rng is assumed to be a stationary a-mixing sequence. Following the idea of Lynden-Bell (1971), the nonparametric maximum likelihood estimators of the dfs F~ and G are given, respectively, by Y Y 1 1 and Gn ðyÞ ¼ , 1 1 1F~ n ðyÞ ¼ nC n ðYi Þ nC n ðTi Þ i:Y r y i:T 4 y i
i
Pn
~ where Cn ðyÞ ¼ n i ¼ 1 IðTi r yr Yi Þ. The estimator of y is defined (cf. He and Yang, 1998) by yn ¼ Gn ðyÞ½1F n ðyÞ=Cn ðyÞ for all y such that Cn ðyÞa0. For any df W, let aW ¼ inffy : WðyÞ 4 0g and bW ¼ supfy : WðyÞ o 1g be its two endpoints. Since T is independent of ðX,YÞ, the conditional joint distribution of ðX,Y,TÞ Z Z 1 H ðx,y,tÞ ¼ PðX r x,Y ry,T r tÞ ¼ PðX rx,Y r y,T rtjY Z TÞ ¼ y Gðv4tÞFðds,dvÞ: 1
s r x aG r v r y
1 R
Taking t ¼ þ 1, the observed pair ðX,YÞ then has the following df F ð,Þ: F ðx,yÞ ¼ H ðx,y,1Þ ¼ y which yields that
R
s r x aG r v r y GðvÞFðds,dvÞ,
1
Fðdx,dyÞ ¼ ½y GðyÞ1 F ðdx,dyÞ for y4 aG R R and LðxÞ ¼ y s r x aG r y ð1=GðyÞÞF ðds,dyÞ. Thus, the estimators of LðxÞ and Fð,Þ are given, respectively, by Ln ðxÞ ¼
n yn X
n
1 IðXi r xÞ G ðY Þ i¼1 n i
and
Fn ðx,yÞ ¼
n yn X
n
i¼1
1 IðXi rx,Yi ryÞ: Gn ðYi Þ
ð2:1Þ
ð2:2Þ
Note that in Eq. (2.2) and the forthcoming formulae, the sum is taken only for i such that Gn ðYi Þa0. Note that, by applying (2.1), (1.2) is equivalent to EðgðYÞðGðYÞÞ1 jX ¼ xÞ ¼ 0:
ð2:3Þ
To make use of (1.2), i.e., (2.3), we now introduce the empirical likelihood function H ¼ subject to the restrictions: n n X X xXi pi Z0, pi ¼ 1, pi K gðYi ÞðGn ðYi ÞÞ1 ¼ 0, hn i¼1 i¼1
Qn
i¼1
pi , where p1 , . . . ,pn are
and K is some kernel function on Rd , ðhn Þn Z 1 r0 as ns1. The maximum of H can be found via the method of Lagrange Q multipliers. It may be shown that Hmax ¼ ni¼ 1 p^ i , where p^ i ¼
1 1 , n 1 þ Zt K xXi gðY ÞðGn ðY ÞÞ1 hn
i
i ¼ 1, . . . ,n,
ð2:4Þ
i
and Z is the solution of the following equation: i n gðYi ÞðGn ðYi ÞÞ1 K xX X hn ¼ 0: xXi t gðYi ÞðGn ðYi ÞÞ1 i ¼ 1 1þZ K hn
ð2:5Þ
´ lvarez / Journal of Statistical Planning and Inference 141 (2011) 3475–3488 ˜a-A H.-Y. Liang, J. de Un
3478
Motivated by (2.2) and (2.4) we propose the following as an estimator for FðyjxÞ: Pn Pn 1 1 i i ^ ^ IðYi r yÞK xX yhd IðYi ryÞK xX n i ¼ 1 p i ðGn ðYi ÞÞ i ¼ 1 p i ðGn ðYi ÞÞ F1n ðx,yÞ hn hn ¼ :¼ Fn ðyjxÞ ¼ Pn Pn xXi xXi 1 1 ln ðxÞ d ^ ^ K hn yhn K hn i ¼ 1 p i ðGn ðYi ÞÞ i ¼ 1 p i ðGn ðYi ÞÞ
ð2:6Þ
with the convention 0/0¼0. Then a natural estimator of xp ðxÞ is given by xpn ðxÞ ¼ inffy : Fn ðyjxÞ Zpg. 3. Main result In the sequel, let C and c denote generic finite positive constants, whose values may change from line to line. suppðlÞ ¼ fx 2 Rd jlðxÞ 4 0g; UðxÞ represents a neighborhood of x; An ¼ OðBn Þ stands for jAn j r CjBn j. All limits are taken as the sample size n tends to 1, unless specified otherwise. Put mðxÞ ¼ Ef½IðY r xp ðxÞÞpgðYÞðGðYÞÞ1 jX ¼ xg,
nðxÞ ¼ Ef½IðY r xp ðxÞÞp2 ðGðYÞÞ1 jX ¼ xg and VðxÞ ¼ EfgðYÞg t ðYÞðGðYÞÞ1 jX ¼ xg: In order to formulate the main result, we need the following assumptions: (A0) aG oaF~ and bG obF~ : (A1) For all integers j Z 1, the joint conditional density lj ð,Þ of X1 and Xj þ 1 exists on Rd Rd and satisfies lj ðs,tÞ rC for ðs,tÞ 2 UðxÞ UðxÞ. (A2) (i) The kernel KðÞ is a bounded function with compact support on Rd ; R R (ii) Rd KðxÞ dx ¼ 1; (iii) Rd xi11 xidd KðxÞ dx ¼ 0 for non-negative integers i1 , . . . ,id with i1 þ þ id ¼ 1. (A3) The sequence aðnÞ satisfies that d d (i) there exist positive integers q :¼ qn such that q ¼ oððnhn Þ1=2 Þ, limn-1 ðnhn Þ1=2 aðqÞ ¼ 0; P d ½aðlÞ12=r o 1. (ii) there exist r 42 and d 412=r such that 1 l l¼1 (A4) For 1 rk,l r k, j Z 1 and ðs,tÞ 2 UðxÞ UðxÞ, the conditional expectation (i) Eðjgk ðY1 Þgl ðY1 Þgk ðY1 þ j Þgl ðY1 þ j ÞJX1 ¼ s,X1 þ j ¼ tÞ o1, Eðjgk ðY1 Þgk ðY1 þ j ÞJX1 ¼ s,X1 þ j ¼ tÞ o 1; (ii) Eðjgk ðY1 Þgl ðY1 þ j ÞJX1 ¼ s,X1 þ j ¼ tÞ o1, Eðjgk ðY1 ÞJX1 ¼ s,X1 þ j ¼ tÞ o 1 and Eðjgk ðY1 þ j ÞJX1 ¼ s,X1 þ j ¼ tÞ o 1. (A5) For 1 rk,l r k, Eðjgk ðYÞgl ðYÞjr jX ¼ sÞ o 1 and Eðjgk ðYÞjr jX ¼ sÞ o1 for s 2 UðxÞ, where r is the same as in (A3). (A6) The second partial derivatives of MðsÞ and lðsÞ are bounded in UðxÞ. (A7) nðsÞ, mðsÞ and VðsÞ are continuous at x, and VðxÞ is positive definite. (A8) F1 ð,Þ has bounded partial derivative of order 2 with respect to the first component in UðxÞ; f ðx,yÞ is continuous at y ¼ xp ðxÞ. dð1 þ 4=rÞ
dþ4
(A9) n1 hn
¼ Oð1Þ and nhn
-0, where r is the same as in (A3).
Remark 3.1. (a) Condition aG o aF~ in (A0) implies GðYÞ ZGðaF~ Þ 4 0, which ensures Gn ðYi Þa0 eventually, so the given estimators are well defined for large n. Assumptions (A1) and (A4) are mainly technical, which are employed to simplify the calculations of covariances in the proof, these assumptions are redundant for the independent setting; conditions (A2), (A5) and (A7) are standard regularity conditions; Assumptions (A6) and (A8) allow us to apply the Taylor expansion in the proof. (b) The role of assumption (A3) is to employ Bernstein’s big-block and small-block technique to prove asymptotic normality for an a-mixing sequence. Moreover, assume that aðnÞ ¼ Oðng Þ for some g 4 0, then the assumption (A3) implies restrictions in the degree of dependence of the observable sequence. Indeed, the conditions in (A3) can be d satisfied easily, for example, choose hdn ¼ cnZ for some 0 o Z o 1, qn ¼ ðnhn =lognÞ1=2 , then (A3) automatically holds if g is large enough, specifically g 4maxfð1 þ ZÞ=ð1ZÞ,rð1 þ dÞ=ðr2Þg (note that l can be arbitrarily large if aðnÞ ¼ Oðrn Þ for some 0 o r o 1); assumption (A9) is technical condition. Theorem 3.1. Let x 2 suppðlÞ. Suppose that (A0)–(A9) are satisfied and that aðnÞ ¼ Oðng Þ for some g Zrðr þ 2Þ=½2ðr2Þ. Then D
d
ðnhn Þ1=2 ðxpn ðxÞxp ðxÞÞ-Nð0, s2 ðxÞÞ, where
0 o s2 ðxÞ ¼
D2 ðxÞ f 2 ðx, xp ðxÞÞ
o1
and
D2 ðxÞ :¼ ylðxÞ½nðxÞmt ðxÞV 1 ðxÞmðxÞ
Z Rd
K 2 ðsÞ ds:
Remark 3.2. (a) Let S1 ðx,yÞ ¼
R
t r y f ðx,tÞ 3
dt=GðtÞ, S2 ðxÞ ¼
nðxÞ ¼ ðlðxÞÞ ½S1 ðx, xp ðxÞÞl
2
R
R f ðx,tÞ
dt=GðtÞ, then
ðxÞ þ S2 ðxÞF12 ðx, xp ðxÞÞ2S1 ðx, xp ðxÞÞF1 ðx, xp ðxÞÞlðxÞ:
2 By using Theorem 3.1 it is possible to construct confidence intervals for xp ðxÞ. For this purpose a plug-in estimate s^ n ðxÞ of s2 ðxÞ can be used. Having already defined the estimator yn of y in Section 2, the estimators of lðxÞ and F1 ðx,yÞ, S1 ðx,yÞ,
´ lvarez / Journal of Statistical Planning and Inference 141 (2011) 3475–3488 ˜a-A H.-Y. Liang, J. de Un
3479
S2 ðxÞ, mðxÞ, VðxÞ and f ðx,yÞ are defined, respectively, by n n X xXi yn X xXi ^l n ðxÞ ¼ yn p^ i ðGn ðYi ÞÞ1 K p^ i ðGn ðYi ÞÞ1 IðYi ryÞK , F^ 1n ðx,yÞ ¼ d , d hn hn hn i ¼ 1 hn i ¼ 1 n n y X IðYi r yÞ xXi ^ yn X 1 xXi K K , , S^ 1n ðx,yÞ ¼ nd S ðx,yÞ ¼ 2n 2 2 d hn hn nhn i ¼ 1 Gn ðYi Þ nhn i ¼ 1 Gn ðYi Þ n X IðYi r xpn ðxÞÞgðYi Þ y xXi K m^ n ðxÞ ¼ d n , ^ hn G2n ðYi Þ nh l n ðxÞ i ¼ 1 n
V^ n ðxÞ ¼
n X gðYi Þg t ðYi Þ xXi K 2 d hn nh ^l n ðxÞ i ¼ 1 Gn ðYi Þ
yn
n
and
F^ 1n ðx,yþ an ÞF^ 1n ðx,yan Þ , f^ n ðx,yÞ ¼ 2an
where 0 o an -0. This yields the confidence intervals of asymptotic level 1d for xp ðxÞ 2 3 u1d=2 s^ n ðxÞ u1d=2 s^ n ðxÞ7 6 qffiffiffiffiffiffiffiffi 5, 4xpn ðxÞ qffiffiffiffiffiffiffiffi , xpn ðxÞ þ d d nhn nhn where u1d=2 denotes the 1d=2 quantile of the standard normal distribution. (b) Without the auxiliary information (1.2), Fn ðyjxÞ in (2.6) reduces to the standard kernel estimator , X n n X xXi xXi F n ðyjxÞ ¼ ðGn ðYi ÞÞ1 IðYi r yÞK ðGn ðYi ÞÞ1 K hn hn i¼1 i¼1 of FðyjxÞ which is just the non-smoothed version of the estimator constructed by Lemdani et al. (2009) and which has not been investigated in the literature under our assumptions; furthermore, xpn ðxÞ reduces to the standard kernel R estimator x pn ðxÞ ¼ inffy : F n ðyjxÞ Zpg of xp ðxÞ; in this case, we have D2 ðxÞ ¼ ylðxÞnðxÞ Rd K 2 ðsÞ ds in Theorem 3.1. Therefore, the proposed estimator is more efficient than the standard kernel estimator since VðxÞ is positive definite. (c) Using an i.i.d. setting without auxiliary information (1.2), Lemdani et al. (2009) constructed a kernel estimator of xp ðxÞ for the left-truncated model, and proved asymptotic normality of their estimator with asymptotic variance R l2 ðxÞs~ 2 ðx, xp ðxÞÞ=f 2 ðx, xp ðxÞÞ ¼ ylðxÞnðxÞ Rd K 2 ðsÞ ds=f 2 ðx, xp ðxÞÞ, which is the asymptotic variance in Theorem 3.1 without the auxiliary information (1.2), where Z y½S1 ðx,yÞl2 ðxÞ þ S2 ðxÞF12 ðx, xp ðxÞÞ2S1 ðx,yÞF1 ðx,yÞlðxÞ s~ 2 ðx,yÞ ¼ K 2 ðsÞ ds 4 l ðxÞ Rd (hereby correcting a typo in Lemdani et al., 2009 by replacing y by 1=y).
4. Simulation study In this section, we carry out a simulation study to investigate the finite sample performance of the estimator xpn ðxÞ of xp ðxÞ in the case d¼2. In particular, we compare the mean squared errors of the new estimator (NE) xpn ðxÞ and the standard kernel estimator (SKE) x pn ðxÞ (see Remark 3.2(b), in which auxiliary information is not used). We also examine how good the asymptotic normality of the new estimator xpn ðxÞ is by its histograms and normal-probability-plots against the normal distribution. In order to obtain an a-mixing observed sequence fXi ,Yi ,Ti g after truncation, we generate the observed data as follows: (1) Drawing of ðX1 ,Y1 ,T1 Þ: Step 1. Draw e1 Nð1,0:252 Þ and take X1 ¼e1. Step 2. Compute Y1 from the model Y1 ¼ 6:8 þ 0:26X1 þ e1 , where e1 Nð0,0:12 Þ. Step 3. Draw T1 Nðm,1Þ, where m is adapted in order to get different values of y. If Y1 o T1 , we reject the datum and go back to Step2, do this until Y1 Z T1 . (2) Drawing of ðX2 ,Y2 ,T2 Þ: Step 4. Draw X2 from the AR(1) model X2 ¼ 0:5X1 þ e2 , where e2 Nð1,0:252 Þ. Step 5. Compute Y2 from the model Y2 ¼ 6:80:17X12 þ0:26X2 þ e2 , where e2 Nð0,0:12 Þ. Step 6. Draw T2 Nðm,1Þ. If Y2 o T2 , we reject the datum and go back to Step5, do this until Y2 ZT2 . By replicating the process (2) above, we generate the observed data ðXi ,Yi ,Ti Þ, i ¼ 1, . . . ,n. The generating process shows that Xi ¼ ðXi1 ,Xi Þ, where Xi ¼ 0:5Xi1 þ ei and ei ’s are i.i.d. random variables with distribution Nð1,0:252 Þ, 2 þ0:26Xi þ ei Yi ¼ 6:80:17Xi1
´ lvarez / Journal of Statistical Planning and Inference 141 (2011) 3475–3488 ˜a-A H.-Y. Liang, J. de Un
3480
and Yi Z Ti , where ei are i.i.d. random variables with distribution Nð0,0:12 Þ and Ti Nðm,1Þ, here m is adapted in order to get different values of y. Obviously, the regression function is mðxÞ ¼ EðYjX ¼ xÞ ¼ 6:80:17x21 þ 0:26x2 at x ¼ ðx1 ,x2 Þ. Assume that mð1,1Þ ¼ 6:89 is known. For the proposed estimator, we employ the auxiliary information gðyÞ ¼ y6:89 and the 2 2 1=8 product kernel Kðx1 ÞKðx2 Þ with KðxÞ ¼ 15 . 16 ð1x Þ Iðjxjr 1Þ, and we choose the bandwidth hn ¼ n 4.1. Comparison between the new and standard estimators We draw random samples with sample size n¼300, 500 and 1000 from the model above and take different values of the non-truncation proportion: y 30%,60% and 90%. In the following Table 1, we report the mean squared errors (MSE) of the estimators xpn ðxÞ and x pn ðxÞ based on M ¼500 replications. From Table 1, it can be seen that (i) the new estimator performs better than the standard kernel estimator; (ii) for the same sample size, the performance of the new estimator does not seem to be greatly affected by the percentage of truncated data 1y; this is not surprising, since we are repeating the data generation procedure until a given size n is reached. Although truncation does not influence the final (or effective) sample size in our simulations, it should be noted that it may influence the amount of information reported by each single observation on the target. This explains the variation in the MSE of the two estimators with the truncation rate. The MSE of the SKE slightly decreases with the truncation rate, while (for specific sample sizes) the MSE of the NE shows a bathtub shape along the y values; a possible explanation for this different behavior is that the relevance of the auxiliary information incorporated by the new estimator is influenced by the truncation. In order to see what happens if the auxiliary information is misspecified, we report, in Table 2, the MSE of xpn ðxÞ based on M ¼500 replications for f ¼ 6:89 (true value) and other f values (wrong) in the auxiliary information gðyÞ ¼ yf when the percentage of truncated data y ¼ 90%. Table 2 shows that for values of f other than f ¼ 6:89 (true value), the MSE of the new estimator may be greater than that of the standard kernel estimator. Indeed, it is seen that the MSE of the new estimator grows as the specified f gets away from the truth. However, when the degree of mis-specification in the auxiliary information is low (e.g. f ¼ 6:87,6:91), the new estimator may be still preferable. 4.2. Asymptotic normality In this subsection, we examine how good the asymptotic normality of the new estimator xpn ðxÞ at x ¼ ð1,1Þ is by the histograms and normal-probability-plots against the normal distribution. We draw M independent n-samples. In Figs. 1 and 2, we plot the histograms and normal-probability-plots for y ¼ 90% based on M¼500 replications with sample size n ¼300 and 900, respectively. From Figs. 1 and 2, it is seen that the quality of fit increases as increasing of the sample size n.
Table 1 The MSE of xpn ðxÞ and x pn ðxÞ with p ¼0.5, x ¼ ð1,1Þ, xp ðxÞ ¼ 6:89.
y
n
SKE
NE
30%
300 500 1000
1.3770 10 3 1.1730 10 3 9.8860 10 4
2.1780 10 4 1.4840 10 4 9.6600 10 5
60%
300 500 1000
1.6416 10 3 1.4808 10 3 1.2370 10 3
2.1230 10 4 1.4630 10 4 9.6800 10 5
90%
300 500 1000
1.9422 10 3 1.6498 10 3 1.4662 10 3
2.1860 10 4 1.4500 10 4 8.8300 10 5
Table 2 The MSE of xpn ðxÞ and x pn ðxÞ with p ¼0.5, x ¼ ð1,1Þ, xp ðxÞ ¼ 6:89, y ¼ 90%, n¼200, gðyÞ ¼ yf, the true value of f is 6.89. The MSE are the numerical values in following form 10 3. Specified f
6.81
6.83
6.85
6.87
6.89 (true value)
6.91
6.93
6.95
6.98
SKE NE
2.3477 14.1413
2.3477 6.2845
2.3477 2.4677
2.3477 0.6904
2.3477 0.2729
2.3477 0.8865
2.3477 2.6036
2.3477 5.1192
2.3477 9.0803
´ lvarez / Journal of Statistical Planning and Inference 141 (2011) 3475–3488 ˜a-A H.-Y. Liang, J. de Un
3481
Normal−Probability−plot 70
0.999 0.997
60
0.99 0.98 0.95
50
0.90 0.75 Probability
40
30
0.50 0.25 0.10
20
0.05 0.02 0.01
10
0.003 0 6.8
0.001 6.85
6.9
6.95
7
6.86
6.88 6.9 6.92 Observed Data
Fig. 1. y ¼ 90%, n¼ 300, M ¼ 500.
5. Proof of Theorem 3.1 We observe that for u 2 R d
ðnhn Þ1=2 ðxpn ðxÞxp ðxÞÞ ru Gn ðuÞ ¼ P sðxÞ
! d
¼ Pðp r Fn ðsðxÞuðnhn Þ1=2 þ xp ðxÞjxÞÞ
and
FðuÞGn ðuÞ ¼ Pððnhdn Þ1=2 ðDðxÞÞ1 ½Fn ðsðxÞuðnhdn Þ1=2 þ xp ðxÞjxÞp o0ÞFðuÞ:
ð5:1Þ d
Put wi ¼ KððxXi Þ=hn ÞgðYi ÞðGðYi ÞÞ1 , wni ¼ KððxXi Þ=hn ÞgðYi ÞðGn ðYi ÞÞ1 , en ¼ sðxÞuðnhn Þ1=2 . Then ! 2 t 1 1 t w þ Zt w Gn ðYi ÞGðYi Þ þ ðZ wni Þ 1 ¼ Z p^ i ¼ i i nð1 þ Zt wni Þ n Gn ðYi Þ 1 þ Zt wni and d
d
ðnhn Þ1=2 ðDðxÞÞ1 ½Fn ðsðxÞuðnhn Þ1=2 þ xp ðxÞjxÞp ( n X ½IðYi r en þ xp ðxÞÞp lðxÞ 1 1 1 xXi qffiffiffiffiffiffiffiffi y K ¼ 1 þ Zt wni ln ðxÞ Gn ðYi Þ GðYi Þ hn d i¼1 nhn DðxÞ n X y½IðYi r en þ xp ðxÞÞp xXi þ K ð1Zt wi Þ GðYi Þ hn i¼1 !) n X y½IðYi r en þ xp ðxÞÞp xXi Gn ðYi ÞGðYi Þ ðZt wni Þ2 t þ K þ Z wi GðYi Þ Gn ðYi Þ hn 1 þ Zt wni i¼1 :¼
lðxÞ 1 qffiffiffiffiffiffiffiffi fI1n ðxÞ þ I2n ðxÞI3n ðxÞ þ I4n ðxÞ þ I5n ðxÞg: ln ðxÞ d nhn DðxÞ
ð5:2Þ
´ lvarez / Journal of Statistical Planning and Inference 141 (2011) 3475–3488 ˜a-A H.-Y. Liang, J. de Un
3482
Normal−Probability−plot 90
0.999 0.997
80
0.99 0.98 70
0.95 0.90
60 Probability
0.75 50
40
0.50 0.25
30
0.10 0.05
20
0.02 0.01 10
0.003 0.001
0 6.85
6.9
6.95
6.86
6.88 6.9 Observed Data
6.92
Fig. 2. y ¼ 90%, n ¼900, M¼ 500.
P
From (5.1) and (5.2), to prove Theorem 3.1 it suffices to show that ln ðxÞ-lðxÞ, 1 P qffiffiffiffiffiffiffiffi fI1n ðxÞ þ I4n ðxÞ þI5n ðxÞg-0, d nhn DðxÞ
1 D qffiffiffiffiffiffiffiffi fI2n ðxÞI3n ðxÞg-Nðu,1Þ: d nhn DðxÞ
The proofs are divided into the following several Lemmas. Lemma 5.1. Let x 2 suppðlÞ. Suppose that (A0), (A1), (A2)(i)(iii), (A3)(ii) and (A4)–(A7) are satisfied and that aðnÞ ¼ Oðng Þ for d P dþ4 d some g 43. Set Qn ¼ ðy=nhn Þ ni¼ 1 wi wti . If nhn -0 and nhn -1, then Z K 2 ðsÞ ds þ op ð1Þ, ð5:3Þ Qn ¼ lðxÞVðxÞ Rd
n IðYi r en þ xp ðxÞÞp y X d
nhn i ¼ 1
G2 ðYi Þ
Z ¼ Op ððnhdn Þ1=2 Þ,
Z ¼ lðxÞVðxÞ
K2
Z xXi K 2 ðsÞ ds þop ð1Þ, gðYi Þ ¼ lðxÞmðxÞ hn Rd
max jZt wni j ¼ op ð1Þ,
1rirn
1 n y X d K 2 ðuÞ du d wi þop ððnhn Þ1=2 Þ: d R nhn i ¼ 1
ð5:4Þ
ð5:5Þ
Z
ð5:6Þ
Proof. We first prove (5.3). From (2.1), by using (A2)(i), and (A6) and (A7) we have n n Z Z t t y X y X 2 xXi gðYi Þg ðYi Þ 2 xs gðtÞg ðtÞ ¼ F ðds,dtÞ E K K EQ n ¼ 2 ðY Þ 2 ðtÞ d d d h h G G n n i nhn i ¼ 1 nhn i ¼ 1 R R Z Z Z Z 1 xs gðtÞg t ðtÞ f ðs,tÞ dsdt ¼ K2 K 2 ðsÞEðgðYÞg t ðYÞðGðYÞÞ1 jX ¼ xhn sÞlðxhn sÞ ds-lðxÞVðxÞ K 2 ðsÞ ds: ¼ d hn GðtÞ hn Rd R Rd Rd ð5:7Þ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Hence, in view of Qn ¼ EðQn Þ þ Op ð VarðQn ÞÞ, we need only to verify that VarðQn Þ-0.
´ lvarez / Journal of Statistical Planning and Inference 141 (2011) 3475–3488 ˜a-A H.-Y. Liang, J. de Un
For 1 rk,l r k, we write !2 ( VarðQn Þkl ¼
3483
9 X = xX g ðY Þg ðY Þ xX g ðY Þg ðY Þ xX g ðY Þg ðY Þ j j j k l i i i i i i k l k l þ ,K 2 Var K 2 Cov K 2 ; hn hn hn G2 ðYi Þ G2 ðYi Þ G2 ðYj Þ i¼1 iaj
y
d
nhn
n X
:¼ Q1n þ Q2n :
ð5:8Þ
From (2.1), (A2)(i) and (A5) and (A6) we obtain that ( " # 2 ) 2 2 y2 4 xX gk ðYÞgl ðYÞ 2 xX gk ðYÞgl ðYÞ E K Q1n ¼ E K 2d hn hn G2 ðYÞ G4 ðYÞ nhn ( Z ! Z 2 ) gk2 ðYÞgl2 ðYÞ y2 1 1 gðYÞg t ðYÞ d 4 xs 2 xs X ¼ s lðsÞ ds X ¼ s ðsÞ ds K E K E ¼ ¼ Oððnhn Þ1 Þ: 2d hn hn GðYÞ G3 ðYÞ y2 Rd nhn y Rd ð5:9Þ Let
xi ¼ ðy=hdn ÞK 2 ðððxXi Þ=hn ÞÞgk ðYi Þgl ðYi Þ=G2 ðYi Þ. 8 1 < Q2n ¼ 2 n :
X
jijj r ½hd n
þ
Then
9 X = Covðxi , xj Þ: ; d
ð5:10Þ
jijj 4 ½hn
For i oj, applying (A1), (A2)(i) and (A4)–(A6) we have y2 xX xX g ðY Þg ðY Þg ðY Þg ðY Þ y2 xX g ðY Þg ðY Þ2 j k i l i k j l j i i k i l i 2 2 2 jCovðxi , xj Þj ¼ 2d E K 2d E K K hn hn hn hn G2 ðYi Þ G2 ðYi ÞG2 ðYj Þ hn y2 Z Z gk ðYi Þgl ðYi Þgk ðYj Þgl ðYj Þ xs 2 xt K2 K E ¼ 2d Xi ¼ s,Xj ¼ t lji ðs,tÞ ds dt hn Rd Rd hn hn G2 ðYi ÞG2 ðYj Þ Z 2 1 xs g ðYÞgl ðYÞ K2 E k 2d X ¼ s lðsÞ ds ¼ Oð1Þ: d hn GðYÞ hn R
ð5:11Þ
On the other hand, from Lemma A.2 it follows that jCovðxi , xj Þj r C½aðjiÞ12=r ðEjxi jr Þ2=r and ! r1 Z yr xXi gk ðYi Þgl ðYi Þ r y jgk ðYÞgl ðYÞjr 2r xs X ¼ s lðsÞ ds ¼ Oðhndðr1Þ Þ, Ejxi jr ¼ rd E K 2 ¼ K E hn hn G2r1 ðYÞ G2 ðYi Þ hn hdr Rd n which, together with (5.10) and (5.11), implies that 0 1 1 X dð12=rÞ d 1 d 1 12=r A ¼ Oððnhd Þ1 Þ ½aðlÞ Q2n ¼ Oððnhn Þ Þ þO@ðnhn Þ hn n
ð5:12Þ
l ¼ ½hd n dð12=rÞ P1 since hn ½aðlÞ12=r ¼ Oð1Þ from (A3)(ii). l ¼ ½hd n d Therefore, (5.9) and (5.12) show that VarðQn Þ-0 by nhn -1. Similarly, one can prove (5.4). Now, we verify (5.5). Write Z ¼ lb, where l Z0 and JbJ ¼ 1. From (2.5) we find ybt X ybt X n n n n wni lbt wni ybt X ybt X GðYi ÞGn ðYi Þ 0¼ d wni 1 wi þ d wi ¼ d ¼ d t t nh Gn ðYi Þ 1þ lb w 1 þ lb w nhn i ¼ 1 nhn i ¼ 1 nhn i ¼ 1 ni ni ni¼1 " # tX t 2 n n wi wti b ylb GðYi ÞGn ðYi Þ GðYi ÞGn ðYi Þ lb Qn b t y X Z þ 1þ 2 b w i t t d d 1 þ lmax1 r i r n jb wni j Gn ðYi Þ Gn ðYi Þ nhn i ¼ 1 1 þ lb wni nhn i ¼ 1 ! t n supy Z aF~ jGn ðyÞGðyÞj y X 2lb Qn b t j b w j þ i t GðaF~ Þsupy Z aF~ jGn ðyÞGðyÞj nhd i ¼ 1 1 þ lmax1 r i r n jb wni j n !2 supy Z aF~ jGn ðyÞGðyÞj lbt Qn b : ð5:13Þ GðaF~ Þsupy Z aF~ jGn ðyÞGðyÞj 1 þ lmax1 r i r n jbt wni j R t Eq. (5.3) implies that b Qn b Z t0 þ op ð1Þ where t0 is the smallest eigenvalue of lðxÞVðxÞ Rd K 2 ðsÞ ds. In view of (2.1), (A2)(i) (iii) and (A6), by using MðxÞ ¼ 0 and the Taylor expansion we have !
Z Z n y X y xXi 1 xs 1 E w E K ÞðGðY ÞÞ K E ðgðYÞjX ¼ sÞlðsÞ ds ¼ KðsÞMðxhn sÞlðxhn sÞ ds ¼ Oðh2n Þ: gðY ¼ ¼ i i i d hn hn hdn hdn Rd Rd nhn i ¼ 1
ð5:14Þ d P Varððy=nhn Þ ni¼ 1
Similarly to the evaluate as for (5.8) one can get d P d dþ4 that ðy=nhn Þ ni¼ 1 wi ¼ Op ððnhn Þ1=2 Þ by nhn -0. Note that
d wi Þ ¼ Oððnhn Þ1 Þ, which, d P t ðy=nhn Þ ni¼ 1 jb wi j ¼ Op ð1Þ.
together with (5.14), implies Therefore, from (5.13) and
´ lvarez / Journal of Statistical Planning and Inference 141 (2011) 3475–3488 ˜a-A H.-Y. Liang, J. de Un
3484
Lemma A.4 we obtain that
l
d
t
1 þ lmax1 r i r n jb wni j d=r
t
¼ Op ððnhn Þ1=2 Þ:
ð5:15Þ
t
t
d
1=r r Þa:s: from the proof of Lemma 3 in Since Ejhn b wi jr ¼ hd n Ejb wi j o 1 from (A2)(i) and (A5), max1 r i r n jb wi j ¼ oððnhn Þ Owen (1990). Hence, from Lemma A.4 we have ! supy Z aF~ jGn ðyÞGðyÞj d t t max jb wni j r max jb wi j 1 þ ¼ op ððnhn Þ1=r Þ: GðaF~ Þsupy Z aF~ jGn ðyÞGðyÞj 1rirn 1rirn d
d
d
Therefore, nhn -1 and (5.15) yield that l ¼ Op ððnhn Þ1=2 Þ, Z ¼ Op ððnhn Þ1=2 Þ and d
d
max jZt wni j ¼ Op ððnhn Þ1=2 Þop ððnhn Þ1=r Þ ¼ op ð1Þ:
ð5:16Þ
1rirn
Next, we prove (5.6). From (2.5) we write 0¼
! n n n 2 t wni y X y X y X GðYi ÞGn ðYi Þ t w þ ðZ wni Þ ¼ w 1 Z wi Qn Z þ d wi ¼ ni ni t t d d d Gn ðYi Þ 1 þ Z wni nhn i ¼ 1 1 þ Z wni nhn i ¼ 1 nhn i ¼ 1 nhn i ¼ 1 )
2 n n 2 t y X GðYi ÞGn ðYi Þ GðYi ÞGn ðYi Þ y X wi ðZ wi Þ GðYi ÞGn ðYi Þ 3 þ wi wti Z 2 1þ : d þ d t Gn ðYi Þ Gn ðYi Þ Gn ðYi Þ nh i ¼ 1 nh i ¼ 1 1 þ Z wni n y X
n
n
While y X n n wi ðZt wi Þ2 y X d Jwi J3 JZJ2 j1 þ Zt wni j1 ¼ op ððnhn Þ1=2 Þ: d r d t nh nh 1 þ Z w ni i¼1 i¼1 n
n
Therefore, from (5.16) and Lemma A.4 it follows that 1 Z n n y X y X d d Z ¼ Qn1 d wi þ op ððnhn Þ1=2 Þ ¼ lðxÞVðxÞ K 2 ðuÞ du d wi þop ððnhn Þ1=2 Þ: d R nhn i ¼ 1 nhn i ¼ 1
&
P
Lemma 5.2. Suppose that the assumptions in Lemma 5.1 hold. Then we have ln ðxÞ-lðxÞ. Proof. We write ln ðxÞ ¼
K xXi n n K 2 xXi gðYi ÞðGn ðYi ÞÞ1 1 y X 1 xXi yZt X hn hn K þ d Gn ðYi Þ GðYi Þ 1 þ Zt wni nhd i ¼ 1 GðYi Þ hn GðYi Þð1 þ Zt wni Þ nhn i ¼ 1 n
n y X 1 d
nhn i ¼ 1
:¼ D1n ðxÞ þ D2n ðxÞ þ D3n ðxÞ: Note that, for e 4 0 !
P
Z Z n n y X 1 xXi y X 1 xXi 1 lðxÞ K K 4 ¼ e E jKðsÞjlðxh sÞ dsjKðsÞj ds, r n d d
nhn i ¼ 1 GðYi Þ
hn
enhn i ¼ 1
GðYi Þ
hn
e
Rd
which implies that n y X 1 xXi K ¼ Op ð1Þ: d hn nhn i ¼ 1 GðYi Þ
e
Rd
ð5:17Þ
(A0) implies 0 o GðaF~ Þ r GðYÞ r 1, so, from Lemma A.4 we have " jGðYÞ=Gn ðYÞ1jr sup jGn ðyÞGðyÞj= GðaF~ Þ sup jGn ðyÞGðyÞj ¼ Op ðn1=2 Þ: y Z aF~
y Z aF~
Therefore, by using Lemma A.4 and (5.5) one can verify that D1n ðxÞ ¼ Op ðn1=2 Þ ¼ op ð1Þ. R From (2.1), (A2)(ii) and (A6) we have EðD2n ðxÞÞ ¼ Rd KðsÞlðxhn sÞ ds-lðxÞ: Hence, similarly to the evaluate for (5.3) one can verify that D2n ðxÞ ¼ lðxÞ þ op ð1Þ. d P d As to D3n ðxÞ, since ðy=nhn Þ ni¼ 1 ð1=GðYi ÞÞK 2 ðððxXi Þ=hn ÞÞgðYi ÞðGðYi ÞÞ1 ¼ Op ð1Þ, from (5.5) we get D3n ðxÞ ¼ Op ððnhn Þ1=2 Þ ¼ op ð1Þ. & qffiffiffiffiffiffiffiffi P d Lemma 5.3. Suppose that the assumptions in Lemma 5.1 hold. Then we have 1= nhn DðxÞ fI1n ðxÞ þI4n ðxÞ þI5n ðxÞg-0.
´ lvarez / Journal of Statistical Planning and Inference 141 (2011) 3475–3488 ˜a-A H.-Y. Liang, J. de Un
3485
Proof. By using Lemma A.4, from (5.5) and (5.17) we have qffiffiffiffiffiffiffiffi d n nhn supy Z aF~ jGn ðyÞGðyÞj 1 y X 1 xXi d=2 qffiffiffiffiffiffiffiffi K jI1n ðxÞj r ¼ Op ðhn Þ ¼ op ð1Þ, t d D ðxÞ½Gða Þsup jG ðyÞGðyÞjð1 þmax j Z w jÞ Þ h GðY d n n 1rirn ni nhn i ¼ 1 i y Z aF~ F~ nhn DðxÞ
1 qffiffiffiffiffiffiffiffi jI4n ðxÞj r d nhn DðxÞ
qffiffiffiffiffiffiffiffi d n nhn supy Z aF~ jGn ðyÞGðyÞjmax1 r i r n jZt wi j y X 1 xXi d=2 K ¼ op ðhn Þ ¼ op ð1Þ: d DðxÞ½GðaF~ Þsupy Z aF~ jGn ðyÞGðyÞj hn nh i ¼ 1 GðYi Þ n
Eq. (5.5), (A2) (i) and (A5) yield that qffiffiffiffiffiffiffiffi d nhn JZJmax1 r i r n jZt wni j 1 qffiffiffiffiffiffiffiffi jI5n ðxÞj r DðxÞ½GðaF~ Þsupy Z aF~ jGn ðyÞGðyÞjð1þ max1 r i r n jZt wni jÞ d nhn DðxÞ n y X jg1 ðYi Þj þ þ jgk ðYi Þj 2 xXi K & ¼ op ð1Þ: d GðYi Þ hn nhn i ¼ 1
Lemma 5.4. Let x 2 suppðlÞ. Suppose that (A1)–(A9) are satisfied and that aðnÞ ¼ Oðng Þ for some g Zrðr þ 2Þ=½2ðr2Þ. Then D
d
ðnhn Þ1=2 ðDðxÞÞ1 fI2n ðxÞI3n ðxÞg-Nðu,1Þ. Proof. From (5.4) to (5.6) one can write Z n 1 y X d t 2 I ðxÞ ¼ lðxÞ m ðxÞ K ðsÞ ds Z þ ðop ð1ÞÞt Z ¼ mt ðxÞV 1 ðxÞ d wi þ op ððnhn Þ1=2 Þ 3n d d R nhn i ¼ 1 nhn and n X 1 y xXi qffiffiffiffiffiffiffiffi f½IðYi r en þ xp ðxÞÞpmt ðxÞV 1 ðxÞgðYi ÞgK ðGðYi ÞÞ1 þ op ð1Þ fI2n ðxÞI3n ðxÞg ¼ qffiffiffiffiffiffiffiffi hn d d nhn DðxÞ i ¼ 1 nhn DðxÞ n X Zni þ op ð1Þ, :¼ n1=2 i¼1
where Zni ¼
y xXi f½IðYi r en þ xp ðxÞÞpmt ðxÞV 1 ðxÞgðYi ÞgK ðGðYi ÞÞ1 : d=2 hn hn DðxÞ
P P D Then, we need only to prove that n1=2 ni¼ 1 EZ ni -u and n1=2 ni¼ 1 ðZni EZ ni Þ-Nð0,1Þ: Pn 1=2 Step 1. We prove n i ¼ 1 EZ ni -u. Using (2.1), from (A2), (A6) and (A8) we have (Z en þ xp ðxÞ Z n X n xs 1=2 n EZ ni ¼ K f ðs,tÞ dtplðxhn sÞ d hn 0 ðnhn Þ1=2 DðxÞ Rd i¼1 Z Z 1 d ðnhn Þ1=2 mt ðxÞV 1 ðxÞ gðtÞf ðs,tÞ dt ds ¼ KðsÞ½F1 ðxhn s, en þ xp ðxÞÞF1 ðx, xp ðxÞÞl1 ðxÞlðxhn sÞ DðxÞ Rd 0 Z d ðnhn Þ1=2 mt ðxÞV 1 ðxÞMðxhn sÞlðxhn sÞ ds ¼ KðsÞf½F1 ðxhn s, en þ xp ðxÞÞF1 ðx, en þ xp ðxÞÞ DðxÞ Rd þ ½F1 ðx, en þ xp ðxÞÞF1 ðx, xp ðxÞÞ þ F1 ðx, xp ðxÞÞl1 ðxÞ½lðxÞlðxhn sÞmt ðxÞV 1 ðxÞMðxhn sÞlðxhn sÞg ds d
¼
ðnhn Þ1=2 sðxÞ dþ4 f ðx, xn ðxÞÞen þ Oððnhn Þ1=2 Þf ðx, xp ðxÞÞu ¼ u DðxÞ DðxÞ
by Fðxp ðxÞjxÞ ¼ p and MðxÞ ¼ 0, where xn ðxÞ is between xp ðxÞ and en þ xp ðxÞ. P D Step 2. We verify n1=2 ni¼ 1 ðZni EZ ni Þ-Nð0,1Þ. Note that (A3)(i) implies that there exists a sequence of positive integers d 1=2 1=2 dn -1 such that dn q ¼ oððnhn Þ Þ, dn ðnhd aðqÞ-0. Let p :¼ pn ¼ ½n=ðp þ qÞ, p :¼ pn ¼ ½ðnhdn Þ1=2 =dn . Then a simple n Þ calculation shows that q=p-0,
paðqÞ-0, pq=n-0, p=n-0, p=ðnhdn Þ1=2 -0:
ð5:18Þ
Next we will employ Bernstein’s big-block and small-block procedure. Partition the set f1,2, . . . ,ng into 2pn þ 1 subsets with large blocks of size p ¼ pn and small blocks of size q¼qn. Let ymn ,y0mn ,y00pn be defined as follows: ymn ¼
kmX þ p1 i ¼ km
ðZni EZ ni Þ,
y0mn ¼
lm X þ q1
ðZnj EZ nj Þ,
j ¼ lm
y00pn ¼
n X k ¼ pðp þ qÞ þ 1
ðZnk EZ nk Þ,
´ lvarez / Journal of Statistical Planning and Inference 141 (2011) 3475–3488 ˜a-A H.-Y. Liang, J. de Un
3486
where km ¼ ðm1Þðp þ qÞ þ 1, lm ¼ ðm1Þðp þqÞ þp þ1, m ¼ 1, . . . , p. Then ( ) n p p X X X n1=2 ðZni EZ ni Þ ¼ n1=2 ymn þ y0mn þ y00pn :¼ n1=2 fS0n þ S00n þ S000 n g: m¼1
i¼1
m¼1
Hence, it suffices to show that n1 EðS00n Þ2 -0,
2 n1 EðS000 n Þ -0,
ð5:19Þ
Varðn1=2 S0n Þ-1,
ð5:20Þ
! p p X Y 1=2 1=2 n ymn Eexpðitn ymn Þ -0, Eexp it m¼1
gn ðeÞ ¼
ð5:21Þ
m¼1
p pffiffiffi 1 X Ey2 Iðjymn j 4 e nÞ-08e 40: n m ¼ 1 mn
ð5:22Þ
We first establish (5.19). Obviously þ q1 p lm X p 1 1 X 2 X EðS00n Þ2 ¼ VarðZni Þ þ n nm¼1 i¼l nm¼1l m
X
m
CovðZni ,Znj Þ þ
r i o j r lm þ q1
2 X Covðy0in ,y0jn Þ :¼ J1n ðxÞ þ J2n ðxÞ þ J3n ðxÞ: n 1riojrp
d=2
In view of (2.1), from (A7) and EZ ni ¼ Oðhn Þ we obtain that
y2 xX 2 G ðYÞð½IðY r en þ xp ðxÞÞpmt ðxÞV 1 ðxÞgðYÞÞ2 þ Oðhdn Þ E K2 VarðZni Þ ¼ EZ 2ni ðEZ ni Þ2 ¼ 2 hn hdn D ðxÞ Z y xs K2 Eð½IðY r en þ xp ðxÞÞp2 ðGðYÞÞ1 jX ¼ sÞlðsÞ ds dt ¼ hn hdn D2 ðxÞ Rd Z Z y ylðxÞ½nðxÞmt ðxÞV 1 ðxÞmðxÞ K 2 ðsÞ ds þoð1Þ ¼ K 2 ðsÞ ds þoð1Þ ¼ 1þ oð1Þ, lðxÞmt ðxÞV 1 ðxÞmðxÞ 2 D2 ðxÞ D ðxÞ Rd Rd ð5:23Þ which yields that J1n ðxÞ ¼ Oðpq=nÞ ¼ oð1Þ from (5.18). Since jJ2n ðxÞj r
2 X jCovðZni ,Znj Þj n 1riojrn
and
jJ3n ðxÞj r
2 X jCovðZni ,Znj Þj, n 1riojrn
to prove jJ2n ðxÞj ¼ oð1Þ and jJ3n ðxÞj ¼ oð1Þ, it suffices to show that 1 X jCovðZni ,Znj Þj-0: n 1riojrn
ð5:24Þ
Next, let cn (specified below) be a sequence of integers such that cn -1 and cn hdn -0. We write 0 1 X 1 X 1@ X AjCovðZni ,Znj Þj: jCovðZni ,Znj Þj ¼ þ n 1riojrn n 0 o ji r c ji 4 c n
ð5:25Þ
n
From (A1), (A2)(i) and (A4) and (A5) we find jCovðZni ,Znj Þj ¼ Oðhdn Þ. Hence 1 X jCovðZni ,Znj Þj ¼ Oðcn hdn Þ-0: n 0 o ji r c
ð5:26Þ
n
On the other hand, from Lemma A.2 it follows that jCovðZni ,Znj Þj rC½aðjiÞ12=r ðEjZni jr Þ2=r :
ð5:27Þ
It is easy to see that EjZni jr ¼
r yr xXi yr 2r1 E f½IðYi r en þ xp ðxÞÞpmt ðxÞV 1 ðxÞgðYi ÞgK ðGðYi ÞÞ1 r dr=2 r dr=2 r hn hn D ðxÞ hn D ðxÞGr1 ðaF~ Þ
r xX r dðr=21Þ ðGðYÞÞ1 þ E mt ðxÞV 1 ðxÞgðYÞK xX ðGðYÞÞ1 Þ: E K ¼ Oðhn hn hn
ð5:28Þ
´ lvarez / Journal of Statistical Planning and Inference 141 (2011) 3475–3488 ˜a-A H.-Y. Liang, J. de Un dð12=rÞ=d
Therefore, by choosing cn ¼ hn
n X
1 X C jCovðZni ,Znj Þj r n ji 4 c n j ¼ 1 ji ¼ c
, from (5.27) and (A3)(ii) it follows that
n1 X
n
3487
n
dð12=rÞ
½aðjiÞ12=r hn
1 X
dð12=rÞ
rChn
þ1
d ½aðlÞ12=r rCc n hn
dð12=rÞ
l ¼ cn
1 X
ld ½aðlÞ12=r -0:
l ¼ cn
ð5:29Þ Thus, (5.24) is verified by (5.25), (5.26) and (5.29). 2 As to n1 EðS000 n Þ , from (5.23) and (5.24) we have n X X 1 1 2 npðp þqÞ 2 X 2 EðS000 þ VarðZni Þ þ CovðZni ,Znj Þ r C jCovðZni ,Znj Þj-0: nÞ ¼ n n i ¼ pðp þ qÞ þ 1 n pðp þ qÞ þ 1 r i o j r n n n 1riojrn
Next we establish (5.20). Since pp=n-1, from (5.23) and (5.24) one can get Varðn1=2 S0n Þ-1. As to (5.21), according to Lemma A.1 we have ! p p X Y 1=2 1=2 n ymn Eexpðitn ymn Þ r16paðqþ 1Þ, Eexp it m¼1 m¼1 which tends to zero by (5.18). Finally, we establish (5.22). Taking p ¼ 1 þr=2, l ¼ q ¼ r, then g Z pq=½2ðqpÞ ¼ rðr þ 2Þ=½2ðr2Þ. Hence, in view of Lemma A.3, and (5.28) we have pffiffiffi dð4r2 Þ=4r Ey2mn Iðjymn j 4 e nÞ r e1r=2 nðr2Þ=4 Ejymn j1 þ r=2 r Cnðr2Þ=4 pð1 þ r=2Þ=2 ðEjZn1 jr Þð1 þ r=2Þ=r rCnðr2Þ=4 pð2 þ rÞ=4 hn : Therefore, (A9) and dn -1 yield that dð4r 2 Þ=4r
gn ðeÞ r Cn½ðr2Þ=4 þ 1 ppð2 þ rÞ=4 hn
ðr2Þ=4
r C dn
dð1 þ 4=rÞ ðr2Þ=8
ðn1 hn
Þ
-0:
&
Acknowledgments The authors are grateful to the Editor, an associate Editor, and two anonymous referees for providing a detailed list of comments which greatly improved the presentation of the paper. Authors were supported by the Grants MTM2008-03129 of the Spanish Ministry of Science and Innovation, the Xunta de Galicia under the INBIOMED project (DXPCTSUG, Ref. 2009/063) and Project 10PXIB300068PR of the Xunta de Galicia, Spain, and also by the National Natural Science Foundation of China (10871146). Appendix A In this section, we give some preliminary Lemmas, which had been used in Section 5. Lemma A.1 (Volkonskii and Rozanov, 1959). Let V1 , . . . ,Vm be a-mixing random variables measurable with respect to the s-algebra F ji11 , . . . ,F jimm , respectively, with 1 r i1 oj1 o ojm rn,il þ 1 jl Z w Z1 and jVj j r 1 for l,j ¼ 1,2, . . . ,m. Then Q Qm b jEð m V Þ j j¼1 j ¼ 1 EV j jr 16ðm1ÞaðwÞ, where F a ¼ sfVi ,a r i rbg and aðwÞ is the mixing coefficient. Lemma A.2 (Hall and Heyde, 1980, Corollary A.2, p. 278). Suppose that X and Y are random variables such that EjXjp o 1, EjYjq o1, where p, q 4 1, p1 þ q1 o1. Then ( jEXYEXEYj r8JXJp JYJq
)1p1 q1 sup
jPðA \ BÞPðAÞPðBÞj
:
A2sðXÞ,B2sðYÞ
Lemma A.3 (Shao and Yu, 1996, Theorem 4.1). Let 2 op oq r 1 and fXn ,n Z 1g be an a-mixing sequence of random variables with EXn ¼0 and the mixing coefficients faðjÞg. Assume that aðnÞ rCng for some C 40 and g 40. If g Z pq=½2ðqpÞ, then there P exists Q ¼ Q ðp,q, g,CÞ o1 such that Ej ni¼ 1 Xi jp rQnp=2 max1 r i r n JXi Jpq : Lemma A.4 (Liang et al., in press). Suppose that aðnÞ ¼ Oðng Þ, for some g 4 3: Then, under (A0) we have supy Z aF~ jGn ðyÞ GðyÞj ¼ Op ðn1=2 Þ. References Cai, T., Wei, L.J., Wilcox, M., 2000. Semiparametric regression analysis for clustered survival data. Biometrika 87, 867–878. Cai, Z.W., 2002. Regression quantiles for time series. Econometric Theory 18, 169–192. Cai, J., Kim, J., 2003. Nonparametric quantile estimation with correlated failure time data. Lifetime Data Anal. 9, 357–371. Chaudhuri, P., 1991. Global nonparametric estimation of conditional quantile functions and their derivatives. J. Multivariate Anal. 39, 246–269. Chaudhuri, P., Doksum, K., Samarov, A., 1997. On average derivative quantile regression. Ann. Statist. 25, 715–744. Chen, J., Qin, J., 1993. Empirical likelihood estimation for finite populations and the effective usage of auxiliary information. Biometrika 80, 107–116.
3488
´ lvarez / Journal of Statistical Planning and Inference 141 (2011) 3475–3488 ˜a-A H.-Y. Liang, J. de Un
Dabrowska, D., 1992. Nonparametric quantile regression with censored data. Sankhya 54, 252–259. Doukhan, P., 1994. Mixing: Properties and Examples. Lecture Notes in Statistics, vol. 85. Springer, Berlin. Fan, J., Hu, T.C., Truong, Y.K., 1994. Robust nonparametric function estimation. Scand. J. Statist. 21, 433–446. Ferraty, F., Rabhi, A., Vieu, P., 2005. Conditional quantiles for dependent functional data with application to the climatic El Nin˜o phenomenon. Sankyha 67, 378–398. Gannoun, A., Saracco, J., Yu, K., 2003. Nonparametric prediction by conditional median and quantiles. J. Statist. Plan. Inference 117, 207–223. Hall, P., Heyde, C.C., 1980. Martingale Limit Theory and its Application. Academic Press, New York. He, S., Yang, G., 1994. Estimating a lifetime distribution under different sampling plan. In: Gupta, S.S., Berger, J.O. (Eds.), Statistical Decision Theory and Related Topics, vol. 5. Springer, Berlin Heidelberg, New York, pp. 73–85. He, S., Yang, G., 1998. Estimation of the truncation probability in the random truncation model. Ann. Statist. 26, 1011–1027. Honda, T., 2000. Nonparametric estimation of a conditional quantile for a-mixing processes. Ann. Inst. Statist. Math. 52, 459–470. Iglesias-Pe´rez, M.C., 2003. Strong representation of a conditional quantile function estimator with truncated and censored data. Statist. Probab. Lett. 65, 79–91. Kang, S.S., Koehler, K.J., 1997. Modification of the greenwood formula for correlated failure times. Biometrics 53, 885–899. Lemdani, M., Ould-Saı¨d, E., Poulin, N., 2009. Asymptotic properties of a conditional quantile estimator with randomly truncated data. J. Multivariate Anal. 100, 546–559. Li, G., Hollander, M., McKeague, I.W., Yang, J., 1996. Nonparametric likelihood ratio confidence bands for quantile functions from incomplete survival data. Ann. Statist. 24, 628–640. ˜ a-A´lvarez, J., 2011. Asymptotic properties of conditional quantile estimator for censored dependent observations. Ann. Inst. Statist. Liang, H.Y., de Un Math. 63, 267–289. ˜ a-A´lvarez, J., Iglesias-Pe´rez, M.C. Local polynomial estimation of a conditional mean function with dependent truncated data. Test, Liang, H.Y., de Un in press, doi:10.1007/s11749-011-0234-6. Lynden-Bell, D., 1971. A method of allowing for known observational selection in small samples applied to 3CR quasars. Monthly Notices Roy. Astronomical Soc. 155, 95–118. Mehra, K.L., Rao, M.S., Upadrasta, S.P., 1991. A smooth conditional quantile estimator and related applications of conditional empirical processes. J. Multivariate Anal. 37, 151–179. Ould-Saı¨d, E., 2006. A strong uniform convergence rate of kernel conditional quantile estimator under random censorship. Statist. Probab. Lett. 76, 579–586. Ould-Saı¨d, E., Yahia, D., Necir, A., 2009. A strong uniform convergence rate of a kernel conditional quantile estimator under random left-truncation and dependent data. Electronic J. Statist. 3, 426–445. Owen, A.B., 1988. Empirical likelihood ratio confidence intervals for a single functional. Biometrika 75, 237–249. Owen, A.B., 1990. Empirical likelihood confidence regions. Ann. Statist. 18, 90–120. Polonik, W., Yao, Q., 2002. Set-indexed conditional empirical and quantile processes based on dependent data. J. Multivariate Anal. 80, 234–255. Qin, G., Tsao, M., 2003. Empirical likelihood inference for median regression models for censored survival data. J. Multivariate Anal. 85, 416–430. Qin, Y.S., Wu, Y., 2001. An estimator of a conditional quantile in the presence of auxiliary information. J. Statist. Plan. Inference 99, 59–70. Shao, Q., Yu, H., 1996. Weak convergence for weighted empirical processes of dependent sequences. Ann. Probab. 24, 2098–2127. Shen, J.S., He, S.Y., 2007. Empirical likelihood for the difference of quantiles under censorship. Statist. Papers 48, 437–457. Tse, S.M., 2005. Quantile process for left truncated and right censored data. Ann. Inst. Statist. Math. 57, 61–69. Van Keilegom, I., Veraverbeke, N., 1998. Bootstrapping quantiles in a fixed design regression model with censored data. J. Statist. Plan. Inference 69, 115–131. Volkonskii, V.A., Rozanov, Y.A., 1959. Some limit theorems for random functions. Theory Probab. Appl. 4, 178–197. Withers, C.S., 1981. Conditions for linear processes to be strong mixing. Z. Wahrsch. verw. Gebiete 57, 477–480. Woodroofe, M., 1985. Estimating a distribution function with truncated data. Ann. Statist. 13, 163–177. Xiang, X., 1995. Deficiency of samples quantile estimator with respect to kernel estimator for censored data. Ann. Statist. 23, 836–854. Xiang, X., 1996. A kernel estimator of a conditional quantile. J. Multivariate Anal. 59, 206–216. Zhang, B., 1995. M-estimation and quantile estimation in the presence of auxiliary information. J. Statist. Plan. Inference 44, 77–94. Zhou, X., Sun, L.Q., Ren, H.B., 2000. Quantile estimation for left truncated and right censored data. Statist. Sinica 10, 1217–1229. Zhou, Y., Liang, H., 2000. Asymptotic normality for L1 norm kernel estimator of conditional median under a-mixing dependence. J. Multivariate Anal. 73, 136–154.