Empirical likelihood for the parametric part in partially linear errors-in-function models

Empirical likelihood for the parametric part in partially linear errors-in-function models

Statistics and Probability Letters 82 (2012) 63–66 Contents lists available at SciVerse ScienceDirect Statistics and Probability Letters journal hom...

214KB Sizes 0 Downloads 67 Views

Statistics and Probability Letters 82 (2012) 63–66

Contents lists available at SciVerse ScienceDirect

Statistics and Probability Letters journal homepage: www.elsevier.com/locate/stapro

Empirical likelihood for the parametric part in partially linear errors-in-function models Zhensheng Huang School of Mathematics, Hefei University of Technology, Hefei, 230009, PR China

article

abstract

info

Article history: Received 22 February 2011 Received in revised form 17 August 2011 Accepted 24 August 2011 Available online 5 September 2011 MSC: 62G15 62J99

Partially linear errors-in-function models were proposed by Liang (2000), but their inferences have not been systematically studied. This article proposes an empirical likelihood method to construct confidence regions of the parametric components. Under mild regularity conditions, the nonparametric version of the Wilk’s theorem is derived. Simulation studies show that the proposed empirical likelihood method provides narrower confidence regions, as well as higher coverage probabilities than those based on the traditional normal approximation method. © 2011 Elsevier B.V. All rights reserved.

Keywords: Confidence region Empirical likelihood Errors in function Partially linear model

1. Introduction Liang (2000) proposed the partially linear errors-in-function model Yi = g (Ti ) + XiT β + εi ,

i = 1 , . . . , n,

(1)

where g (·) is an unknown univariable function, β ∈ R is an unknown parametric vector, Xi ∈ R , Ti ∈ R and Yi ∈ R, the error variable εi is independent of (Xi , Ti ) with E (εi ) = 0 and Var(εi ) = σ 2 , and in model (1), the covariates Ti are assumed to be measured with error, we may only observe their surrogates Wi , which may be expressed as p

W i = Ti + U i ,

i = 1, . . . , n,

p

(2)

where the variables Ui denote the measurement errors and are assumed to be independent and identically distributed, and independent of the vector (Yi , Xi , Ti ), with mean zero and covariance matrix Σuu . To assure the identifiability of the model (1), we only consider the case that U has a known distribution. A more detailed discussion can be found in Fan and Truong (1993). When the covariate T is observable, then model (1) is the partially linear model, which has attracted lots of attention due to its flexibility to combine the traditional linear regression models with the popular nonparametric regression models. The relevant literature can be found in the following references such as Engle et al. (1986), Speckman (1988), Härdle et al. (2000) and references therein. For his model, Liang (2000) applied the deconvolution method used by Fan and Truong (1993) to obtain a root-n consistent estimator of β , which was shown to be asymptotically normal under the suitable conditions. To the best of our knowledge, Liang (2000) was first to explore the the model (1). However, not much attention has paid to inferences of the model (1). In this paper, we make statistical inference for the parametric components β , which are of E-mail addresses: [email protected], [email protected]. 0167-7152/$ – see front matter © 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.spl.2011.08.020

64

Z. Huang / Statistics and Probability Letters 82 (2012) 63–66

primary interest, in the model (1). In principle, the asymptotic results proposed by Liang (2000) can be directly used to construct asymptotically correct confidence regions of the parameters of interest. But the finite-sample performance of the corresponding confidence regions may not be appealing. The main reason is that the complex structure of the covariance matrix has to be estimated by using the plug-in method. To attack this issue, we recommend using the empirical likelihood method to construct confidence regions for β . The empirical likelihood method was introduced by Owen (1988), who pointed out that it has many appealing features. A detailed discussion of the empirical likelihood is given in Owen (2001). Hjort et al. (2009) and Peng and Schick (2010) give extensions that allow for estimated constraints and for the number of constraints to go to infinity. We shall use a result in Peng and Schick (2010) on estimated constraints to obtain our main result. Their result avoids verifying the cumbersome condition that the convex hull generated by the random vectors used in the estimated constraints contains the origin as an interior point with probability tending to one. The verification of this condition has frequently been overlooked in the literature. Other relevant literature about the application of the empirical likelihood method can be found in Owen (1990, 1991), Chen (1994), Kolaczyk (1994), Wang and Li (2002), Zhu and Xue (2006) and Huang et al. (2010). There is some literature on the semiparametric inference for partially linear models based on the empirical log-likelihood method. Shi and Lau (2000) studied the empirical likelihood inference on partially linear models with all the covariates measured exactly. Lu (2009) made empirical-likelihood-based inference for heteroscedastic partially linear models. Cui and Kong (2006) used the empirical likelihood method to estimate the partially linear models in the case that the covariates of linear components are measured with errors. Motivated by the research of the above authors and Liang (2000), in this paper, we study empirical likelihood inference for the parameter β in the partially linear model when the covariate T is measured with error. The proposed empirical likelihood statistic will be shown to be asymptotically a standard chi-square variable. And simulation studies are conducted to compare the proposed method with the traditional normal approximation method. The rest of this paper is organized as follows. The empirical likelihood ratio statistic for β is defined and some assumptions and main results are also given in Section 2, furthermore, the confidence regions for the parameters are constructed. Section 3 provides an example based on simulated data. 2. Empirical likelihood method Assume that (Xi , Yi , Wi ) for i = 1, . . . , n are i.i.d. samples from model (1) with (2). Let fW (·) and fT (·) be the densities of W and T , respectively. By using the method proposed by Stefanski and Carroll (1990) and Fan and Truong (1993), fT (·) t −W

φ (s)

j 1 1 K can be estimated by fˆn (t ) = nh j=1 K ( h ) with K (t ) = 2π R1 exp(−ist ) φU (s/h) ds, where K (·) is a kernel function, h is a bandwidth and φK (·) is the Fourier transform of K (·), and φU (·) is the characteristic   function  of the  error variable   U.

∑n



Similar to arguments present by Liang (2000), we define ωni (t ) = K

t − Wi h

/



j

K

t − Wj h



1 K nh

t − Wi h

/fˆn (t ). In

order to define an empirical likelihood ratio statistic for β , we first briefly review the estimation procedure proposed by Liang (2000). T be defined as g˘ (t ; β) = ∑nBy using the factT that g (t ) = E (Y − X β|T = t ), and if β is known, then the estimator of g (·) ∑can n T ˘ (Wi ; β)}2 . ω ( t )( Y − X β) . Thus the estimator of β can be defined by minimizing the expression { ni i i i=1 i=1 Yi − Xi β − g By using the least-squares method, the estimator βˆ of β can be defined as

 T −1  T   X  Y , X  X βˆ = 

(3)

where  X = ( X1 , . . . ,  Xn )T and  Y = ( Y1 , . . . ,  Yn )T . In the above expressions  Xi = Xi −

∑n

j =1

ωnj (Wi )Yj . Then the final estimator of g (·) can be defined as gˆ (t ) =

n −

ˆ ωni (t )(Yi − XiT β).

∑n

j =1

ωnj (Wi )Xj and  Yi = Yi −

(4)

i=1

By the definition of βˆ in (3), we can notice that βˆ is actually the solution of the equation:  XT  Y −  XT  X β = 0, which

     ∑n  is equivalent to i=1  Xi ( Yi −  XiT β) = 0. Consequently, an empirical likelihood ratio function for β can be defined as   n n n ∏ − − ˜  Rn (β) = max npi : pi ⩾ 0, pi = 1, pi ξi (β) = 0 , (5) i=1

i=1

i =1

where ξ˜i (β) =  Xi ( Yi −  β). n (β), we require some regularity conditions, which are mainly taken To derive the nonparametric Wilk’s theorem for R from Liang (2000) with some minor modifications. Here for easy reference we provide them as follows. Firstly we write φU (·) as the error distribution, and define γj (t ) = E (Xij |Ti = t ) and Vij = Xij − γj (Ti ), where i = 1, . . . , n, j = 1, . . . , p. XiT

Condition A:

(A1 ) The functions g (·) and γ (·) are Lipschitz continuous of order 1, and the expressions sup0⩽t ⩽1 E (‖X1 ‖3 |T = t ) < ∞, and E (|ε|3 + ‖U ‖3 ) < ∞, the matrix Σ1 = E (V1 V1T ) is positive definite, where Vi = (Vi1 , . . . , Vip )T .

Z. Huang / Statistics and Probability Letters 82 (2012) 63–66

65

∞

∞

(A2 ) The kernel function K (t ) is a kth-order function, which satisfies the conditions that −∞ K (t )dt = 1, and −∞ ∞ t l K (t )dt = 0 for l = 1, . . . , k − 1 and −∞ t l K (t )dt ̸= 0 for l = k. (A3 ) The marginal density of the unobserved covariate T , say fT (·), is bounded away from 0 on the interval [0, 1]; fT (·) has a bounded kth derivative, k is a positive integer; the characteristic function of φU (·) does not vanish and the distribution of the variable U is ordinary smooth or super smooth. Condition B:

(B1 ) When φU (·) is super smooth, the variables X and T are mutually independent, the function φK (t ) has a bounded support on |t | ⩽ M0 and under the condition that c > M0 (2/ζ )1/δ we take h = c (log n)−1/δ , where δ, ζ are positive constants. (B2 ) When φU (·) is ordinary smooth, under the conditions that d > 0 and 2k > 2δ + 1 we take h = dn−1/(2k+2δ+1) , and suppose that as t → ∞, t δ φU (t ) → c, t δ+1 φU′ (t ) = O(1) for some constant c ̸= 0, and ∫ ∞ ∫ ∞ δ+1 ′ |t δ+1 φK (t )|2 dt < ∞. |t | (φK (t ) + φK (t ))dt < ∞, −∞

−∞

Remark 1. The detailed definitions of super smooth and ordinary smooth in the above (A3 ) can be found in Fan and Truong (1993). Using similar arguments to those in Liang (2000) one verifies n 1−

n i=1

n 1−

|ξ˜i (β) − Vi εi |2 = op (1) and

n i=1

[ξ˜i (β) − Vi εi ] = op (n−1/2 ).

Thus, in view of the discussion following Theorem 6.1 in Peng and Schick (2010) one now obtains the following result. Theorem 1. Suppose that Condition A and either B1 or B2 hold. If β is the true value of the parameter, then we have d n (β)} −→ − 2 log{R χp2 ,

(6)

d

where −→ stands for convergence in distribution, χp2 is a standard chi-square distribution with p degrees of freedom. By using the conclusion in Theorem 1, we can construct a confidence region for the parameter β . More precisely, for any 0 ⩽ α < 1, let cα be such that P (χp2 ⩽ cα ) = 1 − α . Then, for asymptotically correct coverage probability 1 − α , a confidence

n (β)} ⩽ cα }. region for β can be defined as Cα = {β : −2 log{R Corollary. Under the conditions of Theorem 1, we have P (β ∈ Cα ) −→ 1 − α,

as n → ∞.

3. Simulation studies To illustrate the finite sample performance of our proposed method we carried out a small simulation study. We considered two approaches: the empirical likelihood (EL) method as suggested in Section 2 and the normal approximation (NA) method proposed by Liang (2000), the details can be found in Theorem of Liang (2000). We reported results from a simulation study designed to compare these two methods in terms of coverage probabilities (CP) and average lengths (AL) of the confidence intervals. To compare the results in cases of considering measurement error (CME) and ignoring measurement error (IME), we also consider the case in ignoring measurement error. Consider the following partially linear error-in-function model Yi = XiT β + g (Ti ) + εi

and Wi = Ti + Ui , i = 1, . . . , n,

(7)

3 where β = 1, Xi ∼ N (1, 0.52 ), Ti ∼ N (0.5, 0.22 ), εi ∼ N (0, 0.42 ) and g (t ) = 6t+ (1 − t )3+ . Following the error distribution used in Fan and Truong (1993), we consider the normal error Ui ∼ N (0, Σuu ). Moreover, we take Σuu = 0.1252 and 0.2502 to represent different levels of measurement errors and suppose the kernel function K (·) has a Fourier transform given by φK (t ) = (1 − t 2 )2+ . Then

K (t ) =

1

π

∫ 0

1

cos(st )(1 − s2 )3 exp



Σuu s2 2h2



ds.

This kernel function was also used in Liang (2000). When we consider the case in IME, we can use the quartic kernel 15 (1 − t 2 )2 I {|t | ≤ 1}. 16 In our simulations, the size of the samples were taken as 100, 200, 300. Here we consider the confidence intervals and coverage probabilities of β . The confidence intervals and their coverage probabilities, with nominal level 1 −α = 0.95, were

66

Z. Huang / Statistics and Probability Letters 82 (2012) 63–66

Table 1 Under the cases in the CME and IME, coverage probabilities (CP) and average lengths (AL) of the confidence intervals based on the empirical likelihood (E) and the normal approximation (N) with nominal confidence level 0.95. Cases CME(Σuu = 0.125 ) 2

CME(Σuu = 0.2502 )

IME

n

CP (E)

CP (N)

AL (E)

AL (N)

100 200 300 100 200 300 100 200 300

0.923 0.937 0.943 0.910 0.927 0.935 0.899 0.914 0.918

0.916 0.931 0.940 0.905 0.928 0.931 0.885 0.910 0.914

0.475 0.431 0.259 0.516 0.457 0.301 0.716 0.630 0.528

0.482 0.450 0.293 0.537 0.460 0.312 0.734 0.694 0.590

computed from 1000 runs, respectively. The leave-one-sample-out method is used to select the bandwidth h. This method has been widely applied in practice, for example Fan and Huang (2005) and Fan and Gijbels (1996). The cross-validation score for h can be defined as CV (h) =

1 n

∑n  i=1

T Yi − XiT βˆ −i − gˆ− i (Wi )

2

, where βˆ −i is computed from data with measurements

of the ith observation deleted and gˆ−i (·) is the estimator defined in (4) with βˆ replaced by βˆ −i . All the numerical results are reported in Table 1. From the above simulation results in Table 1, we draw the following conclusions: (1) In the two different kinds of the measurement errors, the coverage probabilities obtained by the EL and NA methods tend to the corresponding nominal level 0.95, and the average lengths decrease as n increases; the EL performs much better than the NA in terms of coverage probabilities and coverage accuracies of the confidence intervals. (2) Table 1 shows that the coverage probabilities decrease and the average lengths increase as the measurement error variance Σuu increases; our limited simulation study suggests that the EL and NA methods in case of CME outperforms those in case of IME. Acknowledgments The author thanks the Editor, the Associate Editor, and the referees for their constructive comments and helpful suggestions, which substantially improved an earlier version of this paper. This research was supported partially by the National Natural Science Foundation of China (grant 11101114). References Chen, S.X., 1994. Empirical likelihood confidence intervals for linear regression coefficients. Journal of Multivariate Analysis 49, 24–40. Cui, H.J., Kong, E.F., 2006. Empirical likelihood confidence region for parameters in semi-linear errors-in-variables models. Scandinavian Journal of Statistics 33, 153–168. Engle, R.F., Granger, C.W., Rice, J., Weiss, A., 1986. Semiparametric estimates of the relation between weather and electricity sales. Journal of the American Statistical Association 81, 310–320. Fan, J., Truong, Y.K., 1993. Nonparametric regression with errors in variables. Annals of Statistics 21, 1900–1925. Fan, J., Gijbels, I., 1996. Local Polynomial Modeling and its Applications. Chapman & Hall, London. Fan, J., Huang, T., 2005. Profile likelihood inferences on semi-parametric varying-coefficient partially linear models. Bernoulli 11, 1031–1057. Härdle, W., Liang, H., Gao, J., 2000. Partially Linear Models. Springer Physica-Verlag, Heidelberg. Hjort, N.L., McKeague, I.W., Van Keilegom, I., 2009. Extending the scope of empirical likelihood. Annals of Statistics 37, 1079–1111. Huang, Z.S., Zhou, Z.G., Jiang, R., Qian, W.M., Zhang, R., 2010. Empirical likelihood based inference for semiparametric varying coefficient partially linear models with error-prone linear covariates. Statistics & Probability Letters 80, 497–504. Kolaczyk, E.D., 1994. Empirical likelihood for generalized linear models. Statistica Sinica 4, 199–218. Liang, H., 2000. Asymptotic normality of parametric part in partially linear models with measurement error in the nonparametric part. Journal of Statistical Planning and Inference 86, 51–62. Lu, X.W., 2009. Empirical likelihood for heteroscedastic partially linear models. Journal of Multivariate Analysis 100, 387–396. Owen, A.B., 1988. Empirical likelihood ratio confidence intervals for a single function. Biometrika 75, 237–249. Owen, A.B., 1990. Empirical likelihood ratio confidence regions. The Annals of Statistics 18, 90–120. Owen, A.B., 1991. Empirical likelihood for linear models. The Annals of Statistics 19, 1725–1747. Owen, A.B., 2001. Empirical Likelihood. Chapman & Hall CRC, London, Boca Raton, FL. Peng, H., Schick, A., 2010. An empirical likelihood approach to goodness of fit testing, Technical Report, Department of Mathematical Sciences, Binghamton University, Available at http://www.math.binghamton.edu/anton/preprint.html. Shi, J., Lau, T.S., 2000. Empirical likelihood for partially linear models. Journal of Multivariate Analysis 72, 132–149. Speckman, J.H., 1988. Kernel smoothing in partial linear models. Journal of Royal Statistical Association B 50, 413–436. Stefanski, L.A., Carroll, R.J., 1990. Deconvoluting kernel density estimators. Statistics 21, 169–184. Wang, Q.H., Li, G., 2002. Empirical likelihood semiparametric regression analysis under random censorship. Journal of Multivariate Analysis 83, 469–486. Zhu, L.X., Xue, L.G., 2006. Empirical likelihood confidence regions in a partially linear single-index model. Journal of Royal Statistical Association B 68, 549–570.