Journal of Statistical Planning and Inference ] (]]]]) ]]]–]]]
1
Contents lists available at SciVerse ScienceDirect
3
Journal of Statistical Planning and Inference
5
journal homepage: www.elsevier.com/locate/jspi
7 9 11
Estimation of general semi-parametric quantile regression
13 Q1
15 Q3
Yan Fan, Lixing Zhu n Renmin University of China and Hong Kong Baptist University, Hong Kong
17
a r t i c l e i n f o
abstract
Article history: Received 14 March 2012 Received in revised form 10 November 2012 Accepted 13 November 2012
Quantile regression introduced by Koenker and Bassett (1978) produces a comprehensive picture of a response variable on predictors. In this paper, we propose a general semiparametric model of which part of predictors are presented with a single-index, to model the relationship of conditional quantiles of the response on predictors. Special cases are single-index models, partially linear single-index models and varying coefficient singleindex models. We propose the qOPG, a quantile regression version of outer-product gradient estimation method (OPG, Xia et al., 2002) to estimate the single-index. Largesample properties, simulation results and a real-data analysis are provided to examine the performance of the qOPG. & 2012 Published by Elsevier B.V.
19 21 23 25 27 29
Keywords: Quantile regression Single-index model Outer-product gradient estimation (OPG) qOPG Eigenvector
31
63
33
65
35
1. Introduction
37
Regression is used to quantify the relationship between a response variable Y and a p-variate covariate X, or specifically F Y9X , the conditional distribution of Y given X. Unlike mean regression, which characterizes only the conditional mean of F Y9X , quantile regression gives us a more complete picture of the relationship between the response variable Y and the covariate X. Quantile regression was first studied in linear quantile regression by Koenker and Bassett (1978) in their seminal work. Since then quantile regression has experienced deep and exciting developments in theory, methodology and applications. For example, a few nonparametric quantile regression models have been studied (Stone, 1977; Chaudhuri, 1991; Koenker et al., 1994; Yu and Jones, 1997, 1998). When some information on the form of the conditional quantile function of covariates is available, more efficient quantile regression models, such as semi-parametric model or nonparametric models with specific forms, are preferable (Gooijer and Zerom, 2003; Horowitz and Lee, 2005; Wu et al., 2010; Kong and Xia, 2011, etc.). For a more comprehensive review of quantile regression, see Koenker (2005) and references therein. In this paper we propose a general semi-parametric quantile regression model. Suppose Y is a response variable and (X,Z) is a (pþ q)-variate covariate with X ¼ ðX ð1Þ , . . . ,X ðpÞ Þ> and Z ¼ ðZ ð1Þ , . . . ,Z ðqÞ Þ> . For 0 o t o 1, we propose to model Q t ðY9X,ZÞ, the t quantile of Y given (X,Z), by a general semi-parametric model
39 41 43 45 47 49
67
51
75 77 79 81
where Gð,Þ denotes an unknown smooth function and b0 is the unknown single-index coefficient. The dependence of G and b0 is suppressed on t as long as it does not cause an ambiguity.
85
n
61
73
83
57 59
71
ð1Þ
Q t ðY9X,ZÞ ¼ GðX T b0 ,ZÞ,
53 55
69
Corresponding author. E-mail address:
[email protected] (L. Zhu).
0378-3758/$ - see front matter & 2012 Published by Elsevier B.V. http://dx.doi.org/10.1016/j.jspi.2012.11.005
Please cite this article as: Fan, Y., Zhu, L., Estimation of general semi-parametric quantile regression. Journal of Statistical Planning and Inference (2012), http://dx.doi.org/10.1016/j.jspi.2012.11.005
87 89 91
2
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33
Y. Fan, L. Zhu / Journal of Statistical Planning and Inference ] (]]]]) ]]]–]]]
This semi-parametric model is rather general and flexible. It relates the covariate X to the conditional quantile of Y through a single-index X > b0 . This allows the dimension of X to be extremely large as the single-index can circumvent the so-called ‘‘curse of dimensionality’’ in high dimensional analysis. A nonparametric regression function is then involved to link the conditional quantile with the single index X > b0 and other covariate Z. It includes many commonly used models as special cases, such as single index quantile regression models (Chaudhuri et al., 1997; Wu et al., 2010; Kong and Xia, 2011), partially linear quantile regression models (He and Shi, 1996; Lee, 2003), varying-coefficient quantile regression models (Honda, 2004; Kim, 2007; Cai and Xu, 2009) and partially linear varying coefficient quantile regression models (Wang et al., 2009; Kai et al., 2011), etc. One important concern of model (1) is how to determine the covariates X and Z for given data. In general, there are two commonly used methods that can serve this purpose. If there are categorical or discrete explanatory variables in the data, a canonical partitioning of the explanatory variables is to take the continuous covariates as X, and the rest categorical or discrete covariates as Z, just as we did in our real-data analysis. In other cases, we can do the partition according to the meanings of the explanatory variables. For example, when analysis the circulatory and respiratory problems in Hong Kong, Xia and Hardle (2006) partitioned the explanatory variables into X and Z, according to whether they stand for weather conditions or air pollutants. The focus of this paper is on the estimation of b0 . For model identification, we assume that Jb0 J ¼ 1 where J J denotes the Euclid norm, and the first component of b0 is positive. This problem was first considered by Koenker and Bassett (1978) in linear quantile regression, and then by Chaudhuri et al. (1997) in single-index quantile regression, in which the average derivative estimation (ADE) method was proposed. However the involvement of a high-dimensional kernel in ADE hinders its popularity. Another embarrassment of the ADE is that if the expectation of the derivative of the conditional quantile function with respect to b is zero, it fails to provide consistent estimate of b in theory. Wu et al. (2010) proposed a two-stage minimization procedure, which was shown to be more efficient than the ADE, to serve the same purpose in single-index quantile regression. In this paper we propose a quantile regression based outer-product gradient (qOPG) estimation procedure to estimate b0 in the general semi-parametric model. This idea is motivated by the outer-product gradient (OPG) method of Xia et al. (2002), which produces efficient estimates of the single-index coefficient in mean regression. The raw qOPG estimate is based on local linear estimates involving a high-dimensional kernel function. To alleviate the negative effect of the high dimension kernel, we take the raw qOPG estimate as an initial value and proposed a refined qOPG estimate using a lower dimensional kernel. We show that the qOPG estimates have root-n consistency, an optimal convergence rate as is achieved by the ADE estimate. Besides, one advantage of the qOPG estimates over the ADE estimates is that they works even if the derivative of the conditional quantile function with respect to X has mean zero, unless it is identically vanishing. The rest of the paper is organized as follows. In Section 2, we present the proposed qOPG estimate, whose consistency is discussed in Section 3. Simulations are conducted in Section 4 to provide evidence for the effectiveness of the qOPG estimate. Regularity conditions and technical proofs are relegated to Appendix.
37
2. Methodology
39
2.1. The qOPG procedure
43 45 47 49 51 53 55 57 59 61
65 67 69 71 73 75 77 79 81 83 85 87 89 91 93 95 97
35
41
63
99 101
Suppose Gðu,zÞ is smooth enough. Let G1 ðu,zÞ and G2 ðu,zÞ denote the partial derivative of Gðu,zÞ with respect to u and z respectively. Then @GðX T b0 ,ZÞ=@X ¼ G1 ðX T b0 ,ZÞb0 . The observation that all such partial derivatives are parallel to the single-index direction b0 motivates us to estimate b0 through estimating the partial derivative @GðX T b0 ,ZÞ=@X or E½@GðX T b0 ,ZÞ=@X (Chaudhuri et al., 1997). A ready method for estimating @GðX T b0 ,ZÞ=@X is the local polynomial estimation procedure (Fan, 1992, 1993). Suppose that fðY i ,X i ,Z i Þ, i ¼ 1,2, . . . ,ng is a random sample from ðY,X,ZÞ. For ðX i ,Z i Þ in the neighborhood of ðX j ,Z j Þ, first-order Taylor approximation gives > > GðX > i b0 ,Z i Þ aj þbj ðX i X j Þ þ cj ðZ i Z j Þ,
where cj by
aj ¼ GðX > j b0 ,Z j Þ,
bj ¼ G1 ðX > j b0 ,Z j Þb0
b ,c ða^ j , b j bj Þ ¼ arg min a,b,c
n X
and
103 105 107 109 111
cj ¼ G2 ðX > j b0 ,Z j Þ.
rt ðY i ab> X ij c> Z ij Þwij ,
The local linear smoother defines the estimates of aj, bj and
ð2Þ
i¼1
where rt ðuÞ ¼ uðtIðu o0ÞÞ is the check function, 0 o t o 1, and wij Z 0 are some weights. Here and in what follows X ij ¼ X i X j and Z ij ¼ Z i Z j . We shall discuss the choices of wij later. b or their average can Since all bj are parallel to the single-index direction b0 , the direction of any linear combination of b j be taken as a desirable estimate of b0 . This is the main idea of the ADE (Chaudhuri et al., 1997). However it fails when EG1 ðX T b0 ,ZÞ is equal or close to zero, because the average of b^ j will also be close to zero and the information of b0 in the b ’s, we propose average will be masked by random error. To overcome this dilemma and incorporate the information of all b j Please cite this article as: Fan, Y., Zhu, L., Estimation of general semi-parametric quantile regression. Journal of Statistical Planning and Inference (2012), http://dx.doi.org/10.1016/j.jspi.2012.11.005
113 115 117 119 121 123
Y. Fan, L. Zhu / Journal of Statistical Planning and Inference ] (]]]]) ]]]–]]]
1
7 9 11 13 15 17 19 21 23 25 27 29 31 33 35
63
to estimate b0 by the eigenvector of V¼
3 5
3
n 1X
nj¼1
bb b> b j j
65
corresponding to its largest eigenvalue. To be consistent with the true single index coefficient b0 , we shall choose the eigenvector whose first component is positive. We call this method quantile regression based outer-product gradient (qOPG) estimation procedure. The same idea was employed by Xia et al. (2002) in the setting of mean regression and called b for general convex loss outer-product gradient (OPG) method. As Fan (1998) showed that the local linear estimate b j function is consistent to bj at fixed ðX j ,Z j Þ. As the check function rt ðÞ is convex, the matrix S roughly has only one non-zero eigenvalue and the corresponding eigenvector should be consistent to b0 . 2.2. Estimation of b0
41
69 71 73 75
A simple choice of wij leads to a raw estimate of b0 . The simplest choice of wij is a high-dimensional kernel function. Let m ¼ p þ q. Assumption 1. The weights wij ¼ KðX ij =h,Z ij =hÞ where KðÞ is an m-dimensional continuous symmetric density with h as a bandwidth. Furthermore, both the following matrix are finite and positive definite ZZ ZZ C 1 ¼ tt > Kðt,sÞ dt ds, C 2 ¼ tt > K 2 ðt,sÞ dt ds: A special case of Kðx,zÞ is K 1 ðJxJÞK 2 ðJzJÞ where K i ðÞ (i¼1,2) are any univariate density functions, for instance, truncated normal and Epanechnikov kernels. It is worthwhile to point out that the kernel function in Assumption 1 can be extended to accommodate different bandwidths for each component of the covariates. However such a way would also add much more computational workload. Thus, we in this paper use the same bandwidth h for both covariates X and Z. Although such choices of wij suffer from ‘‘curse of dimensionality’’, they lead to consistent estimates of b0 as shown in the following theorem.
77 79 81 83 85 87
Theorem 1. Assume conditions (A–C) in the appendix and Assumption 1. Let b^ j be defined in (2), and b^ be the eigenvector of P mþ2 mþ4 bb b> -1 and nh -0, then V ¼ ð1=nÞ nj¼ 1 b j j corresponding to the largest eigenvalue. As n tends to infinity, if 0 o h-0, nh
89
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi L tð1tÞf ðX j ,Z j Þ 1 mþ2 ^ nh ðb j bj Þ!Nð0, 2 C 1 C 2 C 1 1 Þ and f ðaj ,X j ,Z j Þ pffiffiffi ^ L (ii) nðb b0 Þ!Nð0, OÞ with 1 O ¼ tð1tÞfEG21 ðX > E½Hx ðX 1 ,Z 1 ÞHtx ðX 1 ,Z 1 Þ 1 b0 ,Z 1 Þg > and Hx ðx,zÞ ¼ ð@=@xÞ½G1 ðx b0 ,zÞf ðx,zÞ=f ðGðx> b0 ,zÞ,x,zÞ:
93
(i)
91
95 97 99
37 39
67
The first result of Theorem 1 is a well-known property of local linear estimate. Although the local linear estimate of m þ 2 1=2 each bj converges at a very low rate ðnh Þ , result (ii) of the theorem reveals that the proposed qOPG estimate b^ achieves root-n consistency, an optimal rate that is achieved by some classical competitors, such as those relying on the ADE method (Chaudhuri et al., 1997). As all the estimates b^ j ’s hinge on a high-dimensional kernel function, so does b^ . This may downplay the performance of b^ , while these raw estimates can be taken as initial values for refined estimates of b0 .
101 103 105
43 2.3. Refined estimate of b0
107
45 47 49 51 53 55
If a good initial estimate of b0 , say b~ , is available, we can obtain a refined estimate using a lower dimensional kernel than the one involved in the raw estimate. ~ KðX > ij b =h,Z ij =hÞ
where Kð,Þ is a (qþ1)-dimensional symmetric density function and h as Assumption 2. The weights wij ¼ a bandwidth. And Kð,Þ further satisfies that (1) it is Lipschitcz continuous and (2) both the following matrix are finite and positive definite, ZZ ZZ C 3 ¼ tt > Kðt > b0 ,sÞ dt ds, C 4 ¼ tt > K 2 ðt > b0 ,sÞ dt ds: Theorem 2. Assume conditions (A–C) in the appendix. Suppose b~ ¼ b0 þ op ð1Þ and Assumption 2 holds. Let b^ j be defined in (2), P bb b > corresponding to the largest eigenvalue. As n tends to infinity, if 0 o h-0, and b^ be the eigenvector of V ¼ ð1=nÞ n b j¼1
mþ2
-1 and nh
mþ4
j
111 113 115 117
j
-0, then
57
nh
59
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi L mþ2 ^ 2 1 ðb j bj Þ!Nð0, tð1tÞf ðX j ,Z j Þ=f ðaj ,X j ,Z j ÞC 1 (i) nh 3 C 4 C 3 Þ and pffiffiffi ^ L (ii) nðb b0 Þ!Nð0, OÞ where O is defined in Theorem 1.
61
109
Please cite this article as: Fan, Y., Zhu, L., Estimation of general semi-parametric quantile regression. Journal of Statistical Planning and Inference (2012), http://dx.doi.org/10.1016/j.jspi.2012.11.005
119 121 123
4
1 3 5 7 9 11
Y. Fan, L. Zhu / Journal of Statistical Planning and Inference ] (]]]]) ]]]–]]]
Theorem 2 indicates that with the help of an initial consistent estimate of b0 , the qOPG method provides a refined consistent estimate of b0 using lower dimensional kernels. Meanwhile the refined estimate of b0 still retains root-n consistency. If more rigorous constraints are put on X and Z, a univariate kernel function is sufficient to serve this purpose. Q4 Assumption 3. The weights
where KðÞ is a univariate symmetric density function and h as a bandwidth. And KðÞ further satisfies that (1) it is Lipschitcz continuous and (2) both the following matrix are finite and positive definite Z Z C5 ¼ tt> Kðt > b0 Þ dt, C 6 ¼ tt> K 2 ðt > b0 Þ dt: When the covariate Z in GðX > b0 ,ZÞ disappears, it reduces the usual single-index quantile regression model, that is the conditional t quantile of Y given X has form GðX > b0 Þ. The local linear smoother defines the estimates of aj ¼ GðX > j b0 Þ and 0
13
bj ¼ G
19 21 23
ðX > j b0 Þb0
a,b
n X i¼1
1
0
Hx ðxÞ ¼
b0 Þ ¼ arg min ða^ 0 , d^ 0 , c a,d,c
35 37 39 41 43 45 47 49 51 53 55 57 59 61
71 73
n X
77
81 83 85
89
rt ðY i adX >ij b^ c> Z ij Þwij
ð4Þ
91 93
i¼1
95
and the weights has form wij ¼ KðX ij b^ =h,Z ij =hÞ.
97
3. Numerical studies In this section, we report a numerical simulation study which is designed to investigate the finite-sample performance of the proposed qOPG method. For comparison, we take the well-known ADE method (Chaudhuri et al., 1997) as a benchmark (denoted by qADE in our simulation results). Let b^ j be local linear gradient estimates of G1 ðX > b,ZÞb0 , as is defined in (2). The qADE estimator is P defined as the unit-length vector with its first component being positive and paralleling to ð1=nÞ ni¼ 1 b^ j . Taking this raw qADE estimator as an initial value of b, we can also define a refined qADE estimator in the way we define the refined qOPG estimator. The refining idea may be iterated until the estimate converges. In our simulation, we present only the simulation results about the refined qOPG and qADE estimators. In order to evaluate an index coefficient estimator b^ t , we consider the following criteria: (1) bias and standard qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi > deviation (sd), (2) estimation error (EE), EEðb^ t Þ ¼ 19b^ t bt 9 (see Kong and Xia, 2011), whose value lies between 0 and 1. > The closer the value of EEðb^ t Þ is to 0, the better the estimator is, (3) Angle, Angleðb^ t Þ ¼ ð180=pÞ arccosð9b^ t bt 9Þ. The value of Angle is between 0 and 90, with small values indicating good performance, (4) mean absolute deviation (MAD), E½Jb^ b J , where J Jp denote the p-norm, and (5) mean squared error (MSE), EJb^ b J2 .
t
69
87
@ G0 ðx> b0 Þf ðxÞ ½ . @x f ðGðx> b0 Þ,xÞ
Once we get an estimate of b0 , say b^ , the nonparametric function Gð,Þ can be estimated with the local polynomial estimation method. For any covariate ðX 0 ,Z 0 Þ, we can estimate GðX 0 b,Z 0 Þ by a^ 0 where
31
67
79
2 1 O1 ¼ tð1tÞfE½G0 ðX > E½Hx ðX 1 ÞHtx ðX 1 Þ 1 b0 Þ g
27
33
ð3Þ
Corollary 1. Assume conditions (A–C) in the Appendix with all the covariate Z removed. Suppose b~ ¼ b0 þ op ð1Þ and Assumption P bb b> 3 holds. Let b^ j be defined in (3) and b^ be the eigenvector of V ¼ ð1=nÞ nj¼ 1 b j j corresponding to the largest eigenvalue. As n pffiffiffiffiffiffiffiffiffiffiffiffiffiffi L pþ2 pþ4 pþ2 ^ 2 1 tends to infinity, if 0 o h-0, nh -1 and nh -0, then nh ðb j bj Þ!Nð0,ðtð1tÞf ðX j ÞÞ=f ðaj ,X j ÞC 1 5 C 6 C 5 Þ, and pffiffiffi ^ nðb b Þ has an asymptotical normal distribution Nð0, O Þwith
and
65
75
rt ðY i ab> X ij Þwij :
We present the following corollary of Theorem 2 without proof.
25
29
by
b Þ ¼ arg min ða^ j , b j
15 17
~ wij ¼ KðX > ij b =hÞ
63
t 1
t
99 101 103 105 107 109 111 113
t 2
We generated data-sets from the models given in Examples 1–3 with sample size n ¼200. In each of the models, we considered four distributions of the random error e, including three symmetric distributions: (1) N(0,1), the standard normal distribution, (2) tð3Þ, the t-distribution with 3 degrees of freedom, (3) Scaled Cauchy, the Cauchy distribution with scale parameter 0.02; and an asymmetric distribution, and (4) Exp(1), the standard exponential distribution. The qOPG and qADE estimates of the index coefficient b0 are computed and evaluated based on 200 data-sets at a series of quantiles with t ¼ 0:1, 0:3, 0:5, 0:7 and 0.9 respectively. Example 1. Consider a single index model Y ¼ gðX > b0 Þ þ sðX > b0 Þe, Please cite this article as: Fan, Y., Zhu, L., Estimation of general semi-parametric quantile regression. Journal of Statistical Planning and Inference (2012), http://dx.doi.org/10.1016/j.jspi.2012.11.005
115 117 119 121 123
Y. Fan, L. Zhu / Journal of Statistical Planning and Inference ] (]]]]) ]]]–]]]
1 3 5 7 9 11 13 15
5
pffiffiffi > where gðuÞ ¼ 4 expðu2 Þ, b0 ðb01 , b02 , b03 Þ ¼ ð1,2,0Þ= 5 and all the components of X come from a centralized uniform distribution Uð0,1Þ. The error weight sðuÞ is chosen to be 0:1 cosð10uÞ. In the model, the conditional quantile of Y given X, Q t ðY9XÞ ¼ gðX > b0 Þ þ sðX > b0 Þet , is an even function of X, where et is the t-quantile of the random error e not depending on X. In the meantime, the distribution of X or X > b0 is symmetric with respect to zero. Thus the partial derivative @Q t ðY9XÞ=@X has mean zero regardless of the value of t. As we discussed earlier, in this case the qOPG method works, whereas the qADE method fails to work. Table 1 contains the evaluations of the qOPG and qADE estimates of the index coefficient b0 . As we expected, the qOPG estimates are generally very close to the pffiffiffi true value of b0 , whereas the qADE estimates have unacceptable performance. Recall that the true value b0 ¼ ð1,2,0Þ= 5 ¼ ð0:4472,0:8944,0Þ. For each component of b0 , the biases of the qOPG estimates are generally less than 0.02 and at most as small as 0.0415. However the qADE estimates for b02 ¼ 0:8944 have usually more than 0.1 biases, and the biases are even as big as 0.5 or more when t ¼ 0:9 regardless of the error distribution. Note that the standard deviations in Table 1 are much larger than the biases. It is more reasonable to compare the MSE values. We find that the MSEs of the qADE estimator are all as more than 10 times big as those of the qOPG estimator. It is clear to see the invalidation of the qADE estimator from the viewpoint of Angle. The qADE estimates generally have an angle of
63 65 67 69 71 73 75 77
17
Table 1
79
19
Comparison of the qOPG and qADE estimates for the b in Example 1 (Heteroskedastic model). The true model is Y ¼ gðX > b0 Þ þ sðX > b0 Þe with gðuÞ ¼ 4 expðu2 Þ and sðuÞ ¼ 0:1 cosð10uÞ.
81
21
t
Method
Bias (sd)
EE
Angle
MAD
MSE
b^ 01
b^ 02
qOPG qADE qOPG qADE qOPG qADE qOPG qADE qOPG qADE
0.0009(0.0126) 0.0681(0.2760) 0.0013(0.0085) 0.0563(0.2636) 0.0017(0.0087) 0.0496(0.2911) 0.0048(0.0112) 0.0885(0.3622) 0.0216(0.0169) 0.0055(0.4279)
0.0006(0.0063) 0.1168(0.2915) 0.0007(0.0042) 0.1067(0.2726) 0.0009(0.0043) 0.2200(0.4211) 0.0025(0.0057) 0.3182(0.4492) 0.0114(0.0090) 0.5348(0.4877)
0.0005(0.0091) 0.0665(0.2954) 0.0003(0.0068) 0.0743(0.2801) 0.0003(0.0064) 0.1228(0.3342) 0.0002(0.0080) 0.1971(0.4122) 0.0002(0.0145) 0.2170(0.4479)
0.0104 0.2591 0.0071 0.2309 0.0072 0.3046 0.0098 0.4215 0.0214 0.5979
0.8483 20.0013 0.5815 18.3604 0.5887 23.1568 0.8005 32.2566 1.7399 48.7953
0.0600 0.1963 0.0136 0.1005 0.0245 0.0950 0.0618 0.1776 0.1463 0.5145
0.0044 0.0769 0.0003 0.0364 0.0008 0.0310 0.0042 0.0561 0.0222 0.3573
qOPG qADE qOPG qADE qOPG qADE qOPG qADE qOPG qADE
0.0007(0.0211) 0.0374(0.3133) 0.0011(0.0099) 0.0364(0.2422) 0.0024(0.0099) 0.0573(0.3151) 0.0056(0.0127) 0.0501(0.3486) 0.0217(0.0261) 0.0055(0.4432)
0.0008(0.0106) 0.1896(0.3498) 0.0006(0.0049) 0.1274(0.3002) 0.0013(0.0050) 0.1795(0.3346) 0.0030(0.0065) 0.3372(0.4552) 0.0119(0.0138) 0.5398(0.5127)
0.0001(0.0158) 0.0448(0.3382) 0.0000(0.0082) 0.0752(0.2995) 0.0002(0.0073) 0.0916(0.3448) 0.0002(0.0101) 0.1852(0.4138) 0.0018(0.0220) 0.1581(0.4450)
0.0173 0.3197 0.0085 0.2412 0.0082 0.2960 0.0116 0.4258 0.0277 0.5987
1.4046 25.1410 0.6949 19.3235 0.6675 24.0164 0.9434 33.2299 2.2467 48.3541
0.0853 0.2385 0.0183 0.1047 0.0247 0.0995 0.0664 0.1854 0.1704 0.5231
0.0090 0.1059 0.0006 0.0371 0.0008 0.0337 0.0050 0.0592 0.0307 0.3689
qOPG qADE qOPG qADE qOPG qADE qOPG qADE qOPG qADE
0.0000(0.0036) 0.0020(0.1050) 0.0005(0.0021) 0.0319(0.2136) 0.0019(0.0033) 0.0408(0.2496) 0.0062(0.0050) 0.0718(0.3914) 0.0322(0.0116) 0.0234(0.4327)
0.0000(0.0017) 0.0199(0.1215) 0.0002(0.0011) 0.1137(0.3170) 0.0009(0.0016) 0.1730(0.3781) 0.0032(0.0025) 0.3016(0.4304) 0.0170(0.0063) 0.6019(0.5236)
0.0002(0.0024) 0.0105(0.1068) 0.0000(0.0013) 0.0809(0.2578) 0.0000(0.0021) 0.0915(0.3197) 0.0003(0.0038) 0.1506(0.3859) 0.0009(0.0106) 0.2763(0.4476)
0.0018 0.0610 0.0016 0.1562 0.0029 0.2381 0.0062 0.3983 0.0270 0.6358
0.1518 5.0178 0.1343 12.0795 0.2349 18.2737 0.5050 31.1650 2.1888 49.4972
0.0055 0.0458 0.0057 0.0599 0.0145 0.0768 0.0366 0.1695 0.1125 0.5051
0.0001 0.0128 0.0000 0.0283 0.0002 0.0255 0.0015 0.0572 0.0137 0.3597
qOPG qADE qOPG qADE qOPG qADE qOPG qADE qOPG qADE
0.0008(0.0133) 0.0650(0.2931) 0.0005(0.0076) 0.0384(0.2714) 0.0020(0.0068) 0.0605(0.3132) 0.0071(0.0093) 0.0651(0.3714) 0.0415(0.0212) 0.1150(0.4912)
0.0002(0.0066) 0.1307(0.3147) 0.0003(0.0038) 0.1470(0.3536) 0.0010(0.0034) 0.2201(0.3929) 0.0037(0.0047) 0.3433(0.4245) 0.0225(0.0118) 0.5333(0.5053)
0.0001(0.0104) 0.0674(0.2872) 0.0003(0.0057) 0.0643(0.2699) 0.0000(0.0049) 0.1497(0.3503) 0.0003(0.0067) 0.2314(0.4255) 0.0027(0.0171) 0.2363(0.4585)
0.0108 0.2591 0.0063 0.2349 0.0057 0.3215 0.0090 0.4527 0.0362 0.6360
0.8769 20.6101 0.5160 17.4026 0.4615 25.0005 0.7361 36.5917 2.9401 50.0953
0.0796 0.1909 0.0441 0.1253 0.0328 0.1208 0.0379 0.1701 0.1174 0.5103
0.0111 0.0813 0.0028 0.0431 0.0016 0.0428 0.0028 0.0571 0.0161 0.3675
23
b^ 03
83 85
N(0,1)
25
0.1
27
0.3
29
0.5
31
0.7 0.9
33
0.1
37
0.3
39
0.5 0.7
41 43 45
0.9 Cauchy 0.1 0.3
47 0.5
49 51 53
0.7 0.9
55
0.3
57
0.5
59
0.7
61
0.9
91 93
97 99 101 103 105
Exp(1) 0.1
89
95
t(3)
35
87
Please cite this article as: Fan, Y., Zhu, L., Estimation of general semi-parametric quantile regression. Journal of Statistical Planning and Inference (2012), http://dx.doi.org/10.1016/j.jspi.2012.11.005
107 109 111 113 115 117 119 121 123
6
Y. Fan, L. Zhu / Journal of Statistical Planning and Inference ] (]]]]) ]]]–]]]
qOPG
1
qADE
5.0
63
5.0
65
3 4.8
7 9 11 13
4.8 Estimate of g
Estimate of g
5
4.6
4.6
69
4.4
71
4.2
4.2
73
4.0
4.0
4.4
−0.4
15
67
−0.2
0.0
0.2
0.4
75 −0.4
single index
17 19 21 23 25 27 29 31 33 35
−0.2
0.0
0.2
0.4
77
single index
Fig. 1. Display of gðuÞ ¼ 4 expfu2 g (dashed line) and its estimates (solid line) based on the qOPG estimate ð0:4544,0:8907,0:0148Þ> and the qADE estimate ð0:0803,0:3987,0:9136Þ> of b0 . The grid points are 200 points equally distributed in ½0:5,0:5.
more than 201 with the true direction b0 , and the angle are all more than 451 in average when t ¼ 0:9. In comparison, the angles between the qOPG estimates and b0 are no more than 31 in average, which is rather small. The other two criteria MAD and EE provide similar comparison results for the two estimators. All these evidences indicate that the qOPG method provides a desirable estimate of the index coefficient, whereas the qADE method does not. To see how much an index coefficient estimate affects the resulting conditional quantile function estimate, we generated a data-set from Example 1 with normally distributed errors, and computed estimates of g(u), which is the link function of Q 0:5 ðY9XÞ ¼ gðX > b0 Þ. The function g(u) is estimated at 200 equally distributed points based on respectively the qOPG and qADE estimates of b0 . The true value and estimates of g(u) are displayed in Fig. 1. The qOPG and qADE estimates are respectively ð0:4544,0:8907,0:0148Þ and ð0:0803,0:3987,0:9136Þ in this situation (t ¼ 0:5). Clearly as the qOPG estimate is very close to the true value, the corresponding local linear estimate of g(u) is also very close to the true function, while the severely inaccurate qADE estimate produces a far from acceptable nonparametric estimate of g(u). We investigated many such data-sets and found that this phenomenon occurs quite frequently. This fact once again demonstrates the superiority of the qOPG over the qADE. Example 2. Consider a single index partial linear model
79 81 83 85 87 89 91 93 95 97
Y ¼ g 0 ðX > b0 Þ þZ > y0 þ sðX > b0 Þe, 37 39 41 43 45 47 49 51 53
>
b0 is the same as Example 1 and g 0 ðuÞ ¼ sinðpðuBÞ=ðBAÞÞ. The constants A and B are chosen to be where p ffiffiffi 3=27 1:645=12. All the components of X and Z are independent and identically distributed as a uniform distribution pffiffiffi Uð0,1Þ. And we set y0 ¼ ð2,1Þ> = 5 and the error weight sðuÞ ¼ 0:1 expfðu0:6Þ2 g. The model in Example 2 is slightly more complicated than that in Example 1, as it involves an additional term Z > y and an additional covariate Z. It can be regarded as a two-index model, with X > b0 and Z > y0 being the two indices. Thus we may take b0 or y0 as the parameter of interest, then accordingly take Z or X as nuisance covariate. The biases and standard deviations of the estimates of b0 and y0 are presented in Table 2. For either the parameter of interest b0 or y0 , no matter what value t is and what the error distribution is, we see in Table 2 that the biases and standard deviations of the qOPG and qADE estimates are very close to each other and to zero, although the qOPG method may have a slight superiority in general. Such a superiority is confirmed in Table 3 by the other four criteria, where the qOPG method produces smaller EE, Angle, MAD and MSE values than the qADE method in most cases. In the above two examples, we also conducted simulations with sðuÞ ¼ 0:1. The simulation results are very similar to those given above, and therefore are omitted to save space. Next we consider a model with such a simple error weight, but the model itself is much more complicated than those given in Examples 1 and 2. Example 3. Suppose ðY,X,ZÞ follows a varying-coefficient single index model
99 101 103 105 107 109 111 113 115
Y ¼ g 0 ðX > b0 Þ þg 1 ðX > b0 ÞZ 1 þg 2 ðX > b0 ÞZ 2 þ0:1e, 55 57 59 61
where g 1 ðuÞ ¼ 4½expfðu0:6Þ2 g1 and g 2 ðuÞ ¼ u logð92u9þ 1Þ. We choose g 0 ðuÞ, Z, X and b0 to be the same as those given in Example 2. We have seen in Example 2 that the biases and standard deviations of both the qOPG and qADE estimates are generally close to zero if the model is not the type of Example 1. So we choose to omit these simulation results in this example. We provide in Table 4 the rest four overall evaluations of the two estimates for the index coefficient b0 . Again we found that the two estimators have very similar behavior, and the qOPG method has slight gains in EE, Angle, MAD and MSE Please cite this article as: Fan, Y., Zhu, L., Estimation of general semi-parametric quantile regression. Journal of Statistical Planning and Inference (2012), http://dx.doi.org/10.1016/j.jspi.2012.11.005
117 119 121 123
Y. Fan, L. Zhu / Journal of Statistical Planning and Inference ] (]]]]) ]]]–]]]
7
1
Table 2 Mean and standard error (in parentheses) of the qOPG and qADE estimates for the parameters b and y in Example 2 (Heteroskedastic model). The true
63
3
model is Y ¼ g 0 ðX > b0 Þþ Z > y0 þ sðX > b0 Þe with g 0 ðuÞ ¼ sinðpðuBÞ=ðBAÞÞ and sðuÞ ¼ 0:1 expfðu0:6Þ2 g.
65
5
t
Method
b^ 1
b^ 2
b^ 3
y^ 1
y^ 2
qOPG qADE qOPG qADE qOPG qADE qOPG qADE qOPG qADE
0.0382(0.0754) 0.0754(0.0932) 0.0066(0.0314) 0.0373(0.0471) 0.0000(0.0226) 0.0191(0.0321) 0.0019(0.0196) 0.0105(0.0252) 0.0025(0.0217) 0.0052(0.0269)
0.0118(0.0335) 0.0245(0.0379) 0.0018(0.0156) 0.0148(0.0206) 0.0009(0.0109) 0.0079(0.0146) 0.0016(0.0095) 0.0042(0.0119) 0.0020(0.0105) 0.0015(0.0129)
0.0034(0.0680) 0.0025(0.0845) 0.0031(0.0374) 0.0012(0.0509) 0.0043(0.0318) 0.0008(0.0351) 0.0041(0.0286) 0.0014(0.0291) 0.0025(0.0280) 0.0004(0.0319)
0.0030(0.0525) 0.0006(0.0533) 0.00250.0298) 0.0059(0.0307) 0.0048(0.0245) 0.0029(0.0224) 0.0058(0.0207) 0.0007(0.0189) 0.0003(0.0024) 0.0000(0.0021)
0.0089(0.1036) 0.0145(0.1059) 0.0003(0.0582) 0.0177(0.0637) 0.0065(0.0469) 0.0089(0.0456) 0.0093(0.0400) 0.0034(0.0374) 0.0049(0.0465) 0.0025(0.0423)
qOPG qADE qOPG qADE qOPG qADE qOPG qADE qOPG qADE
0.0486(0.0939) 0.0785(0.0980) 0.0123(0.0491) 0.0340(0.0532) 0.0042(0.0472) 0.0193(0.0359) 0.0020(0.0452) 0.0104(0.0294) 0.0012(0.0330) 0.0070(0.0383)
0.0126(0.0370) 0.0231(0.0384) 0.0048(0.1147) 0.0124(0.0233) 0.0010(0.0177) 0.0076(0.0168) 0.0019(0.0161) 0.0037(0.0143) 0.0026(0.0187) 0.0010(0.0188)
0.0088(0.0903) 0.0098(0.1049) 0.0023(0.0639) 0.0086(0.0593) 0.0052(0.0557) 0.0027(0.0399) 0.0037(0.0542) 0.0009(0.0371) 0.0031(0.0474) 0.0005(0.0500)
0.0037(0.0609) 0.0001(0.0612) 0.0055(0.0861) 0.0079(0.0338) 0.0023(0.0291) 0.0049(0.0257) 0.0035(0.0294) 0.0021(0.0267) 0.0088(0.0514) 0.0011(0.0363)
0.0139(0.1239) 0.0205(0.1183) 0.0054(0.0858) 0.0233(0.0702) 0.0003(0.0611) 0.0138(0.0522) 0.0022(0.0586) 0.0080(0.0502) 0.0056(0.0900) 0.0050(0.0726)
qOPG qADE qOPG qADE qOPG qADE qOPG qADE qOPG qADE
0.0484(0.0909) 0.0878(0.1082) 0.0046(0.0278) 0.0368(0.0460) 0.0012(0.0159) 0.0172(0.0276) 0.0008(0.0084) 0.0071(0.0142) 0.0019(0.0294) 0.0025(0.0127)
0.0155(0.0336) 0.0286(0.0371) 0.0012(0.0135) 0.0149(0.0199) 0.0010(0.0080) 0.0074(0.0129) 0.0005(0.0042) 0.0032(0.0068) 0.0001(0.0042) 0.0010(0.0049)
0.0014(0.0607) 0.0039(0.0759) 0.0011(0.0300) 0.0030(0.0465) 0.0008(0.0186) 0.0015(0.0303) 0.0002(0.0099) 0.0003(0.0157) 0.0019(0.0252) 0.0009(0.0135)
0.0028(0.0448) 0.0013(0.0505) 0.0015(0.0273) 0.0081(0.0289) 0.0004(0.0174) 0.0057(0.0180) 0.0000(0.0010) 0.0003(0.0009) 0.0007(0.0212) 0.0004(0.0121)
0.0175(0.0913) 0.0173(0.1011) 0.0074(0.0556) 0.0218(0.0606) 0.0009(0.0352) 0.0137(0.0379) 0.0004(0.0205) 0.0066(0.0203) 0.0000(0.0028) 0.0001(0.0019)
qOPG qADE qOPG qADE qOPG qADE qOPG qADE qOPG qADE
0.0449(0.0700) 0.0813(0.0853) 0.0088(0.0287) 0.0380(0.0399) 0.0010(0.0184) 0.0179(0.0252) 0.0024(0.0175) 0.0076(0.0205) 0.0010(0.0277) 0.0092(0.0347)
0.0158(0.0290) 0.0286(0.0316) 0.0032(0.0141) 0.0159(0.0176) 0.0000(0.0093) 0.0078(0.0120) 0.0016(0.0089) 0.0031(0.0101) 0.0016(0.0141) 0.0026(0.0172)
0.0057(0.0624) 0.0075(0.0765) 0.0002(0.0305) 0.0002(0.0440) 0.0000(0.0020) 0.0000(0.0030) 0.0006(0.0184) 0.0014(0.0261) 0.0027(0.0311) 0.0073(0.0428)
0.0054(0.0423) 0.0050(0.0448) 0.0024(0.0258) 0.0073(0.0265) 0.0001(0.0195) 0.0044(0.0185) 0.0000(0.0020) 0.0002(0.0017) 0.0004(0.0353) 0.0003(0.0301)
0.0222(0.0881) 0.0223(0.0923) 0.0087(0.0525) 0.0193(0.0550) 0.0018(0.0386) 0.0110(0.0373) 0.0020(0.0403) 0.0061(0.0352) 0.0064(0.0724) 0.0042(0.0604)
67
N(0,1)
7 0.1
9
0.3
11
0.5
13
0.7
15
0.9
69 71 73 75 77
t(3)
17
0.1
19
0.3
21
0.5 0.7
23 0.9
25 27 29
0.3 0.5
31 0.7
33 35
0.9 Exp(1) 0.1
37 0.3
39
0.5
41
0.7
43
0.9
45 47
compared with the qADE method. However this slight superiority is not inherited in the estimation of nonparametric conditional quantile functions. To assess an estimate g^ of the conditional quantile function g, we compute the mean absolute deviation (MAD), that is
49 MADðg^ Þ ¼ 51 53 55 57 59 61
81 83 85 87
Cauchy 0.1
79
ngrid 1 X
ngrid t ¼ 1
89 91 93 95 97 99 101 103 105 107 109 111
9g^ ðut Þgðut Þ9,
where fut ,t ¼ 1, . . . ,ngrid g are the grid points at which the function g^ ðÞ is evaluated. Here we set ngrid ¼ 200 and the grid points are evenly distributed over the interval ½0,1. The last three columns of Table 4 present the MAD values of estimates of gi’s. It is observed that the nonparametric function estimates based on the qOPG and qADE estimates of b0 have almost the same estimation accuracy. In all, if the partial derivative of the conditional quantile function with respect to X has mean zero, then the qADE method fails to work, while the qOPG method still produces good estimation of the index coefficient. Otherwise, the qOPG and the qADE methods are generally comparable in the viewpoint of EE, Angle, MAD and MSE, although the qOPG method may have a slight superiority. According to these observations, it is preferable to apply the qOPG method, rather than the qADE method in practice. We then used the qOPG method in a real-data analysis in the next section. Please cite this article as: Fan, Y., Zhu, L., Estimation of general semi-parametric quantile regression. Journal of Statistical Planning and Inference (2012), http://dx.doi.org/10.1016/j.jspi.2012.11.005
113 115 117 119 121 123
8
1
Y. Fan, L. Zhu / Journal of Statistical Planning and Inference ] (]]]]) ]]]–]]]
63
Table 3 Evaluations of the qOPG and qADE estimates for b and y in Example 2 (Heteroskedastic model). The true model is Y ¼ g 0 ðX > b0 Þ þ Z > y0 þ sðX > b0 Þe with
3 5
g 0 ðuÞ ¼ sinðpðuBÞ=ðBAÞÞ and sðuÞ ¼ 0:1 expfðu0:6Þ2 g.
t
Method
65
b EE
y Angle
MAD
MSE
EE
67 Angle
MAD
MSE
69
7 N(0,1)
9
0.1
11
0.3
13
0.5
15
0.7 0.9
17 19 21
0.1 0.3
23 0.7
27 29
0.9
5.7187 7.7619 2.5671 4.1256 1.8242 2.5781 1.5830 2.0055 1.7408 2.1572
0.1497 0.2013 0.0664 0.1096 0.0461 0.0679 0.0411 0.0524 0.0445 0.0554
0.0130 0.0235 0.0026 0.0068 0.0016 0.0029 0.0013 0.0017 0.0013 0.0019
0.0661 0.0668 0.0363 0.0402 0.0299 0.0289 0.0262 0.0240 0.0298 0.0267
5.3655 5.4224 2.9486 3.2596 2.4290 2.3459 2.1286 1.9489 2.4152 2.1712
0.1249 0.1260 0.0690 0.0757 0.0570 0.0547 0.0500 0.0455 0.0566 0.0508
0.0135 0.0142 0.0042 0.0053 0.0028 0.0026 0.0021 0.0017 0.0027 0.0022
qOPG qADE qOPG qADE qOPG qADE qOPG qADE qOPG qADE
0.0840 0.1040 0.0410 0.0558 0.0272 0.0361 0.0256 0.0310 0.0325 0.0405
6.8258 8.4465 3.3415 4.5289 2.2163 2.9313 2.0847 2.5128 2.6383 3.2886
0.1758 0.2167 0.0920 0.1172 0.0584 0.0766 0.0546 0.0646 0.0684 0.0838
0.0208 0.0287 0.0197 0.0082 0.0056 0.0035 0.0052 0.0025 0.0036 0.0043
0.0732 0.0732 0.0507 0.0473 0.0351 0.0338 0.0331 0.0301 0.0531 0.0443
5.9506 5.9447 4.1398 3.8371 2.8529 2.7407 2.6849 2.4454 4.3151 3.5988
0.1377 0.1377 0.0956 0.0890 0.0665 0.0638 0.0627 0.0571 0.1010 0.0839
0.0191 0.0180 0.0147 0.0066 0.0045 0.0035 0.0043 0.0032 0.0108 0.0065
qOPG qADE qOPG qADE qOPG qADE qOPG qADE qOPG qADE
0.0699 0.0967 0.0270 0.0490 0.0159 0.0291 0.0085 0.0142 0.0065 0.0072
5.6729 7.8600 2.1951 3.9771 1.2963 2.3586 0.6952 1.1584 0.5308 0.5913
0.1478 0.2018 0.0570 0.1039 0.0338 0.0614 0.0179 0.0304 0.0138 0.0155
0.0156 0.0273 0.0018 0.0062 0.0006 0.0022 0.0001 0.0005 0.0015 0.0003
0.0582 0.0646 0.0347 0.0399 0.0219 0.0247 0.0130 0.0132 0.0094 0.0080
4.7220 5.2406 2.8162 3.2390 1.7798 2.0071 1.0576 1.0761 0.7704 0.6558
0.1096 0.1216 0.0656 0.0751 0.0416 0.0467 0.0247 0.0251 0.0181 0.0153
0.0106 0.0130 0.0038 0.0050 0.0015 0.0019 0.0005 0.0005 0.0012 0.0005
0.1
0.5
33
0.7
35
0.9
qOPG qADE qOPG qADE qOPG qADE qOPG qADE qOPG qADE
0.0653 0.0896 0.0278 0.0461 0.0181 0.0284 0.0168 0.0222 0.0276 0.0364
5.3016 7.2780 2.2576 3.7383 1.4744 2.3091 1.3662 1.8009 2.2382 2.9566
0.1390 0.1900 0.0595 0.0994 0.0381 0.0601 0.0356 0.0463 0.0580 0.0769
0.0119 0.0215 0.0020 0.0055 0.0008 0.0020 0.0007 0.0012 0.0019 0.0034
0.0568 0.0606 0.0329 0.0368 0.0242 0.0251 0.0249 0.0217 0.0454 0.0369
4.6134 4.9222 2.6737 2.9863 1.9653 2.0349 2.0182 1.7633 3.6867 2.9944
0.1069 0.1141 0.0623 0.0694 0.0460 0.0474 0.0472 0.0412 0.0859 0.0699
0.0100 0.0110 0.0034 0.0041 0.0018 0.0018 0.0020 0.0015 0.0065 0.0045
71 73 75 77 79
Cauchy
0.3
31
0.0705 0.0956 0.0316 0.0508 0.0225 0.0318 0.0195 0.0247 0.0214 0.0266
t(3)
0.5
25
qOPG qADE qOPG qADE qOPG qADE qOPG qADE qOPG qADE
81 83 85 87 89 91 93 95 97
Exp(1)
37 0.1
39
0.3
41
0.5
43
0.7
45
0.9
99 101 103 105 107
47
4. A real-data analysis
109
49
In this section, we apply the qOPG method to the 2004-new-car-and-truck data, which was analyzed by Zhu et al. (2012), to study what factors influence the manufacturer’s suggested retail price. The data-set is available from http:// www.amstat.org/publications/jse/datasets/04cars.dat. The variable of interest is the suggested retail price (MSRP), whose logarithm is taken as the response Y. As did by Zhu et al. (2012), we choose from the data-set seven attributes of vehicles, which possibly affect the suggested price, as covariates. That is, the engine size (X1), the number of cylinders (X2), horsepower (X3), average city miles per gallon (X4, MPG), aver highway MPG (X5), weight (X6) and wheel base (X7). This data-set consists of 428 observations, 16 of which have missing values. We remove those observations with missing values in our subsequent analysis. The remaining data-set has a total of eight variables and 412 observations. Beforehand, both the response Y and the covariate X are standardized so that they have mean zero and unit variance. We fit the data by a pure single-index quantile regression model. This is equivalent to the model GðX > b0 Þ with X ¼ ðX 1 ,X 2 ,X 3 ,X 4 ,X 5 ,X 6 ,X 7 Þ> . The index parameter estimates for t ¼ 0:1,0:3,0:5,0:7,0:9 are similar. For example when t ¼ 0:5, b^ 0 ¼ ð0:3269,0:1949,0:5213,0:4145,0:3830,0:4695,0:2109Þ is the value of the index parameter estimate. This indicates that among the seven candidate attributes of vehicles, horsepower and number of cylinders are the most and
111
51 53 55 57 59 61
Please cite this article as: Fan, Y., Zhu, L., Estimation of general semi-parametric quantile regression. Journal of Statistical Planning and Inference (2012), http://dx.doi.org/10.1016/j.jspi.2012.11.005
113 115 117 119 121 123
Y. Fan, L. Zhu / Journal of Statistical Planning and Inference ] (]]]]) ]]]–]]]
1 3
9
63
Table 4 Evaluations of the qOPG and qADE estimates for b in Example 3.
t
Method
65
g
b
5 7
9
EE
Angle
MAD
MSE
g0
g1
g2
qOPG qADE qOPG qADE qOPG qADE qOPG qADE qOPG qADE
0.0136 0.0144 0.0105 0.0113 0.0108 0.0117 0.0118 0.0127 0.0143 0.0158
1.1027 1.1710 0.8512 0.9193 0.8818 0.9521 0.9609 1.0322 1.1664 1.2840
0.0287 0.0303 0.0222 0.0240 0.0231 0.0249 0.0252 0.0270 0.0303 0.0334
0.0004 0.0005 0.0002 0.0003 0.0003 0.0003 0.0003 0.0004 0.0005 0.0006
0.1597 0.1606 0.1033 0.1031 0.0775 0.0786 0.0852 0.0844 0.1325 0.1310
0.1222 0.1256 0.1125 0.1123 0.1038 0.1096 0.1077 0.1068 0.1225 0.1231
0.1086 0.1089 0.1033 0.1033 0.0992 0.1034 0.1023 0.0989 0.1129 0.1117
qOPG qADE qOPG qADE qOPG qADE qOPG qADE qOPG qADE
0.0241 0.0258 0.0139 0.0154 0.0116 0.0128 0.0134 0.0151 0.0244 0.0258
1.9601 2.0953 1.1304 1.2559 0.9456 1.0421 1.0919 1.2242 1.9818 2.0962
0.0509 0.0543 0.0293 0.0329 0.0243 0.0272 0.0283 0.0321 0.0516 0.0549
0.0015 0.0017 0.0005 0.0006 0.0003 0.0004 0.0004 0.0005 0.0015 0.0017
0.2212 0.2203 0.1253 0.1260 0.0908 0.0908 0.1089 0.1069 0.1953 0.1945
0.1846 0.1859 0.1364 0.1377 0.1237 0.1228 0.1345 0.1320 0.1767 0.1745
0.1641 0.1652 0.1150 0.1164 0.1031 0.1034 0.1144 0.1138 0.1618 0.1626
qOPG qADE qOPG qADE qOPG qADE qOPG qADE qOPG qADE
0.0080 0.0099 0.0032 0.0044 0.0024 0.0035 0.0036 0.0050 0.0083 0.0129
0.6521 0.8079 0.2626 0.3593 0.2017 0.2858 0.2935 0.4107 0.6764 1.0625
0.0169 0.0208 0.0067 0.0092 0.0052 0.0074 0.0076 0.0107 0.0175 0.0276
0.1858e 3 0.2768e 3 0.2767e 5 0.5300e 5 0.1585e 5 0.3276e 5 0.3545e 5 0.7102e 5 0.6142e 3 4.4303e 3
0.0753 0.0764 0.0395 0.0404 0.0320 0.0316 0.0305 0.0319 0.0480 0.0502
0.0810 0.0826 0.0517 0.0523 0.0476 0.0464 0.0505 0.0514 0.0793 0.0812
0.0405 0.0410 0.0217 0.0221 0.0227 0.0234 0.0284 0.0293 0.0516 0.0530
qOPG qADE qOPG qADE qOPG qADE qOPG qADE qOPG qADE
0.0080 0.0100 0.0066 0.0078 0.0089 0.0100 0.0129 0.0140 0.0222 0.0234
0.6512 0.8157 0.5423 0.6397 0.7243 0.8182 1.0504 1.1358 1.8021 1.9029
0.0170 0.0213 0.0143 0.0167 0.0191 0.0216 0.0273 0.0296 0.0463 0.0497
0.0001 0.0002 0.0001 0.0001 0.0002 0.0002 0.0004 0.0005 0.0012 0.0014
0.0614 0.0625 0.0670 0.0678 0.0970 0.0981 0.1443 0.1441 0.2339 0.2338
0.0863 0.0869 0.0857 0.0872 0.0986 0.1005 0.1199 0.1188 0.1663 0.1687
0.0527 0.0530 0.0597 0.0599 0.0733 0.0734 0.0961 0.0967 0.1451 0.1436
N(0,1) 0.1 0.3
11 0.5
13
0.7
15
0.9
67 69 71 73 75 77
t(3)
17 0.1
19
0.3
21
0.5
23
0.7
25
0.9 Cauchy
27
0.1
29
0.3
31
0.5 0.7
33 0.9
35 37 39
0.3 0.5
41 43
0.7 0.9
81 83 85 87 89 91 93 95 97
Exp(1) 0.1
79
99 101 103 105 107
45 Histogram of log(price)
47 49
60
55
log(price)
Frequency
51 53
4
109
3
111
2
113
1
115
0
117
−1
119
−2
121
80
40
20
57 59 61
0 −2
0
1
2
3
4
−2
−1
log of price ^ Fig. 2. Histogram of the response and plot of G.
0 index
1
2
3
123
10
1 3 5
Y. Fan, L. Zhu / Journal of Statistical Planning and Inference ] (]]]]) ]]]–]]]
least important influences respectively on MSRP. Although our index parameter estimate is slightly different from that of Zhu et al. (2012), we arrive at the same conclusion. With the index parameter estimate at t ¼ 0:5, we can estimate the nonparametric function GðÞ with local linear estimation procedure. See Fig. 2. An alternative analysis was motivated by the observation that the covariate X2, i.e. number of cylinders, is clearly a categorical variable, while the other six covariates are all continuous. We intend to fit a single-index model for each group of data grouped by the values of X2. The following is the frequency statistics of X2:
11
X2 Frequency
1 2
3 1
4 129
5 7
6 188
8 83
12 2
The groups with X 2 ¼ 1,3,5,12 have so few data that no good estimates of the index parameter b0 can be expected. We compute the index parameter estimates for the groups with X 2 ¼ 4,6,8 and t ¼ 0:1,0:3,0:5,0:7,0:9, and find that the
Histgram, X2=4
15
71 73
log(price)
Frequency
* *
20 10
25
−0.5
−2.0
0 27
* ** *
0.5
30
19
−2
0
2
29
log(price)
31
Histgram, X2=6
77
X2=4
17
23
67
75
13
21
65
69
7 9
63
4
−2.0
4
33
−1.5
−1.0
79
*
** * * * ** * ***** **** *** * * * * * **** * ** * *************** * ** ** *** *** **** **** ** ************* * ********* * * * * ***** * *
−0.5
0.0
0.5
81 83 85 87
1.0
89
index
91
X2=6
93
*
95
3 35
39 41
log(price)
Frequency
37
40
20
43
*** * *
1
−1
0 −2
0
2
−2.0
4
57
20 log(price)
Frequency
10
* **
59 61
1.0
−0.5
0 0
0.5
1.0
105
2
4
109
*
5
−2
0.0
103
X2=8
2.0
55
−0.5 index
101
107
3.0
49
53
−1.0
Histogram, X2=8
47
51
−1.5
99
** * * ** * **** ******** ** **** ***************** ****** ** * ** * * * ** * ************************* ****** * * **** ************* * * * ** ** * ** *********** ** ** *
log(price)
45
97
** *
2
−2.0
−1.5
log(price)
* ** * ** ** * * * * ** * *** * * * * ********** ** * * * ** **** * * * * * * ** * * * * **** * ** * * ** ** * * * ** * * −1.0
−0.5 index
0.0
0.5
111 113 115
* 117
* *
119
1.0
^ Fig. 3. Histogram of the response and plot of G.
Please cite this article as: Fan, Y., Zhu, L., Estimation of general semi-parametric quantile regression. Journal of Statistical Planning and Inference (2012), http://dx.doi.org/10.1016/j.jspi.2012.11.005
121 123
Y. Fan, L. Zhu / Journal of Statistical Planning and Inference ] (]]]]) ]]]–]]]
11
estimate for each group does not vary much as t changes. The three index parameter estimates with t ¼ 0:5 are presented in the following table as representatives.
1
63 65
3 The qOPG estimate of b0
5
X2 ¼ 4 X2 ¼ 6 X2 ¼ 8
7
ð0:1834,0:4968,0:4591,0:4000,0:5581,0:1926Þ ð0:3431,0:5936,0:2946,0:3388,0:5188,0:2427Þ ð0:0788,0:4219,0:2307,0:6999,0:4488,0:2667Þ
69 71
9 It can be seen that the attribute that affects MSRP most for 4-cylinder automobiles is weight, for 6-cylinder automobiles is horsepower, while for 8-cylinder automobiles is average highway MPG. Clearly the three index parameter estimates are different from each other, which is reasonable since the MSRP has an roughly increasing trend as the number of cylinders of vehicle increases. See Fig. 3. Fig. 3 also gives the local linear estimate of the nonparametric function GðÞ for each group at t ¼ 0:05, 0.5 and 0.95. The curves are again different across groups for fixed t, although they are all roughly decreasing functions.
11 13 15 17 19
67
75 77 79
Uncited references Q2
73
81
Fan et al. (2003) and Zou and Yuan (2008).
83
21 Appendix A. Technical conditions and proofs
85
23 25 27 29 31
To prove the theorems given in this paper, we make for technical convenience the following conditions which may be weakened. (A) The support of the kernel function K is a compact set. (B) The conditional t quantile of Y given ðX,ZÞ ¼ ðx,zÞ, Gðx> b0 ,zÞ, has continuous and bounded second partial derivatives. (C) Let f ðx,zÞ and f ðy,x,zÞ be respectively the joint density functions of ðX,ZÞ and ðY,X,ZÞ. Also let f ðy9x,zÞ and Fðy9x,zÞ denote the conditional density and distribution functions of Y given ðX,ZÞ. The function f ðy,x,zÞ is Lipschtiz continuous and bounded from zero and infinity on its support. The functions f ðx,zÞ and Fðy9x,zÞ are all Lipschtiz continuous in x,y and z. The partial derivative of f ðy9x,zÞ with respect to ðy,x,zÞ is continuous and bounded.
41
We make some comments on these conditions. Condition (A) is a commonly adopted condition for kernel functions in local polynomial estimation. See, for example, Fan, Hu and Truong (1994) and Wu et al. (2010). Condition (B) is used to exclude those conditional quantile functions Gðx> b0 ,zÞ that are not smooth enough. Functions satisfying condition (B) can be approximated locally by their second-order Taylor expansions. Under condition (C), the usual kernel estimates of functions, such as f ðx,zÞ and f ðy9x,zÞ are consistent point-wise. Condition (C), together with condition (B), guarantees the function Hx ðx,zÞ ¼ ð@=@xÞ½ðG1 ðx> b0 ,zÞÞ=f ðGðx> b0 ,zÞ9x,zÞ to be bounded. Therefore the random vector Hx ðX,ZÞ has finite second moment, which further implies the root-n consistency of the proposed qOPG estimate of b. See the proof of result (ii) of Theorem 1. Condition (C) can be appropriately relaxed if it can be guaranteed that Hx ðX,ZÞ has second moment.
43
A.1. Proof of Theorem 1
37 39
49
53
93
97 99 101 103
107 Proof of part (i) of Theorem 1. It is sufficient to prove the asymptotical normality of the following local linear estimate at a given point ðX 0 ,Z 0 Þ bc bÞ ¼ arg min ^ b, ða, a,b,c
51
91
105
45 47
89
95
33 35
87
n X
rt ðY i ab> X i0 c> Z i0 ÞKðX i0 =h,Z i0 =hÞ:
ðA:1Þ
109 111
i¼1 >
> > > > For ease of exposition, we define necessary notation. Let m ¼ p þ q. Write Q i ¼ ð1,X > i0 =h,Z i0 =hÞ and a ¼ ða0 ,b0 h,c0 hÞ where p p ffiffiffiffiffiffiffiffiffi ffiffiffiffiffiffiffiffiffi m m ^ > > > > ^> ^ a0 ¼ GðX > nh ðbb 0 Þ h, 0 b0 ,Z 0 Þ, b0 ¼ G1 ðX b0 ,Z 0 Þb0 and c0 ¼ G2 ðX b0 ,Z 0 Þ. Let ei ¼ Y i Q i a and h ¼ ½ nh ðaa0 Þ, pffiffiffiffiffiffiffiffiffi m > ^ Þ h. Then h^ is the minimizer of nh ðcc
113 115
0
55 LðhÞ ¼ 57 59 61
n X
pffiffiffiffiffiffiffiffiffi m frt ðei Q i h= nh Þrt ðei ÞgKðX i0 =h,Z i0 =hÞ:
117 ðA:2Þ
i¼1
Let Zi ¼ Iðei r0Þt. With the help of the following equality (Knight, 1998) Z y rt ðxyÞrt ðxÞ ¼ yfIðx r 0Þtg þ fIðx r zÞIðx r0Þg dz, 0
Please cite this article as: Fan, Y., Zhu, L., Estimation of general semi-parametric quantile regression. Journal of Statistical Planning and Inference (2012), http://dx.doi.org/10.1016/j.jspi.2012.11.005
119 121 123
12
1 3
Y. Fan, L. Zhu / Journal of Statistical Planning and Inference ] (]]]]) ]]]–]]]
63
we have m
LðhÞ ¼ ðnh Þ1=2
n X
Zi Q i hKðX i0 =h,Z i0 =hÞ þ Bn ,
ðA:3Þ
65
i¼1
5 7 9
pffiffiffiffiffiffiffi R Q h= nhm P where Bn ¼ ni¼ 1 KðX i0 =h,Z i0 =hÞxi and xi ¼ 0 i ðIfei rzgIfei r 0gÞ dz. Write D ¼ fX i ,Z i : i ¼ 1, . . . ,ng. According to condition (A), the conditional density of ei given D is f ðQ > i a þ e9X i ,Z i Þ. For fixed h, we have pffiffiffiffiffiffiffi pffiffiffiffiffiffiffi Z Q i h= nhm Z Q i h= nhm pffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffi m m > ðQ i h= nh tÞf ðQ i a þ t9X i ,Z i Þ dt ¼ ðQ i h= nh tÞf ðQ > Eðxi 9DÞ ¼ i a9X i ,Z i Þ dtð1þ op ð1ÞÞ 0
11
0
1 > 2 ¼ m f ðQ i a9X i ,Z i ÞðQ i hÞ ð1 þ op ð1ÞÞ: 2nh
13
2
67 69 71 73 75
m
3 Similarly we have Eðxi 9DÞ ¼ ð1=3ðnh Þ3=2 Þf ðQ > i a9X i ,Z i ÞðQ i hÞ ð1 þ op ð1ÞÞ. Therefore the conditional mean of Bn is
15 EðBn 9DÞ ¼ 17 19 21 23 25 27
n X
KðX i0 =h,Z i0 =hÞ
i¼1
It can be verified that the conditional variance of Bn VarðBn 9DÞ r
n X
31 33 35 37 39
K
2
2 ðX i0 =h,Z i0 =hÞEðxi 9DÞ ¼
i¼1
m
Ln ðhÞ ¼
An h þ 12h> Sn h þ Rn ,
h^ ¼ S1 n An þop ð1Þ:
n 1 X n > > > > > KðX i0 =h,Z i0 =hÞf ðQ > m i a9X i ,Z i Þ½1,X i0 =h,Z i0 =h ½1,X i0 =h,Z i0 =h ¼ f ða0 ,X 0 ,Z 0 ÞC 1 ð1 þ op ð1ÞÞ, nh i ¼ 1
RR where C n1 ¼ diagf1,C 1 , ss> Kðt,sÞ dt dsg because of the symmetry of Kðt,sÞ. It is clear that An is a sum of independent and identically distributed random elements. Furthermore An has an asymptotically normal distribution because its variance is bounded as is shown below. Note that EðZi 9DÞ ¼ FðQ i a9X i ,Z i Þt and VarðZi 9DÞ ¼ FðQ i a9X i ,Z i Þð1FðQ i a9X i ,Z i ÞÞ. Therefore m
EðAn 9X i ,Z i Þ ¼ ðnh Þ1=2
47
n X
As h-0 and nh
mþ4
81 83 85 87
91
95 97 99 101 103
ðFðQ i a9X i ,Z i ÞtÞQ i KðX i0 =h,Z i0 =hÞ:
i¼1
43
79
93
The remaining task is to prove the asymptotic normality of h^ based on the large-sample properties of Sn and An. The matrix Sn can be approximated as Sn ¼
77
89
Pn
where An ¼ ðnh Þ i ¼ 1 Zi Q i KðX i0 =h,Z i0 =hÞ and Rn ¼ op ð1Þ. According to the quadratic approximation lemma of Hjort and Pollard (1993), we have
41
45
n X 1 1 m 1=2 > 3 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi K 2 ðX i0 =h,Z i0 =hÞf ðQ > Þ, i a9X i ,Z i ÞðQ i hÞ ð1 þ op ð1ÞÞ ¼ Op ððnh Þ 3 m 3 ðnh Þ i ¼ 1
where the second holds because of condition (B). When nh -1, we have VarðBn 9DÞ=ðEðBn 9DÞÞ2 ¼ op ð1Þ, which implies Bn ¼ EðBn 9DÞ þop ð1Þ. Then Eq. (A.3) can be rewrite as
m 1=2
29
1 1 > > 2 h Sn hð1 þ op ð1ÞÞ: m f ðQ i a9X i ,Z i ÞðQ i hÞ ð1 þop ð1ÞÞ 2 2nh
-0, EðEðAn 9X i ,Z i ÞÞ-0. And the variance of An is
ZZ n 1 X > 2 VarðAn Þ ¼ FðQ i a9X i ,Z i Þð1FðQ i a9X i ,Z i ÞÞQ i Q > Fða0 þhb0 t þ hc> m 0 s9X 0 þth,Z 0 þ shÞ i K ðX i0 =h,Z i0 =hÞ ¼ nh i ¼ 1 2 > > > > > ð1Fða0 þhb0 t þhc> 0 sÞ9X 0 þ th,Z 0 þshÞÞ ð1,t ,s Þ ð1,t ,s ÞK ðt,sÞf ðX 0 þth,Z 0 þ shÞ dt ds
105 107 109
>
49
111
¼ tð1tÞf ðX 0 ,Z 0 ÞC n2 ð1þ op ð1ÞÞ, 51 53
RR where C n2 ¼ diagf1,C 2 , ss> K 2 ðt,sÞ dt dsg. Thus h^ has an asymptotically normal distribution with mean 0 and variance pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 mþ2 ^ 1 nh ðbb0 Þ of h^ has also an asymptotically matrix tð1tÞC 1 1 C 2 C 1 f ðX 0 ,Z 0 Þ=f ða0 ,X 0 ,Z 0 Þ. Therefore the sub-vector normal distribution with mean zero and variance obtained accordingly. &
59 61
115 117
55 57
113
Proof of part (ii) of Theorem 1. At a given point ðX j ,Z j Þ, we have approximated b^ j in the proof of part (i) up to m þ 2 1=2 > > > > > > op ððnh Þ Þ. Let eij ¼ Y i Q > ij aj and Zij ¼ Iðeij r 0Þt with Q ij ¼ ð1,X ij =h,Z ij =hÞ and aj ¼ ðaj ,bj h,cj hÞ . The proof of part (i) implies that n n X X X ij X ij n C 1 1 C 1 1 1 b ¼b 1 b Zij K h ðX ij ,Z ij Þ½1þ op ð1Þ ¼ bj Z K ðX ,Z Þ½1 þ op ð1Þ, j j nh f ðaj ,X j ,Z j Þ i ¼ 1 h nh f ðaj ,X j ,Z j Þ i ¼ 1 h i h ij ij
Please cite this article as: Fan, Y., Zhu, L., Estimation of general semi-parametric quantile regression. Journal of Statistical Planning and Inference (2012), http://dx.doi.org/10.1016/j.jspi.2012.11.005
119 121 123
Y. Fan, L. Zhu / Journal of Statistical Planning and Inference ] (]]]]) ]]]–]]]
1
m
where K h ðÞ ¼ Kð=hÞ=h P bb b> V ¼ ð1=nÞ nj¼ 1 b j j as
3 V¼ 5 7 9
15 17 19
> and Zni ¼ IðY i o ai Þ with ai ¼ GðX > i b0 ,Z i Þ. Noting bj ¼ G1 ðX j b0 ,Z j Þb0 , we can approximate
Pn
Pn > j ¼ 1 ðG1 ðX j b0 ,Z j ÞÞ=f ðaj ,X j ,Z j Þ i¼1
2
where Mn ¼ ð1=n hÞ We shall prove later M n ¼ n1
n X
67
1 n C 1 1 X ij h i K h ðX ij ,Z ij Þ.
Z
69 71
Zni Hx ðX i ,Z i Þ þ op ðn1=2 Þ,
ðA:4Þ
i¼1
73
where the random vector @ G1 ðx> b0 ,zÞf ðx,zÞ Hx ðx,zÞ ¼ > @x f ðGðx b0 ,zÞ,x,zÞ
75
has finite second moment according to Conditions (B) and (C). Thus M n ¼ Op ðn1=2 Þ and it is easy to verify that an asymptotical normal distribution Nð0, tð1tÞE½Hx ðX 1 ,Z 1 ÞHtx ðX 1 ,Z 1 ÞÞ. It then follows that
pffiffiffi nMn has
2 > > 1=2 1=2 Þ ¼ EfG21 ðX > Mn þ op ðn1=2 Þ V ¼ E½G21 ðX > 1 b0 ,Z 1 Þb0 b0 ðb0 M n þ M n b0 Þ þ op ðn 1 b0 ,Z 1 Þg½b0 fE½G1 ðX 1 b0 ,Z 1 Þg >
>
77 79 81
1=2 ½b0 fE½G21 ðX > M n þop ðn1=2 Þ> : 1 b0 ,Z 1 Þg
21
63 65
n 1X > > ½G2 ðX > b ,Z Þb b ðb0 M> n þM n b0 Þ½1 þop ð1Þ, nj¼1 1 j 0 j 0 0
11 13
13
83
This immediately implies 23 25 27 29
and that
pffiffiffi ^ nðb b0 Þ has an asymptotically normal distribution Nð0, OÞ with
1 O ¼ tð1tÞfEG21 ðX > E½Hx ðX 1 ,Z 1 ÞHtx ðX 1 ,Z 1 Þ: 1 b0 ,Z 1 Þg
89
The remaining part of the proof is to show Eq. (A.4). To derive the dominant part of Mn, it is sufficient to get that of a> M n for any p-dimensional non-random vector a. We rewrite
91
a> Mn ¼ 33
37 39 41 43 45 47 49 51 53 55 57 59 61
87 ðA:5Þ
31
35
85
1=2 b^ ¼ b0 fE½G21 ðX > M n þ op ðn1=2 Þ 1 b0 ,Z 1 Þg
n X n G ðX > b ,Z Þ X 1 X 1 j 0 j 1 n a> C 1 Zi K h ðX ij ,Z ij Þ ¼ xn ðW i ,W j Þ, 1 X ij h 2 n h i ¼ 1 j ¼ 1 f ðaj ,X j ,Z j Þ 1riojrn
where xn ðW i ,W j Þ ¼ zn ðW i ,W j Þ þ zn ðW j ,W i Þ is a U-statistics with kernel 1 G ðX > b ,Z Þ 1 n zn ðW 1 ,W 2 Þ ¼ 2 1 1 0 1 a> C 1 Z2 K h ðX 21 ,Z 21 Þ 1 X 21 h n h f ða1 ,X 1 ,Z 1 Þ and W i ¼ ðY i ,X i ,Z i Þ. P It is easy to verify that Exn ðW i ,W j Þ ¼ Ezn ðW i ,W j Þ ¼ 0. Let jn ðxÞ ¼ Exn ðW 1 ,xÞ ¼ Ezn ðW 1 ,xÞ and Pn ¼ ðn1Þ ni¼ 1 jn ðW i Þ. According to the usual Hoeffding decomposition of Mn (Serfling, 1980), we have nðn1Þ nðn1Þ 2 2 ½Exn ðW 1 ,W 2 Þ2Ej2n ðW 1 Þ r Exn ðW 1 ,W 2 Þ: Eða Mn Pn Þ ¼ 2 2 >
93 95 97 99 101 103
2
It follows by conditioning on ðX 1 ,Z 1 ,X 2 ,Z 2 Þ that ( )2 4 G1 ðX > 2 2 1 n m2 1 b0 ,Z 1 Þ > 1 Exn ðW 1 ,W 2 Þ r 4Ezn ðW 1 ,W 2 Þ r E a C X h Z K ðX ,Z Þ ¼ Oðn4 h Þ: 21 21 21 1 2 h 2 f ða1 ,X 1 ,Z 1 Þ n4 h
105 107 109
This implies that Eða> M n P n Þ2 ¼ Oðn2 h Þ. Because nh -1, we have a> M n ¼ P n þ op ðn1=2 Þ. To further approximate Pn, we study jn ðxÞ ¼ Ezðx,W 1 Þ. Recall that f ðx,zÞ denotes the joint density function of ðX,ZÞ. It follows that: ZZ 1 jn ðW 2 Þ ¼ Zn2 fðn2 hÞ1 Hðx,zÞa> C 1 K h ðX 2 x,Z 2 zÞg dx dz, 1 ðX 2 xÞh
111
where HðX 1 ,Z 1 Þ ¼ f ðX 1 ,Z 1 ÞðG1 ðX > 1 b0 ,Z 1 ÞÞ=f ða1 ,X 1 ,Z 1 Þ. Let Hx and Hz denote the first-order partial derivatives of H with x and z respectively. First-order Taylor expansion of Hðx,zÞ at ðX 2 ,Z 2 Þ gives ZZ 1 ½HðX 2 ,Z 2 Þ þ Hx ðX 2 ,Z 2 ÞðxX 2 Þ þ Hz ðX 2 ,Z 2 ÞðzZ 2 Þ jn ðW 2 Þ ¼ Zn2 n2 h o 1 K h ðX 2 x,Z 2 zÞ dx dzf1 þoð1Þg: a> C 1 1 ðX 2 xÞh
117
m2
mþ2
Please cite this article as: Fan, Y., Zhu, L., Estimation of general semi-parametric quantile regression. Journal of Statistical Planning and Inference (2012), http://dx.doi.org/10.1016/j.jspi.2012.11.005
113 115
119 121 123
14
1 3 5 7
Y. Fan, L. Zhu / Journal of Statistical Planning and Inference ] (]]]]) ]]]–]]]
By careful study of the above integration, we find that the first and third terms are equal to zero, because the kernel function is symmetric. Thus we finally get ZZ Zn Zn 2 jn ðW 2 Þ ¼ 22 a> C 1 fðX 2 xÞðX 2 xÞ> h K h ðX 2 x,Z 2 zÞg dx dz Hx ðX 2 ,Z 2 Þf1 þoð1Þg, ¼ 22 a> Hx ðX 2 ,Z 2 Þf1þ oð1Þg: 1 n n The last equation holds because the integral is equal to C1. Therefore a> M n can be further approximated by
63 65 67 69
9
n 1X a> Mn ¼ Pn þ op ðn1=2 Þ ¼ a> Zn Hx ðX i ,Z i Þ þop ðn1=2 Þ: ni¼1 i
71
11
As this approximation holds for any vector a, it follows that:
73
13 15 17
Mn ¼
n 1X Zn Hx ðX i ,Z i Þ þ op ðn1=2 Þ: ni¼1 i
This competes the proof.
77
&
79
A.2. Proof of Theorem 2
81
19 21 23 25
Proof of part (i) of Theorem 2. Consider the local linear estimate at point ðX 0 ,Z 0 Þ bb ^ b, ða, c Þ ¼ arg min a,b,c
29 31
n X
rt ðY i ab> X i0 c> Z i0 ÞKðX >i0 b~ =h,Z i0 =hÞ:
i¼1
With the same notation in the proof of Theorem 1, h^ is the minimizer of n X
27
75
m 1=2
frt ðei ðnh Þ
> ~ t ðei ÞgKðX i0 b =h,Z i0 =hÞ ¼ Ln ðhÞ þ Dn ,
Q i hÞr
i¼1
where
83 85 87 89 91
Ln ðyÞ ¼
n X
m
frt ðei ðnh Þ1=2 Q i hÞrt ðei ÞgKðX > i0 b0 =h,Z i0 =hÞ,
93
i¼1
33 35 37
Dn ¼
n X
m > ~ frt ðei ðnh Þ1=2 Q i hÞrt ðei ÞgKðX > i0 b0 =h,Z i0 =hÞC i X i0 ðb b0 Þ=h,
i¼1
with Ci’s being uniformly bounded due to the Lipschitz continuity of K. With the same approximation scheme as in the proof of Theorem 1, we have Ln ðhÞ ¼ A~ n h þ 12h> S~ n h þRn ,
95 97 99 101
45
pffiffiffiffiffiffiffiffiffi m P m Pn > > > ~ where S~ n ¼ ð1=nh Þ ni¼ 1 KðX > i0 b0 =h,Z i0 =hÞf ðQ i a9X i ,Z i ÞQ i Q i , Rn ¼ op ð1Þ and A n ¼ ð1= nh Þ i ¼ 1 KðX i0 b0 =h,Z i0 =hÞZi Q i . Similarly, it can be verified that for each fixed h, Dn ¼ Op ðJb~ b0 JÞ ¼ op ð1Þ. This together with the quadratic approximation 1 m lemma implies h^ ¼ S~ n A~ n þ op ð1Þ: As h-0 and nh -1, the matrix Sn can be approximated as S~ n ¼ f ða0 ,X 0 ,Z 0 ÞC n3 ð1 þ op ð1ÞÞ RR > n with C 3 ¼ diagf1,C 3 , ss Kðt > b0 ,sÞ dt dsg and A~ n is asymptotically distributed as Nð0, tð1tÞC n4 f ðX 0 ,Z 0 ÞÞ with C n4 ¼ RR diagf1,C 4 , ss> K 2 ðt > b0 ,sÞ dt dsg. The rest of the proof is similar to that of Theorem 1 and omitted.
47
Proof of part (ii) of Theorem 2. According part (i) of this theorem
109
49
n X C 1 1 3 b ¼b þ 1 b K ðX > b ,Z ÞZn X h ½1 þ op ð1Þ: j j nh f ðaj ,X j ,Z j Þ i ¼ 1 h ij 0 ij i ij
111
39 41 43
51 53 55 57 59 61
It follows immediately that V¼
n X
n X
1 > bb b> ¼ 1 ~ > þM ~ n b> Þ½1þ op ð1Þ, b ½G2 ðX > b ,Z Þb b ðb0 M 0 n nj¼1 j j nj¼1 1 j 0 j 0 0
where n G ðX > b ,Z Þ X n X 1 j 0 j 1 ~n¼ 1 C 1 X ij h Zni K h ðX > M ij b0 ,Z ij Þ: 2 n h j ¼ 1 f ðaj ,X j ,Z j Þ i ¼ 1 3
~ n ¼ n1 Pn Zn Hx ðX i ,Z i Þ þ op ðn1=2 Þ. Consequently we have b^ ¼ b fE½G2 ðX > b , Similarly it can be shown that M 0 i¼1 i 1 1 0 pffiffiffi ^ 1=2 ~ 1=2 M n þop ðn Z 1 Þg Þ, and nðb b0 Þ has the same limiting distribution as result (ii) of Theorem 1. & Please cite this article as: Fan, Y., Zhu, L., Estimation of general semi-parametric quantile regression. Journal of Statistical Planning and Inference (2012), http://dx.doi.org/10.1016/j.jspi.2012.11.005
103 105 107
113 115 117 119 121 123
Y. Fan, L. Zhu / Journal of Statistical Planning and Inference ] (]]]]) ]]]–]]]
15
1
Acknowledgment
37
3
39
5
The research described here was supported by a Grant from the University Grants Council of Hong Kong, Hong Kong. The authors thank the editor, the associate editor and a referee for their constructive comments and suggestions which led to a significant improvement on the early paper, particularly an improvement on the theoretical results.
7
References
43
9
Cai, Z., Xu, X., 2009. Nonparametric quantile estimations for dynamic smooth coefficient models. Journal of the American Statistical Association 104, 371–383. Chaudhuri, P., 1991. Nonparametric estimates of regression quantiles and their local Bahadur representation. Annals of Statistics 19, 760–777. Chaudhuri, P., Doksum, K., Samarov, A., 1997. On average derivative quantile regression. Annals of Statistics 25, 715–744. Fan, J., 1992. Design-adaptive nonparametric regression. Journal of the American Statistical Association 87, 998–1004. Fan, J., 1993. Local linear regression smoothers and their minimax efficiency. Annals of Statistics 21, 196–216. Fan, J., Yao, Q., Cai, Z., 2003. Adaptive varying-coefficient linear models. Journal of Royal Statistical Society B 65, 57–80. Gooijer, J.G., Zerom, D., 2003. On additive conditional quantiles with high-dimensional covariates. Journal of the American Statistical Association 98, 135–146. He, X., Shi, P., 1996. Bivariate tensor-product B-splines in a partly linear model. Journal of Multivariate Analysis 58, 162–181. Honda, T., 2004. Quantile regression in varying coefficient models. Journal of Statistical Planning and Inference 121, 113–125. Horowitz, J.L., Lee, S., 2005. Nonparametric estimation of an additive quantile regression model. Journal of the American Statistical Association 100, 1238–1249. Hjort, N.L., Pollard, D., 1993. Asymptotics for minimisers of convex process. Technical Report, Yale University. Kai, B., Li, R., Zou, H., 2011. New efficient estimation and variable selection methods for semiparametric varying-coefficient partially linear models. Annals of Statistics 39, 305–332. Kim, M., 2007. Quantile regression with varying coefficients. Annals of Statistics 35, 92–108. Knight, K., 1998. Limiting distributions for L1 regression estimates under general conditions. Annals of Statistics 26, 755–770. Koenker, R., Bassett, G., 1978. Regression quantiles. Econometrica 46, 33–50. Koenker, R., Ng, P., Portnoy, S., 1994. Quantile smoothing splines. Biometrika 81, 673–680. Koenker, R., 2005. Quantile Regression. Cambridge University Press, New York. Kong, E., Xia, Y., 2011. A single-Index quantile regression model and its estimation, Econometric Theory, preprint. Lee, S., 2003. Efficient semiparametric estimation of a partially linear quantile regression model. Econometric Theory 19, 1–31. Serfling, R.J., 1980. Approximation Theorems of Mathematical Statistics. Wiley, New York. Stone, C.J., 1977. Consistent nonparametric regression, with discussion. Annals of Statistics 5, 595–645. Wang, H.J., Zhu, Z.Y., Zhou, J.H., 2009. Quantile regression in partially linear varying coefficient models. Annals of Statistics 37, 3841–3866. Wu, T.Z., Yu, K., Yu, Y., 2010. Single-index quantile regression. Journal of Multivariate Analysis 101, 1607–1621. Xia, Y., Tong, H., Li, W.K., Zhu, L., 2002. An adaptive estimation of dimension reduction space. Journal of the Royal Statistical Society Series B 64, 363–410. Xia, Y., Hardle, W., 2006. Semi-parametric estimation of partially linear single-index models. Journal of Multivariate Analysis 97, 1162–1184. Yu, K., Jones, M.C., 1997. A comparison of local constant and local linear regression quantile estimation. Computational Statistics and Data Analysis 25, 159–166. Yu, K., Jones, M.C., 1998. Local linear quantile regression. Journal of the American Statistical Association 93, 228–237. Zhu, L.P., Huang, M., Li, R.Z., 2012. Semiparametric quantile regression with high-dimensional covariates. Statistica Sinica 22, 1379–1401. Zou, H., Yuan, M., 2008. Composite quantile regression and the oracle model selection theory. Annals of Statistics 36, 1108–1126.
45
11 13 15 17 19 21 23 25 27 29 31 33 35
Please cite this article as: Fan, Y., Zhu, L., Estimation of general semi-parametric quantile regression. Journal of Statistical Planning and Inference (2012), http://dx.doi.org/10.1016/j.jspi.2012.11.005
41
47 49 51 53 55 57 59 61 63 65 67 69 71