Economics Letters 144 (2016) 33–36
Contents lists available at ScienceDirect
Economics Letters journal homepage: www.elsevier.com/locate/ecolet
Identification in nonseparable models with measurement errors and endogeneity✩ Yingyao Hu a , Ji-Liang Shiu b , Tiemen Woutersen c,∗ a
Johns Hopkins University, United States
b
Renmin University of China, China
c
University of Arizona, United States
article
abstract
info
Article history: Received 2 August 2015 Received in revised form 13 March 2016 Accepted 10 April 2016 Available online 20 April 2016
Economic variables are often measured with errors and may be endogenous. This paper extends Chesher (2003) and gives new identification results for the ratio of partial effects in a class of nonseparable index models with measurement error and endogeneity. The identification restrictions include a triangular system and the derivative of some conditional mean functions being nonzero. An example that motivates the paper is identification of the labor supply elasticity. © 2016 Elsevier B.V. All rights reserved.
JEL classification: C20 J20 C27 Keywords: Nonclassical measurement error Measurement error and endogeneity Labor supply elasticity
1. Introduction Economic variables are often measured with errors and may be endogenous. For example, when estimating the labor supply elasticity, it is likely that the wage and the number of hours worked are measured with errors.1 In this paper, we use the estimation of the labor supply elasticity as a leading example, but the estimation procedure is more general than that. We consider models with measurement error and endogeneity. In particular, let y = m(θ x∗ + w, w ′ , η),
(1)
x = x∗ + ε,
(2)
x = g (z , w, w , u), ∗
′
(3)
where we observe y, x, z , w, w . The model involves a dependent variable y, a true regressor x∗ , a mismeasured regressor x, cor-
′
✩ We thank an editor and a referee for excellent comments and we thank Maddie Pickens for research assistance. ∗ Correspondence to: Department of Economics, Eller College of Management, University of Arizona, P.O. Box 210108, Tucson, AZ 8572, United States. E-mail address:
[email protected] (T. Woutersen). 1 Borjas (2009) gives an overview of empirical studies that estimate the labor
supply elasticity and also discusses the problems caused by measurement error. http://dx.doi.org/10.1016/j.econlet.2016.04.009 0165-1765/© 2016 Elsevier B.V. All rights reserved.
rectly measured regressors w and w ′ , and an instrument z, while the unobservables are η, ε , and u. We do not assume that u and (z , w) are independent. Moreover we do not impose a restriction on the dimension of the disturbance η in the regression function or on w ′ . The true regressor x∗ is endogenous in the sense that it is determined by g (z , w, w ′ , u) while u and η are generally correlated. The measurement error ε may be correlated with η so that Hu (2008) and Hu and Schennach (2008) do not apply. Three related papers are Chesher (2003), Abrevaya et al. (2010) and Hu et al. (2015). Our results differ from Chesher (2003) since we allow for measurement error but we have less to say about quantiles. Abrevaya et al. (2010) consider estimation up to scale in the transformation model in order to test the null hypothesis of no causal effect. Our work differs from theirs in that we allow the regressors to be measured with error and the model to be nonseparable and our identification result immediately implies a test in a more general framework. Hu et al. (2015) consider a separable model and provide a literature review. All the identification results in this paper are constructive in the sense that the results suggest an estimator, which we demonstrate by doing simulations. The rest of the paper is organized as follows. In Section 2 we extend the model of Eqs. (1)–(3) with a triangular system and present the identification result. Section 3 contains results from Monte Carlo experiments and Section 4 concludes. The
34
Y. Hu et al. / Economics Letters 144 (2016) 33–36
detailed mathematical proof of the main result is provided in the Appendix. 2. Identification in nonseparable models containing a triangular system We are interested in the parameter θ of the nonseparable regression model, y = m(θ x∗ + w, w ′ , η),
(4)
x = x + ε,
(5)
∗
x∗ = g (z , w, w ′ , u),
(6)
where we observe y, x, z , w, w . As discussed in the introduction, Hu et al. (2015) assume that the observable variables are independent of all error terms, i.e. (z , w) ⊥ (η, ε, u) in Eqs. (4)–(6). However, in many cases in economics, establishing the marginal independence in Eq. (6) such as (z , w) ⊥ u may require undesirable strong assumptions.2 Worker’s labor supply function can be a motivating example of the nonseparable model to illustrate this case. Suppose worker’s labor supply function can be written as Eq. (4). Also, the dependent variable y could denote hours worked, x∗ the unobserved true wage rate, and ε the measurement error. In addition, w could be a measure of worker productivity, z education, and w ′ a list of covariates such as demographics, housing, etc. While the error term η can be interpreted as unobserved health status, the error term u can represent job training possibilities at firms. Under the empirical setting, the parameter of interest θ is the ratio of partial effects corresponding to the true wage rate and a measure of worker productivity on hours of worked. While Eq. (4) is the hours worked equation, Eq. (6) is the wage equation. In this case, a potential correlation between education and job training possibilities or worker productivity and job training possibilities would violate the marginal independence assumption (z , w) ⊥ u. To allow (z , w) and u to be correlated we extend the nonseparable model (4)–(6) with the following marginal independence and triangular system:
′
Assumption 2.1. The variables z , w, w ′ are jointly independent of the error terms (η, ε), i.e.
z , w, w ′ ⊥ (η, ε).
Each (z , w) contains a continuous element and the functions m (·, ·, ·), and g (·, . . . , ·), are differentiable. In addition, the function m (·, ·, ·) is strictly monotone in η and the derivative of ∂ m(·,·,·) its first argument is nonzero, i.e. ∂ψ ̸= 0 where ψ denotes the first argument of m (·, ·, ·). The function g (·, . . . , ·) is strictly monotone in u. These monotonicity conditions can be illustrated by the motivating example of labor supply and the wage equation. In particular, holding other factors fixed, good health may increase the hours worked and job training possibilities may increase the unobserved true wage rate. Assumption 2.2. Let v denote an unobservable error term. Let
w = h1 (z , w′ , u, v), z = h2 (w ′ , v),
(7) (8)
where h1 , and h2 are differentiable, h1 is strictly monotone in its argument u, and h2 is strictly monotone in its argument v . No restrictions are imposed on v .
If we interpret the unobservable error term v as ability then the motivating example satisfies these monotonicity assumptions because ability increases the observable wage rate and education ceteris paribus. The modeling in Eq. (7) allows for correlation between w and u. Assumption 2.3. Assume the cumulative distribution functions FY |Z ,W ,W ′ (y|z , w, w ′ ) and FX |Z ,W ,W ′ (x|z , w, w ′ ) are differentiable with respect to z and w . For some (¯y, x¯ ) and (z , w, w ′ ) let
∂ FY |Z ,W ,W ′ (¯y) ∂ FX |Z ,W ,W ′ (¯x) ∂ FY |Z ,W ,W ′ (¯y) ∂ FX |Z ,W ,W ′ (¯x) − ̸= 0. ∂z ∂w ∂w ∂z Also, at the point at which these derivatives are evaluated, (z , w, w ′ ), let (i) the density of (z , w, w ′ ) be strictly positive or (ii) the conditional density of (z , w) given w ′ be strictly positive and P (W ′ = w ′ ) > 0. Our main result can be stated as follows and we leave the proof to the Appendix: Theorem 2.1. Suppose that Assumptions 2.1, 2.2, and 2.3 hold. Then the parameter of interest θ is identified. In particular, any (¯y, x¯ ) and (z , w, w ′ )
θ=
.
(9)
In order to estimate θ we have to use estimators for the derivatives of the cumulative distribution functions. This would cause two types of bias to θ , one is the approximation bias for these derivatives and the other one is ratio bias since the denominator in Eq. (7) may be close to zero. To mitigate these biases, in the estimation section, we propose to use a weighted average over sample points. Also, the condition in Assumption 2.3 is expressed in terms of distribution functions so the restriction can be tested. In particular, if Assumption 2.3 holds for more than one value (¯y, x¯ ) then we can compute an empirical estimator for the term θ for different values of (¯y, x¯ ) to test the model. 3. Estimation and simulation Under the assumptions of Theorem 2.1, the parameter of interest θ can be identified and written in terms of the first derivatives of the distribution functions FX |Z ,W ,W ′ and FY |Z ,W ,W ′ .
n
From an i.i.d. random sample yi , xi , zi , wi , wi′ i=1 , the empirical estimator for FY |Z ,W ,W ′ is
n 1 number of y1 , . . . , yn less then y F (y|z , w, w ′ ) =
n i=1 such that |zi − z | < σ · std(z ), |wi − w|
< σ · std(w), |wi′ − w ′ | < σ · std(w ′ ) ,
(10)
where std(z ), std(w), and std(w ′ ) are standard deviations of zi , wi , and wi′ in the sample respectively and σ is a bandwidth that converges to zero as the sample size increases. This implies that a consistent estimator for the derivative of FY |Z ,W ,W ′ with respect to z can be obtained by ′ ∂ FY |Z ,W ,W ′ (y, z , w, w ) ∂z
= 2 For example Newey (2001) argues that independence between z , w and u is a strong assumption.
∂ FY |Z ,W ,W ′ (¯y) ∂ FX |Z ,W ,W ′ (¯x) ∂z ∂x ∂ FY |Z ,W ,W ′ (¯y) ∂ FX |Z ,W ,W ′ (¯x) ∂ FY |Z ,W ,W ′ (¯y) ∂ FX |Z ,W ,W ′ (¯x) − ∂z ∂w ∂w ∂z
1 2δ
[ F (y|z + δ, w, w ′ ) − F (y|z − δ, w, w ′ )]
using a small value of δ . A similar formula can be applied to estimate FX |Z ,W ,W ′ and its derivatives.
Y. Hu et al. / Economics Letters 144 (2016) 33–36 Table 1 Estimations of single linear index models (n = 1000). Separable estimator θ = −1
Nonseparable estimator θ = −1
−0.987 −0.960
Simulation I
Mean Median Std. dev.
−1.029 −1.080 0.623
0.351
Simulation II
Mean Median Std. dev.
−0.225 −0.216
−0.979 −0.932
0.075
0.183
0.282
−0.076
−1.062 −1.060
4.730
0.433
−0.303 −0.301
−1.069 −1.018
0.195
0.180
Simulation III
Simulation IV
Mean Median Std. dev. Mean Median Std. dev.
Note: Standard deviations of the parameters are computed by the standard deviation of the estimates across 100 simulations and called (simulation) standard deviations. While the Separable Estimator indicates the local polynomial regression method in Hu et al. (2015) the Nonseparable Estimator is the proposed estimator in Section 3. The abbreviated names reflects whether the function g in Eq. (3) is nonseparable or not.
Because the formula for θ in Eq. (9) works for the domain of (y, x, z , w, w ′ ), we can use ∂ FY |Z ,W ,W ′ (y) ∂ FX |Z ,W ,W ′ (x) ∂z ∂x θ = ··· ∂ FY |Z ,W ,W ′ (y) ∂ FX |Z ,W ,W ′ (x) ∂ FY |Z ,W ,W ′ (y) ∂ FX |Z ,W ,W ′ (x) − ∂z ∂w ∂w ∂z
× ω(y, x, z , w, w ′ )dydxdzdw dw ′ where ω(y, x, z , w, w ′ ) is a weighting function that is zero if the denominator in the last equation is zero. If we use the empirical distribution of (y, x, z , w, w ′ ) as our weighting function and the denominator is nonzero, a sample counterpart estimator of θ is given in Box I. Because the estimator is constructed using empirical estimators for the distribution functions and their derivatives, the proposed estimator should be consistent under general conditions such as those stated by Hu et al. (2015). We illustrate that the identification results are constructive and investigate the finite sample properties of an estimator through a Monte Carlo simulation. In particular, we conducted four sets of simulations. In these experiments the data generating process (DGP) has that w and u are correlated. We assume that the variables x, w ′ , u, v , and η are generated using a Gaussian process with mean zero and unit variance. The four sets of simulations for the functions m, g, h1 , and h2 are given in Box II. These simulations are intended to capture the different degrees of separability in the functions m, and g. In Simulation I, m, and g are both separable, g in Simulation II and III is nonseparable and m, and g are both nonseparable in Simulation IV. We use 100 replications for each experiment with a sample size 1000, and report mean, median, and standard deviation for the Separable Estimator in Hu et al. (2015) and the Nonseparable Estimator (the estimator suggested by the identification results). The estimation results are summarized in Table 1 and clearly show that the Nonseparable Estimator performs well in all experiments. As expected, the Separable Estimator in Hu et al. (2015) causes larger bias in Simulations II, III and IV. When m and g are both separable in Simulation I, both estimators have small bias. However the Nonseparable Estimator suggested in this paper has a lower standard deviation. 4. Conclusion Many empirical problems in economics have some variables that are measured with error and/or are endogenous. This paper gives new identification results for the ratio of partial effects in a class of nonseparable index models with measurement error and
35
endogeneity. The identification restrictions include a triangular system and the derivative of some conditional mean function being nonzero. The results are more general than Chesher (2003) in some aspects since we allow for measurement error but the results are more restrictive in other aspects. Measurement error and endogeneity are features of many empirical problems and we discussed the identification of the labor supply elasticity as an example. Appendix. Proof of Theorem 2.1 With the help of Eqs. (7)–(8) from Assumption 2.2, the model (4)–(6) can be divided into two triangular systems, (A) and (B): y = m(θ g (z , w, w ′ , u) + w, w ′ , η) w = h1 (z , w′ , u, v) z = h2 (w ′ , v),
(A)
and x = g (z , w, w ′ , u) + ε w = h1 (z , w′ , u, v) z = h2 (w ′ , v).
(B)
We apply the identification technique developed in Chesher (2003) to these triangular systems. Since h2 is strictly monotone in v from Assumption 2.2, for a given w ′ the inverse of h2 exists and is denoted by h˜ 2 (w ′ , z ) = v for a fixed w ′ and we call this function the inverse of h2 conditional on w ′ . Substituting it in h1 , we can have w = h1 (z , w′ , u, h˜ 2 (w ′ , z )) conditional on z , w′ . That h1 is strictly monotone in u implies that the inverse function h˜ 1 (z , w ′ , w) = u exists. Plugging this equation into m, we can have that y = m(θ g (z , w, w ′ , h˜ 1 (z , w ′ , w)) + w, w ′ , η) conditional on z, w , and w ′ . Define n1 (z , w, w ′ , η) = m(θ g (z , w, w ′ , h˜ 1 (z , w ′ , w)) + w, w ′ , η). Taking derivatives with respect to z , w on both sides of the above equation, and using the independence assumption in Assumption 2.1 z , w, w ′ ⊥ η, we obtain
∂ m(θ g (z , w, w ′ , h˜ 1 (z , w′ , w)) + w, w′ , η) ∂ n1 (z , w, w ′ , η) = ∂w ∂ψ ∂ h˜ 1 × θ gw + gu +1 , (11) ∂w ∂ n1 (z , w, w ′ , η) ∂ m(θ g (z , w, w ′ , h˜ 1 (z , w′ , w)) + w, w′ , η) = ∂z ∂ψ ˜ ∂ h1 × θ gz + gu . (12) ∂z ∂ m(·,·,·)
Because ∂ψ ̸= 0 from Assumption 2.1, we can cross multiply the two equations above to obtain
∂ n1 (z , w, w ′ , η) ∂ h˜ 1 θ gz + gu ∂w ∂z ∂ n1 (z , w, w ′ , η) ∂ h˜ 1 = θ gw + gu +1 . (13) ∂z ∂w Using the independence assumption, z , w, w ′ ⊥ η, again, and using the strictly monotone assumption of n1 (z , w, w ′ , η) in η from the monotone property of m in η in Assumption 2.1 yields Fη (η) ¯ = Fη|Z ,W ,W ′ (η) ¯ = Pr (η < η| ¯ Z , W , W ′) = Pr (n1 (z , w, w ′ , η) < n1 (z , w, w ′ , η)| ¯ Z , W , W ′)
36
Y. Hu et al. / Economics Letters 144 (2016) 33–36
θ=
1 n
i
′ ′ ∂ FY |Z ,W ,W ′ (yi ,zi ,wi ,wi ) ∂ FX |Z ,W ,W ′ (xi ,zi ,wi ,wi ) ∂z ∂x ′ ′ ′ ∂ FY ∂ F ′ (yi ,zi ,wi ,wi′ ) ∂ FX |Z ,W ,W ′ (yi ,zi ,wi ,wi ) ∂ FX |Z ,W ,W ′ (xi ,zi ,wi ,wi ) |Z ,W ,W ′ (xi ,zi ,wi ,wi ) − Y |Z ,W ,W ∂w ∂z ∂w ∂z
Box I.
Simulation I : m(θ x∗ + w, w ′ , η) = 0.1(θ x∗ + w + w ′ ) + η, g (z , w, w ′ , u) = 0.1(z + w + w ′ + u) h1 (z , w ′ , u, v) = z + w ′ + u + v, h2 (w ′ , v) = 0.1(w ′ + v), Simulation II : m(θ x∗ + w, w ′ , η) = 5 exp(θ x∗ + w + w ′ ) + η, g (z , w, w ′ , u) = 0.5(z + w + w ′ )2 · u h1 (z , w ′ , u, v) = (z + w ′ )2 + u + v, h2 (w ′ , v) = 0.5(w ′2 + v), Simulation III : m(θ x∗ + w, w ′ , η) = 0.5(θ x∗ + w + w ′ )3 + η, g (z , w, w ′ , u) = 0.5 exp(z + w + w ′ ) · u h1 (z , w ′ , u, v) = 0.5(z + w ′ )3 + u + v, h2 (w ′ , v) = 0.1(w ′3 + v), Simulation IV : m(θ x∗ + w, w ′ , η) = 0.1(θ x∗ + w + w ′ )3 · η, g (z , w, w ′ , u) = 0.5(z + w + w ′ )2 · u h1 (z , w ′ , u, v) = (z + w ′ )2 + u + v, h2 (w ′ , v) = 0.3(w ′2 + v) Box II.
where x¯ = n2 (z , w, w ′ , ε¯ ). Combining those results and plugging them into Eq. (11) results in
= Pr (y < n1 (z , w, w ′ , η)| ¯ Z , W , W ′) ′ ¯ = FY |Z ,W ,W ′ (n1 (z , w, w , η)). Differentiating the above expression with respect to w yields
−
¯ ∂ FY |Z ,W ,W ′ (¯y) ∂ FY |Z ,W ,W ′ (¯y) ∂ n1 (z , w, w , η) + , ∂w ∂y ∂w
∂F
−
where y¯ = n1 (z , w, w ′ , η) ¯ . It follows that
∂ n1 (z , w, w ′ , η) ¯ ∂ FY |Z ,W ,W ′ (¯y) −1 ∂ FY |Z ,W ,W ′ (¯y) =− . ∂w ∂y ∂w
(14)
−1
∂ FY |Z ,W ,W ′ (¯y) ∂w
y) Y |Z ,W ,W ′ (¯
−1
∂ FY |Z ,W ,W ′ (¯y) ∂z
∂y
′
0=
y) Y |Z ,W ,W ′ (¯
∂F
∂y
∂F ′ (¯x) −1 ∂ FX |Z ,W ,W ′ (¯x) θ − X |Z ,W∂ x,W +1 ∂w . = ∂F ′ (¯x) −1 ∂ FX |Z ,W ,W ′ (¯x) θ − X |Z ,W∂ x,W ∂z
Similarly, we have This gives our desired identification result,
∂ FY |Z ,W ,W ′ (¯y) −1 ∂ FY |Z ,W ,W ′ (¯y) ∂ n1 (z , w, w ′ , η) ¯ =− . ∂z ∂y ∂z
(15)
θ=
∂ FY |Z ,W ,W ′ (¯y) ∂ FX |Z ,W ,W ′ (¯x) ∂z ∂x ∂ FY |Z ,W ,W ′ (¯y) ∂ FX |Z ,W ,W ′ (¯x) ∂ FY |Z ,W ,W ′ (¯y) ∂ FX |Z ,W ,W ′ (¯x) − ∂z ∂w ∂w ∂z
,
Hence we can express the derivative terms of n1 (z , w, w , η) in Eq. (11) in terms of the first derivatives of the cumulative distribution function FY |Z ,W ,W ′ . Next, we apply the above derivation to the triangular system (B). Denote
because Assumption 2.3 guarantees that the denominator in the last expression is nonzero.
n2 (z , w, w ′ , ε) = g (z , w, w ′ , h˜ 1 (z , w ′ , w)) + ε.
References
′
The derivatives of n2 (z , w, w ′ , ε) with respect to z , w are:
∂ n2 (z , w, w ′ , ε) ∂ h˜ 1 = gw + gu , ∂w ∂w ′ ∂ n2 (z , w, w , ε) ∂ h˜ 1 = gz + gu . ∂z ∂z
(16) (17)
Applying the similar derivation of Eqs. (12) and (13) in the system (A) to the system (B), we have
∂ n2 (z , w, w ′ , ε¯ ) ∂ FX |Z ,W ,W ′ (¯x) −1 ∂ FX |Z ,W ,W ′ (¯x) =− , ∂w ∂x ∂w ∂ FX |Z ,W ,W ′ (¯x) −1 ∂ FX |Z ,W ,W ′ (¯x) ∂ n2 (z , w, w ′ , ε¯ ) =− , ∂z ∂x ∂z
(18) (19)
Abrevaya, J., Hausman, J.A., Khan, S., 2010. Testing for causal effects in a generalized regression model with endogenous regressors. Econometrica 78, 2043–2061. Borjas, G.J., 2009. Labor Economics, fifth ed. McGraw-Hill/Irwin. Chesher, A., 2003. Identification in nonseparable models. Econometrica 71 (5), 1405–1441. Hu, Y., 2008. Identification and estimation of nonlinear models with misclassification error using instrumental variables: A general solution. J. Econometrics 144, 27–61. Hu, Y., Schennach, S.M., 2008. Instrumental variable treatment of nonclassical measure-ment error models. Econometrica 76, 195–216. Hu, Y., Shiu, J., Woutersen, T., 2015. Identification and estimation of single index models with measurement error and endogeneity. Econom. J. 18 (3), 347–362. Newey, W., 2001. Flexible simulated moment estimation of nonlinear errors-invariables models. Rev. Econ. Stat. 83 (4), 616–627.