Journal
of Econometrics
56 (1993) 1699188. North-Holland
Some aspects of measurement censored regression model
error in a
When a linear model with a censored dependent variable is written as a regression model, the model is nonlinear in the explanatory variables. This means that when the explanatory variables are measured with error. estimation by instrumental variables will not lead to consistent estimates of the parameters. My approach to the problem involves assuming that the variables measured with error can be represented by a reduced form equation and estimating the parameters of the censored regression model by the least absolute deviations estimator. This estimator is based on the median of the distribution of the dependent variable rather than the mean, and the median is not affected by censoring in the lower tail. Hence the estimator is consistent under general conditions. I also consider the problem of testing for the presence of measurement error.
1. Introduction In this paper, I consider the estimation of a censored regression model (CRM) in which the explanatory variables are measured with error. To see why measurement error (ME) is a problem in this model, recall first the structure of the linear regression model with ME. In terms of observables, the ME leads to correlation between the regressors and errors, and as a result, OLS is inconsistent. Some additional information is needed to consistently estimate the parameters, and often this takes the form of a set of instrumental variables (IV’s). The IV’s are uncorrelated with the errors and IV estimation exploits this to give consistent estimates. Unfortunately, in the CRM the same approach does not give consistent estimates: The CRM leads to a regression model that is nonlinear in the variables and as a result, the zero correlation cannot be exploited. To be more concrete, suppose that the model is given by
(1.1)
Correspondence to: Andrew A. Weiss. Department Los Angeles, CA 90089-0253, USA. * I would like to thank
0304-4076/93/$05.00
of Economics,
University
Paul Ruud and a referee for helpful comments.
% 1993-Elsevier
Science Publishers
of Southern Any errors
B.V. All rights reserved
California, are my own.
in the usual notation where .f, = .x, +
(/I’,, is li x I). Rather than .x,, the variables
.P, are observed,
(1.2)
‘I,.
and )I[ is the ME. Then ),I* =
.v;/i,,+ (;:,
- r/:/i,,).
(1.3)
and under the visual assumptions, E[.qi(/:, - rl;/&,)] = - E[qi)lI]/lo # 0. If an (Ix I) set of instruments. z,, is available such that E[-_J~] = 0 and E[:iqj] = 0, then _i is uncorrelated with the error in (1.3). The IV estimator utilizes this by finding the value of /i setting C:‘_, (JS,*- .~;/~)_i = 0 (when I = k). In the CRM. the variable
is observed.
rather
than J,T. As is well-known,
if the i:i are Lid. N(0. rri), then
where (i, and @ are the p.d.f. and c.d.f. of a standard Normal random variable and E[~J~J~,] = 0 [e.g.. T. Amemiya (1984. eq. 15)]. The nonlinear IV estimator again makes the residuals uncorrelated with the instruments [e.g., Y. Amemiya (1985)]. In (1.5). it finds /L r~ such that
Replacing _yiby .qi and substituting (1.2) into this expression shows that qi enters nonlinearly and therefore that IV is inconsistent even if E[zir/i] = 0. This argument implies that the IV’s must satisfy stronger conditions than those in the linear regression model if consistent estimates of the parameters are to be obtained. For example, suppose the IV’s allow the representation .v, =
n;,:, + II,.
vvhere E[u~~:~] = 0 and rr,, is (I x k) and has not measured with error can serve as their (1.7). the model can be written as a CRM ables, and numerous consistent estimators
( 1.7) rank own with have
li. Of course. any elements of .~i instruments. In any case. with endogenous explanatory varibeen proposed for this model
[e.g., T. Amemiya gives (1.3) and
(1979), Newey (1987b)].
Eliminating
xi from (1.1) and (1.7)
.T; = 7CbZi+ (Uj + 7;).
(1.8)
The endogeneity of ii results from the correlation between ( 1.3) and (1.8). Also, substituting (1.8) into (1.3) gives
=
z:ro+ (Ei +
u;po),
the error terms in
(1.10)
where x0 = no/$,, which is the reduced form equation for p:. The estimates of PO are obtained from (1.8) and either (1.9) or (1.10). Estimating (1.8) by OLS gives an estimate of rtO. Then applying tobit maximum likelihood (ML) to (1.9) with this estimate in place of no gives an estimate of /I,,. The rank condition on no ensures that the variables in (I .9), n&(, are linearly independent. Alternatively, let & denote the tobit ML estimate of x0 in (1.10). Then the estimate of /3, is obtained from x0 = nob0 by solving min($ - ?/I)’
W(!i-
i?/?),
(1.11)
with i = GML,where i is the OLS estimate of 7c0and @‘is some positive definite matrix. In this case, the rank condition ensures that a unique solution is obtained. Also, with the appropriate choice of W, the second estimator is more efficient than the first [T. Amemiya (1979)]. Unfortunately, the consistency of these estimators is contingent upon the normality and homoskedasticity of the errors, Ei and ui. In general, nonnormafity or heteroskedasticity of the errors leads to inconsistency of the tobit ML estimator. More robust estimators are needed in these cases, and two classes have been proposed. These differ in their treatment of the unknown density of the errors: They either form estimates of this density or do not explicitly depend on it. The former include the estimators proposed by Duncan (1986) and Horowitz (1988), while the latter include the semiparametric estimators considered by Newey (1985b). Special cases of the latter include symmetrically censored least squares (SCLS) [Powell (1986)] and censored least absolute deviations (CLAD) [Powell (1984)]. None of these estimators require normality of the errors, although the density-based estimators require identically distributed errors and SCLS requires symmetry of the errors. In this paper, the CLAD estimator is used in place of the tobit ML estimator. The rest of the paper begins, in section 2, with descriptions of the estimators and some useful assumptions. The properties of the estimators are studied in
172
A. Weiss. Mea.surrment
error
in a censored regrrsswn
model
section 3, while in section 4, some tests for the presence of ME are considered. The paper finishes with concluding comments in section 5. Proofs are given in the Mathematical appendix.
2. Estimators and assumptions CLAD, like all least absolute deviations estimation, is based on minimizing the sum of absolute deviations of the dependent variable about an expression for the median of the dependent variable, conditional on the independent variables. From (1.4) and (1.9) the observed dependent variable is yi, and following Powell (1984) when the median of Ei + u$~ given Zi is zero, the median of yi given zi is max(O, 2; ~~0~). The first estimator involves substituting ?i for rrO in (1.9) and applying CLAD:
(2.1)
jrv = arg min i /yi - max(O, z;Z/I)I i=l B The second
estimator
first applies
CLAD
to (1.10) to obtain
&,v = arg min i 1~; - max(O, ;Ir)l, 3 i= 1 and then solves (1.11) with & = &rv to obtain
(2.2)
the estimator
of &. This gives
/Iv+.= (Er’W7i)-%~ti&,v.
(2.3)
Special cases of (2.3) include when I = k and when W = I,, where II is the I-dimensional identity matrix. In the former, the number of instruments equals the number of variables measured with error and bW reduces to the indirect least squares estimator, Tz- ’ ~xIv. * In the latter, (2.3) gives the OLS estimator, (rYfi)-‘$‘i,v. Note also that the estimation of 7c0by OLS is optimal when the ui are normally distributed. In other situations, an estimator such as LAD equation by equation could be used in place of 5 The first assumption ensures that the median of yi is, in fact, max(0, zinO/&,). Assume that E[$(E~ + u;/&,)~z~] = 0, where ti(.x) = sign(x). From given by C’= 1 hi(r), where /Ii
Powell (1984)
= l(Zicc > O),_jlc/(1:j- $2) .
(2.4) the scores for CLAD
in (2.2) are
(2.5)
A. Weiss, Mrasuremrnr
in a censored reyression
error
173
model
Hence, under (2.4), E[hi(ao)] = 0 and it may be expected that &,v is consistent. Eq. (2.5) also shows why an estimator of the form of (1.6) does not lead to a consistent estimator. This would solve II= 1 1(5$/3 > O)$(yi - njg)zi = 0. But any value of p giving 1(5$p > 0) = 0 for all i satisfies these conditions. Next, following Powell (1984), for the consistency and asymptotic normality of the CLAD estimator in (2.2), conditions are imposed on the density of the errors and the behavior of the regressors. Let fi(. lz) denote a conditional density of si + u& given zi. Assume that, uniformly in i,fi(. lz) is bounded from above, bounded from below at zero by a positive number, and Lipschitz continuous. Lipschitz continuity implies that for any ir, jL2, I.fi(1.r Iz) -fi(~zlz)l 2 LI 3-r - jtiz1, where L is a constant that does not depend upon i. Next, define
A,(5) = lim 2n-’ n-cr
f: E[l(z~n,/?,
> <)fi(Olz)ziz:],
(2.6)
i=l
and let A, = A,(O). Assume that for some < > 0, A,([) is positive definite. This implies that A, is also positive definite. The last condition for CLAD is a technical condition on the behavior of zirtj3 for r$ near nOflO. Assume that there exists to > 0 such that for each i and 0 I 4 < co, E[l(lzinflI~
IIii IIsY)IIziII’1 I K5 for r = 0, 1,2, II7~- noII <
CO,
IIB -
PO II <
50,
and
SOme
constant K > 0 that does not depend on i. Consider now the estimation of no. From (1.8),
vec(ri - 710)= (I, 0 (Z’Z)-‘)
i
(I,@ Zi)(Ui + vi),
(2.7)
i=l
where Z’ = (z,, . ,z,). Assume that E(ui) = E(qi) = 0 and that ui and vi are independent of zi. Under the standard conditions, n”‘vec(ri
- 7ro)
~NOO,K+,W:-‘h
(2.8)
where Vu+, = E[(ui + qi)(Ui + qi)‘], i = 1,2, . . ..n. AZ = plim,,,(Z’Z/n), and V ,,+q and A, are assumed to be positive definite. The extension to allowing for heteroskedasticity in Ui and rli is straightforward [e.g., White (1980)]. The last set of terms are those appearing in the estimation of PO in (2.1). Based on the score vectors for /?,, and x0, define
bi
hitao)
= (Ik 0
zi)(“i
+
(2.9) Vi)
1.
Let BI = lim,,, given by
, var(K
’ ’ I:= 1 hi) and assume that B, is positive definite. B, is
(2.10) Other conditions needed for the derivations of the properties of the estimators are left implicit. although many can be inferred from the discussion. For example, the required moments of the various random variables are assumed to exist and to simplify the notation, various averages of expectations are assumed to converge to finite limits. Also. certain matrices are assumed to be positive definite and the vector of true parameters, lo, is assumed to be interior to a compact parameter space. The different random variables are assumed to have densities with respect to Lebesgue measure, and these densities satisfy dominance and differentiability conditions similar to those given in assumptions 4 and 5 of Newey (1985a). Other assumptions needed in particular estimation methods and tests will be introduced later.
3. Properties of the estimators The properties of /I?,, are conside_red first. /?,V and /jw are compared subsequent discussion of c?,, and /&.
Expanding
the expression
for I’,, gives
during
the
where the Bij are the blocks of Bi corresponding to the partition in (2.9). The first term in Viv is simply the covariance matrix for the CLAD estimator of /I0 assuming that no is known. The other terms are therefore due to the estimation of no. Notice also that if B,2 = 0, i.e., the first-order conditions for CLAD are orthogonal to those for the estimation of no, then Viv simplifies to only two terms. If the errors are normally distributed, this type of orthogonality can be obtained by adding the conditional expectation of i:i + ui/Io given Ui + vi to the right-hand side of (1.9), and implies that i MLis an efficient estimator [e.g., Newey (1987b)]. However, it seems unlikely to occur other than by such a construction. Hypothesis testing based on /& requires an estimate of Viv and, despite their complexity. each of the terms in Viv is readily estimated. For example.
(3.4)
and
sym
\
AB,,
(3.5)
where G,,+g = ~r-‘~~= I GiGi and iii = .~i - jc’zi; and following may be estimated by the kernel estimator,
A, = 2(/&i
i
I (z;ij,”
> O)l(O < iii < /&zj,
Powell (1984) A,
(3.6)
i=l
median (iii: jii > 0, -_I?/?,,, > 01, and k. is where ii, = yi - zi7;j,v, li,, = k,~0.2 a suitable constant. Consider now the estimation of x0. I_t is clear from (2.1) and (2.2) that the properties 3iiv are analogous to those of p,v when the estimation of rro is ignored and zi replaces nhzi as the vector of right-hand side variables: Tlwortw
2
(properties
of i)
n”2(&,v - To) -N(O, d
A;lB,,A;l).
(3.8)
)=[A,‘:
-P;,0A-1]“~17Chi+0~(l).
(3.11)
The asymptotic distribution of & follows from (3.9) and (3. I I ). The most interesting case is when W = 2‘ ‘, where Z is the asymptotic covariance matrix of n”‘(i,, ~ I?/$,). That is, 1 = .4 ; ’ B2 A ; ‘, where B2 was defined in Theorem I. Let i be an estimator of Z and let /j\ = (fi’z‘-‘7?- ‘ri’i-‘i,,. As in T. Amemiya (197X. 1979) and Newey (1985b), /?i minimizes t_he asymptotic covariance of I?,+ [see (1.1 l)]. Since (3.9) a_nd (3.10) imply that /I,, is also in the c!ass /Iw, with CC’= .4,, it follows that /jZ is asymptotically efficient relative to /I,v.
II’ ‘(pi - Po)~N(0.(n;,Z‘-‘710)-‘).
(3.13)
Presumably, since the asymptotic covariance matrix of fi positively affects B2 and hence 1, an efficient estimator of 7~would be useful. Note also that since B2 depends on PO. a preliminary estimator of PO, such as the OLS estimator of z‘. (ir’Ti)_ ‘ii’?,, , is needed in the estimation
4. Testing for ME Recall that the endogeneity of S, arose from correlation between the error in eqs. (1.3) and (1.8). This resulted from the presence of rli in both errors terms. although clearly. it will also result from correlation between E, and Ui, The latter is usually associated with endogeneity of the explanatory variables
[e.g., T. Amemiya (1979)]. In the absence of both sources of correlation, .Ti is exogenous, and the model becomes
Zi = si,
and .Yj
=
7Cb=i+
(4.2)
Ui.
If E[$(c~)/.F~] = 0 and conditional densities of the Ei satisfy continuity and boundedness conditions like those for ci + uif10 above, then /I0 may be consistently estimated by CLAD of yi on pi and IV estimation is unnecessary. Let denote the CLAD estimator. From Powell (1984). &LAD ti1,2(llcLAD - PO) -N(O, d
~&&LAD&&~D)~
(4.3)
where
A cLAD = lim 2n-’ n-7
B CLAD
=
i E[l(\f:p, i=l
,lit~ t7-l ~ E[I(I~/I,
> O)fji(Ol.?)li .?I],
> O)Si.~i],
(4.4)
(4.5)
i=l
and f;,i(. 12) is a conditional density of Ei given .Ti. The estimator of AcLAD is equivalent to A, in eq. (3.6). Now, many forms of misspecification can lead to E[~(Ei)l.~i] # 0. The focus here is on the problem of testing for ME and correlation between ai and Iii, and two approaches are considered. First, following Smith and Blundell (1986) and Newey (1987a), for example, the ME and correlation between ELand Ui can be tested by adding the residuals from the estimation of (4.2) to the right-hand side of (4.1), applying CLAD to (4.1) and testing whether or not the coefficients on the residuals are jointly equal to zero. [Of course, if some elements of .Giserve as their own instruments, the _corresponding residuals are not included in (4.2).] Second, the consistency of flcLAD_can be tested by the Hausman test [Hausman (1978)] comparing flCLADand /IX. /I: is consistent even if _ME or correlation between ci and Ui imply that E[~(E_i)l.~i] # 0.In this case, (/JCLAD- jX) is likely to be large. flL.is used rather than piv because it is asymptotically more efficient. Before considering the tests in detail, it is important to note that because the ME and correlation between Ei and ui are generally indistinguishable, they cannot be tested separately. On the other hand, if Ei and ui are independent, the tests are specifically for ME. Another special case is when both are present but E[rC/(Ei)(_~i] = 0 still holds. In this case, the consistency of flCLADis not affected.
The first test is based on the estimation
problem
II min C I)., ~- max(O, .\‘;/i + li;/I)(. ,I, ,’ / =i
(4.6)
where ii, = -7, ~ ir’:,, and the test statistic i-,, = rlp’ c,; ’ i,.
(4.7)
where p is the estimated coeticient on d, in (4.6) and c,, is an estimate of C;,. the ’ is given by the block corresponding to Iii asymptotic covariance matrix of $. b,, of the matrix .-13 ’ B, A i ’ . where ,-I, = lim ?K ’ f- E[I (.?;/i,, > cl,,/;, (O/.?. II)( g,- I I 1
u:)‘(.?;. u;.)].
(4.X)
(4.9)
B3 = lim 11~’ i E[1(.3;/Io > O)(.ci. U:)‘(.?:. Ir:)]. II* I /=I
;tnd,fji( / .t. I{) is ;I conditional density of i:; given .?, and zli. Assume that .?,. II,, and i:i satisfy sufficient conditions for the consistency and asymptotic normality of the CLAD estimation in (4.6). In particular, E[ti(>:i)l.?;. lri] = 0. 7/1mw77
i
(r~‘.st
fiw
c~fwwltrrio77)
d
L; ,’ --ii;.
(4.10)
The test statistic not exceeding the critical value from the ;cf is taken as evidence that li, can be omitted from (4.6). resulting in CLAD of ~3,OII li. Further insight is obtained by deriving a Lagrange multiplier (LM) form of the test, Following Weiss (I 99 I ), this is based on an asymptotic linearization of the gradient from (4.6). Let lri(n) = i-, - ~‘_i and ~~~~ = 0. Lrw7777rr LIS
6
T/7cwxw7
(rr.s!‘777prork
5. /iw
rrr7J’
/ir7ccrr.i:trtior7 !W
>
0.
of’rl7r
qrtrdirr7t).
Ur7rlu
tl7e
.smw
c~or7dirion.s
A. Weiss, Meusuremmr
error in a censorrd
reyressiorl
179
model
Next, from (4.3) II~I’(/&~~,, - PO) 5 O,,(l); from Powell (1984, eq. A. 13), and from n - 1’2.j-;= 1 1 (.?j&AD > O)li$(yi - .fif!ICL*D) = O,(l); (2.8), n”*vec(Z - rrO) = O,( 1). Hence, setting Q = 0 in (4.11) implies that II -1’2 i
l(.C!j I
^ CLAD
>
o)bi$(yi
-
%:/&LAD)
i=1
=
(RA;‘R’)m’R,,J~‘,7-‘~2
i
1(2y&
> 0)
i=l
-xi u,
$(&i) + Op(l)
0
~N(~,(RA;~R')-~RA;~B~A;~R'(RA;IR~)-~),
(4.12)
where R = (0: I,). This gives the form of the LM test. Under ME alternatives, ci and ui are contaminated by the ME and (z?:, ui)’ and $(ei) are correlated. Also, the LM test statistic simplifies greatly if f;i(Ol<, u) =f;(Ola, u) for all i: The covariance matrix in (4.12) reduces to (RB; ’ R’)- ’ and the LM te$ statistic can be written as )I times the R* from the regression of t+k(yi- ,?i/&--AD) on > O)(.?;, $)‘. Moreover, the estimation of f;(Olm, u) is not required. 1(%/&LAD The asymptotic equivalence of the two versions of the test also follows from (4.11). The first term in (4.11) is o,,(l) when evaluated at the CLAD estimates from (4.6) and hence, using (4.12), n1/2;
=
RA;
1 RI,,-
112
1 (dCLAD
it1
>
O)Gi$‘(J’i
-
-cj/!?CLAD)
$
op(l).
(4.13) The difference between the tests is that the LM test uses estimates of A3 and B3 based on CLAD of yi on .fi, whereas in (4.7) these estimates are based on CLAD of )‘i on .~i and Lii. The form of the Hausman test is obtained from a linearization of the gradient for
PCLAD:
Lemma 7 (asymptotic linearization as (4_3),for any M > 0,
sup
n-l’* II
-n
-“*
~
qf‘the gradient).
Under the same conditions
l(~_IBo > O)Ij~(4’i - ~IBO)
i=l
+A where
CLADn
the supremum
“*(P
-
is over
Bo)ll
17“*
=
(4.14)
op(l),
/I(/I
-
/I$,) /I 2
M.
The next result gives the Hausman
test:
(3.15)
(4.16) Hence. the test is sensitive to large values of ti-“2C~=, I(.ci/?: > O).fi x $(J’;- .c;/i,). As this is simply the vector of first-order conditions from CLAD of J‘; on .\‘,, evaluated at /?I. t_he test will generally have power if E[~(Ei)l.~i] # 0. Note also that ([I,,,, - pl.) can be identically equal to zero: The first-order conditions for CLAD are not continuous and /3x can be another solution to the CLAD problem Of Ji on Ii. In large samples. this would occur with probability approaching zero when the ME or correlation imply that Be;,,,,, is inconsistent for /j. Now, because of the complexity of the covariance matrices in the test satistics, a direct comparison of the properties of the LM and Hausman tests is not undertaken. On the other hand. it is straightforward to compare the
‘numerators’ in_ the test statistics, >O)l;i$(J’i - .Fj/JcLAn) and n-‘~‘C~=, represented
n-
or -equivalently, n-~‘~~=, 1 (.$j?cCLAD (.?:jlz > O).~ilC/(~i- .fiflz). The former is
in (4.12) and from (4.14) the latter can be written
l’* i
as
1(.Y& > O)Bilj(L’j - 1$-J
i=l
-A
CLAD?’
1*2cp,
-
PO,
+
(4.17)
op(l).
In (4.17), only the second term on the right-hand side is sensitive to ME. The others are sensitive to correlation between the IV’s and the errors and, hence, test the validity of the instruments. If Zi and Ei are correlated, then Zi is endogenous and (1.7) with this choice of zi is not the true reduced form for Xi. This means that for (local) ME alternatives in which the instruments are valid, only the second term in (4.17) will contribute to the mean of the numerator in the test statistic. Hence, since all three terms will contribute to the denominator variance, the LM test should be more powerful. One strategy is to use the two tests together to determine which estimation technique is appropriate. Irrespective of the outcome in the Hausman test, the LM test statistic exceeding its critical value seems to rule out the use of CLAD. If the Hausman test does not exceed its critical value, the conclusions are that the Hausman test lacks power against the ME or correlation between ei and uir that the instruments are valid, and that IV is the appropriate estimation technique. If the Hausman test does exceed its critical value, the next step is to test for correlation between z, and ei. If this is present, the reduced form should be respecified. The LM test statistic not exceeding its critical value suggests that CLAD can be used, although if the Hausman test statistic exceeds its critical value, it is prudent to investigate the reduced form to ensure that the Hausman test statistic is large due to incorrectly chosen instruments, rather than ME or correlation between Ei and Ui. Finally, the correlation between Zi and Ei may be tested using the first-order conditions for j,v from (2.1). In particular, since n- “Cy= 1 72’/1~(7i/?,~) = o,,(l), th e estimator fiiv uses k linear combinations of the I moment conditions E[hi(ro)] = 0. Provided I > k, other linear combinations can be used to test the model. Let L be an (S x I) constant matrix with rank s. Theorem 9
(test
of moment conditions).
If‘1 > k. then
(4.18)
See Newey (19821, lemma 3) for one choice of +inverse. If I = 1, (or other conditions otherwise), moment implied by (2.4). such as E[l (z;x,, > O)I~“$(J~, - ::I,,)] = 0. where ~1~’contains squared elements of zi, could be used to test the model.
5. Concluding comments
I have discussed one approach to the estimation and testing of a CRM in which the explanatory variables are measured with error. Because of the inherent nonlinearities in the model, no simple solution seems to be available. I have taken an instrument variables approach, although this requires the assumption that the instruments are related to the variables measured with error via a reduced form equation. This is stronger than the equivalent assumption in the linear model. where the instruments need only be uncorrelated with the errors. Unfortunately, the nonlinearities also mean that the asymptotic covariance matrix of the IV estimator has a complex form. This complicates the forms of the test statistics. although with some care the various terms can still be estimated. Next, 1 stress that if the regularity conditions for estimators such as ML or SCLS are satisfied, they should be more efTicient than CLAD and hence should be used. On the other hand, if the conditions are not met, they may be inconsistent. Therefore, relative to these other estimators, CLAD represents an estimator of last resort. Whether or not condition such as normality or symmetry arc satisfied is an empirical question, and tests for these are described in T. Amemiya (1984) and Newey (1987a), for example. Finally. I mention another aspect of ME in the CRM: ME in the left-hand side variable. Stapleton and Young (1984) have considered this, and in particular the case when the observations are correctly classified as censored or not, but the value of the dependent variable is observed with error. With normally distributed equation errors K,, the usual ML estimator is inconsistent. But when the ME has mean zero, (1.5) is not affected and NLLS is consistent. Of course. NLLS will be inconsistent if the equation errors are not normal. Similarly, the SCLS estimator will be inconsistent if the errors are not symmetrically distributed. This would seem likely in most cases since the observed values of the dependent variable must be positive. The LAD framework requires that the median of J’; is max(O. x~/~,~). Again. this requires strong conditions on the distributions of the ME’s, suggesting that CLAD will also be consistent in most cases. Studying the estimation of this model is a topic for future research.
183
Mathematical
appendix
Proof of’ Theorem
I
(i) Weak comistency. S,(7C./j) = II-lC
Let (IJi - max(O,$nfl)I
- (yi - max(O,~)~O[~,)I).
Then I&(7?, 1) - S,,(no. fi)/ I K’C
2
0
Imax(0, $i?/?) - max(O, =17r0/?)I
uniformly
in /I.
Defining /I: = arg mit-@,(n,, j?), this implies that flz - fi,v L 0. The consistency of pz for PO follows from Powell (1984, thm. I). (ii) Asymptotic normality. Appending those for p^,v and following the approach
!
II-
the first-order conditions for ti to in Powell (1984, eq. A.16) gives
l C n’E[lti(nfi)]
n-l 1 E[(I,
@ Zi)(li - R’Zi)]
= - MB> 71)(,,$,“;,,))>
(A.1)
where A,,(/, ?I) = (A, C/Ln)ij)r /1,(/I, 7~)11= 2K’Cn’E[l(~j7t/j
~,(B,
7012
=
2tC
‘C
> O)~iJ)(~ii_iIzi)zl]~,
x’E[~(zI~I~~ > O)Zifi(Tij.ilZi)(/?b
0
z;)],
i.i = zi(n/l - 7cO/10),and the Si lie between zero and one. Following Powell (1984, eq. A. 18), A,,( j?, Z) is within 0( 1)(/II - p,,)‘, (vec(z - no))’ (/) of a positive definite matrix, and the left-hand side of (A. 1) satisfies condition N.3.i) of Powell (1984)
or Huber (1967). The remaining and therefore.
conditions
in Huber
(1967) are also satisfied,
But from (A. I ),
Hence, (A.?) may be rewritten 4,,(lil,.
as
%1~7’ 2(Plv - I,,)
7i)r2~xbAI([~b @ I,). Hence, the rightBut ,l,,(/i,v. I?),~ 5t;,4r~~ and 4,,&, hand side of (A.3) is within o,(l) of zb[I,: - 4r(& @A;‘)][n-’ 2xhi]. Applying the CLT to II-’ ‘Ihi completes the proof. Proof of’ Tl~rot~v~~ 2 Immediate
from Powell ( 1984).
= H1“(i,, = A,‘nC’
- x,,) ~ ll”(fi
- n,,)/jo
2~l~i(71(~/~o) ~ (/lb@ I,)n”‘vec(fi
-
no) + o,(l)
=A;‘[I,:~A,(~~~OA~‘)]tl~“2~hi+Op(l).
The first part of the second equality follows because II- 1.2C h,(r) satisfies the following asymptotic linearization result: For any M > 0.
A. Weiss, Measurement
error
in a censored regression
185
model
The proof of this is equivalent to that of lemma 7 of Weiss (1991). Use lemma 3 of Huber (1967). Next, subtracting (7i’ @k)- ‘72’@I?/& from both sides of (2.3) implies that
= (nb w7t~)-‘ql
wn1’2(oi,”- 7$()) + op( l),
since n1’2(3i,v - r?/&) is O,,(l). Next, from eq. (A.4)
- 7?Al(ljb 0 Il)n”2vec(7F or, since 72’6 ‘I21 hi(7?j,,)
nli2(~lV-
BO) =
- no) + o,(l),
(A.5)
= op( 1):
(7CbA11Cg)-1~bn-“2 ~ hi(~,P,) i=l - (7~b,4~7t~)-‘7rbA~(Pb 0 Il)n”‘vec(7?
ml 0 Ilk “2vec(7? - no)} + o,(l)
=
- 7ro) + o,(l)
(nbAlrro)-‘nbAln1’2(0i,V- ?q&) + OJl).
Proof of Theorem 4
Immediate from Theorem estimator of C.
2, Lemma 3, and the consistency
of the obvious
Proof qf Theorem 5
First, as in the proof of Theorem
n-l
jl
IJ’i
5 IIPlln-'
max(0, -f3 +
-
1,
&p)l -
n-l
i i=l
jl
IIiii
-
ui
II Lo3
lyi - max(O, .?ifl+
uip)(
uniformly
in /1. 11. Hence,
uniformly II-
‘C;_
I
in /j’. /J; and following Powell (1984. thm. l), the second term (minus E[~J; - max(0, .?7:B0+ (r;p,)/]) is uniquely maximized at /i = /lo and
11= 0, and i, r, 0. Next. appending
the first-order
conditions
for rT to those from (4.6) gives
/1,,(P, I’? x)3, = ‘l,,CP. /‘. 71132= 0. ,4,(/I. p, 7r)AJ = lk 0 A,, ii = .?:.(/I - p,,) +ui(no)‘p
and the T, lie between I and noting that /;LO gives
+ (ui(n) - ui(n,,))‘p.
the proof of Theorem
where /jO is the estimate
zero and one. Following
of [lo from (4.6). The result follows.
A. Weiss, Measuremet~t
Proqf
of Lemma
error
in a censored reqression
model
187
6
See the proof of eq. (A.4) in Theorem
3.
Proof qf Lemma 7 See the proof of eq. (A.4) in Theorem
Proof‘
of
3.
Theorem 8
Write II~‘~(&_~~ - /?I) = ?~~,“&~n evaluated at /ICLAD,
- /I,,) - nl”(jx
- PO). But from (4.14)
Then (3.9) with W = Z -’ implies the result. Proof of Theorem 9
As in (A.5)
-
Al(Pb
0 I,)n “*vec(7? - 7ro) + oJ1).
(A.6)
The rest of the proof follows that of Newey (1985a, thm. 1): The right-hand side of eq. (A.6) premultiplied by L is asymptotically equivalent to LAI,[II: - A1(/?b @ A_‘)](n-“*Cr=l hi). The limiting x * distribution with rank(V,,) degrees of freedom then follows since n-‘/*Cbi% N(0, B,,). That rank( V,) = rank[rro: L’] - k follows because 7c0 is a basis for the nullspace of AA and rank( k’,,) = rank(AAL’). See Newey (1985a, lemma A.5). [The result is also equivalent to theorem 1 of Newey (1985a) with his W = ~o(&Al~o)-ln~ and his H = Alno.]
References Amemiya. T., 1978, The estimation of a simultaneous equation generalized probit model. Econometrica 46. 119331205. Amemiya. T.. 1979. The estimation of a simultaneous equation tobit model. International Economic Review 20, I69- 18I. Amemiya. T.. 1984, Tobit models: A survey. Journal of Econometrics 24, 3-61. Amemiya. Y.. 1985, Instrumental variable estimator for the nonlinear errors in variables model, Journal of Econometrics 28, 273-289. Duncan, G.M., 1986, A semiparametric censored regression estimator, Journal of Econometrics 32, 5534. Hausman. J.A., 1978. Specification tests in econometrics, Econometrica 46, 1251-1272.
Horowitz. J.L.. 19X8, Semiparametric ttt-estimation of censored linear regression models. Advances in Econometrics: Nonparametric and Robust Estimation 7. 45-X3. Huber. P.J.. 1967. The behavior of maximum likelihood estimates under nonstandard conditions. m: L.M. Le Cam and J. Neyman. eds.. Proceedings of the fifth Berkeley symposium on mathematical statistics and probability. Vol. I (University of California Press, Berkeley, CA). Newey. W.K.. I985a. Generalized method of moments specification testing, Journal of Econometrics 29. 22992%. Newey. W.K., 1985b. Semiparametric estimation of limited dependent variable models with endogneous explanatory variables, Annales de I’INSEE 59 60, 219-237. Newey. W.K.. 19X7a. Specification tests for distributional assumptions tn the tobit model. Journal of Econometrics 34, I25 146. Newey. W.K.. 1987b. Efficient estimatton of hmited dependent variable models wtth endogenous explanatory variables. Journal of Econometrics 36, 231-250. Powell, J.L., 1984, Least absolute deviattons estimation for the censored regression model, Journal of Econometrics 25. 3033325. Powell. J.L.. 1986. Symmetrically trimmed least squares estimation for tobit models. Economctrica 54. 1435 -1460. Smith, R. and R. Blundell. 1986. An exogeneity test for the simultaneous equation tobit model with an application to labor supply. Econometrica 54. 6799685. Stapleton, D.C. and D.J. Young, 1984. Censored normal regression with measurement error on the dependent variable, Econometrica 52. 757- 760. Weiss. A.A.. 1991. Estimating nonlinear dynamic models using least absolute error estimation, Econometric Theory 7, 46668. White. H.. 1980. A heteroscedastic-conststent covariance matrix and a direct test for heteroscedasticity, Econometrica 4X. 421-44X.