Journal
of Econometrics
48 (1991) 29-55.
North-Holland
Specification testing and quasi-maximumlikelihood estimation* Jeffrey M. Wooldridge Massachusetts
Received
Institute
February
of Technology,
1989, final version
Cambridge,
received
MA 02139, USA
January
1990
This paper develops robust, regression-based conditional moment tests for models estimated by quasi-maximum-likelihood using a density in the linear exponential family. The tests are relatively simple to compute, while being robust to distributional assumptions other than those being explicitly tested. The conditional mean tests assume only that the conditional mean is correctly specified under the null hypothesis, but are asymptotically equivalent to standard forms of the tests if the conditional variance is correctly specified. Tests for conditional variance misspecification assume only that the first two moments are correctly specified under the null hypothesis. An empirical application to individual arrest data is included.
1. Introduction Many economic hypotheses can be formulated in terms of the conditional expectation E(y,lx,) of one set of variables yt given a set of predetermined variables x,. If the conditional expectation is parameterized by a finitedimensional vector, standard econometric techniques can be used to estimate E(y,Ix,). Because the parameters are usually given economic interpretations, it is important to have methods of testing whether or not the parameterized model is correctly specified for E(y,Ix,). Tests for misspecification can be derived by nesting the model within a more general model, by examining its behavior in light of a nonnested alternative, or by comparing different estimators of the parameters of interest. This paper develops a class of Newey’s (1985a) and Tauchen’s (1985) conditional moment (CM) tests that are explicitly designed to detect misspec*This is a revised version of MIT Department of Economics Working Paper No. 479. I wish to thank Dan McFadden, Danny Quah, Jerry Hausman, Lars Hansen, Tom Stoker, two anonymous referees, an associate editor, and the participants of the Harvard/MIT and Yale econometrics workshops for helpful comments. Thanks also to Jeff Grogger for providing the data set.
0304-4076/91/$03.500
1991-Elsevier
Science
Publishers
B.V. (North-Holland)
30
J.M. Wooldridge, Specification testing
ification of a conditional expectation or conditional variance. The framework considered is broad enough to contain the three kinds of tests mentioned in the previous paragraph, and allows for time-series as well as cross-section observations. Broadly speaking, the setup here is encompassed by White (1987), who extends the Newey-Tauchen framework to include conditional moment tests for time-series as well as cross-section data. But by focusing on a class of moment restrictions intended to detect misspecification of the conditional mean or conditional variance, and by restricting attention to quasi-likelihoods in the linear exponential family CLEF), simple forms of these tests are available without imposing auxiliary assumptions on the conditional density under the null hypothesis. If, for example, the conditional mean is the object of interest, then a test which further assumes that the second moment is correctly specified will generally have the wrong asymptotic size for testing the relevant null hypothesis. In addition, the power of conditional mean tests can be adversely affected if the auxiliary assumptions do not hold under alternatives. A useful feature of the robust tests derived here is their computational simplicity. Only first derivatives of mean and variance functions are needed, and regression forms are available. Moreover, the robust tests remain asymptotically equivalent to the nonrobust versions under conditions when the latter are valid. In other words, robustness is obtained without sacrificing
asymptotic efficiency. In the remainder of the paper, section 2 introduces the general setup and briefly discusses some useful properties of LEF distributions. Section 3 derives the conditional mean test statistics and presents several examples. Section 4 considers testing the conditional variance assumption, and section 5 contains an empirical application to individual arrest data. Section 6 contains some concluding remarks. 2. Notation
and setup
Let {(y,,z,): t=1,2,...) b e a sequence of observable random vectors with y, K X 1 and z, L X 1. The variables y, are the dependent or endogenous variables specified by the researcher. In a time-series context, interest lies in explaining y, in terms of the explanatory variables z, and past values of y, and z,. Thus, let x,=(z,,y,_r,z,_ ,,..., y,, z,) denote the predetermined variables (elements of contemporaneous z, may be excluded from x, without altering the subsequent results). For cross-section applications, set X, = z, and assume independence across t. The conditional distribution of yt given X, always exists and is denoted D,(. Ix,). Under weak conditions on D,(. Ix,) there exists a conditional density p,(* Ix,) with respect to a a-finite measure Y,. Because (by definition) the
31
J.M. Wooldridge, Specification testing
stochastic behavior of (z,} is not of interest, the conditional densities a,(. Ix,) describe the relevant dynamic behavior of {y,]. The class of densities for p,(. Ix,) is denoted (f,(. Ix,, mf,qt)] and is assumed to be a member of the linear exponential family. In particular,
(2.1) where m, is K x 1, vt is JX 1, cc.1 is K X 1, and a(.) and b(.) are scalars. The function m,(x,> is the expectation associated with the density f,(* Jxr, m,, rlt). This entails the restriction m,(x,>‘v,c(m(x,>,771(Xt))
= - %l+,(~,)?7,(~t))>
(2.2)
for all x,. Following Gourieroux, Monfort, and Trognon (1984a) [hereafter GMT (1984a)], the LEF family is parameterized through the K X 1 conditional mean function,
{m,(x,,(Y):
a EL4
and the J x 1 nuisance
CR’},
parameter
(2.3) function,
In the context of section 3, correct specification of the conditional mean, E(yt(x,)=m,(x,,a,)
specification i.e.,
forsome
(Y,EA,
means
correct
t=1,2,...
.
(dynamic)
(2.5)
As GMT (1984a) have shown in the case of independent observations and as White (1989) has shown in a more general dynamic setting, the LEF class of densities has the useful property of consistently estimating the parameters of a correctly specified conditional expectation despite misspecification of other aspects of the conditional distribution. The nuisance parameter y may be assigned any value in r; more generally, it may be replaced by an estimator TT - r: 3 0, where I?:} cr. The ‘T’ subscript on the plim of q, is generally required because fT need not be estimating any ‘true’ parameters (parameters of interest) even when (2.5) holds. For example, qT could be an estimator from a misspecified parameterized conditional variance equation in the context of weighted nonlinear least squares. When the data are strictly stationary, 7; does not depend on T. In section 4, when the conditional variance specification is tested, the plim of qT
32
J.M. Wooldridge, Specification testing
is ya, where (a,,, 7”) indexes the conditional variance of y, given the null hypothesis. The QMLE 6$. maximizes the quasi-log-likelihood function
where
d,(y,, xt; y, a) is the conditional
x, under
log-likelihood
[Note that b(y,,nt(xr,TT)) does not appear as it does not affect the optimization problem.] For simplicity, the argument x, is omitted wherever convenient. Let SJLY) = Va,dt(yF, a) be the 1 X P score. Suppressing the dependence of sTI on yf, the score can be expressed as
(2.6) where ml(~) = m,(x,, a), nl(y) = n,(x,, y), e,(a) = yt - m,(a) is the K X 1 residual function, C,((Y, y) = C,(x,, (Y,y) = [V,,&n,(a), ~Jy))l-~ is K X K, and M,(cu) = Mt(x,, (Y) = Vpz,(a) is K X P. The second equality follows from (2.2). If the mean is correctly specified then E(y,lx,) = m,(x,, a,) and the true errors ep 3 e,(cyJ are defined. Consequently, because A4,(aO) and C,((Y,, y:) depend only on x,,
E[s,(dlx,] = 0,
(2.7)
provided that (2.5) holds. This demonstrates that (sp = s,(cy,J: t = 1,2,. . .I is a martingale difference sequence with respect to the a-fields (a(~,, x,): Fisher consistency of the QMLE r = 1,2,. . .}. Eq. (2.7) also establishes (when jr. is replaced by its plim or by any fixed value) and is the basis for the GMT (1984a) results for dynamic models. It can also be shown [see GMT (1984a)] that if the conditional mean is correctly specified, then
E[ h,( a,)lx,] = -M;‘cf-‘A&“,
(2.8)
J. M. Wooldridge. Specification
33
testing
where h,(a) = vas,(a) is the Hessian and values with a ‘0’ superscript evaluated at (Ye or ((.u,, 7:). The conditional variance of the score is = M;‘C:‘-‘@C;‘-IMP,
V(s;jx,)
are
(2.9)
where 0:’ = 0:(x,) E V( yI lx,) is the true conditional covariance matrix of y, given x,. GMT (1984a) show that C,(a, 7) is the covariance matrix associated with the density f,(yt(x,, WZ~(CY>, ~~(~11 if the latter is nonsingular. The conditional information equality holds provided fq=
C,(~“,Y$
(2.10)
t= l,...,T,
and this is the case if the assumed density [evaluated at (a”, r$!)] has variance corresponding to the actual conditional variance of yI. In general,
A;=
-T-l
2 E[h,(a,,,y;)]
= T-’
t=l
;
E[M:“C,‘-‘M,O]
1=1
and
B+ T-’ ; E[s,(cr,,y~)‘s,(a,,y~)] t=1 = T-’
5 E[ M:“C:-‘e:e:“C;-IMP] t=1
differ even when the conditional mean is correctly specified, information equality fails. Nevertheless, fi
so that the in distribuA: and BF
and
where all hatted quantities are evaluated at &T or (&r.,+r.), robust classical inference is fairly straightforward for this class of QMLE’s. The next section derives computationally simple, robust specification tests for a variety of testing procedures.
34
J.M. Wooldridge, Specification testing
An alternative to QML estimation of (Ye is a generalized method of moments (GMM) approach, as in Hansen (1982) and Newey (1985b). In fact, if (2.10) fails, then it is possible to construct GMM estimators that are asymptotically more efficient than the QMLE’s. Mullahy (1988) takes this approach in deriving specification tests for count models. The intent of the present paper is to offer a method for obtaining computationally simple, robust versions of popular specification tests. 3. Conditional
mean tests
3.1. General results This section focuses on a class of specification departures from the hypothesis H,:
E(yJx,)=m,(x,,a,)
forsome
tests designed to detect
t=1,2,...
~,EA,
. (3.1)
Let e,(a) = yI - m,(a) be the K x 1 residual function, and the true errors under H,. Suppose that A,
E[4( a”,
~;)‘c,(~“~Y:!)p’e,(ao)]
= 0,
t=
let ep = e,(aJ be a K X Q matrix LYand rr. If (3.1) existence of the
1,2 3.. . ,
(3.2)
where A,(a, TTT) = A,(x,, (Y,x) and C,(a, y) = [V,&n,(a), ~Jy)>l-’ is the covariance matrix associated with the density fr(ytIxt, m,(a), qt(y)). The misspecification indicator A, is allowed to depend on (Y” and on the limiting value of some nuisance parameter estimator +,. The nuisance parameter 7 might include the variance parameters y. In the context of testing in the presence of nonnested alternatives, +T would contain estimates from the alternative model. As pointed out by Newey (1985a,b), Tauchen (1985), and White (1987), (3.2) suggests basing a test of (3.1) on a quadratic form in the Q X 1 vector (3.3)
It is readily seen that the asymptotic distribution of (3.3) does not depend on - yF> = O,(l) and fi(+?, - rF> = that of +r or 7i, p rovided that fi(f,
J.M. Wooldridge, Specification testing
35
Op(l), which is typically the case. From the point of view of producing tests with correct asymptotic size, C,(&,, qT)-’ could be absorbed into the indicator A, as they both depend only on X, and estimated parameters. However, the structure in (3.3) leads to robust conditional mean tests with the additional important property of being asymptotically equivalent to more traditional nonrobust forms when the latter are valid. More will be said on this below. If &, is the QMLE, then, under suitable regularity conditions, it is straightforward to establish that T-‘j2
5
k&‘C,
5
M(0,
DD,V;D+‘)
(3.4)
I=1
under H,,, where
v+
T
T
T-’
c
V([
cp;,s;‘]) = T-'
r=1
c
E([cp,O,sj’]‘[cp:,$j),
(3.6)
t-1
and cp: = e:lC, ‘-‘A” Eq. (3.4) can be used as the basis for testing the correct specification of th: conditional mean, with the resulting statistics being robust to misspecification of other aspects of the conditional distribution of y, given x,. All that one needs are consistent estimators of 0: and V,“, and these are available from (3.51, (3.61, tt, hT, qT, and 7i,. A method for computing the test statistic that requires only auxiliary least-squares regressions can reduce the computational burden while broadening the scope of applications. Because (3.3) is of the form covered by Wooldridge (1990, procedure 2.1), it is straightforward to derive a robust, regression-based test. Briefly, the idea is as follows [see Wooldridge (1990) for a more general discussion]. Recall that the first-order condition for the QMLE is 0. t=
the statistic CT=, A;c,
Therefore,
(3.7)
1
‘e^, is identical to
T
- &?,&T)]‘C,“2e^, = 5 (A, - R$,)Y, t=1
= f: R;e’,, t=1
(3.8)
J. M. Wooldridge, Specification testing
36
where
A, = ?,‘/*A,,
it?* = Ctl/*nfjr,
is the P x Q matrix on tit:
of regression
-’
l?* = A, -tit&,,
coefficients
e’, = Ct1/‘6,,
from a matrix
regression
and
&,
of A,
T
c
AT&i,.
(3.9)
t=1
The statistic in (3.8) first orthogonalizes A, with respect to kt before constructing the misspecification indicator. Note that 6, is an estimator of V(y,Ix,) if the second moment is correctly specified, but not in general. The usefulness of this orthogonalization is that, under the regularity conditions in Wooldridge (1990, theorem 2.11, it follows that
T-l’*
;
I?$,=
T-l’*
f: [C~-1’2(A~-A4~G0,)]‘C~-1’2e;+o,(l) t=1
t=1
= T-l/*
c R:‘eP t=1
(3.10)
+ op( 1),
where -’ G”T=
i
E
Cf-‘/‘A:,
M,*
E( M;‘C,o-‘A:),
t=1
t=1
AT
T
c
E( M,o’C;PIM;)
E
Ct(-1/2Mf,
RF
E
AT
-
M,*Gt,
ey
s
Cf-l/*ey,
and
values with ‘0’ superscripts are evaluated at (a,, yt). Under H,, a consistent estimator of the asymptotic covariance matrix of the right-hand side of (3.10) (and therefore of the left-hand side as well) is
T-’
f: (I@,)(
(3.11)
I?;:&)‘.
t=1
Eqs. (3.10) and (3.11) lead to the following Wooldridge (1990, procedure 2.1).
procedure.
It is a special
case of
estimates qT and 7i,, Procedure 3.1. (i) Compute the nuisance parameter the QMLE &T (or some other fi-consistent estimator of LYE),the weighted gradient of the regression function residuals e’,,= C;, 1/2t1, the weighted kt = C;‘/‘n/rt, and the weighted misspecification indicator A, = Ct-‘/‘At.
J. M. Wooldridge, Specification testing
(ii) Perform
A,
on
the matrix
regression
A?,,
t= l,...,T,
and save the K X Q matrix R, = ii, - I%?$,, (iii> Perform 1
residuals, t= l,...,T.
the OLS regression
on
?;t?‘,,
t= l,...,T,
and use TR: = T - SSR as asymptotically ,Y; under H,, where uncentered r-squared and SSR is the sum of squared residuals. a;!?:, is a 1 x Q vector. n Procedure
37
3.1 requires
(3.12) Ri is the Note that
that the matrix
T
T-’
c
E( RF’ePe,*‘R:)
I=1
be positive definite uniformly in T (for T sufficiently large). This condition would fail if the weighted indicator AT [evaluated at (a,, rF,rF)] contains redundancies or if it contains elements that are linearly related to M,*. Procedure 3.1 can still be used but the degrees of freedom in the chi-square distribution must be appropriately reduced. Procedure 3.1 extends the regression-based, heteroskedasticity-robust test suggested by Davidson and MacKinnon (1985) in the context of nonlinear regression with independent errors and unconditional heteroskedasticity. It provides a simple method of testing specification of the conditional mean for a broad class of multivariate models estimated by QMLE, without imposing the additional assumption that the conditional variance of y, is correctly specified under the null hypothesis. Dynamic as well as static forms of variance misspecification are allowed. Moreover, as noted above, if the second moment of yl is correctly specified, then the weighting matrix Cp-’ is the inverse of the conditional variance of y, given x,. This ensures that Procedure 3.1 produces a test that is asymptotically equivalent to nonrobust forms of the test if the variance is correctly specified [see Wooldridge (1990, theorem 2.2)1. The regression appearing in (3.12) is similar to auxiliary regressions appearing in Newey (1985a), Tauchen (19851, White (1987), and elsewhere, but
38
there
J. M. Wooldridge, Specijication testing
is an important
difference.
Consider
the regressions
(3.14) where the multivariate regression in (3.14) (when K > 1) is carried out by stacking the data and using OLS. If V(y,lx,> = C:, then the TR: from (3.13) and KTRZ from (3.14) are asymptotically distributed as xi under H,. If the conditional second moment of yr is misspecified, i.e., V(y,Ix,) f C:, then neither of these statistics generally has an asymptotic xi distribution and they are no longer asymptotically equivalent. The statistic derived from (3.12) has a limiting xi distribution under H,, whether or not the second moment is correctly specified. If the conditional mean is the object of interest and the researcher is at all uncertain about the conditional variance assumption, then Procedure 3.1 offers a viable alternative. In addition, it is asymptotically equivalent under H,, and local alternatives to the statistics based on (3.13) and (3.14) under correct specification of the conditional variance. The QMLE framework underlying Procedure 3.1 allows consideration of a broad class of conditional mean tests used by applied econometricians. Examples of how to choose A, in some familiar cases are given in the next subsection. 3.2. Examples of conditional mean tests This section presents some applications of Procedure 3.1. Included are robust LM tests, robust, regression forms of Hausman tests, and an extension of the Davidson-MacKinnon test in the presence of nonnested alternatives. 3.1 (LM test). Consider LM tests for exclusion restrictions after E(y,lx,) has been estimated by QMLE using the LEF indexed by (a, b, c>. Let p,(x,, a, p) denote the K x 1 unrestricted mean function where (Y is P X 1, p is Q x 1, and assume that E(y,lx,) = p&x,, (Y”, PO) for some (Y” and PO. The null hypothesis is
Example
H,:
pa = 0.
The LM approach
where
leads to a statistic
qT is JT-consistent
based
for some {r;l,
upon
the Q x 1 vector
&- is the QMLE
of
(~0
obtained
J.M. Wooldridge,Specificationtesting
39
under the assumption that pa = 0, and e^, = yt - m,(x,, 6,) = y, p,(x,, &., 0). Let V&i,, V&, denote the K X P and K X Q gradients of pUr with respect to cy, p, respectively, evaluated at (&$.,O)l. In the notation of Procedure 3.1, m,(a) = CL~(LY, 01, MI(a) = Vp,(cu) = Vapu,(a, 01, and A,(cu,r) = Vppt(a,O). A test of H, which is robust to violation of the variance assumption V(xflxl) = C,(a,, 7:) i! obtainednfrom Procedure 3.1 with e’, = c,-1/2e^,, i$?t = Ctl/‘M1, and A, = C,‘/2A, = Ctm’/*V’fi,. For example, in a univariate Poisson regression model, k, = +zI. In a dynamic context with y, a scalar a robust test for AR(Q) serial correlation is obtained by setting ii, = (f?_,, e^,_,,. .., e^,_,). Hausman, Hall, and Griliches (1984) estimate a panel data model in which y, is a vector of count variables and E(y,,lx,) = exp(x;,cYJ, where xti is a P x 1 subvector of x, = z,. (Unfortunately, the t here corresponds to the cross-section dimension and i = 1,. . . , K to the time-series dimension.) The gradient of the K x 1 conditional mean function is the K X P matrix Under the assumption that ytilx, M,(a) = [exp(x,i~)x,,, . . . , exp(xtKa)xrK]‘. N Poisson(exp(x;;a,)> and that the elements of y, are independently distributed conditional on x!, C,((Y, y) = C,(a) = diag[exp(x:,a), . . . , exp(x:,a)l. QMLE under this assumption consistently estimates (~a provided that of yz need not have a E(ylilx,) = exp(x;&, i = 1,. . . , K; the elements Poisson distribution nor must they be independently distributed. The robust standard errors discussed in section 2 can be used to construct appropriate estimates of precision, and Procedure 3.1 can be applied to compute various robust LM tests for exclusion and other restrictions. The negative binomial extension used by Hausman, Hall, and Griliches (1984) involves allowing C, to depend on additional parameters y. n Example
3.2 (Hausman test for a conditional mean). Suppose that, in the spirit of the Hausman (1978) methodology, two estimators of the conditional mean parameters (Ye are compared in order to detect misspecification of the regression function. Because the QMLE’s considered here yield consistent estimators of the conditional mean, it is natural to base a test on the difference of two such estimators. A regression form of the test can be derived which does not require either estimator to be the efficient QMLE. Also, only one of the QMLE’s actually needs to be computed. Suppose that sr. is the QMLE from an LEF indexed by (a,,b,,c,) and nuisance parameters nl,(r,), and a second estimator is to be used from the LEF (a,, b,, c,) with nuisance parameters ~~~(-y~). A statistic that is asymptotically equivalent to the Hausman test which directly compares the two estimators and assumes that (a,, b,, cl> is correctly specified is obtained by setting (3.15)
.I.M. Wooldridge,Specificationtesting
40
in regression (3.13) or regression (3.14) [see, misspecification indicator is just a particular regression function. A robust test is easily 3.1. There, r = (r;, 7;)’ and c(m, 7) = c,(m, nuisance ct;‘/‘Gr. &2&t1
e.g., Ruud (1984)l. Note that the weighting of the gradient of the obtained by applying Procedure vr). Let ?rl and ?r2 denote the
parameter estimators, and set 2, = y, - m,(&,), e’, = ?t;1’221, $t = The weighted misspecification indicator is A, = ef11/2A, = l$
For ‘ionketeness, consider the case that yt is a vector of count variables in a panel data setting and the researcher postulates an exponential conditional mean function for each element of y,: H,:
E( ~ti IX,) = exp( x:iao) 3
i=l
,‘..,
K,
(3.16)
where xti is a P x 1 subvector of X, = z,. As in Example 3.1, suppose that the estimation procedure is motivated by the assumption that each element of yt has a conditional Poisson distribution and that the elements are independent conditional on x,. This is a fairly strong assumption, but assume for now that interest lies only in testing Ha. Under Ha, Poisson QMLE and multivariate NLS, using, say, a scalar covariance matrix, both consistently estimate (Ye and can be used to form a Hausman-type test. To apply Procedure 3.1, set c,, = diag[exp(x;,a^,), . . . , exp(x;,&,)], et2 = ZK, and Zk?t = where &, is the Poisson QMLE (or any [exp(x:,&-)x,,, . . . , exp(x;,cy,)x,,]‘, n-consistent estimator of (~a). Hausman, Ostro, and Wise (1984) (HOW) apply a regression method in which they perform the NLS estimation but assume that the Poisson assumption is correct under the null. If the Poisson assumption is incorrect, the HOW approach leads to a test with incorrect asymptotic size for testing H,. In addition, the HOW procedure is inconsistent for testing the Poisson distributional specification in the following sense: if the conditional mean is correctly specified but the Poisson assumption is violated, the HOW test statistic has a well-defined limiting distribution (which is not x;), rather than tending in probability to infinity. Because this result extends to all Hausman tests based on two QMLE’s that are derived from the LEF class of densities, the Hausman test in the present context should be viewed only as a test of conditional mean misspecification. A procedure that is useful for testing distributional assumptions beyond the first moment is derived in the next section. For the Poisson case, it leads to comparing the estimated conditional mean and another estimate of the conditional variance. n Let yt be Example 3.3 (Davidson-MacKinnon nonnested hypotheses test). a K x 1 vector and let X, be the predetermined variables as defined in
.I.M. Wooldtidge, Specification testing
section
2. Consider
two competing
models
41
for E(y,Ix,):
H,:
E(ytlx,)=m,(x,,a,),
some
a,eA,
t=1,2
,...,
H,:
E(Y~Ix~)=P~(x,,&),
some
&=B,
t=1,2
,....
The DM test looks for departures from H, in the direction of H,. Let ST denote the QMLE from the LEF (a,, b,, c,); let & be the QMLE from the LEF (a,, b,, c2). Under H,, Qnr converges to (Y,, and p^, generally converges to a nonstochastic sequence {/3:} cB. The e,stimated ‘nuisance parameters’ ,. in this setup are 7i,- (p+.,$$.1,++2)‘, where fT, is the estimated nuisance parameter vector indexing vl, and yT2 is the estimated nuisance parameter indexing qtz. A DM-type test checks for nonzero correlation between the residuals P, = yI - m,(x,, &T) and a particular weightin,, 1~of the difference in the estimated regression functions: the misspecification indicator is the K X 1 vector
where et1 is the estimated variance function for the LEF (a,, b,, c,>, and e,, is the estimated variance function from the L,EF (a,, b,, cl>. The weighted quantities are tit = Ct;‘/2Gr, V(y,lxt) = Cz is maintained, run the stacked regression _
4
c, = et; ‘12e^,, and A, = ei,/2e,1<;, - &,I. If then a regression test is available from (3.14):
_
on M,, 4,
t= l,...,T,
and use KTRZ as asymptotically xf under H,. This is the LM form of the usual Davidson-Ma&&non (1981) test (which has been extended here to multivariate linear exponential families). If the variance assumption is violated under H,, then this approach generally leads to inference with the wrong asymptotic size. The robust statistic obtained from Procedure 3.1 is straightforward to compute. w 4. Testing the conditional
variance
assumption
4.1. General results The tests developed in section 3 are explicitly designed to detect misspecification of the conditional mean, while allowing the conditional variance to be misspecified. In section 3, two regression forms of the conditional mean tests were presented [(3.13) and (3.1411 that could be guaranteed to have a
42
J.h4. Wooldridge, Specification testing
limiting chi-square distribution only if the variance of the assumed distribution matches the true conditional variance of y,. More precisely, the regression tests in (3.13) and (3.14) effectively take the null hypothesis to be Hb:
E(y,lx,)
=m,(x,,q,)
forsome
cu,EA,
and
some
V(y,lx,)
= Cf(~t,q,,~O)3
t=1,2
yoEr,
,....
(4.1)
Recall that C,(x,, (Y,y) = [Vmct(mt(xt, a), q,(x,, y))l-’ is the variance associated with the LEF density. The parameter vector yO is no longer indexed by T because it now contains ‘true’ parameters which, along with ~ya, index the conditional variance of yt under the null. For the validity of the regression tests in (3.13) and (3.14), the correct specification of the second moment is usually needed to consistently estimate the covariance matrix which appears in the conditional moment test statistic. Because violation of the distributional assumption does not lead to inconsistent estimates of a correctly specified mean, the CM tests for the conditional mean based on (3.13) or (3.14) are inconsistent for the alternative H;:
E(y,lx,)
=m,(x,,(~a)
for some
CQEA,
but
(4.2) V(y,lx,)#CC,(a,,y)
forall
yET,
t=1,2
,....
This is a fairly obvious assertion, but it is worth stating explicitly. This section focuses on constructing tests that have power against Hi. The null hypothesis is now H&, but the tests constructed in this section are intended to primarily test for conditional variance misspecification, presumably after the conditional mean has been found to be adequate upon applying the methods of section 3. As in section 3, a broad class of tests is obtained by forming appropriate orthogonality conditions. For motivational purposes, first consider White’s (1982) information matrix test for the parameters (Y. In particular, a test is based on two consistent estimators of the asymptotic covariance matrix of &, under the hypothesis that the first two conditional moments are correctly specified. Let &r and fr be the estimators of (~a and yO under H& The White test is based on the difference
(4.3) I=1
wAhere all quantities
r=1
are
evaluated at 6, or (Sr,fT) (recall that gt and on Tr). If the first two conditional moments of specified then the difference in (4.3) tends to zero in
M, = V’&m,(ti,> do not depend yt given X, are correctly
J.M. Wooldridge, Specification
testing
43
probability as T + m, and the standard errors for &, which are computed under the information matrix equality will be asymptotically valid. If either the conditional mean or variance is misspecified, then (4.3) will typically not converge to zero (although there are directions of misspecification for which this statistic will still tend to zero). Thus, a test of correctness of the conditional variance can be based on a suitably standardized version of (4.3). A useful expression for the vectorized tth difference in (4.3) is (4.4)
where
In what follows, the misspecification indicator A, = [tit @ tit] for the White test can be replaced by any K2 X Q matrix A,(x,, 8, 71, where 8 = ((Y’,y’) indexes the variance (a still indexes the mean) and r is a vector of nuisance parameters. The subsequent results apply to the general case as well as to White’s information matrix test. Standard approaches to operationalizing (4.6) require the limiting distribution of the statistic
which generally depends on the asymptotic distribution of fi
is standard because the information equality holds. However, if parameters y are present in addition to the mean parameters (Y, then the strategy for conditional variance diagnostics becomes more complicated. The reason is that qT might come from a variety of sources, depending on the particular application. Although the analysis could proceed by assuming a particular asymptotic representation for fi($, - y,J, the scope would necessarily be limited while producing statistics that are burdensome to compute. It is much easier to simply purge from the test statistic the effect of the asymptotic distribution fi(f,yO). There are many ways to proceed; the approach adopted here is motivated by expression (4.5) and the leading case of multivariate normality when the mean and variance parameters are variation-free. Consequently, when applied to the normal case, the procedure
J. M. Wooldridge, Specification testing
44
presented below is robust to nonnormality, while remaining asymptotically equivalent to nonrobust LM-type statistics under normality. This discussion of pr. is irrelevant in models such as Poisson, Geometric, and Exponential regression, where (Ye indexes the variance as well as the mean under H& To proceed, the purging procedure suggested by Wooldridge (1990, theorem 2.1) is applied to the subvector qr. of 6,. Let Qt(x,, 0) = 0’ vec[Ct(a, -y)l be the K 2 x N gradient of the stacked conditional variance function with respect to the parameters y and let V&xI, 19)= Va vec[Ct(a, y)] be the K2 X P gradient with respect to (Y.The first step is to construct the K 2 X Q residuals, say i,, from the regression A,
on
6r,
t=l
‘...,
T,
where A, = 2; ‘/‘A, and 6’, = _$; ‘I24 (If &l contains redundant columns, these are simply omitted from the regr&sion.) The modified statistic is
where 6, = it- 1/2$t. By arguments entirely analogous to Wooldridge (1990, theorem 2.1) it follows that, under Hb,
T-l’*
&,&=T-1’2 t=1
5
([At(e,,~r)-~t(Bo)K~]‘~t(Bo)-’
t=1
where
Thus, the asymptotic distribution of (4.8) depends only on the asymptotic distribution of @CL;, - LYJ. It follows immediately that if C1(~, y) depends only on y, then T
T
T-l’*
c t=1
I?;$, = T-l’*
c
R~‘&+
+ op( 1))
(4.10)
f=l
where R*t = SF- ‘12(Ay - @FKF) and +,* = xp-‘/‘4p. To handle the case where C,(CY,y> depends on LY,define $, = $:a, and it = $, - ita, ‘jr, where
J.M. Wooldtidge, Specification testing
the 1 X P score fl is defined
in section
2 and
f=
r=1
I
Note that T-‘/*C~= ,( = Tp ‘/2c~= ,Gr whenever as argued in the appendix,
6,
is the QMLE.
~,&=T-‘~‘; (cp:‘-$A;-‘J,o)+O,(l)
T-l/*
I=1
under H;,. Thus, the regression 1
45
on
This discussion
Further,
(4.11)
I=1
an asymptotically
6,
xi
statistic
is obtained
as T - SSR from
t= l,...,T.
can be summarized
with the following
procedure.
Procedure 4.1. (i> Compute the QMLE &-, and estimators 9, and 6, such that fi(qT - yo) = QJl) and fi(kT - rt> = O,(l) under H,$ Construct 6, = 2, ‘/*dl, 6, G 2; ‘/* c$~, $! E ,$- ‘I* $, , and A E _$- ‘/‘ii (ii-a) If C(((Y, 7) on &l and save the depend on y, let d, (ii-b) If C,((Y, r>
depends on y, perform the multivariate regressfbn of A, K2 X Q matrix of residuals, say 2,. If C,((Y, 7) does not = A,. If C,(cr, y) does not depend on (Y, set & = &;fiz,. depends on (Y, define
,. n 5, = C&R, - &i,‘J,. (iii)
Perform
the OLS regression
1 on i,,
t=l
,...,
and use TR: = T - SSR as xi.
T, Note that f, is 1 X Q.
n
Procedure 4.1 provides a fairly general method for testing correct specification of the conditional variance. When the variance does not depend on the mean parameters, only least-squares regressions are needed to compute the statistic. The leading example of this is (multivariate) nonlinear regression with an estimated weighting matrix. Because of the nature of the first-order condition for the normal MLE, Procedure 4.1 leads to a test which is robust to nonnormality but asymptotically optimal under normality; this follows from Wooldridge (1990, theorem 2.2).
46
.I.M. Wooldridge, Specification testing
Another broadly applicable case is when the same parameters index the mean and the variance, so that the regression in step (ii-a) is omitted. In this case, even though the test assumes only that E(y,Ix,) and V(y,]x,) are correctly specified under H,, Procedure 4.1 produces a test which is asymptotically equivalent to maximum-likelihood procedures - such as the Newey-Tauchen-White test - when the distribution is correctly specified in its entirety. This is simply because T-‘/*CT_ 1,$,= T-l’* CT= 1$,. It might be useful to note that, in order to obtain a test with the correct asymptotic size under Hb, the mean parameters (Y can always be included in y, so that 6 can be constructed from the residuals E, and fi, in (ii-a); (ii-b) can be skipped. However, when only the mean parameters index the variance, this would result in a test with less asymptotic power than the NTW statistic when the entire distribution is correctly specified. Because the parameters y,, need not be estimated by an efficient procedure, Procedure 4.1 can be applied to such problems as testing the second moment of the negative binomial generalization of the Poisson regression model as developed by GMT (1984b). The strength of this approach is that any \lT-consistent estimator of y0 can be used, but at the same time the efficiency of the QMLE for (Ye (in the class of WNLS-type estimators) is exploited. In such cases there are no obvious existing tests to compare to Procedure 4.1. 4.2. Examples of conditional uariance tests Example 4.1. Consider applying Procedure 4.1 to testing for various forms of heteroskedasticity in a univariate nonlinear model that has been estimated by NLS. In this case y = a*, C,((Y, 7) = u*, and $,(a, y) = et((Y)2 - u*, so that @&(w,y) = 1. For the White (1980) tezt fpr heteroskedasticity, the indicator A, is nonredundant elements of vec[M,M,l = vec[Vam,(x,, &)lVamr(q, &)I. Engle’s (1982) test for ARCH(Q) takes A, = (2:-i, . . . , e^:_& In either case, Procedure 4.1 leads to the regression 1
on
(f?f--&+)(A,-XT),
(4.12)
where 6: = T-‘C~=,e^~ and XT = T-‘CT_,I1”,. This procedure is asymptotically equivalent to the regression form of the White test or Engle’s ARCH test under the additional assumption that E[(ef)41x,] is constant. Interestingly, the slight modification in (4.12) (which is the demeaning of the indicator /i,) yields an asymptotically chi-square distributed statistic without the additional assumption of constant fourth moment for ep. For the White test in a linear time-series model, the demeaning of the indicators yields a statistic that is asymptotically equivalent to Hsieh’s (1983) suggestion for a robust form of the White test, while being easier to compute. n
J. M. Wooldridge, Specification testing
47
Table 1 Summary
statistics;
number
of observations
Variable
Mean
NARR86 PCNV ASL TOTTIME PTlME86
0.404 0.358 0.632 0.839 0.387 2.309 0.550 2.25 1 0.161 0.218 0.363
Q86 INC86 DURATION BLACK HISPANIC BORN60
= 2725.
Example 4.2. Consider a Poisson regression model where x, is assumed to have first two moments matching a Poisson mean E(y,lx,) =m,(x,,qJ. The White 1M test can be assumption that V( y,Ix,> = E( y, lx,). In this case there are A, is a 1 X Q e, = Ar, I& = ti, = v$zE,, and the indicator dundant elements of vec(A?;I,‘k,). This is used in Procedure
5. Empirical
Standard deviation 0.859 0.395 3.508 4.607 1.950 1.610 0.666 4.607 0.368 0.413 0.481
y, conditional on distribution with used to test the no parameters Y, vector of nonre4.1. n
application
Grogger (1990) has recently estimated a model relating the number of arrests in a particular year to individual characteristics and measures of crime deterrents. The variable to be explained is NARB86, the number of times an individual was arrested in 1986. The explanatory variables include PCNV, the proportion of prior adult arrests that resulted in convictions, AVGSEN, the average length of sentence served since age 18 (in months), PTIME86, the number of 1986 months spent in prison, TOTTZME, the number of months spent in prison since age 18. Grogger (1990) also includes variables to measure labor market opportunities: Q86, the number of quarters in 1986 during which some earnings were reported, INC86, reported earnings in 1986 (in tens of thousands), and DURATION, the number of quarters since the individual last had positive earnings or was released from prison, whichever was most recent. BLACK and HISPANIC are qualitative variables, and BORN60 is a qualitative variable equal to unity if the person was born after 1960. The data are a random sample of 2725 men from arrest records maintained by the California Department of Justice; a larger sample is described more fully in Grogger (1990). Summary statistics for this sample are in table 1.
48
J. M. Wooldridge, Specification testing
It is natural to estimate a model for NARR86 which exploits its nonnegative integer nature. Here, two models are estimated: a Poisson model with C,(x,, (.u) = m,(x,, a) and a Geometric model with C,(x,, cu>= m,(x,, a> + m,(x,, (Y)*. In both cases m,(x,, cu) = exp(xja), where x, is the P X 1 vector of explanatory variables. There are certain issues concerning the explanatory variables that deserve a more in-depth discussion than can be provided in this paper. In particular, because the individual chooses the levels of legal and illegal activity jointly, variables such as INC86 and Q86 are simultaneously determined along with NARR86, rather than being set exogenously. Grogger (1990) provides some discussion of the simultaneity issue. The analysis here follows Grogger’s lead in conditioning on these variables. Grogger estimates a negative binomial model with an exponential mean. Because he estimates the mean and variance parameters simultaneously, the estimates of the mean parameters are not guaranteed to be consistent in the presence of variance or other distributional misspecification. The Poisson and Geometric models are used here because of the robustness of the conditional mean estimates. Table 2 presents the results for the Poisson regression model and table 3 contains the results for the Geometric model. (The GAUSS code for computing the estimates and test statistics is available on request from the author.) The first column in each table contains the results analogous to Grogger’s model. The standard errors computed from the inverse of the estimated expected Hessian, Al; l/T, are in parentheses, and the robust standard errors, computed from A^; ‘h,a; l/T, are in brackets; see (2.11) and (2.12). Although several of the parameters estimates are similar in the two tables, there are some notable differences in magnitudes for some of the key variables. First, in both models the variable PCNV is highly significant whether one uses the usual or robust standard errors in computing the t-statistic. However, the estimate from the Geometric model differs from that in the Poisson model by over 17 percent. Also, the coefficients on 1NC86 and PTZME86 in the Geometric model differ from those in the Poisson by almost 10 percent. These differences, some of which are economically significant, suggest that the conditional mean might be misspecified. To formally test this proposition, regression-based Hausman statistics were computed for each model; these are given at the bottom of each table. The Hausman test for the Poisson model tests against the Geometric estimator, and vice versa. From Example 3.2, the Hausman test for the Poisson model is obtained by setting kf =jptx;, A, = [ j,/(j, + f,?)]jtx;, and e^,= yr -$,, where the 9 are the Poisson fitted values. The weighted quantities are &r = IL?,/ / y^, = fix:, A, = A,/ fi the Hausman e,
= [fi/(j, + $:>]v^,x;, and e’, = e^,/ a. The nonrobust test is obtained as TRff from the regression on
Izi,, A,.
form of
(5.1)
Table 2 Poisson
estimates;
number
of observations
= 2725.”
Model 1
Variable ONE
-0.551 (0.078) [0.1051
PCNV
- 0.403 (0.085) [0.1011
PCNV 2
-
-0.616 (0.080) [0.1041 1.185 (0.283) LO.3731
-
ASL
TOTTIME
PTIME86
PTIME86 2
Model 2
- 1.833 (0.307) LO.4301
- 0.024 (0.020) [0.024]
- 0.024 (0.022) [0.0281
0.024 (0.015) [0.0201
0.011 (0.016) [0.023]
-0.101 (0.021) [0.023]
0.700 (0.093) [0.0981
-
-0.106 (0.017) [0.0161
Q86
-0.051 (0.031) [0.0381
- 0.003 (0.035) [0.0401
INC86
- 0.806 (0.104) [O. 1231
- 1.197 (0.183) [0.1791
INCM
-
2
DURATION
0.206 (0.061) [0.0501
- 0.008 (0.007) [0.0081
-0.016 (0.007) [0.0091
BLACK
0.663 CO.0741 [0.1001
i:::;;,’
HISPANIC
0.499 (0.074) [0.092]
0.420 (0.075) [0.092]
- 0.053 (0.064) [0.081]
- 0.099 (0.064) [O.OSO]
BORN60
Log-likelihood r-squared Hausman statistic Robust Hausman statistic ZM statistic Robust IM statistic ‘Standard
errors
computed
0.592
- 2248.02 0.078 64.52 56.11 62.99 32.01 from Hessian
in parentheses,
- 2166.07 0.122 41.16 34.62 44.96 23.35 robust
standard
errors
in brackets.
Table 3 Geometric
estimates;
Variable
number
of observations
= 2725.”
Model 1
Model 2
ONE
- 0.523 (0.096) [0.1051
- 0.635 (0.100) [0.1051
PCNV
- 0.484 (0.102) LO.0991
1.095 (0.352) io.3731
-
PCNV 2
ASL
- 0.016 (0.026) [0.024]
TOTTIME
- 0.018 (0.029) [0.032]
0.019 (0.020) [0.0211
PTlME86
PTlME86
- 1.793 (0.376) [0.428]
0.011 (0.022) LO.0291
-0.110 (0.025) [0.023]
0.749 (0.137) [0.096]
-
’
-0.113 (0.023) [0.016]
- 0.061
- 0.006 (0.043) [0.0401
(0.038) [0.0381 INC86
- 0.767 (0.116) [0.123]
- 1.210 (0.212) LO.1751
INC86 2
0.213 (0.069) [0.049]
DURATION
- 0.006 (0.008) [0.0081
- 0.013 (0.009) [0.0091
BLACK
0.656 (0.094) [0.098]
0.608 (0.095) [0.0981
HISPANIC
0.502 (0.090) [0.092]
0.413 (0.093) [0.0931
- 0.047 (0.078) [0.0801
- 0.085 (0.080) [0.0791
BORN60
Log-likelihood r-squared Hausman statistic Robust Hausman statistic IM statistic Robust IM statistic aStandard
errors
computed
-2 1157.52 0.075 69.40 57.52 49.81 39.52 from Hessian
in parentheses,
- 2101.72 0.117 26.13 29.63 66.10 44.54 robust
standard
errors
in brackets.
-.mauquou wmy!u8!s blleD!urouosa palyuro laylm3 am alay] leq~ /(Idur! &es -sa3au lou saop uo!pataal leyspels aqL ~a8.w~ L~qeuossa~ so suogetiasqo 30 mqumu ayl icq~ palou aq plnoqs I! ‘s[aAaI amzw@$s p.wpue~s ueq~ .rapuIs ISIS am asaqj apqM ‘~9oog moqe s! lapow 3yauxoa~ aql 103 3~~1 apqM ‘~ooo*o vtoqe 30 arqekd E seq uo!ssalZa_t uosyod aql 103 3gsya~s ut?rusneH lsnqol aql :I lapour 01 pa.wdwoD Qeytmqns paseamy alley sanlm-d aqL *uoywgpads IDalloD lapun p1X z se AIIwgo3dwrCse palnqr.I$srp MOU am . szqsye~s aqL ‘1 IapouI .103 se palndwo3 alam sisal uewsneH .lapotu IDallo:, aql s! I! $eq~ saawe.nm% Fhuq~ou lnq ‘(L1pzguelsqns sawamy pa.renbs-l aqi leql alou) 1 [apow ueql lallaq elep Jsa.m aql suydxa iClu!el.ta3 z lapo~ wmugsa 1 laporu aq~ u10.13lualedde IOU am leql suogwldw! hqod pgualod amq sllnsal asaqL wmla~ %uqsy!u_up siCeIds!p alqe!.m (983NI) awocy aqi ‘pmq lay10 aqi uo 5qiuoru $E inoqe 01 in0 (aa@au ueqi Jaqiw) aA!gsod s! i3aga pamugsa aqi inq ‘sum1a.I %h!wamy shldsrp os~e ‘9ggwlLd ‘waga alzyz~ammu! ayi %rywmu ajqe!.ntA ayl ‘1 lapour u10.13 pau!eiqo ()p’oql!M pa.mduroD ‘g9’0s! z lapour u10.13 pali?uysa Ai!3ysela-!tuas aqi ‘s.0 = ANsd J! ‘uo!ssa.Gal uoss!od aqi u! ‘aldmxa 10d ‘1 [apow ~1013 paieuysa layi ueqi la%el qmw s! iDaga iuallalap aqi ‘sale1 UO!~!AUO:, Iopd q%!q 1o.g ~alqe!.wA suo~l3!~uo~ lo!ld 30 a%zluazIad aqi 30 amalmap 01 umla~ S.yeamy ue p2aha.I Zm3d pue ANsd uo siuapgao3 aql ‘ICI%uyza.IaluI .lapotu s,laSBoJt) ~1013 palyruo sag!.wauy.tou luel~odur! aq 01 waas amyi ‘snqJ yapow 3piawoaE) pue uossrod aqi qioq u! (slo.kIa p~epuels lsnqo1 aql uo paseq) p ueql lalea.# ~gs!lels-~ e seq surlal palmbs aql 30 ‘ID‘eg ‘98gJQl&j pue ‘983NI ‘m3d ‘sa]q”!J”” lueDy!u@ hwysyws aqi 30 qaea 103 mai 3gwpenb e sapnlm! alqel qma 30 OMI uurnIo3 .sa[qr! -!.nz~ (a,yelqenbuou) aqi JO qma 01 lgadsal ql!M sa!l!cyse~a-yas ~ue~suo~ saumsm 1 lapon ~o,(~~o) ueql ssal s! anlm-d Supuodsalloa aql ill.9~ s! ‘[apour uosslod ayi 103 ~gspeis uerusneH isnqol aqi ‘asayi JO isaIpms aqL .pay!Dadss!ur s! ueau pmoypuo~ aql leql lsa%ns s~yyw~s aql JO 11~ .sysye~s lsnqol aql (z’s) pue ~!ls~ws $snqo.xuou aql sampold (l’s) uo!s -salSax ‘:X’Jz,r[ ‘8/( ‘6 + T)] = ‘v pue %Q,[(‘6 + I)/‘41 = ‘I4 ‘z,,(;6 + ‘6) aqi uIol3 + 1) E ‘v ijxJte~ ‘w uaqJ mo!ssa.Ga.I zya&oaf) /‘,a s ‘_a ‘:x’g’tj pue sanlm paijg aqi aq ‘,a pm ‘6 ia1 ‘japow 3yatuoag aqi lod ~suo~~~puo:, aql30 uoy~y~ads lDallo3 Japun l;X /C~pqoldurhse s!
spmp!sal maw
(Z’S)
‘)f ‘,a
uo
1
uo!ssa.G!al ayi ~uol3 gss -J uaqi :‘m uo ‘v uo!ssaBal aql uro.13 ‘H spznprsal 11 x 1 aql sugqo 1s.y ‘ploy 01 uo~ldurnss~ asue!.mA aqi a.ynbal- IOU saop qD!qfi ‘Isa1 isnqol aqL maw aqi symba amymt aql pm payDads hpza.uo3 s! ueatu pmoypuoz aql 31 ‘_1X hip -go~duIAsc s! &J ‘1 lapotu u! salqe!.m hroleueldxa uaAala ale alay astmaa
52
.I.M. Wooldridge, Specification testing
Mostly for curiosity’s sake, a form of the information matrix (ZM) statistic is also reported in tables 2 and 3. The indicator A, is chosen to yield a test with as many degrees of freedom as the Hausman test. In particular, A, = $x; is either 1 x 11 or 1 X 14. In the notation of Procedure 4.1, d, = y^, and 9, = jtx; for the Poisson model and e, = 9, + 9: and !& = (jJ in the Geometric case. Also, 6, = e*:/j, - 1 for the Poisson and 4, + 9:) - 1 for the Geometric. Because there are no additional parameters, the Newey-Tauchen-White regression can be used to a nonrobust ZM statistic. The NTW regression is simply
1 on
it, 4$,
t=
l,...,T,
+ 29:)x; = e^:/($, variance compute
(5.3)
where s*(=ehtetlGt, $, = $,A,, and A, =srx: (Poisson) or A, z [j,/ (1 + ft)]x; (Geometric). TRZ = TR - SSR from (5.3) is asymptotically chisquare if the entire distribution is correctly specified. If the conditional mean is misspecified, the 04 test in this context is only distracting. Consequently, one would not bother to compute the ZM statistics for model 1 after the Hausman test reveals misspecification of E(y,Ix,) (even though they have been computed here). For model 2, suppose that the researcher concludes that E( y,lx,) is correctly specified (based on the robust Hausman statistics). Then the nonrobust ZM test soundly rejects both the Geometric and Poisson specifications. However, the robust ZM statistics support the Poisson variance specification over the Geometric. For the Geometric, the robust ZM statistic is 44.54, with a p-value less than (0.1)“. The robust ZM statistic for the Poisson model is 23.35, with a p-value greater than 0.06; this is in contrast to the nonrobust statistic, which is 44.96. It is difficult to know whether the nonrobust ZM statistic is so much larger because higher-order (e.g., third) moment assumptions of the Poisson distribution are violated by these data, or simply that the NTlV regression produces a statistic with poor finite-sample properties; Bollerslev and Wooldridge (1988) present simulation evidence of the latter problem. Interestingly, the conclusion that the data support the Poisson distribution for model 2 is contrary to the values of the likelihood functions. Probably neither distribution is entirely correct; fortunately, it is not necessary for either to be correct to estimate and test hypotheses about E(y,]xt). 6. Concluding
remarks
This paper has developed a general class of specification tests for possibly dynamic multivariate models which impose under the null only the hypotheses being tested (correctness of the conditional mean or correctness of the conditional mean and conditional variance). The examples in sections 3 and 4 cover only a few of the possible applications. Multivariate linear exponential families encompass a variety of models used by applied econometricians.
J. M. Wooldridge, Specification testing
53
The general approach used here has several other applications. As noted in sections 3 and 4, the QMLE &r can be replaced by any fi-consistent estimator of (Y”. This generality is useful in situations where the conditional mean parameters are estimated using a method different from QMLE. The current approach is inherently limited by its focus on the first two conditional moments. In many cases the entire distribution of yf conditional on x, is of interest; in fact, for censored and truncated regression models, it does not make sense - at least in a parametric framework - to search for conditional mean tests that are robust to conditional variance misspecification in the underlying latent variable model. Semi-nonparametric techniques, such as those discussed by Gallant and Tauchen (1988), might be useful in these cases. However, the distributional theory for these methods has not yet been worked out in sufficient generality.
Appendix This appendix sketches a proof of the validity of eq. (4.11). For notational simplicity, the case K = 1 is explicitly considered. Also, the uniform weak law of large numbers (UWLLN) and central limit theorem are appealed to wherever appropriate; see Wooldridge (1990) for regularity conditions in these contexts. First, from arguments analogous to Wooldridge (1990, theorem 2.1),
T-“2
i ([n,(e,,ar)-~,(~(~)K~l’~,(e,,)-’ t= 1
;Zt;&=T-“2 t=1
+op(lL
(A.11
where Kf is given by (4.9) and R,* = J$-‘/‘
T-112
&= f=l
T-II2
i r=l
(&-$A;-‘J;)
+O#).
(A.4
J.M. Wooldridge, Specification testing
54
Along with (A.l),
(A.2) implies
that, under
Hb,
x&(e”) -1’2R:, -,@O;‘J;) A standard
mean value expansion
about
+ o,(l).
(Y” yields
i (P;‘C:)t=l
-fi(&,-a”)‘T-’
"*RT
+
h;AO,-‘J;)
+0,(l)
(A.31
5 (Tt*'R:
-fi(&,-a,)‘T-’
+h;&‘J;)
+o#),
t=l where q* ,_.$0-1/2~~ and ho = Vas,(ao). and, by definition, J: = T-’ EFS,E(Vr*‘R:).
T-’
5 E('Pt*'R,*
+h$4O,-'J,O)
However, T-‘CT=,E
= -At,
=0,
t=1 so if the WLLN
T-’
holds,
5 ('&*'RF
+h;A$-'J;)
=0,(l).
t=1 a,) = O,(l), this establishes (4.11). From ($ll), a conAlong with fi(&,sistent estimator of the asymptotic variance of T- “*CT= ,tt is simply obtained as TR: = an asymptotic xi statistic is therefore T-‘C;=,j;i,; T - SSR from the regression 1 on [,, t = 1,. . . , T.
J. M. Wooldridge, Specification testing
55
References Bollerslev, T. and J.M. Wooldridge, 1988, Quasi-maximum likelihood estimation of dynamic models with time-varying covariances, Working paper no. 505 (Department of Economics, Massachusetts Institute of Technology, Cambridge, MA). Davidson, R. and J.G. MacKinnon, 1981, Several tests of model specification in the presence of alternative hypotheses, Econometrica 49, 781-793. Davidson, R. and J.G. MacKinnon, 1985, Heteroskedasticity-robust tests in regression directions, Annales de I’INSEE 59/60, 183-218. Engle, R.F., 1982, Autoregressive conditional heteroskedasticity with estimates of United Kingdom inflation, Econometrica 50, 987-1008. Gallant, A.R. and G. Tauchen, 1988, Seminonparametric estimation of conditionally constrained heterogeneous processes: Asset pricing applications, Mimeo. (Duke University, Durham, NC). Gourieroux, C., A. Monfort, and A. Trognon, 1984a, Pseudo-maximum likelihood methods: Theory, Econometric 52, 681-700. Gourieroux, C., A. Monfort, and A. Trognon, 1984b, Pseudo-maximum likelihood methods: Applications, Econometrica 52, 701-720. Grogger, J., 1990, Further evidence on the economic model of crime: Estimates from arrestee data, Economic Inquiry, forthcoming. Hansen, L.P., 1982, Large sample properties of generalized method of moments estimators, Econometrica 50, 1029-1054. Hausman, J.A., 1978, Specification tests in econometrics, Econometrica 46, 1251-1271. Hausman, J.A., B. Hall, and Z. Griliches, 1984, Econometric models for count data with an application to the patents-R&D relationship, Econometrica 52, 909-938. Hausman, J.A., B.D. Ostrow, and D.A. Wise, 1984, Air pollution and lost work, Working paper no. 1263 (National Bureau of Economic Research, Cambridge, MA). Hsieh, D.A., 1983, A heteroskedasticity-consistent covariance matrix estimator for time series regressions, Journal of Econometrics 22, 281-290. Mullahy, J., 1988, Moment-based tests for Poisson and related count data models, Mimeo. (Yale University, New Haven, CT). Newey, W.K., 1985a, Maximum likelihood specification testing and conditional moment tests, Econometrica 53, 1047-1070. Newey, W.K., 1985b, Generalized method of moments specification testing, Journal of Econometrics 29, 229-256. Ruud, P.A., 1984, Tests of specification in econometrics, Econometric Reviews 3, 211-242. Tauchen, G., 1985, Diagnostic testing and evaluation of maximum likelihood models, Journal of Econometrics 30, 415-443. White, H., 1980, A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity, Econometrica 48, 817-838. White, H., 1982, Maximum likelihood estimation of misspecified models, Econometrica 50, l-26. White, H., 1987, Specification testing in dynamic models, in: T.F. Bewley, ed., Advances in econometrics - Fifth world congress, Vol. I (Cambridge University Press, New York, NY) l-58. White, H., 1989, Estimation, inference and specification analysis (Cambridge University Press, New York, NY) in press. Wooldridge, J.M., 1990, A unified approach to robust, regression-based specification tests, Econometric Theory 6, 17-43.