Economics Letters 0165-1765/93/$06.00
41 (1993) 231-234 0 1993 Elsevier
231 Science
Publishers
B.V. All rights
A simple consistent Stephen
specification
test
G. Donald
Department of Economics, Received Accepted
reserved
College of Business Administration,
University of Florida, Gainesville, FL 32611,
USA
8 December 1992 1 March 1993
Abstract This paper proposes a simple consistent model specification test for alternatives. The test statistic is based on comparison of randomly weighted model and the sum of squared residuals from a series-based non-parametric
parametric models against non-parametric sum of squared residuals of the parametric estimate of the model.
1. Introduction Suppose
that y is generated
for some unspecified family denoted by $(a)
function
by the model
5 One is interested
in determining
= {g(x, LY): (YE A C RP, A compact,
such that the null hypothesis alternative that H, : fjE’$(a). function as
p finite}
if f belongs
to some parametric
,
implied is essentially H,, : f E $(a) against Define the distance between the parametric
the weakly specified function and the true
where P(x) is the probability measure for x. Of course d(f, g) = 0 if and only if for some (YE A, P. We may state the null hypothesis equivalently as H,, : d(f, g) = f(x) = g( x, a 1 a 1most everywhere 0 and the alternative as H, : d(f, g) > 0 and our aim is to obtain a test that is consistent against this alternative. The test will be based on a comparison of sums of squared residuals where the squared residuals under the null are given randomly generated weights. One advantage of the test is that the data may be heteroskedastic, unlike tests based on sample splitting, as in Yatchew (1988) and Wooldridge (1989). A second advantage is that the model may contain discrete as well as continuous regressors, and the restriction on the growth rate of the number of terms in the series regression is weaker than needed for the tests in Hong and White (1991). Finally, a series-based method of estimation is used to non-parametrically estimate the sum of squared residuals so that the method is easy to use in practice.
232
S.G. Donald
I Economics
Letters
41 (1993) 231-234
2. The test statistic We make
the following
assumptions
on the data-generating
process
and the parametric
model.
Assumption 1. Assume that (i) xi are i.i.d. bounded random variables, (ii) f is a bounded function over the support of xi, and (iii) CL;are independent random variables with E(u;]x,) = 0 and E(]u,14+‘)
that we have an estimator
- cu”)%V(O,
& of (Y such that &a$~,, E intA the minimand
V,) )
where V, is a finite positive definite matrix. (ii) Assume that g is continuously that both g(a, X) and (~Ild,)g(c~, x) h ave fourth moments with uniform bounds neighbourhood of CX’. Define
the parametric
residuals
differentiable and for all (Y in an open
by
fi, = y, - g(&, X;) . The test is based on a comparison of a randomly with a sum of squared non-parametric residuals. parametric residuals is given by
weighted sum of squared parametric residuals The mean of the randomly weighted sum of
where wi are a set of random variables that are independent and identically distributed with mean 1 and variance ai > 0, bounded away from 0 and 0~. The non-parametric residuals are obtained by series-based non-parametric estimation of the relationship between y and x given by Yi =,$, ‘,!+$(‘,> + uj 7 where K denotes the number of terms to be used in estimation, and (cr,are basis functions. The results will allow for the Fourier Flexible Functional form [see Gallant (1981)], power series and interaction splines [see Newey (1990)] to form the Cc,functions. The following regularity conditions on f and K are assumed. The notation d, is used to denote either the degree of smoothness [see Donald and Newey (1992)] or the Sobolev smoothness index of the function f [see Andrews (1991)]. Also let the dimension of the regressor vector be denoted by r. Assumption 3. Assume that K = cN~‘~-~ for some positive finite constant c and 0 < y < + - r/4df For a y satisfying Assumption 3 to exist we need that r/4df < l/2, which places restrictions on the degree of smoothness of f relative to the dimension of X. The mean of the sum of squared non-parametric residuals is given by
S.G.
Donald
I Economics
233
Letters 41 (1993) 231-234
where M = I - !P(F’?P-!P is the usual idempotent matrix formed using the regressors, which are the basis functions of X. Note that a generalized inverse may be used so that we do not require the basis functions to be orthogonal with respect to the unknown distribution of the regressors. This avoids the need to worry about the eigenvalues of the second moment matrix of the basis functions as in Hong and White (1991). One nice feature of this is that we may have discrete and continuous regressors and these may be treated symmetrically in forming the basis functions. The test statistic is based on t^=G;(w)-6;. The following result characterizes alternative hypotheses. Theorem 1. Given Assumptions (i) Under H,, : d( f, g) = 0,
(ii)
the
behavior
of the
test
statistic
l-3,
Under H, : d(f, g) > 0,
where
may be estimated consistently under the null by Q(t) = Proof.
(i) Under
CT’, +:.N
_
H,, given
Assumptions
2
1
1 - 3 it is easy to show that
and
so that
V% -_-L..- = & V(t)“2
N
lz ui”+ q,(l)%%
by the Liapunov Central Limit Theorem. (ii) Using the same arguments as in (i),
1)
under
both
the
null
and
S.G. Donald
234
vd-V(ty2 =&g +
&g
I Economics
Letters 41 (1993) 231-234
K2(1-wi)
(.flxi)
-d%~
xi)>2wi + op(l)‘w
3
since
Note that both results hold when V(t) is replaced by the consistent estimator p(t). This suggests that a simple consistent test of d(f, g) = 0 can be performed by a one-sided asymptotic t-test using ?mQ(t)y2. Note that the limiting distribution is degenerate when c,+ = 0. A simple choice for the wi would be to generate a sample of N observations from a uniform distribution over the interval [l - y, 1 + 71, where y < 1. In such a case, CT: = y2/3. Note also that it would be easy to allow for heterogeneous x, in the results and also limited forms of dependence.
References Andrews, D.W.K.,
1991, Asymptotic normality of series estimators for nonparametric and semiparametric regression models, Econometrica 59, 307-345. Donald, S.G. and W.K. Newey, 1992, Series estimation of semilinear models, mimeo. Gallant, A.R., 1981, On the bias in flexible functional forms and an essentially unbiased form, Journal of Econometrics 15, 211-245. Hong,Y. and H. White, 91-139.
1991, Consistent
specification
testing
via Nonparametric
series regression,
Newey, W.K., 1990, Series estimation of regression functionals, unpublished manuscript, MIT. Wooldridge, J., 1989, Some results on specification testing against nonparametric alternatives, Economics Working Paper. Yatchew, A.J., 1988, Nonparametric regression tests based on an infinite dimensional least squares of Toronto Department of Economics Working Paper.
UCSD
MIT
Discussion
Paper
Department
procedure,
University
of