31
E!.conomics Letters 29 (1989) 31-35 North-Holland
ON THE CALCULATION OF THE INFORMATION LINEAR REGRESSION MODEL Alastair
HALL
MATRIX
TEST
IN THE
NORMAL
*
North Carolina State University, Raleigh, NC 27695-81 IO, USA Received 19 April 1988 Accepted 7 September 1988
In this paper we propose a simplified method of a calculating the information matrix test in the normal linear regression model. A simulation study demonstrates that our test has considerably better size properties than the Chesher (1983)-Lancaster (1984) version of this test. Our test converges to its asymptotic distribution by n = 600, whereas for the same model Taylor (1987) reports the Chesher-Lancaster test statistic has not converged to its asymptotic distribution by n = 8000.
1. Introduction White (1982) introduced the information matrix test (IMT) as a method of testing the adequacy of a model estimated by maximum likelihood. The IMT is based on the fact if the model correctly specified, the Hessian and outer product forms of Fisher’s information matrix are asymptotically equivalent. The original version of White’s IMT is unappealing because its covariance matrix depends on the third derivative of the log-likelihyood function. Chesher (1983) and Lancaster (1984) proposed a simplified method of calculating the IMT which is computationally convenient as (i) it only depends on the first two derivatives of the log-likelihood function and (ii) it can be calculated as nR2 from an auxiliary regression. Recently Taylor (1987) presented simulation results for both a simple linear regression model and a simple censored normal model, which demonstrate the Chesher-Lancaster version of the IMT has poor size properties even in samples of size 1000 and more. This poor performance has led Orme (1987) and Kennan and Neumann (1987) to re-examine the IMT.
Both authors argue that the size problems
of the Chester-Lancaster
test are in part due to a
numerical artifact but also to be expected because of the method used to estimate the covariance matrix. The covariance matrix of the IMT depends on higher order population moments of the process and the Chesher-Lancaster version of the test estimates these higher order moments by their sample analogs. Citing arguments by Kendall and Stuart (1977) and Davidson and MacKinnon (1983) Orme and Kennan and Neumann demonstrate the use of this covariance matrix estimate is inefficient and may slow down the convergence of the IMT to its asymptotic distribution. Orme (1987) proposes a new nR2 method of calculating the IMT and presents simulation evidence for the censored normal model which shows the test is very close to its asymptotic x2 distribution by sample of size 500. Unfortunately this procedure has the same drawback as White’s original test because its covariance matrix estimator depends on the third derivative of the log-likelihood function. * Part of this work was undertaken during the tenure of a Social Science Research Council Studentship. Trevor Breusch, Grayham Mizon and Ray O’Brien for valuable discussions on this work. 0165-1765/89/$3.50
0 1989, Elsevier Science Publishers B.V. (North-Holland)
I am grateful to
A. Hall / Calculation
32
of informationmatrix
test
In this paper we present a simplified method of calculating the IMT in normal linear regression model. This method follows directly from Hall’s (1987) demonstration that in the model the IMT is asymptotically equivalent to the sum of three statistics. It is shown here that this decomposition the IMT facilitates its calcualtion via two auxiliary regressions. This test procedure is not subject
of to
the criticisms of the Chesher-Lancaster approach outlined above. The results from a simulation study suggest that the size properties of our test statistic are a considerable improvement on those of the Chesher-Lancaster
test.
2. A simplified method of calculating the information matrix test Let
t=l,2,...,n; (i) Y, - Wx#, a*), Gr be the log-likelihood function
(ii) (iii)
B be the maximum-likelihood
of y, conditional
estimator
on the vector of p exogenous
of 8’ = (p’,
variables
x,;
a2), and
a* = y, - x;p^; (iv)
d,( 4) is a ( p + l)( p + 2)/2 vector whose s th element d,,(8)
a2L,(&aoz
=
ae,+
where j 6 i, i = 1, 2,. Following
Hall (1987)
part of the information
D”(8) =n-’
is
(aL,(e)/ae,)(aL,(8)/ae,),
. , p and s = 1, 2,. . . ( p + l)( p + 2)/2.
we consider
the case where the researcher
matrix identity,
and so base the IMT
may only be interested
in testing
on
5 Sd,(8),
(1)
t=1
where S is a qx( p + l)( p + 2)/2 block S = diag[S,, S,, S,] where S, is (ql Xp(p
diagonal + 1)/2),
selection S, is (q2
matrix. Xp),
S,
As in Hall (1987) we define is 1 X 1. The IMT is therefore
T=~zD,(~)'[V,(~)]-'D,(~),
(2)
where V,(6) is a consistent estimator of the covariance matrix of D,(e). Under the null hypothesis that the model is correctly specified T converges in distribution to x:. Hall (1987) demonstrates that if the model is correctly specified then T asymptotically decomposes into the sum of the following (i)
three indeJ:ndent
T,,= t(li:-o^*)f:]s;[s,+:s;]
Sl[
quadratic
forms:
~15,(iilrB’)]/2B’,
t=1
where 5, is a p( p + 1)/2 vector consisting
of the lower triangular
elements
of (x,x:
- n-r~~=,x,x~):
A. Hall / Calculation of information
matrix test
33
Inspection of T,, and T,, demonstrates the following: (a) T,, = ES,‘Z$/~~^~where ESS, 4:s;.
(b) T2,, = ESS,/6@
is the explained sum of squares from the regression of fi: - a^* on
where ESS, is the explained sum of squares from the regression of Eij on x:.
This suggests that in large samples IMT then can be calculated as IMT*
= T,, + T,, + T,,.
(3)
It is worth noting that IMT* is not subject to the criticisms leveled at the Chesher-Lancaster IMT. In particular, the estimate of the covariance matrix of n -‘I* D“( 6) is not the sample covariance matrix but an estimate of lim n ---f co n-’ X:=, SE[d,(&)d,(g)‘]S’. The arguments of Davidson and MacKinnan (1983) suggest the use of such a covariance matrix estimator should improve the finite sample performance of the test. To assess the size properties of IMT* we performed a simple simulation study. We examined the case where { y,} is a sequence of independent normal random variables generated as follows:
y,=
5 x,, + u;;
I,2 ,...,
n,
(4)
j-0
where x0, = 1, for all i, “J,. - IN(0, 1) j > 1, U, - IN(0, l), E[x,~x~,] = 0, j f k, E(x,,u,) = 0. Tables 1 and 2 contain the empirical size of the IMT* statistic when S is the identity matrix for n = 100, 200, 300, 400, 500, and 600 and k = 2, 5. The case k = 2 is included for direct comparison with Taylor’s (1987) results for the Chesher-Lancaster version of the IMT. The case k = 5 is included to see whether the rate of convergence to the asymptotic x2 distribution deteriorates as the number of regressors increases. The figures in tables 1 and 2 denote the empirical size of the four test statistics in 1000 replications. This is calculated as the empirical rejection frequency when the test statistics are compared to the 100 cy% significance level critical points of their asymptotic x2 distribution. The results suggest that in both k = 2 and k = 5 cases, the IMT* has converged to its asymptotic x2 distribution by n = 600. Note for both cases the correct size of the test is included in a 95% confidence interval centered on the empirical size using the normal approximation to the binominal distribution. For the k = 2 case this is a considerable improvement over the Chester-Lancaster IMT: Taylor (1987) reports empirical sizes of 0.300 and 0.220 for OL= 0.10 and a! = 0.05 respectively when n = 1000. We also report the empirical size of each of the components of IMT*, because Hall (1987) demonstrates that these components contain tests already familiar in the literature. For instance, T, asymptotically equivalent to White’s (1980) Direct Test for heteroscedasticity under normality. T, contains the Bowman and Shenton (1975) test for skewness and T3 is the test for non-normal kurtosis proposed by the same authors. The results suggest that for T,, a is not in the 95% confidence interval centered on the empirical size described above for n < 600 and k = 2 or 5. For T,, a is in such an interval for n < 400 when k = 2, but only in such an interval for n = 600 when k = 5. However, for T3 a is in a 95% confidence interval for n > 400 when k = 5, but only for n = 600 when k = 2. A complete analysis of the IMT requires an examination of its power properties. Hall (1987) considers the asymptotic power of the IMT and further work is needed to examine the power
0.10
Test statistic
L-d
n
0.086
600
Test statistic
n
0.01 0.023 0.032 0.016 0.018 0.019 0.012
0.05 0.067 0.058 0.038 0.057 0.058 0.050
0.102
0.075
0.099
0.099
0.109
100
200
300
400
500
600
0.012
0.008 0.017
0.013 0.011 0.012
0.01
0.111
0.043
0.045 0.031
0.043 0.040 0.039
0.05
0.10
n
size, k = S
Empirical
T I”
0.080 0.072
400 500
Table
0.069 0.073 0.070
100 200 300
2
r,,
size, k = 2.
1
Empirical
Table
0.069 0.062
0.107 0.096 0.088
0.057
0.051
0.082
0.123
0.105
0.078
0.05
0.053
0.055 0.056
0.052 0.044 0.057
0.05
0.111
0.10
TZn
0.113
0.098 0.094
0.078 0.082 0.092
0.10
TZn
0.014
0.024
0.019
0.027
0.043
0.047
0.01
0.013
0.013 0.010
0.018 0.021 0.019
0.01
0.083
0.101
0.086
0.066
0.065
0.068
0.10
T3”
0.098
0.092 0.076
0.057 0.067 0.076
0.10
T3”
0.037
0.050
0.041
0.029
0.028
0.031
0.05
0.043
0.039 0.036
0.030 0.033 0.036
0.05
0.007
0.012
0.011
0.007
0.009
0.016
0.01
0.011
0.009 0.008
0.013 0.007 0.012
0.01
0.064 0.070 0.064
0.108 0.100
0.055
0.086
0.087
0.05
0.052
0.047 0.051
0.052 0.051 0.049
0.05
0.103
0.088
0.131
0.122
0.10
IMT*
0.085
0.091 0.079
0.077 0.077 0.072
0.10
IMT *
0.016
0.030
0.032
0.021
0.054
0.047
0.01
0.012
0.017 0.018
0.024 0.023 0.019
0.01
% -. z 3
r? 2 5 g, S
$: P \
a
A. Hall / Calculation of information
matrix test
35
properties in finite samples. Such a study is beyond the scope of the present paper and left for future research.
References Bowman, K.O. and L.R. Sherton, 1975 , Omnibus contours for departures from normality based on b, and b,, Biometrika 62, 243-250. Chesher, A. 1983, The information matrix test simplified calculation via a score test interpretation, Economics Letters 13, 45-48. Davidson, R. and J.G. MacKinnon, 1983, The small sample performance of the Lagrange multiplier test, Economics Letters 12, 269-275. Hall, A.R., 1987, The information matrix test for the linear model, Review of Economic Studies 54, 257-263. Kennan, J. and G. Neumann, 1987, A Monte Carlo study of the size of the information matrix test, Unpublished Mimeo. (Department of Economics, University of Iowa, Iowa City, IA). Lancaster, T., 1984, The covariance matrix of the information matrix test, Econometrica 52, 1051-1053. Orme, C., 1987, The small performance of the information matrix test, Unpublished memo (Department of Economics, University of York, York). Taylor, L. 1987, The size bias of white’s information matrix test, Economics Letters 24, 63-68. White, H., 1980, A heteroscedasticity-consistent covariance matrix estimator and a direct test for heteroscedasticity, Econometrica 48, 817-838. White, H.. 1982, Maximum likelihood estimation of misspecified models, Econometrica 50, l-26.