A unified approach to estimation and orthogonality tests in linear single-equation econometric models

A unified approach to estimation and orthogonality tests in linear single-equation econometric models

Journal of Econometrics 44 (1990) 41-66. North-Holland A UNIFIED APPROACH TO ESTIMATION AND ORTHOGONALITY TESTS IN LINEAR SINGLE-EQUATION ECONOMET...

1MB Sizes 4 Downloads 26 Views

Journal

of Econometrics

44 (1990) 41-66.

North-Holland

A UNIFIED APPROACH TO ESTIMATION AND ORTHOGONALITY TESTS IN LINEAR SINGLE-EQUATION ECONOMETRIC MODELS* M. Hashem

PESARAN

Trinity College, Cumhridge, United Kingdom Uuiversi!v of Culiforniu, Los Angeles, CA 90024.1477,

Richard b’rziuersit,l~of Munchester,

USA

J. SMITH Manchester,

United Kingdom

Maximum-likelihood estimation is considered for a generalisation of the model of Anderson and Rubin (1949) in which the exogenous variables in the structural equation may not be included in the reduced-form equations. Classical and specification tests are derived for orthogonality hypotheses A necessary and sufficient condition for their equivalence is presented. The classical tests are compared using Bahadur’s asymptotic relative efficiency criterion. It is shown that a generalisation of the Durbin-Wu-Hausman T2 statistics is asymptotically Bahadur-efficient.

1. Introduction Following the pioneering work of Reiersol (1941,1945) and Geary (1949), the method of instrumental variables has played an important role in the estimation of stochastic relations in circumstances where the ‘orthogonality condition’ is violated; that is, when some or all of the explanatory variables are correlated with the disturbance. The application of the instrumental-variables (IV) technique is, however, subject to two important considerations: (i) whether the chosen instruments are ‘admissible’ in the sense of being uncorrelated with the equation disturbance, and (ii) whether the available admissible set of instruments is used ‘efficiently’ if there are more instruments than the minimum number required for a consistent estimation of the unknown parameters. Both of these issues have been extensively researched in the literature. The problem of the efficient use of instruments has been considered by Durbin (1954) Sargan (1958,1959), and Hansen (1982); and the issue of instrument admissibility has been examined, primarily in the context *The referees

authors are grateful to Les Godfrey, Albert0 for their helpful comments on an earlier version

0304-4076/90/$3.50

T 1990, Elsevier Science Publishers

Holly, Peter of this paper.

Phillips.

B.V. (North-Holland)

and

anonymous

42

M.H. Pesurut~ und R.J. Smith, Estimation

and orthogona1it.v tests

of simultaneous-equations systems, by a number of authors, notably Wu (1973,1974), Revankar and Hartley (1973) Revankar (1978), Hausman (1978) Hwang (1980) Richard (1980,1984), Engle (1982) Holly (1982a, b), Smith (1983b), and Lubrano, Pierse, and Richard (1986). In this paper we present a unified treatment of the dual problems of instrument admissibility and instrument efficiency and provide an asymptotic power comparison of the various exogeneity tests proposed in the literature, using Bahadur’s (1960,1967) criteria of asymptotic relative efficiency. Bahadur’s approach is particularly useful here as it allows for a relatively simple method of comparing test statistics that are asymptotically indistinguishable by the more familiar Pitman local power criterion. Our unification is based on a likelihood framework that differs in two respects from the conventional formulation of Anderson and Rubin (1949) exploited extensively in the literature. Firstly, it is more general as it permits the endogenous variables in the reduced-form mode1 to be determined by different sets of exogenous variables, a situation that frequently arises in rational-expectations models [cf. Turkington (1985) and Pesaran (1986)]. Secondly, our choice of the factorisation of the likelihood function and the associated parameterisation allows important simplifications in the derivation of the maximum-likelihood (ML) estimators and in the generation and comparison of the various classical and Wu-Hausman specification tests proposed in the literature for testing the independence of some or all of the stochastic regressors and the equation disturbance. The plan of the paper is as follows: Section 2 sets out the genera1 likelihood framework and describes how it differs from the conventional framework of Anderson and Rubin (1949). Section 3 derives the ML estimators and provides a generalisation of the K-class estimator. Section 4 deals with the issue of instrument admissibility and derives classical tests for the independence of a subset of stochastic regressors and the equation disturbance. A comparison of classical and specification test statistics is provided in section 5, where the conditions under which these two types of tests are asymptotically equivalent are discussed. Section 6 gives an analysis of the asymptotic power characteristics of the various classical exogeneity test statistics using Bahadur’s approximate slope approach. 2. Formulation Consider

_)I, =

where

of the problem: A likelihood framework

the linear (Y’xr +

stochastic

pz, + u, =

relation

y’w, + u,,

a and fl are k- and r-vectors

t = l,...,

of fixed unknown

n,

(2.1) structural

parameters;

M. H. Pesarun and R.J. Smith, Estimation and orthogonality tests

or in matrix

43

notation,

wy+u.

y=xa+zp+u= The vector of stochastic

variables

x, = B’h, + u,,

t=l

(2.1’)

x, is generated ,.-.,

according

to the relation

n,

(2.2)

or X=HB+

V.

(2.2’)

Eq. (2.2) generalises the more familiar Anderson and Rubin (1949) limitedinformation simultaneous-equations model in allowing z, and h, to be r- and s-vectors of possibly distinct elements. In addition to the usual limited-information simultaneous-equations system where z, c h,, model (2.1)-(2.2) arises in other circumstances, notably in rational-expectations (RE) models. For example, the RE model y, = a’x,e + P’Z, +

E,,

(2.3)

where x, is determined as in (2.2) and x: = B’h, = x, - o,, can be written in the form of (2.1) with U, = E, - a’~,. [See, inter alia, Pagan (1984) and Pesaran (1987, ch. 7)]. In the RE contexts in which (2.2) and (2.3) represent behavioural equations for economic agents with heterogeneous information, z, need not be contained in h,. Another case of model (2.1) and (2.2) in which z, is not fully included in h, occurs when yr is assumed to Granger ‘noncause’ x1; that is, when z, includes lagged values of yt, whereas h, does not. The analysis that follows is based on the following assumptions: Assumption I. Conditional on z[, h,, and a,_,, the (k + 1) X 1 disturbance vector 5, = (u,, u,!)’ is distributed as N(0, z((), where _XIE is a nonsingular matrix and O,_, is the information set on {y,, xs, zs, h,} prior to period t. Assumption 2. column rank.

The parameters

y, B, and E,,

Assumption 3. Observation matrices the moment matrices ,X22= E( Z’Z/n) Assumption

1 can

be relaxed

are identified

and

B has full

Z and H are each full column rank and and z,,,, = E( H’H/n) positive definite.

in a number

of ways

to allow

for serial

44

M. 11. Pesoru~~ end R.J. Smith, Estimution

und orthogonality tesfs

correlation and conditional heteroskedasticity in the { 6,) process. The normality assumption is also not essential for the asymptotic theory developed below. Under Assumption 1. we have the following:

u,Iz,, h,, fir_,- N(0,CT*), x,Iu,. z,, h,, D,_, -

N(B’h,+

(2.4a)

au,,\I/>,

(2.4%

where Ess is partitioned conformably with 5, = (u,, u;)‘, 6 = u~*Z~,,~,, and ’ This factorisation has been exploited by Richard (1984) $J = z,,,, - fJ- 2~,>UCI”I’. and Lubrano et al. (1986) and independently in Pesaran (1984). Let 8 = (v’, a’, u2, (vet I?)‘, (vet #)‘)’ be the unknown parameter vector of model (2.4) and denote its log-likelihood function for the n observations on y,, l-3 we have x,, z,, h,, and a,_, by r(e). Then under Assumptions

log-density where f(E,lz,. h,, fin,_, ) is the conditional the system (2.1)-(2.2) in matrix notation, we obtain

function

n

I(e)

a - ilogo*-

$( y - WY)‘(Y - WY Vu’

- zlog

of cr.* Now for

141

-+tr{q-1(X-m-u8’)‘(X-ml-us’)}.

‘The more conventional framework of Anderson and Rubin (1949) exploited by many authors [see, inrer uliu. Holly and Sargan (1982) and Smith (1983b)] would replace (2.4) by ?; ( 1;. -, . 11,.Q, , - N( y’w, + q’u,.a’), .Y,~~,.Il,.R, , -N(B’h,,Z ,I,_ ), However, as will be evident from later sections. where n = z’,,,,‘2, ,, and w? = 0’ - B,,,H,.‘P,,,. formulation (2.4) allows substantial simplifications in the derivations of the ML estimators and the generation of orthogonality tests. ‘Here we have followed the convention Hall and Heyde (1980, p. 157).

and set the likelihood

for { 5,. s I 0) equal to unity. See

M. H. Pesurut~ und R.J. Smith, Estimation

5.

Maximum-likelihood

+&j-‘8

of B be b; then the first-order

i

W’t=

conditions

D/(e)

W’(X-H&q-‘8,

ii’ii/n

=

(3.2)

,

(3.3)

E=(H~H)-‘H’(x-~~&),

n$ = ( X -

= 0

(3.1)

Hi)‘ii=&ic’ii),

(X-2 0

45

tests

and other estimators

Let the ML estimator yield ;

and orthogonali
(3.4)

Hi? - i;&)‘( X - Hj - id’),

(3.5)

where fi=yCombining

Wf.

(3.2) and (3.4) yields

&=X’(X-Hl?-iii’). Now using we have

this result together

(3.6) with the equations

in (3.1) corresponding

ncT2S” = X’i2. This result. as will be seen in section 4, forms orthogonality tests proposed in the literature. Eqs. (3.7) and (3.2) can also be used to obtain

to X

(3.7) the basis

of most

of the

which forms the basis of the various IV estimators for y proposed in the literature. For example, when Z is in H, Sargan’s (1958) generalised IV estimator results by replacing B in (3.8) with the moment estimator for B = I;~2,1~,, namely B = (H/H)-‘H’X. In the general case when Z may not be in H, the ML estimator of y satisfies the relation

w@,,+ps$=0,

(3.9)

46

M. H. Pesorutl und R.J. Smith, Estimation

and orthogonality

tests

where

6, = Ph- F2L

(3.10a)

P,=H(H’H)-‘H’=Z,-M,,

(3.10b)

j? = ;‘P,,ic/iYn,

(3.1Oc)

S, = M,, - M,,X( x’M,X)-‘X’M,,,

(3lOd)

P = iXs,ii/ic’ti,

(3.10e)

6’ = (1 - ;‘)/i*.

(3.10f)

See appendix A for a derivation. Eq. (3.9) is a generalised generating equation for y, which may be written as

K-class estimator

9 = ( w’k,W)P1W4~y,

(3.11)

i?,, = Q,, + p’s,,.

(3.12)

where

Because of the dependence of k, on B, the computation of the ML estimators needs to be done iteratively. However, an asymptotically fully efficient estimator for y is provided by the following two-step procedure. Step 1. Compute Sargan’s (1958) generalised IV estimator i; = ( w’P,w)-Lw’Phy, ii in (3.10) the IV residuals ii = y - Wi;, and fi2, x2, and fi2, by replacing with ii. Step 2. Compute R, via (3.12) using the estimates from step 1, and hence the two-step estimator,3

qzs= ( W’k,W)-lW’R,y. ‘Notice that W’( i,, - A,,)W/n = o,(l), W’(ih - i?,)y/n1j2 = o,(l) as i;’ - fi2 = ~,(n-“~) and 2 - 1’ = o,,(l). Step 2 may be further simplified by using the asymptotically equivalent form of A,,, namely Pi, + k ‘S,,. thus allowing the computation of the two-step estimator to be carried out on most econometric packges by means of auxiliary regressions.

and orthogonuliry tests

47

7 and other estimators in the literature log-likelihood function

it

M. H. Pesorm und R.J. Smith, Estimation

To consider the relationship between is useful to examine the concentrated

l(J)a

-t{log(X’M,X/n)-log(l-8*)+log(d*x*)},

derived as (A.5) in appendix A. In the standard case when Z is in H, we have

rG2i2= ixs,,a= y&y, for all values of y, and thus l(0) will be a monotonically decreasing function in p*. Minimising p2 in terms of y gives precisely the standard limited-information maximum-likelihood (LIML) estimator derived by Anderson and Rubin (1949). Moreover, since plim( n’/*/i*) = 0, j?* may be set to zero which yields Sargan’s generalised IV estimator. Durbin’s (1954) estimator maximizes the angle between (y - Wy) and the space spanned by the columns of H, which is equivalent to minimising p* = u’P,,u/u’u with respect to y. When Z is in H, Durbin’s estimator is identical to the LIML estimator. In the general case where Z may not be in H, instead of minimising p* the relevant criterion for the derivation of the ML estimator of y is to maximise the ratio (1 - p*)/a*A* or equivalently u’M,u/( u’u)( u’S,u) in terms of y. 4. Orthogonality

tests

The issue of instrument admissibility and the associated problem of testing for lack of correlation between regressors and the disturbances have attracted a good deal of attention in the literature. These tests are generally referred to as orthogonafity tests, although in the simultaneous-equations systems’ context they are also known as exogeneity tests. The precise connection between these two types of orthogonality tests will be explored in section 5. In its general form the orthogonality hypothesis concerns the k,-subvector 6, of 6 = (&‘, 6;)’ defined in (2.4), namely, H,:

6, = plim( X;U/U)

= 0,

(4.1)

to be tested against H,: 6, # 0, where X = (X1, X2) is partitioned conformably with 8. In the context of the RE model (2.2)-(2.3) this hypothesis allows for the replacement of RE variables xF2 by their realization without a loss of efficiency.

M.H.

48

4. I.

Pesorurt crud R.J. Smifh, Estimution

and orthogonality

tests

Wald ( W) test statistics

The Wald statistic is based on the estimator of 6, given by (3.7) and examines the significance of 6,. Partitioning I/ and B conformably with X = (Xi, X2), since V, = X2 - HB,, using (3.8) we can write 8, = r;,,a/i%

(4.2)

Consider n

l/ZQ,Ei = n- i’zv;(I,-

W(W’R,W)_‘W’R,)u+o,(l).

(4.3)

As shown in appendix B, under H,, n -1/2?2fi2 has a limiting normal distribution with mean zero and variance matrix given from (B.5) in moment form as var( flp1/2+ifi) Substituting state: Proposition

estimators

4.1.

= o’plim[V;(I,,+ ,I+ 30

W(W’R,W)-1W’)V2/n].

and sample values where appropriate

(4.4)

in (4.4) we may

A Wald statistic for H,: 6, = 0 is given by

W,, = nz2’F,,l [ I,1 - W( w’k,W n . where P,,, = V2( Q2’f2) ‘f2’. with k, degrees of freedom. the diflerence in uncentred double-length IV regression

+ w’F[,*W) -’ W’] fi+z?/C’ir.

(4.5)

Under H,: 6, = 0, W,, has a limiting x2 distribution The statistic W,, may be simply computed as n times R2 from the LS regression of ii on p2 and the of

where 0,, is a zero vector of order n.4 Consider the more familiar LIML case where Z is included in H. Now 0, replaces R ,? in (4.5). In this case it is further possible to define an alternative Wald statistic by replacing &, c, and 6 in (4.5) with Ph, MhX, and respectively. Defining the projection matrix Ph,xz = P,, + y-WY, A4,,X,( X;M,,X,)-‘X;M,, and the generalised IV estimator under H, as (W’P,,. ,,W)- ‘W/P,,. \2 y, these substitutions result in: 4The uncentered

R’ of the IV regression

R’ = (J,‘J. - e’Pr)/r’y

where

of _v on X in the metric e=y

~ X( YPX)~‘X’qv.

P is defmed

by

M.H.

Pesum~ und R.J. Smith,

Estimution

and orthogonalit?~

tests

49

Corollary 4.2. When Z is included in H, the Wald statistic based on IV estimation under H, for the test of H,: 6, = 0 can be computed as the Wald statistic (denoted by W,,) for the test of d, = 0 in the IV regression y=

Wc+(M,,X,)d,+u,

(4.6)

in the metric P,,, 82. Setting d2 = 0, the IV regression yields the generalised IV (GI V) estimator for y under H,,, whereas when M,, X, is included, the GI V estimator under H,, namely 7, will result. The artificial IV regression in this corollary is in fact the same as that suggested by Newey (1985). The corollary also represents a simple generalisation of the expanded regression test of 6 = 0 discussed in Hausman (1978) and Nakamura and Nakamura (1981). When testing for the orthogonality of the whole set X, the relevant!metric becomes Ph,x, and since Z is assumed to be in H, the results of the IV and LS estimation of (4.6) will coincide. Furthermore, in this case the F statistic for the joint test of the significance of MAX in (4.6) will be numerically equal to Wu’s (1973) T2 statistic [cf. Smith (1983a)]. Seen from this perspective, the test in Proposition 4.1 and in Corollary 4.2 may also be regarded as an extension of Wu’s ‘exogeneity test’ [cf. Smith (1985)]. In fact, the test of 6, = 0 based on (4.6) is identical to the Wald version of Holly’s (1982b) extended regression [see Smith (1983b)]. But unlike in Holly’s procedure, it is unnecessary to compute the special matrix M,,X( X’M,X)-‘. It is also worth noting that one may choose to estimate u2 in the IV regression (4.6) by either ii’8/n or ii’i;/n, where v’= y - Wq - MhX2d2 and d; = (X,‘M,~X,))‘X;M,(y - Wi;), to give valid Wald statistics. Clearly the statistic using the latter estimator, denoted WIv, gives a computationally more attractive statistic than W,,.

4.2. Lagrange

multiplier

(LM) statistic

From (3.2) the score vector with respect to 6 is I/- ‘( X- HB - uS’)‘u. Denoting estimation under H,: 6, = 0 by tilde, it is easily seen that the relevant score vector for the LM statistic is given by (4.7) where ‘For

&,, = p;p22/n ML estimation

and f2 = M,,X,.’ under H,,, the model (2.1’)-(2.2’) is rewritten as y = Wy + u. X, = treating Z, H, and A’>as exogeneous, where L = 2,: ,P,.,$ 1. The _. _ to (3.9) and (3.10) are

H( B, - B2 L.) + X, 1, + E,.

corresponding

equations

w,( CA. \? + P%)J -7 _, p- = U P,,, ,2ii/;‘ii,

= 0.

CA,, ,? = 47. I2 - Br,,.

i;” = (1 - ,L?)/h;‘.

R = iifS,i;/ix.

M.H.

50

Pesurur~ urtd R.J. Smith, Estimution

und orthogonality

tests

To calculate the asymptotic variance matrix for ~‘/*$,-,‘~~2/ii from standard ML theory we merely use the inverse of the asymptotic variance matrix given in (4.4). After cancellation of $,, = _I?,,2,2under H, the relevant matrix will be estimated by

where a”‘= ii’ii/n, I?,,+ = &_ + [(l - P2)/x2]Sh, Thus, combining (4.7) and (4.8) we may state: Proposition

4.3.

P+,

- 6*I,.”

An LM statistic for H,: 6, = 0 is given by

LM,, = nii’v2

[

?i

i

I,, - W( W’l?h.x,

(4.9)

Under H,,: a2 = 0, LM has a limiting x2 distribution A simple

and o,_=

regression

version

with k, degrees of freedom.

of LM, of (4.9) is given by:

Corollary 4.4. An LM statistic for H,: 6, = 0 may be computed as the LM test for d 2 = 0 in the IV regression y=

with respect to the metric I?,, 1x * or iz,,* Proof.

(4.10)

Wc+(M,X,)d,+u,

See appendix

= Ph,x, + (l/i2)S,,.

C.1.’

When Z is in H, the relevant metric for regression (4.10) is &xz = Ph,x, ,5*1,,. The consequent LM statistic is a natural generalisation of the LM versions of statistics discussed by Wu (1973), Hausman (1978) and Nakamura and Nakamura (1981) for the full-set X regre_ssor case. From _the orthogonality of L@= (Xl, X2, Z) and ii, where r?, = H(B, - i2i) + X,L (from footnote ‘Under R 11.\-“

H,,:

iS2= 0. d,, = P,, - $I,, + [(l - jS*)/R].S,,

and

F,,, = Ph. \1 - P,,. Thus

r?, + F,,, =

‘Note that the IV regression setting d, = 0 returns j = ( W’kJ1,XzW))lwIk,,l v-,y or an asympat ;* totically equivalent estimator if it, 1 is used. If R, x or Rh*)-\* are evaluated and R. a similar result occurs. Also ihe IV regression with d, + 0 returns an asymptotically esquivalent estimator_ LO T = (W’R,_W)~‘W’R,y. This result is cbtained because R h. ,$( &‘R /r. 23) ’ VI’RA. x2 = (1 - 6) P[,, from appendix B.2 and R,, x* - P+ = R, from footnote 6. In fact the Wald statistic for d, = 0 derived from such an IV regression will differ from those in section 4.1 by an asymptotically negligible, o,(l), term.

M. H. Pesurun und R.J. Smith, Estimation

and orthogonality tests

51

S), the exact LM stafistic is given as n times the uncentred R* from the LS regression of ii on (W, MAX,) [cf. Engle (1982) and Smith (1983b)j. Using the GIV metric under H,, Ph,x,, in (4.10) gives the LM version of Corollary 4.2 due to Newey (1985), which is also the LM version of Holly’s (1982) statistic [see Smith (1983b)].

4.3. Likelihood

ratio (LR) statistic

For completeness, we present the LR statistic for H, [see also Lubrano et al. (1986) and Richard (1984)]. The likelihood value under H, is

(4.11)

see footnote

5. Together

from section

3, the LR statistic

Proposition

H,:

5. Comparisons

Hi given by

may be stated in:

The LR statistic for the test of H,: 6, = 0 is given by

4.5.

LR,.=dog[

which under freedom. ’

with that under

(;:;+og[

8, = 0 has a limiting

(4.12)

(;:;*)I,

x 2 distribution

between classical and specification

with k,

degrees

of

test statistics

For the purposes of comparison of the classical test (CT) statistics for H,: S, = 0 and specification test (ST) statistics we assume that there exist prior exclusion restrictions on X in (2.1’): y = xsa*

+ zp + u = w*y* + u,

(5.1)

where S = diag( S,, S,) is a (k, k,) selection matrix where S, and S, are (k,, kl,) and (k,, k2*) matrices, respectively, and W, = (XS, Z), y* = ( LY’*,j3’)’ ‘When Z is in H. we have I’= y’.S,y/n8’ and x2 =y’S,,y/nE2 and LR,, n log[(l - {I)/(1 - $)I, namely Hwang’s (1980) LR statistic. See also Smith (1984).

reduces

to

52

M. Il. Pesorcrtl owl R.J. Smith,

Estimution

und orthogonali&

tests

(the analysis of the previous sections is fundamentally unchanged except we now maintain (Y= Sa,). Such prior restrictions might arise in practice if some potential instruments do not appear in (2.1). [See Hausman and Taylor (1981)]. Given that y* is the parameter vector of interest, Hausman’s (1978) ST statistic may be expressed as

-=p2(Y*

-Y*)])

(7, - ?*I,

(5.2)

where q* = (W;i?,,W,)-‘W,‘I?,y and f* = (W,lfi, .,W,)-‘W,lk,.,zy. Maintaining plim p, = y*, the implicit null and alternative hypotheses of ST are H,*: plim & = y* and H:: plim f* f y*, respectively. [See Hausman and Taylor (1981) Holly (1982), and Ruud (1984).] Consequently, H, L H,* c H: c H,. Re-expressing H,*: plim( W;k,,. x2u/n) = 0 immediately reveals that ST examines the validity of Rh.x, W, as instrumental variables in (5.1). To compare H,with H :, the precise form of H, * is straightforwardly calculated as

(5.3)

after some cancellations.

Note that the statements

P

T* --)

Y*

and

n1j2fi2 5 0

are logically equivalent allowing us to drop terms involving fi2 which gives plim( Z’ri ,,, , ,u/n) = 0. The expression (5.3) allows us to relate the independence hypothesis H,, and the instrument validity hypothesis H,*: Proposition 5.1. In the context of (2.1’)-( 2.27, the hypotheses H,: 6, = 0 and H,*: plim y* = y* are equivalent if and only if S, = Ikz or, equivalently, if X, is included fully in ( 2. I ‘). Proof.

See appendix

C.2.

Note that no requirement is made regarding S,, that is, which columns of X, are included in (2.1’). Now an asymptotic equivalence in a local power sense occurs between CT and ST statistics if and only if H, and H,* (and thus H, and H:) coincide. [See Hausma_n and Taylor (1982, proposition 2.7).]

M. H. Pesorun and R.J. Smith, Estimurion

and orthogonulip

fests

53

The CT statistics for H, of section 3 and the ST statistic for H,* Corollary 5.2. based on nl/*( j?+- R) are asymptotically equivalent in a local power sense if and only if there are no a priori restrictions on the coeficients of X, in (2. I’). Noting var[ni’*(T* - y*)] = a2[plim(W~k,W,/n)]-1 and var[n’/*(j$ - y*)] in order to preserve the positive semidefinite= a2[pW W~R,,,,,K/n)l-‘, ness of { ’ } in (5.2j common estimators of u*, p* (or zero), and h* should be used. Thus Wald and LM versions of ST use a^*, F2, A* and 6*, fi*, x2, respectively, giving, for example, Rh,x, - R, = Ph,x, - P,, 2 0. A simple version of ST follows: Proposition 5.3.

A simple version of ST for H,* is

ST = ii’R,,W,[ W;( R, - R,,W*( W,R,,.>W,)

-‘W;R,)

W*] W,R,ii/a*. (5.4)

A Wald version, ST(W), uses o^*, fi*, and A*, whereas a Lagrange multiplier version, ST( LM), uses O-*, fi2, and x2. Both these versions may be computed appropriately as tests for the exclusion of (P,,, x2 - P,,)W, in the IV regression Y=

W*c,+(P,,,,-P,)W,d,+v,

(5.5)

employing the metric R,,, ~,. Proof.

See appendix

Note that (P,>,., regression of W, on (5.5) excluding and Wald versions) and tively. Comparing (4.10)

C.3. P,7)W, may be computed as the predictor of W, in the LS II~~X.~ Moreover, it is easily seen that the regressions in including ( Ph,_Yz- P,)W, provide Ho-efficient (LM or HI-efficient (Wald version only) estimators of y*, respecand (5.5), we also have:

Corollary 5.4. ST(LM) and ST(W) are algebraically identical to the LM statistic of (4. IO) and the Wald statistic given in footnote 7, using R, and R,* x1 defined in Proposition 5.3, if and only if X2 is fully included in (2.1’). Proof.

See appendix

‘Using(P,,, ,, identical

statistics.

-

C.4.

P,, ) IV,, the predictor

of

W, in the LS regression

of

W, on

Mh X2, gives

54

M. H. Pesurun und R.J. Smith, Estimution

and orthogonality tests

The above results generalise the discussion in Newey (1985) and Smith (1983b). When 2 is in H, and generalized IV estimation is used, that is, p2 = 0, regression (5.5) reduces to that suggested by Newey (1985). 6. Asymptotic relative efficiency of orthogonality tests by Bahadur’s approach As is already stated in Corollary 5.2, when there are no u priori restrictions on the coefficients of X2 in (2.1’) all the various CT statistics and the ST statistics proposed for testing the orthogonality hypothesis, H,, have the same asymptotic power function under local alternatives. As a result the familiar Pitman local asymptotic theory cannot be used to distinguish between these tests. An alternative asymptotic procedure would be to adopt Bahadur’s (1960, 1967) criterion of asymptotic relative efficiency, where for a fixed alternative hypothesis the test procedures are compared according to the rate at which their size tends to zero as the sample size increases. In this framework, one test is said to be (asymptotically) Bahadur-efficient relative to another one if its ‘approximate slope’ is greater. Under fairly general conditions Bahadur shows that the approximate slope of a test statistic, say T, which under the null H, is asymptotically distributed as a X2 variate, is given by [r = plim( n-‘T( Hi), where H, stands for the alternative hypothesis. The calculation of the approximate slope of the CT statistic for Ha: 6, = 0 is a straightforward, albeit, tedious matter. The approximate slope of the Wald statistic, W,,, of (4.5) in the general case is given by”

where (6.la) @ = ii22- tit_

+B’B,,[~Z,‘-(X,,+r)-‘]B,,B,

9, = plim( X’MLX/n) = Z,, + B’( Zhh - E,,Z,‘Z,,)

The above result simplifies considerably ‘OThe details of the derivation given in appendix D.

of the various

(6.lb)

B,

in the simultaneous-equations

approximate

slopes reported

(6.ld)

case

in this section

are

M. H. Pewrun and R.J. Smith, Estimation

where

Z is included

und orthogonality tests

55

in H. In this case we have r = 0, and

A further simplification also results when the orthogonality hypothesis of interest involves all the regressors in X. In this case 52,,_ defined in (6.le) needs to be replaced by O,., = plim( X’M,, xX/n) = 0, and (6.2) becomes ti,+/ = a%‘( n,’

- a-1)

8,

(6.3)

where s2, = z,,,, = plim( X’M,,X/n). But since Q2,- 52, = B’{ 1hh - z,z~z;‘~,,}

B,

then t,,, 2 0. The strict inequality 5, > 0 holds in situations where B # 0, and 6 # 0 and columns of H can not be expressed as exact linear combinations of the columns of Z. This last condition also ensures the consistency of the Wald test of H,: 6 = 0 against H,: S # 0 for all B # 0. It is now easy to show that the approximate slope of the W,, statistic defined in Corollary 4.2 is also given by (6.2). The approximate slope of the WI, statistic in Corollary 4.2, which is based on the IV regression (4.6), is slightly different from (6.1) and is given by

where the expression tw,, is the same as that given by (6.2). The difference between the approximate slopes of W,, and WI, arises because of the different estimates of u2 that are used in their construction. In Wzs, u2 is estimated by ii’ii/n, while in Wrv, u2 is estimated by tYG/n. (See section 4.1.) It is now easily seen that when Z is in H, tw,, > tw,, = Ew, and the WI, statistic based on the IV regression (4.6) is asymptotically more efficient than the other forms of the Wald statistic for the test of 6, = 0. This is a fortunate result as WI, is also easier to compute by means of standard computer packages than the other W, statistics. An explicit expression for the approximate slope of the LM statistic given by (4.9) does not seem to be possible. But, as shown in appendix D, when Z is in H, the expression for the approximate slope of the LM version of the test of d, = 0 in (4.10) is given by [LM = [,*,/(l - $), where

56

M. II. Pesurun crnd R.J. Smith.

Estimution

and Lv2sis already defined by (6.2). The considerably when the null hypothesis under this case we have

und orthogonulity

tests

expression for tLM simplifies consideration is H,: 6 = 0. In

o’S’(ii?,’ - a,‘)s ‘5l.M= Finally,

1-

a26’fr’6 2

(6.6)

.

using (4.12) for the approximate

slope of the LR statistic,

we have

where c$, pi, and X2, stand for the probability limits of a’*, fi*, and x2 under Hi. In general, derivation of an explicit expression for these probability limits does not seem to be possible, but as shown in appendix D, when Z is in H, we have tLR = -log(l - pg). The expression for pi in general depends on 6, and 6, in a highly nonlinear manner. But when the null hypothesis is H,: 6 = 0, tLR becomes

(6.7)

A comparison of the approximate slopes for the various by (6.3) (6.4). (6.6) and (6.7) allows us to state:”

CT statistics

given

Proposition 6.1. When Z is included in H, the approximate slopes of the CT statistics for the test sf 6 = 0 satisfy the following inequalities:

Proof.

See appendix

C.5.

The results of this section also suggest that the WI, statistic of Corollary 4.2, which provides a natural extension of the Wu-Hausman statistic T2, is asymptotically Bahadur-efficient, besides being relatively easy to compute on standard software econometrics packages. “Notice

that for H,,: 6 = 0. cw,” in (6.4) simplifies &+,v = 0?6’( Q,,

’ - Qua’)“/{ 1 - 0%‘0,‘6}.

to

M. H. Pewrun

urld R.J. Smith. Estimation

und orthogonaliy

tests

57

7. Concluding remarks The likelihood model employed generalises that of Anderson and Rubin (1949) to include the case in which the endogenous variables in the reducedform model may be determined by a different set of exogenous variables to that in the structural equation. The ML estimator for the structural form coefficients is derived and shown to have the familiar generalized K-class structure. There is some efficacy in considering ML-based procedures in that they dominate others according to second-order efficiency arguments. A simple two-step estimator is also proposed which is asymptotically first-order efficient. In the usual simultaneous-equations context of Anderson and Rubin (1949) an algebraic equivalence is obtained between the LIML and Durbin’s (1954) IV estimators providing a likelihood-based rationale for Durbin’s approach to the problem of the efficient use of surplus instruments. In the general framework of the paper, classical tests (Lagrange Multiplier, Likelihood Ratio, and Wald) are derived for the independence of a subset of structural stochastic regressors and disturbance. Specification tests of the Hausman and Taylor (1981) type are also obtained and compared with the classical tests. An asymptotic equivalence between the tests occurs if and only if the ST regressors are fully included in the structural equation. Simple expanded regression versions of these test statistics are presented which are thereby natural generalisations of those discussed by, inter alia, Hausman (1978) Engle (1982) and Newey (1985). Bahadur’s asymptotic relative efficiency criterion is used to compare the various classical tests which are indistinguishable by the Pitman local power criterion. In particular, the generalisation of the Durbin-Wu-Hausman T, statistic is shown to be asymptotically Bahadur-efficient. Although the analysis is performed under the assumption of normality, this assumption is not crucial to either the asymptotic results for the ML estimator or those for the regression forms of the test statistics. The framework adopted here naturally allows for the introduction of serial correlation and conditional heteroskedasticity in the disturbances. This will be the subject of a further paper. Appendix A Derivation of the generalised K-class estimator generating eq. (3.9) and the concentrated log-likelihood function

Substitution

of (3.4) in (3.8) yields

x’P,,Ti = #2x%,

(A.1 >

where P,, and /’ are defined in (3.10). Using (A.l) in (3.6) and denoting the

58

M. H. Pesarutt and R.J. Smith, Estimation

moment-matrix

estimator

and orthogona1it.v tests

of D = Z,, - Z,,,E;jz,,

by d, we have

j = fi - a^‘(1 - fi’)&,

(A.2)

where fi=X’M,,X/n

M,,=I,,-Ph.

and

Now using the normal from (3.4), we have

eq. (3.1) corresponding

Z’ii = o^‘Z’M,( X-

ii&)j-‘8.

This result can be further s^= X’M$/{

to 2, after substitution

for l?

(A.3)

simplified

by noting

from (3.7) and (A.l)

that

nS2(1 - fi2)},

and using (A.2),

C’

After

=

fi-’

some routine

+

a^2p

_

1 _

a^2(1

algebra

s’)fi-ls^s^f-1 _

{P/(1

{P/(1-~‘)},

-$*)}I$-%=si-18,

CP(l - s”)&&‘s^= Thus,

*

we now have

C?2(1-;2)C-t8=1[l-

fi2)&jz-16”

{(l-

b2)/P}

- 1.

eq. (A.3) gives Z’&ii

+ (1 - 6’) Z’S&@

= 0,

(A.4)

where &, S,, F2, and fi’ are defined in (3.10). But, since S,X= 0, combining (A.l) and (A.4), we have (3.9) in the text. To derive the concentrated log-likelihood function we first note that I(8) Now

a - ilog(B’ljl).

M. H. Pesurun and R.J. Smith,

Estimation

and orthogonulity

tests

59

Hence

l(6)

a -

iI0gltij+tlog(l

(A-5)

- fi2) - 510g(S2i2).

Appendix B

The asymptotic distribution of n’/‘( ?{ ii /n - o ‘~3~) Define

E, by V,=u8;+-

(B.1)

E,,

so that u and E, are thus uncorrelated, R,, = P,, + is,,

and note that

+ o,(n-‘/2),

(B.2)

where we have defined A2 = plim p = [I + u~S’I/-~CS-~. it is straightforward to show that v;l;/PI From

Using (4.3) and (B.2)

= a%, + o,(l).

(B.l) and (B.3) we consider

the asymptotic

distribution

of

n112( $;lC/n - a2S2) = n1/2 [ ( V;u/n - a2a2) - (V,lW/n)(

W’RhW/n)-‘W’R,u/n]

+ o,(l). Noting that the odd-order moments of V, and u are zero consequent on the joint normality assumption, we may show that n112(V;uu/n - ~~6,) and W’R,,u/n’12 are asymptotically uncorrelated using (B.l) and (B.2). However, again from (B.l), n112(V;u/n - ~~6,) has asymptotic variance 2 ++ + i3,&, whereZta2+= n-‘E(V2’V2).

n1j2( V2’l;/n - u2S2)

(

[ V
03.4)

M.H.

60

Pewrun

und R.J. Smith, Estimuiion

and orthogonaliy

tests

where var[ IZ-‘/~W’R,~U] Note

= a2 plim( W’R,W/n).

that var[n-‘/2(f-u)]

See, for example,

=a*[plim(W’R,W/n)]-‘,

Turkington

plim(I/,‘W/n) Therefore,

under

--)

= (Z,,2L,,0).

03.6)

H,: S, = 0, we have

, i a*h

N 0,

(1985). Also

1)21,2

+

(~U,I,,O)[plim(W'R,W/n)l

-1(zli2i130)‘}). (B-7)

Appendix C C. I. Proof of Corollary 4.4

and

under

local

alternatives

to H,,

p’*= ~,(n-‘/~),

rlz,.Kz = o,(n-‘/2). C.2.

Proof of Proposition

Clearly

H, implies

5.1

H,* or H, L H,*. As

is full column

we require

rank,

S, = Ix for H,* to imply H, or H,* G H,.

W’l?T,,xlii = 0,

I?h,x, -

M. H. Pesorm

C.3.

Proof of Proposition

61

und R.J. Smith, Esrimation and orthogonality fests

5.3

To obtain (5.4) from (5.2) we note the invariance of (5.2), as formulated in the proposition, to the choice of g-inverse. Hence A -‘B-A-’ is a g-inverse for ABA if A is nonsingular, which gives the form (5.4) for A = (W;R,W,)-’ and B defined as [ .I. Now Rll. .,m. \ - PII) = 4,x, - R, which justifies the IV regression form (5.5) for statistic (5.4). C.4.

Proof of Corollary 5.4

Necessity follows immediately from Proposition 5.1. For sufficiency, firstly note that ST of (5.4) is invariant to the choice of g-inverse [Rao and Mitra (1971, lemma 2.2.2)]. Secondly,

R/l,JP,LV2 - pJw=

An

appropriate

W(W’R,7,

g-inverse

(Ph.,,- P,)W=

V2[(~;V2)-1~;w],

is the block corresponding bordered by zeroes which

YzW)-1W’)~2]~‘,

to X,, [ fi(l,l will reproduce LM

and W, C.5.

Proof of Proposition

Let x = (1 - ~~)/(l 1 L pi 2 p2 r 0, then

6.1

-pi),

where pi = a*S’52,‘6

and

p2 = a2S’Q_‘8.

Since

x-l x-12logx2X

which upon substituting desired result.

2/Q-Fl*20,

for x and using (6.3), (6.4) (6.6)

and (6.7) yields the

Appendix D Derivation D.1.

of approximate

slopes

Wald statistics

Using results written as n -‘w,,

in section 4.1, the Wald statistic

=

for testing

H,:

6, = 0 can be

ir’v;[e;2/v2 + @w( W’R,W)-1wP2] -lv$,ti%. (D.1)

62

M. 11. Pesuru~~ and R.J. Smifh,

Estimorion

and orthogonulity

tests

Under H,: S, f 0, as n --) cc we haveplim(n-‘fi’fi) = u2, plim(?-‘f$,‘ii) = a26,, and plim(n-‘~~~Z) = C1t,21,2. Plim(n-‘V;W) and plim(n-‘W’R,W) are given in (B.6) and (BS), respectively. Using these results in (D.l) and employing Rao (1973, example 2.9) to invert the probability limit of the matrix (normalized by n-‘) inside the square brackets in (D.l), we obtain

where @ is defined in (6.lb) To obtain the approximate that, under H,, plim( K’ii’ii) plim(, To obtain

in the text. slope for W,, defined

in Corollary

4.2, first note

= u2,

- %+,,,a) _ = u%;Z,f,$,.

plim( n-‘ii’$,ii),

note that

and since P,,, ,., = P,, + F,j2,then

(D.2)

CD.31

where

a,, , z is defined swLs = o-‘(plim(

by (6.le) in the paper.

Hence

~YF~,~ii/n) - plim( ii’p$/n))

or

To derive the approximate (iYi2/YB)W,,, then t WW= u *t W,,/plim(

slope

ii’;/,

).

of

WI,

in Corollary

4.2, since

WI, =

M. H. Pesorutl und R.J. Smith,

Estimation

und orthogonulity

tests

63

But i;=y-

Wf-M,X,d;=

W(y-f)-

+zd”z+u

Therefore, noting that, under H,, 7 and vz are consistent V2, respectively, we have

) = u 2 - 2u

plim( i’C// where

estimators

*d;S, + d;Z1D11,2d2,

of y and

CD.51

d, = plim( d;l H,). Now using (6.5),

from which it immediately in (D.5) yields

follows that d, = a*.E&&.

Substituting

this result

and therefore

where tw,,is given by (D.4). D.2.

The LM statistic

The LM statistic (G’C/ti’ti) W,,. which

that we are concerned with here is given by LM = is the LM version of the statistic for testing d, = 0 in (4.6). The expression for li is defined by ti = y - WY, where T= (W’P,,. ,,W)~‘W’P,,, \->y.Writing ti = W(y - -+) + u, we have plim( z.i'ti/n ) = u 2 - 2fJ2(KO)17

+ 77’qJJ,

where 77= plim( 3 - y)

=

(

plim

(

n~‘W’P,,,,~W))-l(plim(n’W’P~,.~u)].

(D-7)

64

M. H. Pesurun und R.J. Smith,

Estimution

und orthogonuliiy

tests

But using (D.2) and (D.3) in the above, we obtain

where

Hence,

5/M= EW’J(~ - 4).

0.3.

03.8)

The LR statistic

The LR statistic for the test of 6, = 0 is given by (4.12). Noting H,, plim( b2) = 0, plim( i2) = 1 - a2S’G’i ‘6, then

that under

0.9)

where ui, Ai, and pi are the probability limits of cY2. x2, and F2 under H,, respectively. When Z is in H, it follows from footnote 8 that (D.9) reduces to tLR = - log(1 - pt). Unfortunately, the derivation of an explicit expression for pi does not seem to be possible. However, in the simple case where the null hypothesis of interest is Ho: 6 = 0 and Z is included in H, using the result 1-

fi2= ii’M,,, $i/iX,

M. If. Pesoruu uud R.J. Smith, Estimation

where

ii are the LS residuals plim( n -‘ii’M,,, $) plim( K’ii’ii)

and orthogonality

from (2.17, and noting

tests

65

that

= a*(1 - a%?;‘S),

= a2 - 2a2(6’,0)q

+ q’Z,,q,

and q = plim( 7 we have l-p;‘,=(l and hence ,$,.R = log{ (1 - a%%y’8)/(1

- a%‘q’S)}.

References Anderson. T.W. and H. Rubin. 1949, Estimation of the parameters of a single equation in a complete system of stochastic equations, Annals of Mathematical Statistics 20, 46-63. Bahadur. R.R.. 1960. Stochastic comparison of tests, Annals of Mathematical Statistics 31. 2766295. Bahadur. R.R.. 1967. Rates of convergence of estimates and test statistic, Annals of Mathematical Statistics 3X. 303-324. Durbin, J., 1954, Errors in variables, Review of the International Statistical Institute 22, 23-32. Engle, R.F.. 19X2, A general approach to Lagrange multiplier model diagnostics, Journal of Econometrics 20. X3-104. Geary, R.C.. 1949. Determination of linear relations between systematic parts of variables with errors of observations. the variances of which are unknown, Econometrica 17, 30-59. Hall, P. and C.C. Heyde. 1980, Martingale limit theory and its application (Academic Press, New York. NY). Hansen. P.L.. 1982. Large sample properties of generalised methods of moments estimators, Econometrica 50, 102991054. Hausman, J.A.. 1978. Specification tests in econometrics, Econometrica 46, 1251-1271. Hausman. J.A. and W.E. Taylor, 1981, A generalised specification test, Economics Letters 8, 239-245. . Hausman. J.A. and W.E. Taylor. 1982, Comparing specification tests and classical tests, Mimeo. (M.I.T.. Cambridge, MA). Holly. A., 1982a. A remark on Hausman’s specification test, Econometrica 50, 749-759. Holly, A.. 1982b. A simple procedure for testing whether a subset of endogenous variables is independent of the disturbance term in a structural equation, Cahiers de recherches economiquea no. X209 (Universite de Lausanne. Lausanne). Holly, A. and J.D. Sargan, 1982. Testing for exogeneity within a limited information framework, Cahiers de recherchcs economiques no. 8204 (Universite de Lausanne. Lausanne). Hwang, H.-S., 19X0, Tests of independence between a subset of stochastic, regressors and disturbance, International Economic Review 21, 749-760.

66

M.H.

Pe.wrun und R.J. Smith, Estimation

and orthogonalit.v

tests

Lubrano. M.. R.G. Pierse, and J.-F. Richard, 1986, Stability of a U.K. money demand equation: A Bayesian approach to testing exogeneity, Review of Economic Studies LIII, 603-634. Nakamura, A. and M. Nakamura, 1981. On the relationships between several specification tests presented by Durbin, Wu and Hausman, Econometrica 49, 1583-1588. Newey. W.K.. 1985, Generalised method of moments specification testing, Journal of Econometrics 29, 229-256. Pagan, A.R., 1984. Econometric issues in the analysis of regressions with generated regressors, International Economic Review 25, 221-247. Pesaran, M.H., 1984. A general likelihood approach to the instrumental varible estimation and tests of misspecifications; presented at the/Australasian meeting of the Econometric Society, 1984, Sydney, .z Pesaran, M.H., 1986, Two-step, instrumental variable and maximum likelihood estimation of multivariate rational expectations models. M&o. (Trinity College, Cambridge), presented at the European meeting of the Econometric Society, 1986, Budapest: Pesaran, M.H.. 1987. The limits to rational expectations (Basil Blackwell, Oxford). Rao, C.R., 1973. Linear statistical inference and its applications (Wiley, New York, NY). Rao. C.R. and S.K. Mitra. 1971, Generalised inverse of matrices and its applications (Wiley, New York. NY). Reiersol. 0.. 1941, Confluence analysis by means of lag moments and other methods of confluence analysis. Econometrica 9. l-24. Reiersol. 0.. 1945. Confluence analysis by means of instrumental sets of variables, Arkiv for Mathematik. Astronomi och Fysik 32, l-119. Revankar, N.S.. 197X. Asymptotic relative efficiency analysis of certain tests of independence in structural systems. International Economic Review 19, 165-179. Revankar. N.S. and M.J. Hartley, 1973, An independence test and conditional unbiased predictions in the context of simultaneous equations systems, Internatal Economic Review 14. 625-631. Richard. J.-F.. 19X0. Models with several regimes and changes in exogeneity, Review of Economic Studica XLVII. l-20. Richard. J.-F., 19X4. Classical and Bayesian inference in incomplete simultaneous equation models, Ch. 4 in: D.F. Hendry and K.F. Wallis. eds.. Econometrics and quantitative economics (Basil Blackwell, Oxford). Ruud, P.A., 19X4. Tests of specification in econometrics, Econometric Reviews 3, 211-242. Sargan, J.D.. 195X. The estimation of economic relationships using instrumental variables, Econometrica 265. 393-415. Sargan, J.D., 1959. The estimation of relationships with autocorrelated residuals by the use of instrumental variables. Journal of Royal Statistical Society B 21, 91-105. Smith, R.J.. 19X3a. On the classical nature of the Wu-Hausman statistics for the independence of stochastic regressors and disturbance, Economics Letters 11, 357-364. Smith. R.J., 1983b. Limited information classical tests for the independence of stochastic variables and disturbance of a single linear stochastic simultaneous equation, Discussion paper no. ES142 (University of Manchester, Manchester), presented at the European meeting of the Econometric Society. 1983. Pisa. Smith. R.J.. 19X4. A note on likelihood ratio tests for the independence between a subset of stochastic regressors and disturbance. International Economic Review 25, 263-269. Smith, R.J.. 19X5, Wald tests for the independence of stochastic variables and disturbance of a single linear stochastic simultaneous equation, Economics Letters 17, 87-90. Turkington, D.A.. 19X5. A note on two-stage least squares, three-stage least squares and maximum likelihood cxtimation in an expectations model, International Economic Review 26, 507-510. Wu, D.-M.. 1973. Alternative tests of independence between stochastic regressors and disturbance, Econometrica 41. 733-750. Wu, D.-M.. 1974. Alternative tests of independence between stochastic regressors and disturbance: Finite sample results, Econometrica 42, 529-546.