Testing for conditional heteroskedasticity with misspecified alternative hypotheses

Testing for conditional heteroskedasticity with misspecified alternative hypotheses

ELSE&lER Journal of Econometrics 82 (1997) 63-80 Testing for conditional heterosked misspecified alternative hypo Naorayex Depnrtmnt 01 Economic...

1MB Sizes 0 Downloads 45 Views

ELSE&lER

Journal

of Econometrics

82 (1997)

63-80

Testing for conditional heterosked misspecified alternative hypo Naorayex Depnrtmnt

01 Economics, Received

1 July

Uilicersity

K. Dastoor

of Alhertu.

1994; received

Ednmrrton,

in revised

form

Alberta.

Canada

1 December

T6G 2H4

1996

Abstract For the multiple linear regression model, this paper examines the asymptotic behaviour of some tests for conditional heteroskedasticity when an alternative hypothesis is misspecifted.In the standard likelihood-based framework, a Lagrange multiplier statistic based on a correctly specified alternative is ranked above one based on an overspecified altcrnativc. It is shown that this ranking does not hold for statistics that are robust to the form of conditionai heterokurticity. It is also shown that the ranking obtained in the standard likelihood-based framework does not depend on a joint density being specified in its entirety; the same ranking can be obtained when a type of information matrix equality holds. (Q 1997 Elsevier Science S.A.

Asymptotic local power; Conditional heterokurticity; Conditional heteroskedasticity; Lagrange multiplier statistic; Misspecified alternative JEL class$catiott: Cl 2 Keywords:

1. Introduction For the multiple linear regression mode!, numerous test statistics are now available to researchers who wish to test the hypothesis of conditional homoskedasticity against the alternative of conditional heteroskedasticity; most of the ’ This paperis a revisedand abridgedversionof Dastoor(1995).The author is gratefulto Adolf Buse,Judith A Gilesand two refereesfor helpfulcommentsand/or discussions on previousversions of this paper. The author especially thanks RefereeI for numeroushelpfulsuggestions; in particular, for suggestingthat an information matrix type equality be rewritten as in (4.l), for the projection arguments

on (4.4) and for comments

0304-4076/97/%17.00

N(:! 1997 Elsevier 1804-O

PI1 SO304-4076(96)0

that inspired

Science

the formulation

and analysis

S.A.All rights reserved

of the statistic

KF.

relevant references can be found in Judge et al. (1985, Ch. 11) and, more recent ones, in Pagan and Pak (1993). However, it is probably safe to conjecture that the subset of most commonly used statistics includes among others: the Lagrange multiplier (LM) statistic proposed by Godfrey (1978) and Breusch and Pagan (1979); Koenker’s (1981) modification of the LM statistic; and a statistic proposed by White (1980b, Eq. (3)) which, as shown by Waldman (1983), is actually a special case of that proposed by Koenker (198 I). The LM statistic is constructed under the assumption of conditional normality while the Koenker (1981) statistic is a more-robust version in the sense that it can also be valid under the weaker assumption of conditional homokurticity. Recently, as an extension of the Davidson and MacKinnon (1985b) approach, Wooldridge (1990) proposed a general method that provides a statistic that is even more robust than that of Koenker (1981) in the sense of allowing for conditional heterokurticity; also see Wooldridge (1991a, b). This paper examines the behaviour of the LM, Koenker (1981) and Wooldridge (19903 statistics when the alternative hypothesis used to carry out the tests is possibly misspecified. In a standard likelihood-based framework where a joint density is specified in its entirety and where the standard regularity conditions hold. Saikkonen (1989) used the criterion of asymptotic relative efficiency to study the asymptotic properties of the classical (LM. likelihood ratio (LR) and Wald) statistics in the presence of misspecification. The results of Saikkonen (1989) show that, in the standard likelihood-based framework, an LM statistic based on a correctly specilied alternative hypothesis is ranked above one based on an overspecified alternative since it has greater asymptotic !ocal power. The main result in this paper is that this ranking does not hold in the Wooldrldge (1990) framework. where conditional heterokurticity is permitted. It is also shown that the ranking obtained in the standard likelihoodbased framework does not depend on a joint density being specified in its entirety; the same ranking can be obtained under the assumption that a type of information matrix equality holds. The next section states the assumptions of the model, presents the LM, Koenker (1981) and Wooldridge (1990) families of statistics, and attempts to clarify some anomalies concerning ‘the’ White test. In Section 3, the results for the Wooldridge (1990) statistics are derived and compared to the corresponding ones from Saikkonen (1989). Getion 4 defines a type of information matrix equality and examines the behaviour of a statistic that contains the LM and Koenker (1981) statistics as special cases. Finally, some concluding remarks are presented in Section 5. 2. The model and statistics Consider the model y, = xfJ + u, (with t = 1.2, . . . , n throughout) 4’ = x/3 f 11.

written as (2.1)

N.K.

Dmtoor

/Journnl

of Econometrics

82 (1997)

63-80

65

where $ is the tth row of X the n x k matrix of observations on the independent variables, y, is the tth element of y the n x I vector of observations on the dependent variable, fl is a k x 1 vector of unknown parameters, and u, is the tth element of u the n x 1 vector of disturbance terms. Following Pagan and Pak (1993), the tests to be considered will be embedded in the unifying framework of ‘conditional moment tests’ as discussed by Newey (1985), Tauchen (1985) and White (1987). Therefore, let the vector of conditioning variables, We,be defined as containing the algebraically distinct elements of x,, q, and z,, excluding unity, where yt and zI are some variables to be used below. Although it is possible for some variables in w, to be redundant in terms of conditioning per se, it is, nonetheless, notationally convenient to define IV, in this manner. Now let 0,” = E[uf 1w,] then the hypothesis of primary interest is that of conditional homoskedasticity under which E [u: 1bv,] is a constant. In the next section, the analysis will be carried out under a drifting datagenerating process (DGP) which is defined by the following maintained assumptions that include conditional heteroskedasticity given by the sequence 0: = s() + n- ‘/2y:s,

(2.2)

where (6,, i;T)T is an (s + 1) x 1 vector of constants with do a scalar and y: is the tth row of the :? x s matrix Q.’ Assumption 1. (a) The model is given by (2.1) and (2.2X (b) E[u,Iw,] (c) of > 0; und (d) I’, c V [II: 1w,] > 0.

=O;

Assumption 2. (a) For each n, {( y*, ~7): t = 1,2, . . . , n} is u sequence of independently and identically distributed vector randvm variubles; und (b) E[f;J’] is a positive dejinite matrix where j; is 61column vector contuining the algebraically distinct eletnents of \v*, \L’:, with \i;‘, = (1, w:). Assumption 3. (a) E [ I,f;i 1’ ‘“1 und E [ Ifriu: I2+‘I are uniformly bounded for some tz> 0 wherefij is the ith element r,fJ;; and (b) lim E [(u,’ - SO)2\v*,wiJ exists and is a positive dejne matrix.

The first assumption is standard and, given the definitions of w, andf,, the two remaining assumptions succinctly present conditions that are sufficient for (and help to simplify) the derivation of the ensuing results. In particular, Assumption 2(a) follows Newey (1985) and, as there, the dependence of the data (and L The term ‘drifting DCi P’ is borrowed from Davidson and MacKinnon (1993) and is used simply DGP that drifts in an appropriate manner; see Dastoor (1990) and Davidson and MacKinnon (1993, p. 409) for a discussion of the terminology used to denote different types of drifting DGPs.

to denote a

66

N.K.

Dnsioor

/Journal

cf Econometrics

82 (1997)

63-80

parameters) on iz is suppressed for notational convenience unless necessary. Assumption 2(b) ensures the existence and positive definiteness of moment matrices such as E[w+..w&] and E[g&J where g1 is a column vector containing the distinct elements of (z, - E[z,])(z, - E[z,])~. Using the Cauchy-Schwartz and Minkowski inequali!ies, Assumption 3(a) provides all the moments required to apply a law of large numbers and a central limit theorem, and the existence of the limit in Assumption 3(b) helps to simplify the analysis. By specifying appropriate weaker assumptions on the conditional means and conditional variances of the disturbance terms, the analysis can be extended to cater for more complex models such as time series models with lagged dependent variables. However, for the Wooldridge (1990) statistics, the results obtained below will also apply to more general models so the resirictive maintained assumptions above help to simplify the ensuing analysis. Now suppose that the alternative hypothesis of conditional heteroskedasticity that is tested for is specified as fJ*I = ZT I * *I x * = xg + ZTU

(2.3)

where M, = (crc,txT)T is an (r + 1) x 1 vector of unknown parameters with u. a scalar, z& = (1, zT) is the tth row of the II x (r + 1) matrix Z, = [e, Z], e is an n x 1 vector of ones, and z, may or may not equal cit. Then c(=0 under the hypothesis of conditional homoskedasticity. The three families of statistics to be considered are the LM statistic proposed by Godfrey (1978) and Breusch and Pagan (1979) and the statistics proposed by Koenker (1981) and Wooldridge (1990, p. 34); for reasons stated at the end of this section, the analysis of the latter two families wili also provide the behaviour of the statistics proposed by White (1980b) and Hsieh (1983), and see Dastoor (1995, Section 5.3) for a discussion of a fourth family of statistics proposed by Bera and Yoon (1993). Let the Koenker (1981) and Wooldridge (1990) statistics be denoted by K and W, respectively, and note that ‘W ‘does not refer to a Wald statistic. Henceforth, as in Saikkonen (1989), C will be used as a generic letter to denote any one of LM, K, and W unless stated otherwise. Also, ‘Koenker’ and ‘Wooldridge’ will refer to Koenker (198 1) and Wooldridge (1990) respectively. Let l\/i = I,, - e(eTe)- ’ eT, fi = (XTX)- ‘XT~, li, = yI - XT/?, and v^ be an n x 1 vector with tth element r?,= #. The three statistics can then be written in generic form as cz = (&qTb,

l (&a),

(2.4)

where 6 = (ZTMZ)- ‘ZTMt, A DLnr = ~c?~(~-‘Z.‘MZ)-‘,

(2.5) 8, = ‘jK(n-lZ’rMZ)-‘,

N.K.

Dastoor

/Journal

of Econometrics

82 (1997)

63-8~

67

and

6, = (n-‘ZTMZ)-ln-‘ZTMPMZ(n-‘ZTMZ)- * with e2 = n- yjj, fK = n -‘$MC, r = diagbj,, &, . . . , f”} and $ = (fit - 3’)‘. Each of the three statistics is a quadratic form in &d and the matrix of each quadratic form is, under uppropriate assumptions, a consistent estimator of the inverse of the asymptotic variance-covariance matrix of ,,&i. Before considering the appropriate assumptions, two points should be noted. First, Godfrey (1978) specified the alternative hypothesis as 0: = expfz&,) in deriving the above LM, statistic and Breusch and Pagan (1979) demonstrated that the same statistic is applicable for the alternative t$ = @Z&Q.) for appropriate h( .). Subsequently, Godfrey and Wickens (1982) showed that (2.3) and of = h(z&a,) are ‘locally equivalent alternatives’. Therefore, (2.3) is not particularly restrictive since it should be viewed as a linear approximation to certain nonlinear forms of conditional heteroskedasticity that depend on z, only. Second, and more important, the term ‘null hypothesis’ will be used in the sense of White (1980b, p. 823); i.e., a null hypothesis consists of ~11 the assumptions/hypotheses initially used to derive the known asymptotic dist~bution of a test statistic. The null hypothesis of any test statistic can then be partitioned into three subsets; the hypothesis of primary interest, the maintained assump tions, and the auxiliary assumptions, if any. In the framework considered here, the hypothesis of primary interest is that of conditional homoskedasticity. The maintained assumptions are those which are always assumed to hold; i.e., Assumptions 1-3 so, given Assumption l(b), the analysis here does not consider the case where the conditional mean is misspecified. For a given statistic, a particular assumption is classified as auxiliary if another statistic is robust to violations of this assumption so such a classification is an equivalent view of robustness, defined appropriately. Therefore, if a null hypothesis is false then the maintained assumptions still hold but the hypothesis of primary interest and/or the auxiliary assumptions do not hold. Given Z, the hypothesis of primary interest and the maintained assumptions are the same for each of LMz, K,, and Ws. The matrices of the quadratic forms in (2.4) only differ as a result of any auxiliary assumptions imposed on the behaviour of u,. In the case of LMz, Godfrey (1978) and Breusch and Pagan (1979) imposed the auxiliary assumption of conditional normality. In modifying LMZ, Koenker proposed a more-robust statistic that allows for at least nonnormality in the sense that if c( =0 in (2.3) then KZ does not impose the restriction that E[ufl w,] ==30(; as is done by the auxiliary assumption of normality. Instczd, Koenker imposed the weaker auxiliary assumption of conditional homokurticity; i.e., given a sample size n, E[$( w,] is the same constant for all t. The statistic proposed by Wooldridge is ‘completely robust’ in the sense that it allows for conditional heterokurticity (i.e., given a sample size n,

68

N.K.

Dustoor

/Journal

of Econometrics

82 (I997)

63-80

E[t(f 1w,] = p4,) so Wz does not require an auxiliary assumption in addition to the maintained assumptions. Let Ho(Cz) denote the null hypothesis of Cz then Cz 5 X’(r) under Wa(Cz) as 6, is then a consistent estimator of the asymptotic variance-covariance matrix of $4. Conditional on w,, homoskedasticity and normally distributed uI imply homokurtic u,, and W, allows for conditional heterokurticity so H&!Mz) c Ho(&) c H,(W,); cf. Wooldridge (p. 22). Therefore, the differences between these three statistics can be viewed (equivalently to robustness) as resulting from each statistic being proposed for a different null hypothesis. While it is useful to recognize explicitly the different null hypotheses involved, it should also be noted that White (1987) and Wooldridge argue that it is preferable to use a statistic that is robust to violations of an auxiliary assumption of another statistic. In particular, Wooldridge (p. 19) argues that statistics based on auxiliary assumptions ‘can result in inference with the wrong asymptotic size while having no systematic power for testing the auxiliary assumptions that are imposed’ as part of the null hypothesis. In concluding this section, the following comments are an attempt to clarify some anomalies concerning ‘the’ Wi:ite test and its extensions; thereby, providing the reasons alluded to in the earlier statement concerning the behaviour of the White (1980b) and Hsieh (1983) statistics. First, it is important to note that White (1980b) actually proposed two statistics which are not always distinguished in the literature.” To be more precise, let $, contain the distinct elements of .xrsT, excluding unity, then the second of the two statistics proposed by White (1980b) is algebraically equivalent to k,; a result shown by Waldman (1983). As in the case of I&, this White (1980b, Eq. (3)) statistic is based on the auxiliary assumption of conditional homokurticity and is the one commonly used in carrying out what is normally referred to as ‘the’ White test. In extending the analysis of White (1980b) to include time series regression, Hsieh (1983, p. 287) hite’s test of heteroscedasticity to allow for claimed to have also ‘extended heterokurtic disturbances’, and Wooldridge (p. 34) noted that in ‘the case of the White test in a linear time series model, . . . ‘W, is’ . . . a statistic that is asymptotically equivalent to Hsieh’s (1983) suggestion for a robust form of the White test’. Wowever, it should be noted that the first White (1980b, Eq. (1)) statistic does allow for conditional heterokurticity. Also, it can be shown that this White statistic is algebraically equivalent to W, so W, generalizes (for arbitrary Z) a White (19gOb) statistic that is already robust to the form of conditional heterokurticity. Given appropriate assumptions, the Hsieh (1983)

3 For example, in Amemiya (I985, p. 2OO), the theoretical discussion is with respect to the first White (19SOb, eq. (1)) statistic but the computational argument is with respect to the second White (19SOb. Eq. (3)) statistic. Also, in the second term on the right-hand side of the equation for A given by Amemiya (1985, p. 2OO), the exponent of T should read ’ - 3’ instead of ’ -2’.

N.K.

Dastoor

/Jortrnal

of Econometrics

82 (1997)

(53-H)

69

statistic can be generalized to cater for arbitrary 2 so let such a statistic denoted H&. Using the artificial regression in Pagan and all (1983, Eq. (27)), it can be shown that HS, is a Wald-type statistic while kVz i see Dastoor (1995, Section 5.1). The analysis of Gallant and White (1988, p. 129) then shows that HSZ and Wz are asymptotically equivalent under a more general framework than that considered here. Therefore, the asymptotic bchavior of the White (1980b) and Hsieh (1983) statistics can easily be ascertained from the results obtained below for Kz and Wz.

3. Some asymptotic local power corn The effect of using a misspecified alternative hypothesis could be examined by subjecting Cz to conditional heteroskedasticity generated by (2.3) with z, = clt but this would lead to situations where Cz will have a degenerate asymptotic distribution. Therefore, as conventionally done, the ensuing analysis will be carried out under the drifting DGP as defined in the previous section. Since the hypothesis of primary interest is that of conditional homoskedasticity and as (2.2) will hold throughout, henceforth, the hypothesis of primary interest will be stated as 6 =O. Hence, the drifting DGP contains as a special case the hypothesis of primary interest and, as the sample size increases, it approaches this hypothesis of primary interest from the direction of (2.3) with z, = qt. Normally, a drifting DGP is formulated in terms of regions local (or close) to a null hypothesis but here the three statistics have different nulls so (2.2) considers regions local to the hypothesis of primary interest which (given the maintained assumptions) is a common component of the null hypotheses of the Cz statistics. Following Davidson and MacKinnon (1987), a sequence that approaches the hypothesis of primary interest from an arbitrary direction could also be considered. However, the analysis is carried out under (2.2) in an attem to provide the local behaviour of a statistic that uses the Z variables when, in effect, the conditional heteroskedasticity is a function of the Q variables. Therefore, (2.3) with zI = qr will be referred to as the ‘correct’ alternative hy~t~~is. Let Hw denote the framework where only the maintained Assumptions l-3 hold; i.e., the heterokurticity-robust framework of Wooldridge provides the drifting DGP Hw. The following theorem then provides three results which can be obtained as special cases of the analysis of Gallant and White (19 However, a direct approach is used to provide a proof of the theorem sine a For the concentrated optimand given by n-‘(i - ZX)~I%I(L’ - Zx). -20 ‘ZrMc’is the vector cvaluatcd at the restricted estimate a =0 so, by noting that i in (2.5) is a nor transformation of this gradient vector, Theorems 7.6 and 7.7 of Gallant and White (1988) relevant for the first result in the following theorem and their Theorems 6.3(a)and 7.9are r~l~vaal the second and third results, respectively.

t r are for

N.K.

70

Dastoor

~Jorrnal

of Econometrics

82 (1997)

63-M

approach is much simpler than one that evaluates the numerous general expressions of Gallant and White (1988) for the special case being considered here. The proof of a theorem is relegated to the appendix and the proof of a proposition will be omitted since it should be fairly obvious. Theorem 1. Under Hw:

(a) ~~~,~a~~{E[~~~~l}-‘E[~,s:16, (EC~~~~]}-lsz{E[~~~~]}-l)

where it =z, -

I Q = lim E[(u? - cS,,)~.?,~~]; (b) & = {E[~~~~])-*R(E[~~~~]}-’ (c) Wz Z x2(r, A( W,)) where E.(Wz) = SrE[y,Z:]a-

’ E[Z$JS.

(3.1) + or(l); and (3.2)

Result (a) of this theorem provides the asymptotic distribution of &a for arbitrary Z and this will also be used later to examine the behaviour of LMZ and Kz. Result (b) shows that 6, is a consistent estimator of the asymptotic variance-covariance matrix of &3i even in the presence of misspecification (i.e., Z # Q), and result(c) shows that W, is asymptotically distributed as a x2 variate for all Z. The null hypothesis of W, is Ho(Wz): S =0 (and the maintained assumptions) so W, E X2(r) under H,(W,). Result (c) also shows that, in the presence of misspecification (Z # Q), it is possible for Wz to have asymptotic local power equal to size even if 6 # 0. For example, if r < s then rank(E[i,qT]) c s so there exist certain values of 6 # 0 for which A( W,) =0 and, for these values of 6, W, will have asymptotic local power equal to size. Such values of 6 will belong to the ‘implicit null hypothesis’ of W, as defined by Davidson and MacKinnon (1985a, 1987); see Dastoor (1990) for a more detailed discussion of concepts such as an implicit null hypothesis. However, if Z is such that rank(E[ZtqT]) = s then Wz will have asymptotic local power greater than size for all 6 # 0. This may suggest the inclusion of as many variables as possible in Z but, as will be seen, this suggestion could also lead to a loss of asymptotic local power under some circumstances. Theorem 1 provides results for arbitrary Z so, in principle, it can be used to compare the behaviour of Wz (for Z # Q) with that of W,, a Wooldridge statistic based on the correct alternative hypothesis. However, a particular inequality between L( W,) and i( WQ) cannot hold for all Z # Q so, in order to use the results of Das Gupta and Perlman (1974), some special cases of interest are considered. Let Q-, Q ss [Q-, Qr], and Q+ = [Q, Q2] be matrices of dimensions n x s- , n x s, and n x s+ , respectively. Throughout, the subscripts ‘ - ‘, ‘Q’ and ‘ -t’ will then denote quantities which use (as the Z variables) Q- , Q, and Q+ , respectively, so W _ , Wo, and W + correspond to cases where the

N.K.

Dastoor

/Journal

of Econometrics

82 (1997)

6340

71

alternative hypothesis in (2.3) is underspecified, correctly specified, and overspecified, respectively. Then it can be shown that A(W+) 2 l.(WQ) > jL(W-).

(3.3) The proof of these inequalities (provided in the appendix) shows that an analogy can be drawn with the result which states that an increase in the num instruments can lead to more efficient generalized method of moments estimators; e.g., see Davidson and MacKinnon (1993, Section 17.4). Note that, as mentioned earlier, (3.3) shows that a particular inequality between n(IVz) and i(W,) cannot hold for all Z # Q. Now let PC(r) denote the asymptotic local power of the statistic r under Hc and also let P&r*) 2 Pc(r2) denote the case where the statistics r1 and r2 cannot be ranked in terms of asymptotic local power. Given the first inequality in (3.3) and the fact that s+ 3 s, the following proposition follows from the results of Das Gupta and Perlman (1974). Proposition W. Under H,v: Pw( WQ)~ P&W +). This proposition shows that (in the Wooldridge framework) W, and W + cannot be ranked in terms of asymptotic local power so, as mentions earlier, there exist values of S # 0 for which the inclusion of Q2 in Q+ leads to a loss of asymptotic local power; of course, there also exist values of 6 # 0 for which the inclusion of Q2 leads to a gain in asymptotic local power. Also, (with reference to the paragraph following Theorem 1) 2 = Q t- gives rank(E[i,qp]) = s so both W, and W + will have asymptotic local power greater than size for all 6 # 0. In a standard likelihood-based framework where a joint density is specified in its entirety and where the standard regularity conditions hold, Saikkonen (1989) (hereafter ‘Saikkonen’) used the criterion of asymptotic relative efficiency (ARE) to compare the behaviour of the classical (LM, LR, and Wald) statistics in the presence of misspecification. In this standard likelihood framework, it is easily seen that (when comparing two statistics with the same null but different alternative hypotheses) the ARE criterion in Eq. (10) of Saikkonen provides the same qualitative conclusion as that based on a direct comparison of asymptotic local power. Therefore, the results of Saikkonen (when specialized to LMQ and LM +) can be compared to those in Proposition W. However, first it is im to note that Saikkonen considered two cases when an alternative hypothesis is overspecified. Below, LM + corresponds to the first case of Saikkonen (p. 359) in the sense that LM, is a test for zero restrictions on the coefficients associated with Q.+. The second case in Section 4 of Saikkonen corresponds to the situation where, although the alternative hypothesis is overspecified with Z = Q+ = [Q. QJ, the test is for zero restrictions on the coefficients associated with Q on1~. This second case is not considered here since in such a formulation the hypothesis of primary interest would, in addition to conditional bomoskedasticity,

12

N.K.

Dmtoor

/Jorrrnnl

oj Econoawtrics

X2 (1997)

63-80

include conditional heteroskedasticity that is a function of the variables in Qz only. Although the drifting DGP actually considered by Saikkonen does not specify a particular distributional assumption, in the present context, let the Saikkonen framework be denoted by HLM: Hw holds with ut 1wl - N (0, of). Either directly or by using (6) in Saikkonen, it can be shown that ,!,Mz 5 x2@, i.(LMz)) under HLhl where j,(Z,MZ) = (26$- lST E[ql$] (E[@T]j-‘E[&qT]G. It can also be shown that E,(LM+) = L(LMQ) >, l.(LM-);

(3.4)

cf. Saikkonen (pp. 359 and 362) for the equality and inequality in (3.4), respectively. Given the equality in (3.4) and the fact that s+ > s, the results of Das Gupta and Perlman (1974) show that the conclusion of Saikkonen based on the ARE criterion can also be stated in a form comparable to Proposition W as follows. Proposition LM. Under HLM: P,,(LM,) > &&M+) ifs # 0. The earlier comments concerning rank(E[?&j) also apply here so, under NLM, LM, and LM, will have asymptotic local power greater than size for all S # 0. Propositions W and LM provide different conclusions. In the Saikkonen framework, the inclusion of Q2 in Q+ always leads to a loss of asymptotic local power when the hypothesis of primary interest does not hold, whereas, this is not the case in the Wooldridge framework. In the next section, it shall be shown that the crucial factor that provides these different conclusions is the presence or absence of a type of information matrix equality. In concluding this section, it should be noted that when Co and C- are compared under H,-, each of the three frameworks provides the same conclusion that such statistics cannot be ranked in terms of asymptotic local power. To see this, first note that the inequality in (3.4) and s > simply PLM(LMQ)$ZPLM(LM-). Then note that the drifting DGP considered by Koenker, denoted HK, is such that H Lh, E HK E Hw. Since LMQ and LMcannot be ranked under HIAM and as LMZ, KZ and Wz are asymptotically equivalent under the most restrictive framework HLM, it follows that Co and Calso cannot be ranked under the less restrictive framework Hc where C = K, W. An intuitive interpretation, for observing the same conclusion across the three frameworks, is as that provided by Saikkonen (pp. 361-362); i.e., under Hc, Pc(C-) > Pc(CQ) is possible if Q- captures some essential features of Q, and Pr(C-) < P&C,) is also possible if Q- fails to capture the essential features of Q.

N.K.

Dnstoor

4. A type of information

/Journul

of Econometrics

82 (1991)

63-80

73

matrix quality

The standard regularity conditions imposed in the likelihood framewor Saikkonen ensure that the usual information matrix (I ) equality holds. Since Hw does not specify a likelihood function, let an I&&type equality be defined as lim E[(u: - c$,)~,?~$] = IC~E[Z,ZT],

(4.1)

where K; = lim E[(u: - c&,)~]; cf. the IM-type equality B, =2ai& in and Domowitz (1984, p. 154) which is in terms of all the parameters of the while (4.1) is defined with respect to only the parameter to sted, Q. For the concentrated optimand n-‘(C - Z(X)~M(C - Za), loosely s ing, (4.1) states that (at o[ =0) the asymptotic Hessian is proportional to the asymptotic variante-covariance matrix of the gradient vector; i.e., E[&$] must be proportional to Szin (3.1). The IM-type equality is specific to the model considered here but similar equalities can also be obtained for other cases of interest, one of which is discussed in Dastoor (1995, Section 5.4). To explain the relevance of this IM-type equality, the behaviour of KZ and LMZ is now examined in greater detail by considering the statistic Kg = (~~)TCP(n-‘zTMz)-‘]-‘(J;;B)

(4.2)

where 1; = nn- ‘CT6 + h8* and the known scalars R and 6 are such that (a, b) E .d with .C/={(((~,h)l~=O,h>Oandfinite)u{(l,

-I)}.

By construction, K$ yields as special cases KZ and any statistic that is proportional to LMZ with a fixed factor of proportionality. In particular, if (n, b) = (1, - l), (0,2) then 9 = 7x, 2k4 so K$’ = KZ, LMz, respectively. Basically, with N = 0, K$ = 2LMZfh is an appropriate statistic whenever an auxiliary assumption is such that h = (E[ u: 1w,]/oI’) - 1 is a constant. For example, if an auxiliary assumption states that the conditional distribution of rr, is a t(m) distribution for given m > 4 then, with a =0 and h =2(m -1)&n -4), K$ = (m -4)LM&n - I) t x’(r) under its null hypothesis where conditional homoskedasticity and Assumptions 1-3 hold with conditional t(m) disturbances. In passing. it
i14, = E [u,” (w,] and & = z( - E[z,]. To simplify the analysis, let rank(G,) = p + 1 < n; basically, ~1~must contain the distinct elements of vech($j so the requirement of full column rank saves having to use a selector matrix throughout. Since the span of G, is a subspace of R”, there exists an IZx (H - p - 1) matrix A such that the columns of [G,, A] form a basis of R” with GZA = 0. For appropriate choices of a scalar &,,, and vectors &” and 4.,,“, a decomposition of !14 is then given by ~4

= @onef ‘Wa + A$n,,.

(4.3)

Being an identity, (4.3) does not alter the nature of Hw. Note that the auxiliary assumption of conditional homokurticity can now be stated as &” =0 and +“,, = 0 since (given a sample size n) j14, = &” is then the same constant for all t; whereas, conditional homoskedasticity and conditional normality imply &,, = 3& &,, =0 and 4”” =O. Al so note that, in the present context, the drifting DGP considered by Koenker is basically given by HK: Hw holds with

&;” = Op(rr-*i2)

and

$n,, =O.

Result (a) of Theorem 1 and (4.2) show that, under Hw. K$ will be distributed as a quadratic form in asymptotically normal variates and not necessarily as a z2 variate; whereas Wz is always asymptotically distributed as a x2 variate. Therefore, the following theorem provides two necessmy ctrtd sc&!icient conditions under which K$’ is asymptotically distributed as a x2 variate. Note that, since premultiplicatioh of (4.3) by G; yields (b, d~&)‘~= (iz- ‘GzG,)- ‘11-l GYPS, the maintained assumptions ensure the existence of & = plim{&,,) and & = plim(&(;,} which are required in the following theorem. Theorem 2. Under Hw, the jdlowing

statewtents we eqtrivdent:

(a) & =0 a& (1 - a)& = (1 + h)S$ (b) 52 = (& - Sg)E[Ztiy] cud (1 -N)&

= (1 + h)Si; and (c) Kg ” z2 (r, n(Kzh)) cfrzd(1 - cr)& = (1 4 h)Si where i.(K;h) = (u/J~ + h6~)-‘6’rE[y,~~]~E[~,~~])-‘E[Z,q:]f~.

(4.4)

Condition (a) of this theorem provides the implkir auxiliary assumptions in the sense that they arc necessary and sufficient for Ksh to be asymptotically distributed as a x2 variate. In the case of a Koenker statistic, the implicit auxiliary assumption is given by only the first part as the second part is an identity when (a, b) = (1, - 1). In the case of a statistic that is proportional to LMZ (with a fixed factor of proportionality), the second part is a further implicit auxiliary assumption that basically specifies a restriction between the fourth and second conditional moments of u,; e.g., in the case of LMz where ((I, h) = (0,2), this second part specifies &, = 38; for which conditional normality is sufficient

N.K.

Dastoor

/.lottrttal

of

Econometrics

82 (1997)

63--M

75

but not necessary. Condition (b) is in terms of the IM-type equality in (4.1) so KZ will be asymptotically distributed as a x” variate if and only if (4.1) holds where K: is unrestricted. In the case of a statistic that is proportional to LMz (i.e., (a, h) = (0, h)), the second part of condition (b) requires (4.1) to hold with K;

=

h&

For given (n, h) E ,d, let H&, h) denote a particular from the family of drifting DGPs given by

framework/mem~r

HIM: Hw holds with the IM-type equality in statement (b) of Theorem 2 or equivalently where Hw holds with statement (a) of Theorem 2. Some further points can now be noted. First, H&x, b) can be viewed as an implicit framework in the sense that it is one under which K$’ is asymptotically distributed as a x2 variate so, as defined by Davidson and MacKinnon (1985a, 1987), the implicit null hypothesis of K’$’ is given by H,&, h) with 6 =O. Second, Koenker showed that K, ‘1 x*(s, i.(K,)) under UK so (as Z is arbitrary and HK c HI&l, - 1) asymptotically) Theorem 2 provides a more general result that KZ ? x2+-, i(K,)) under HIM( 1, - I), the implicit framework of Koenker.5 Third, Wz and K$ are asymptotically equivalent under HIM@, 6) as (3.2) reduces to (4.4) in this case. Therefore, under the drifting DGP Hw, the completely-robust statistic of Wooldridpe will have the same limiting distribution as that of the less-robust statistic Kg’ when the later’s implicit auxiliary assumptions hold.6 Fourth, for given (a, h) E .G//,basic least-squares projection arguments show that the noncentrality parameter in (4.4) is maximized if zr contains yI so with Z = Q + , Q, in turn, it is easily seen that A(K”+h)= i,(K$) = (u&, + h&-‘STE[&#‘]S. Then Theorem 2 and s+ > s provide the following proposition which contains the result in Proposition LM as a special case.

This proposition shows that all the frameworks in HIM provide the same conclusion that (when the hypothesis of primary interest does not hold) a statistic based on the correct alternative hypothesis has greater asymptotic local power than one based on an Gverspecified alternative; in particular, the frameworks of Saikkonen and Koenker yield the same conclusion.’ It is important to ‘See Whang and Andrews (1993. Footnote 2, p. 294) for a correction to the theorem presented by Koenker. In revising the present paper, the author hccame aware ofa recent paper by Godfrey (1996) which also states this correction and which provides the bchaviour of KZ under (basically) NK and not &&I,

-1).

6In particular,if if &

=0

‘Using

and

only I/J,, =3&

C/Q; =0 holds then Wz then Wz ? Kl *, LMz

the above least-squares that, under H&,

also be shown

? Kz ? x’(r,

projection arguments h), K;;h and K”_ cannot

? 2&4,

- ii,‘,) ’ LMz ? x’(r,

L(Kz));

whereas,

L(LIW~)). for the non-centrality be ranked in terms

parameter ofasymptotic

in (4.4), it can local power.

76

N.K.

Dustmr

/Journul

cf Erononwtrics

82 (1997)

63-80

note that the analysis of Saikkonen does not provide the behaviour of K, under HK. Being a modification of LMZ, Kz is an LM-type (and not an LM) statistic as it is not formally constructed from a likelihood function so the Koenker framework cannot be viewed as being likelihood-based in the ustial sense where a joint density is specified in its entirety. Hence, this proposition also shows that the ranking of LM, and LM, obtained under HLM does not depend on a joint density being specified in its entirety; the same ranking can be obtained under the assumption that an IM-type equality holds. Therefore, the crucial factor that accounts for the different conclusions in Propositions W and IM is the assumption that an IM-type equality holds in Proposition IM. An intuitive interpretation, of the different conclusions in Propositions W and IM, can be based on the notion that a statistic constructed with r&want i&rtnution only would be expected to have greater asymptotic local power than one formulated with both relevant and irrelevant information; i.e., basically, the inclusion of irrelevant information is expected to hinder the performance of a statistic. Given (2.2), the conditional heteroskedasticity is basically a function of Q only so any information contained in Qz (in Q + = [Q, QJ) is irrelevant to the form of heteroskedasticity. In (4.3), plim & E (bo =0 basically ensures that any information contained in Qz is also asymptotically irrelevant to the form of heterokurticity. Therefore, under any framework in HIM, Qz is not permitted to contain any relevant information so its inclusion results in loss of asymptotic local power. However, in the Wooldridge framework where & =0 is not required to hold, Q2 is permitted to contain information relevant to the form of heterokurticity. Therefore, although Qz contains information irrelevant to the form of heteroskedasticity, its inclusion in the Wooldridge framework could provide greater or lesser asymptotic local power depending on whether or not Qz contains any information that is asymptotically relevant to the form of heterokurticity.

5. Concluding

remarks

For the case of testing for conditional heteroskedasticity in the linear regression model, this paper has shown that (when based on a correctly specified and overspecified alternative hypothesis, respectively) the two Wooldridge statistics cannot be ranked in terms of asymptotic local power. This is in contrast to the results of Saikkonen (1989) which showed that, in a standard likelihood-based framework, corresponding statistics can be ranked since an LM statistic based on a correctly specified alternative has greater asymptotic local power than one based on an overspecified alternative. By examining the behaviour of a statistic that contained the LM and Koenker (1981) statistics as special cases, it was also shown that the ranking obtained in the standard likelihood-based framework does not depend on a joint density being specified in its entirety; the same

N.K.

Dmtoor

JJournul

of Econometrics X2 (1997)

63-80

77

ranking can be obtained under the assumption that a type of information matrix equality holds. Given the preponderant status accorded to ‘unified’ theories, it is appropriate to conclude with two observations concerning the generality of the results obtained. First, consider a more general model that contains the framework examined above as a special case and one that is neither likelihood-based nor requires an M-type equality to hold. Such a more general model would simply be obtained by specifying less restrictive assumptions on the conditional means and conditional variances of the disturbances. Then, as the Wooldridge statistics cannot be ranked in terms of asymptotic local power in the special case considered above, it can easily be asserted that corresponding statistics based on the general procedure proposed by Wooldridge (Theorem 2.1) also cannot be ranked; this does not of course preclude the possibility that, in other special cases, corresponding Wooldridge statistics could be ranked. Second, here attention has been restricted to tests for conditional heteroskedasticity that are completely robust to the form of conditional heterokurticity. However, by going back a moment, it can be seen that analogous results will hold for tests of the conditional mean that are completely robust to the specification of the conditional variance. Therefore, although this paper has only considered the special case of testing for conditional heteroskedasticity in the linear regression model, corresponding results for completely-robust tests also hold in more general models and in other testing situations of the kind considered in the unified framework of Wooldridge.

Appendix

Throughout

the Appendix, ‘1’ will denote summation over t = I 2, . . . , n.

Proqf‘cfTheorrtn 1. Following Amemiya (1977), let ul, v2 and u3 be n x 1 vectors with tth elements v,, = u: - s, - n- “2q:(g v.zr= 4-m - s,v and 031= (ii - ~)T.~~~:(fi - fi), respectively. Then, as 6, = (uI - .$(fi - p) j2, c^= eSo + rr- “zQd + r1 -2~3 + v3 which can be substituted into 3i,+= (ZlZ,)- ‘2:; to yield

J;;cs* - r&,)

=(n-‘Z;Z,)-‘n-‘Z;Q6 +(n-‘Z:Z*)-1n-‘~2Z;(iT(v3

+(tz-‘Z;Z,)-‘n-“2Z;o, - 2v2),

where e = Z,eo and e, = (I, 0, 0, . . . , O)T is an (r + I) x 1 vector. The first term converges to <, = ( E [z*,z*,].r I, - ’ E [z,,yT]S, a central limit theorem (e.g., White, 1984, Theorem 5. I I) with the CramCr-Wold device can be applied to the second term, and the last term is oP(l), cf. Amemiya (1977, results (A.6) and (A.9)). Hence, &(a* - eOcS,)5 N(j,, Y,)where Yy, = (E[z,,zT,]}-‘n,{E[z,,z~,])-’

78

N.K.

Dnstoor

/Journal

of Econometrics

82 (1997)

63-80

and 9, = iim V[z,,vt,] = lim E[(u: - 60)2t,,z$]. Part (a) then follows as &c? and (E[i;$j} - ’ E[?,qf JS are the last I’ elements of &(a* - e&,) and 5+. respectively, and as (E[~,,~]}-‘a{E[~~,~~~-’ is the bottom right partition of Y*. Since fi is a strongly consistent estimator of 8, it follows from White (1980a, Lemmas 2.3 and 2.6) that n-‘ZzpZ, = n-*1(& - 82)2.z,,z& = $2, + o,,(l) which gives part (b) as (E[~,~~]}-‘J2{E[z~~]}-’ and 8, are the bottom right partitions of Iv, and (n -lZ;fjZ,)-ln-‘Z,TFZ*(n-lZ~Z,)-l, respectively. Finally, part (c) follows from parts (a) and (b) by noting from (2.4) that wz = (J-la)*b,‘(&&). Prmfqf (3.3). Let B = [Zs, OIT be an s+ x s matrix so that q: = qItB where q:, = (q:, qTt) is the tth row of Q+ = [Q, Q2], and also let q+l = q+, - E[q+,], D - E[q+,qT], and @ = &+ where Sz, is given by (3.1) with 2, = q,,. Then, by noting that & = BTG+, and by setting .?, = &. 4+,, in turn, in each of (3.1) and (3.2), it can be shown that A(W+) - i(W,) = GT[DT@-‘D -DTB(BT@B)-‘BTD]d.

(A.11

The first inequality in (3.3) and the analogy with generalized method of moments estimators then follow from Davidson and MacKinnon (1993, p. 604). Similarly, for the second inequality in (3.3) the difference E.(WQ) - I.( W-) is also given by the right hand side of (A.1) but with D = E[qtqT], 8 = 52, is given by (3.1) with 5, = q,;, and B an s x s- matrix such that q?, = q:B where qT = (q?,, qz) is the tth row of Q = [Q-, Qr]. Proof @‘Theorem 2. The equivalences will be proved by showing that (a) e (b) and (a) c> (c). First note that, given Assumption 2(a), (3.1) can also be written as s2 = lim(n-‘xE[( z$ - &J2?&r]). Using this and (4.3) with the law of iterated expectations and by noting that &, = q5u+ op(l), q5un= & + o&l), G’A =0, and qr = vech(,#), it can be shown that

vechtQ2) = (40- &El&l f ECsdld~.

64.2)

Then, as vech(E[i;$f]) = Erg,], vech(a - (& - &)E[Z,rz’;‘]) = E[~,{$]& so (a)-(b) as E[qlq:] is non-singular by Assumptron 2(b) and as (1 - a)& = (1 + h)6: holds in both cases. To see that (a)*(c), first note that from White (1980a, Lemmas 2.3 and 2.6), n-‘t’l‘v*=n-‘~~;F-nn-‘CE[uP]jop(l) Z2 = n-‘Ctif

= n-‘CE[u:]

and + orjll)

N.K.

Dnstoor

/Journal

of

Econometrics

82 (1997)

63-80

79

which reduce to n-‘STv* = & + E[gT]&

+ or(l)

and

b2 = 6,, + or(l),

respectively, by using (2.2) and (4.3) with the law of iterated expectations and as c#I~”= & + or(l), &. = & + or(l), and eTA =O. Then, with 7 = phm f where f = un - ‘fiTV^+ G4,

y = n#o + bd; + aE[g:]&.

(A-3)

Nowlet D, = f(n-‘ZTMZ)-‘, D, = plim D, = y(E[@T]}-‘, 5 = {E[zt$jj-’ E[?#$]S, and Y = C[E[~~~~]}-‘n(E[~~~~]}-‘. Then, since &a ? N(<, !P) and as Kgh = (,/&‘%,‘(,/&)

= (,/‘i2)TD>~‘(J;;d)

KZOh2 ~‘(r, L(Kih)) iff DJ’Y A(@)

= y-‘B(E[#]}-’

= <“D, ’ 5 = 17-‘s’E[q,iT](E[&$J

Now,y-‘G{E[&i~]}written as

+ o,,(l),

is idempotent and j-“E[&qT]s.

(A-4)

’ is idempotent iff vcch(Q - yE[&ST]) =0 which can bc

((1- 4&1- (I+ W%Wtl + PXgdl - 43tlECs~lM~ -0

VW

by using (A.2), (A.3) and vech(E[#j) = ECg,]. Hence, Kir’ 5 &r,3.(Kib)) iff (A.5) holds; note that (1 - a)@,, = (1 + h)6; holds in statement (c) so (A.5) then gives C#J~ =O which in turn, with (A.3), shows that (A.4) reduces to (4.4). Finally, (a) o (c) follows as (A.5) and (1 - n)&, = (1 c h)&$ are equivalent to the conditions in statement (a).

References Amemiya, T., 1977. A note on a heteroscedastic model. Journal of Econometrics 6, 3655370. Amemiya, T., 1985. Advanced Econometrics. Harvard University Press, Cambridge, MA. Bera, A.K., Yoon, M.J., 1993. Specification testing with locally misspecified alternatives. Econometric Theory 9.649-658. Breusch,T.S.. Pagan, A.R., 1979. A simple test for heteroscedasticity and random coefficient variation. Econometrica 47, 12X7- 1294. Das Gupta, S., Pcrlman, M.D., 1974. Power of the noncentral F-test: Effect of additional variates on Hotelling’s T2-test. Journal of the American Statistical Association 69, 174-180. Dastoor, N.K., 1990. The asymptotic behaviour of some tests for heteroskedasticity. Unpublished paper, Department of Economics, University of Alberta, Edmonton. Dastoor, N.K., 1995. Testing for conditional hctcroskedasticity with misspccified alternative hypotheses. Unpublished paper, Department of Economics, University of Alberta, Edmonton. Davidson, R., MacKinnon. J.G, 1985a. The interpretation of test statistics. Canadian Journal of Economics IS, 38-57. Davidson, R., MacKinnon, J.G., 198%. Heteroskedasticity-robust tests in regression directions. Annales de I’INSEE 59/60, 183-218.

80

N.K.

Dastoor

/Journal

of Econometrics

82 (1997)

63-80

Davidson, R., MacKinnon, J.G., 1987. Implicit alternatives and the local power of test statistics. Econometrica 55, 130551329. Davidson, R., MacKinnon, J.G., 1993. Estimation and Inference in Econometrics. Oxford University Press, New York. Gallant, A.R., White, H., 1988. A Unified Theory of Estimation and Inference for Nonlinear Dynamic Models. Basil Blackwell, Oxford. Godfrey, L.G., 1978. Testing for multiplicative heteroskedasticity. Journal of Econometrics 8, 227-236. Godfrey. L.G., 1996. Some results on the Glejser and Koenker tests for heteroskedasticity. Journal of Econometrics 72, 275299. Godfrey, L.G., W&kens, M.R., 1982. Tests of misspecification using locally equivalent alternative models. In: Chow, G.C., Corsi, P. (Eds.) Evaluating the Reliability of Macro-economic Models. Wiley, New York, pp. 71-99. Hsieh, D.A., 1983. A heteroscedasticity-consistent covariance matrix estimator for time series regressions. Journal of Econometrics 22.281-290. JudSe, G.G., Grifhths, W.E., Hill, R.C.. Liitkepohl, H., Lee, T.-C. 1985. The Theory and Practice of Econometrics. 2nd ed. Wiley, New York. Koenker, R., 1981. A note on studentizing a test for heteroscedasticity. Journal or Econometrics 17, 107-I 12. Newey, W.K., 1985. Maximum likelihood specification testing and conditional moment tests. Econometrica 53, 1047-1070. Pagan. A.R., Hall, A.D.,1983. Diagnostic tests as residual analysis. Econometric Reviews 2, 159-218. Pagan, A.R., Pak, Y., 1993. Testing for heteroskedasticity. In: Maddala, G.S., Rao,C.R., Vinod. H.D. (Eds.) Handbook of Statistics, vol. 11, Econometrics. North-Holland, Amsterdam, pp. 489-518. Saikkonen, P., 1989. Asymptotic relative ethciency of the classical test statistics under misspecification. Journal of Econometrics 42. 351-369. Tauchen, G., 1915. Diagnostic testing and evaluation of maximum likelihood models. Journal of Econometrics 30. 415.-443. Waldman, D.M., 1983. A note on algebraic equivalence of White’s test and a variation of the Godfrey/Breusch-Pagan test for heteroscedasticity. Economics Letters 13. 197.-200. Whang, Y.-J., Andrews, D.W.K. 1993. Tests of specification for parametric and semiparametric models. Journal of Econometrics 57, 277 318. White, H., 1980a. Nonlinear regression on cross-section data. Econometrica 48. 721.-746. White, H.. 19SOb.A heteroskedasticity-consistent covariance matrix estimator and a direct test for hetcroskedasticity. Econometrica 48. 817.-838. White, H., 1984. Asymptotic Theory for Econometricians. Academic Press, Orlando. White, H., 1987. Specitication testing in dynamic models. In: Bewley, T.F. (Ed.) Advances in Econometrics - 5th World Congress, vol. 1.Cambridge University Press, Cambridge, pp. I-58. White, H., Domowitz, I., 1984. Nonlinear regression with dependent observations. Economctrica 52, 143-161. Wooldridg, J.M., 1990. A unified approach to robust, regression-based specification tests. Econometric Theory 6. 17-43. Wooldridge, J.M., 1991a. On the application of robust, regression-based diagnostics to models of conditional means and conditional variances. Journal of Econometrics 47, 5-46. Wooldridge, J.M., 1991b. Specification testing and quasi-maximum-likelihood estimation. Journal of Econometrics 48, 29-55.