konomics Letters 18 (1985) 157-159 \Jorth-Holland
157
A RULE OF THUMB FOR MIXED HETEROSKEDASTICITY Peter KENNEDY Simon Fraser University, Burnaby, B.C., Canada V5A IS6
Received 10 October 1984
Rather than testing for the existence of mixed heteroskedasticity, researchers should test for the existence of muted heteroskedasticity of sufficient strength to render of benefit the incorporation of the intercept term in an appropriate estimated generalized least squares procedure. A rule of thumb for practitioners is developed from a Monte Carlo investigation of this principle.
Mixed heteroskedasticity arises whenever the variance of an error term in a regression is given by uf = y + uZi, where 2 is some exogenous variable and y and u are both non-zero parameters. If y were zero, the appropriate generalized least squares (GLS) procedure is to apply ordinary least squares (OLS) to the data transformed by dividing through by @. But if y is non-zero, no GLS procedure exists (unless y and u are known): an ‘estimated’ GLS (EGLS) procedure must be employed, using a ‘pretest’ to determine whether or not y is significantly different from zero. Both Glejser (1969), and Goldfeld and Quandt (1972) feel that mixed heteroskedasticity is common in empirical work, but both concluded, on the basis of their studies, that mixed heteroskedasticity is difficult to detect and that this creates severe problems for the use of pretest estimators in the context. No subsequent studies have explicitly addressed this dilemma. The purpose of this note is to suggest that the conclusions of Glejser (1969), and Goldfeld and Quandt (1972) were based on inadequate pretest estimators, and to offer a rule of thumb for practitioners to aid in choosing an appropriate estimating procedure. The thrust of the argument is that the studies of Glejser (1969), and Goldfeld and Quandt (1972), did not examine a case of ‘substantial’ mixed heteroskedasticity and thus were not able to uncover the impact of mixed heteroskedasticity or suggest an appropriate remedy. In particular, their results are based on inappropriate pretest estimators and an incomplete examination of their potential. Pretest estimators they examined were pretesting for the existence of mixed heteroskedasticity rather than for the existence of mixed heteroskedasticity of sufficient strength to render of benefit the incorporation of the intercept term in the EGLS procedure. This may have happened because the authors assumed that the ratio of y to u is a measure of the extent of the ‘mixed’ character of the heteroskedasticity; a more appropriate measure is the ratio of y to the average variance. That the magnitude of 2 plays a prominent role here, may have been overlooked by these authors. A Monte Carlo study was undertaken to examine this issue. Data were generated for sample size N = 25 according to the formula I: = a + /3X, + q, where a and p are fixed constants, Xi is taken from a uniform or log-normal distribution, and ci is taken from a normal distribution with mean zero and variance u: = y + uZi, where Zi is taken from a uniform or log-normal distribution and may or 0165-1765/85/$3.30 0 1985, Elsevier Science Publishers B.V. (North-Holland)
158
P. Kennedy / Mixed heteroskedasticity
may not be identical to Xi. Sample sizes of 50 and 100 were created by replication of the X’s an 2’s. A variety of cases were considered, a selection of which is reported in table 1, arranged in ( of R, the ratio of y to the average variance. Y measures the severity of heteroskedasticity, calcu as the variance of the variances, as recommended by Kennedy (1984). The numbers reported in 1 are the ratios of the (estimated from 1000 repeated samples) variance of the OLS estimate of the (estimated) variance of the EGLS estimate of /? for four different EGLS estimators. Several variants of EGLS could be used as competitors of OLS in this context. Goldfeld Quandt (1972) conclude that a maximum likelihood procedure is best, an expected result given the maximum likelihood technique in their examples exploited knowledge of both the error disti tion and the functional form of the heteroskedasticity. In much applied work, however, researc are not prepared to undertake the computational burden of maximum likelihood procedure and choose instead some simpler approach. Textbooks offer little guidance on this issue, most sugges that the data should be transformed by dividing by Z or by a. There are three promil alternatives which have low computational cost and are therefore likely to be employed frequentl; practitioners. One is the method of Park (1966), in which the N variances are estimated throuI regression of the logarithm of the OLS squared residuals on the logarithm of Z. Another is method of Glejser (1969) in which the N variances are estimated through a regression of the abso value of the OLS residuals on Z. And the third is a method due to Goldfeld and Quandt (19 which they call the modified Glejser method, in which the N variances are estimated throug regression of the OLS squared residuals on Z. If the variances are determined as y + yZ, a cat survey of practitioners revealed that given this knowledge they would employ the modified Gle procedure since in contrast to the other methods it uses the correct functional form. This does not course, guarantee that it will be the best method in small samples. The maximum likelihc
Table 1 Ratio of OLS variance to EGLS variance. Case
Severity
z - cql, 31) x - CqlO, 40) a’-5+0.2Z
R = 0.59
In Z - N(3,l) x=z 02=11+z
R = 0.22
z - U&31) x=z 02=10+3z
R = 0.16
In Z - N(3,l) x - U(10,40) o2=19+3Z
R = 0.14
z - U&31) x - U(10,40) a2=5+3Z
R = 0.09
Z - U(O,lO66) x-z o2 = 20+0.5Z
R = 0.07
v = 0.05
v = 0.66
v = 0.19
V = 0.88
V = 0.23
V = 0.27
N
Park
Glejser
Modified
Divide
Glejser
by \/z
25 50 100
0.95 0.95 1.00
0.99 0.98 1.03
0.91 0.95 1.02
0.74 0.73 0.72
25 50 100
1.12 1.15 1.14
1.11 1.17 1.15
1.07 1.12 1.09
1.07 1.07 1.00
25 50 100
1.05 1.17 1.15
1.14 1.19 1.15
1.11 1.13 1.10
1.21 1.21 1.12
25 50 100
1.57 1.67 1.75
1.62 1.72 1.73
1.29 1.40 1.47
1.65 1.72 1.69
25 50 100
1.20 1.29 1.37
1.25 1.33 1.39
1.17 1.11 1.14
1.38 1.43 1.42
25 50 100
1.19 1.37 1.37
1.29 1.35 1.31
1.26 1.36 1.31
1.52 1.49 1.40
P. Kennedy / Mixed heteroskedasticiry
159
procedure and more sophisticated variants of the modified Glejser procedure [see Amemiya (1977) or the survey of Judge et al. (1980)] were ignored in this study on the grounds that most practitioners would choose one of the less computationally-burdensome methods noted above. Two main conclusions can be drawn from the results reported in table 1. (1) Whenever y is less than about 15% of the average variance, dividing by fi is the best of the computationally-easy EGLS procedures in this context. (2) Whenever y is greater than about 15% of the average variance, the Glejser and Park methods appear to be the best methods, both consistently (and surprisingly) outperforming the modified Glejser method. These two results suggest the following rule of thumb for practitioners to employ when faced with the possibility of mixed heteroskedasticity: If an estimate of y falls short of 15% of the estimated average variance, use the divide-by-@ variant of EGLS; otherwise, use the Park or Glejser variant.
References Amemiya, T., 1977, A note on a heteroskedastic model, Journal of Econometrics 6, 365-370. Glejser, H., 1969, A new test for heteroskedasticity, Journal of the American Statistical Association 64, 3161-332. Goldfeld, SM. and R.E. Quandt, 1972, Non-linear methods in econometrics (North-Holland, Amsterdam). Judge, G. et al., 1980, The theory and practice of econometrics (Wiley, New York). Kennedy, P.E., 1984, On measuring heteroskedasticity, Discussion paper (Simon Fraser University, Bumaby). Park, R.E., 1966, Estimation with heteroskedastic error, Econometrica 34, 888.