Journal of Econometrics
2 (1974) 55-65. 0 North-Holland
ON TESTING
Publishing Company
HYPOTHESES IN SIMULTANEOUS EQUATION MODELS
Alison MORGAN* and Waiter VANDAELE** Graduate School of Business, University of Chicago, Chicago, III. 60637,
U.S.A.
Received July 1973, revised version received October 1973
1. Introduction
In this paper we discuss tests of hypotheses about coefficients in simultaneous equation systems. We give a brief description of alternative tests. The first type is based on asymptotic normality, including a possible extension to a Lindley type (1965) Bayesian test. The second type due to Dhrymes’ 2SLS-test (1969) and Vandaele-Morgan’s 3SLS-test (1972), leads to asymptotic t or F tests. Then, we compare the asymptotic power of certain tests and discuss the meagre evidence of their performance in small samples. This analysis indicates that tests based on asymptotic normality are more powerful. Finally, we apply some of the tests to an actual econometric model and show the very different results given by different tests. 2. Asymptotic tests and tests within the Bayesian framework Estimation methods usually proposed for simultaneous equation systems are : Two-Stage Least-Squares (2SLS), Limited Information Maximum Likelihood (LIML), Three-Stage Least-Squares (3SLS), and Full Information Maximum Likelihood (FIML). Under certain general conditions it has been shown that if 6, the vector of all coefficients in the simultaneous model, is estimated by one of these methods, d/T@-6) is asymptotically multivariate normal distributed. See, e.g. Dhryyes (1970, ch. 4). Clearly d/T(8ik-6i,)/&ik, where 6, is the ith coefficient estimate in the kth equation and 8, its standard error is distributed N(0, l), so that single and joint *University of Chicago. *University of Chicago and Katholieke Universiteit, Leuven. Fellowship support from the Ford Foundation, the Belgian-American Educational Foundation and the Belgian NSF (National Fonds voor Wetenschappelijk Onderzoek) is gladly recognized. Research has also been supported by the NSF Grant GS-2347. The authors are greatly indebted to A. Zellner for his continuing advice. We would also like to thank P. Dhrymes and C. Sims for helpful comments on an earlier version of this paper. Of course all errors are the sole responsibility of the authors. l
56
A. Morgan, W. Vandaele, Testing hypotheses in simultaneous models
asymptotic tests of coefficients in a simultaneous model can be based respectively on the normal or x2 distribution. Zellner (1971, pp. 273-275) has shown that, with diffuse prior information, the leading term in an asymptotic expansion of the posterior pdf of the coefficients in a simultaneous equation system follows a multivariate normal pdf. The mean and covariance matrix are the 3SLS estimates of the coefficients and the largesample covariance matrix estimate respectively. Therefore, we can interpret the tests above and in sect. 3 as asymptotic approximations to Lindley-type Bayesian tests [Lindley (1965, pp. SSff.), Zellner (1971, p. 298)], since the Lindley tests are computationally equivalent to the sampling theory tests in this case. The Lindley test is based on the posterior pdf for the parameters, the dik’s obtained by using a diffuse prior pdf. A point to stress is that it is mandatory to use a diffuse prior pdf, i.e. a prior pdf sufficiently smooth in the neighborhood of the null hypothesis. However in large samples, the prior pdf would not matter much. To apply a Lindley test of size a, say 5 %, for the null hypothesis 6, = 6,) we construct an interval such that, given the data D, the probability content is a, i.e. Pr(a < di, < bl D) = a. For a given probability content a, the endpoints a and b are chosen such that the interval [a, b] is the shortest. The necessary conditions arep = p(b). See Zellner (1971, p. 27, footnote 22 with reference). This is often referred to as the Bayesian highest posterior density interval. The rationale of the Lindley test is that the posterior pdf reflects the degree of belief we have about possible values of Sik, so that if 6,, is in the confidenceAinterval, we should accept the null hypothesis. The sampling theorist considers 6 random and 6 fi?ed. For the null hypothesis 6 = 6, he constructs an interval such that Pr(a < ai, < b) = a. The end points of this interval will generally be the same as in the Bayesian confidence interval, but are based on criteria such as uniform most powerful, unbiasedness, invariance. He then accepts the null hypothesis if 6, is contained in the interval since (1 -a)% of the intervals so generated will contain 6, thefixedvalue of the coefficient. Thus the probability a is not a degree of belief that 6 = 6,.’ If the simultaneous equation system is such that the posterior pdf is tractable inferences can be made directly from the posterior pdf. See Zellner (1971, pp. 206ff.). In sect. 5, we discuss some Monte Carlo results due to Zellner (1971, pp. 276ff.) comparing Bayesian and sampling theory confidence intervals.
3. Sampling theory tests for a simultaneous
equation model
Dhrymes (1969, 1970) has shown that 2SLS may be regarded as a classical least squares estimation method on a transformed simultaneous equation. This ‘This point is made clear in DeGroot (1973). In this paper examples are given in which the tail area is equal to the posterior probability that Ho is true and in which it is equal to the likelihood ratio comparing Ho to a certain class HA of alternatives. That is the traditional statistical practice of calculating tail areas are given a Bayesian interpretation.
A. Morgan, W. Vandaele, Testing hypotheses in simultaneous models
57
transformation gives rise to test statistics that are asymptotically t or F distributed. ’ We (1972) derived parallel tests for coefficients estimated by 3SLS. Some features of the above t and F tests as contrasted to the normal and x2 should be emphasized. First, the transformation makes the degrees of freedom of the t and F tests dependent on the degree of over-identification and not on the number of observations. Therefore as the number of observations is increased the t and F tests become exact3 but do not converge to the normal and x2 tests. Second, Revankar (1971) shows that information is lost when a dimension reducing transformation is used as a basis for testing, in spite of the fact that the formulae for the estimators are algebraically the same. Third, under a given null hypothesis, the quadratic form in the numerator of these F tests is that required for a x2 test.
4. The power of tests A fundamental way of choosing between asymptotic tests based on an F (or t) distribution or those based on a x2 (or normal) distribution is to compare their power functions. For a test with null hypothesis
H,:
6 = a,,
versus the alternative
HA: 6 # 6,)
the power is defined by Prbl(rejecting
H,IH,:
S = a,),
where 6, is a particular value of 6 # 6,. For all possible values of the 6, the power function Pr(rejecting
is defined as
HoI HA).
Obviously for given type I error tl, the higher the power for all 6’s of the alternative hypotheses, the higher the probability of making a correct inference. Extensive tables for the power functions of the normal t, F and x2 distributions have been published elsewhere. See Neyman et al. (1935), Owen (1965), Pearson and Hartley (1951), Tang (1938) and Tiku (1967, 1972). The following general remarks can be made. As is well known the t distribution approaches the normal as the degrees of freedom in the t distribution approach infinity. ‘Revankar and Mallela (1972) have derived an exact finite sample F test in the context of a structural equation. Similar results were obtained in Anderson and Rubin (1949). As pointed out by Revankar and Mallela (1972), the basic weakness of their test is that the null hypothesis has to specify all the coefficients of the endogenous variables. Without minimizing the contribution of the paper, once all the coefficients of these endogenous vartables are known, the test can be interpreted as a test on coefficients of a multivariate regression equation. ‘This, because a consistent estimator is used.
58
A. Morgan, W. Vandaele, Testing hypotheses in simultaneous models
Therefore, since the standardized normal distribution has thinner tails than the standardized t distribution, it has a higher power than the t for all 1, where 1 is the difference between the value of the parameter under H,, and a specific HA. This difference in power is not constant for all values of 2 but first increases and then decreases for larger values of A. Similarly, the x2 is more powerful than the F test since as vz+co,
where vi and v2 are the degrees of freedom in the numerator and denominator, respectively. However, for the F tests mentioned v2 does not include the number of observations, but the degree of over-identification, so that increasing the sample size will not make these Ftests converge to the asymptotic x2 tests. In the F test proposed by Dhrymes (1969) and generalized by the authors (1972) a consistent estimate of the covariance matrix C is involved. Therefore increasing the sample size will only provide an improved estimate of C. In fact the asymptotic pdf for Dhrymes 2SLS test statistic and our 3SLS test statistic is the F distribution. Similarly the 2SLS and 3SLS asymptotic t tests do not converge to the asymptotic normal tests.
5. Small sample properties The above analysis of the power indicates the use of the asymptotic normal and x2 tests and not the t or F tests as derived by Dhrymes for 2SLS and the authors for 3SLS. However, since these tests are all based on asymptotic distributions they may perform differently in small samples. The first studies of the small sample properties of simultaneous equation estimators were Monte Carlo experiments, but many of these did not examine the properties of tests. Exceptions were Cragg (1964, 1967a,b and 1968) and Zellner (1971, ch. 9). Cragg found that the normal test, his two times the standard error (2a-test), did not perform well. The problem appeared to be the underestimation of the true standard error. As Cragg increased the number of observations from 35 to 50 to 70, the performance of the 2a-test did improve, but there were still a greater fraction of the coefficient estimates than theoretically expected outside the 20 confidence limits. Cragg also examined some ‘t-ratios’ of the form used in classical least-squares, but, as is known, these are not appropriate for drawing conclusions about hypotheses regarding simultaneous equation parameters’ values. For a Monte Carlo study comparing Bayesian and sampling theory confidence intervals of a small simultaneous equation model we can refer to Zellner (1971, pp. 276ff.). He concluded that, in small samples, intervals computed from posterior pdf’s performed better than approximate sampling-theory confidence intervals. In large samples Bayesian and sampling-theory procedures performed about equally well. We must be careful about placing too much weight on this
A. Morgan, W. Vandaele,Testing hypotheses in simultaneousmodels
59
evidence, since in every Monte Carlo study the analysis is heavily influenced by the model and parameters selected. More recently fruitful attempts have been made to derive the small distributions of simultaneous equation estimators under various restrictive assumptions. See Basmann (1960, 1961, 1963a,b) Basmann et al. (1971), Richardson (1968), Kabe (1963, 1964), Sawa (1969), Takeuchi (1970). Although the expressions derived even under these restrictive assumptions are not very tractable, exact small sample test distributions can be examined. [See Richardson and Rohr (1971).] In the Bayesian approach, it is considered meaningful to introduce probabilities associated with hypotheses. This allows us to compare and choose between alternative models. [See Zellner (1971, pp. 206ff.) with references to work of Thornber and Geisel.] 6. An application of the tests 6.1. Framework A simultaneous equation system may be written as YI- = XB+U, where Y is the TX M matrix of the T observations on M endogenous variables, I is the M x M nonsingular matrix of parameters of the dependent variables, X is the T x K matrix of T observations on K predetermined variables and B the Kx M matrix of their parameters, U is the TX M matrix of disturbances with C, an M x M positive definite matrix, as covariance matrix of the rows of U. If I is an upper triangular matrix and X is diagonal, the system is fully recursive and each equation is unrelated to the disturbances in the preceding equation. Then the Classical Least Squares estimator is the Maximum Likelihood estimator. Andersen and Carlson (1970) have published a model - the St. Louis Model which is set up as a fully recursive system and which they have estimated by Classical Least Squares. When the St. Louis Model is broadened so that I is nontriangular we can estimate the extended model by 3SLS. Then we test, using the tests discussed above, to see if the model reduces to triangular form. The model, as we reestimated it, is presented in table 1. This is identical with what Andersen and Carlson (1970, app. C) call the interest rate version of their model with the addition of the term O.Ol(R,X,_,) in eq. (2), the price equation. This term expresses the current value of the long term interest rate in flow terms by multiplying the long term interest rate by lagged output (X,_ i) and a scale factora %ee Andersen and Carlson (1970, p. 13), the estimated price equation, for an explanation why the model requires interest in flow terms.
60
A. Morgan,
W. Vandaele, Testing hypotheses in simultaneous models
Table 1 The ‘Federal Reserve Bank of St. Louis Model’ in algebraic form.
(1) Total Spending Equation AY, = fr(AM,,.
. ., AM,-.,
A&, . . ., A&-,)
(2) Price Equation
AP, = fz[&t (3) Demand
. . ., D,-n, 0.01(&, x,-d,
0.01(&-1, x,-,)1
Pressure Identity
D, = AY,-(X:-X,-r)
(4) Total Spending Identity AY, = AP,+AX, (5) Interest Rate Equation R, = f3(AM,,
(6) Unemployment
AX,,
. . ., AX,-,, Af’,, Apt”)
Rate Equation
fJr = fb(Gt, G-d (7) GNP Gap Identity G, = (XT-X,)/X: Endogenous
Exogenous
variables
A Y, = change in total spending
(nominal GNP) AP, = change in price level (GNP price
deflator) D, = demand pressure AX, = change in output (real GNP) R, = market interest rate
APIA = anticipated
change in price level
Ur = unemployment G, = GNP gap
‘Other than lagged variables.
rate
variables”
AM, = change in money stock AE, = change in high-employment Federal expenditures X,F = potential (full-employment) output
A. Morgan, W. Vanabele, Testing hypotheses in simultaneousmodels
61
Table 2 Tests of hypotheses about coefficients of the interest rate version of the price equation in the St. Louis model estimated by 3SLS.
Simple tests
H,(l):
/?I = 0
H&2): 82 = 0
Normal test statistic (= coeff./SE)
1.168
- 0.553
Asymptotic test size based on normal pdf
12%
29%
F , ,22test statistic
0.158
0.035
tZZ(= JFT,2z) test statistic
0.397
0.188
Test size based on asymptotic t pdf
35%
43%
Joint test
H,(3): /?I = 82 = 0 F2.22: 4.03
~1’: 69.27
Crttical value, with test size a = 5%
F2.2,: 3.44
,yz=: 5.99
a = 2.5%
Fz.22: 4.38
~2’: 7.38
Computed
value
b1 = coefficient of current interest flow variable j_&= coefficient of interest flow variable lagged one period
p12% represents for the normal pdf the probability of obtaining a value greater than 1.168, and 29% is the area under the normal curve for values less than -0.553.
In the price equation Andersen and Carlson use interest index of expectations about future prices. The current reflect current expectations. The interest rate lagged one reflect expectations of the previous period. Since the model is no longer recursive, we used 3SLS efficients. 5
in flow terms as an interest rate should period would only to estimate
the co-
6.2. Tests and results Given PI, the coefficient of O.Ol(R,X,_,), and pz, the coefficient of O.Ol(R,_ r-Y,_ t ) in eq. (2), the price equation of table 1, we tested the following “Detailed results are available on request.
62
A. Morgan, W. Vandaele, Testing hypotheses in simultaneousmodels Table 3 Examples of actions taken w.r.t. the null hypotheses H,(3) in table 2, for different test sizes.
Ho(l) and
Hypothesis Ho(l) Test size a (significance level) 0.01 0.05 0.10 0.15 0.30 0.40
Normal test statistic
t test statistic
accept H,(l) accept accept reject H,(l) reject reject
accept H,(l) accept accept accept accept reject H,(l)
Hypothesis H,(3)
0.01 0.025 0.05 0.10
one-sided
hypotheses
x2 test statistic
F test statistic
reject H,(3) reject reject reject
accept H,(3) accept reject H,(3) reject
:
1)
H,(l):
p1 = 0
versus
H,(l):
/I1 > 0
2)
H,(2):
/I2 = 0
versus
H,(2):
/I2 < 0
using as a basis for inference the normal Dhrymes’ t-test. The x2 and 3SLS extension of Dhrymes’ hypothesis 3)
H,(3):
/I1 = pz = 0
versus
test and
our
3SLS extension
of
F test were used to test the joint
H,(3):
/Ii # 0, pZ # 0.
The results are presented in table 2. In the case of H,(l) the asymptotic normal test (coefficient divided by its standard error) rejects H,(l) approximately at a test size 6 of 12 ‘/& whereas the t rejects H,(l) only when the test size is approximately 35%. In the case of the joint test H,(3), the values of both F and x2 are very high (suggesting /I1 # 0, p2 # 0), but the computed value of xi is more than 11 times its critical value at a test size of 5 %, while the F test is less 6Lindley (1965, p. 60), for example, talks about exact significance level.
A. Morgan,
W. Vandaele, Testing hypotheses in simultaneous models
63
than 1.2 times the critical value at the same test size. Table 3 gives examples of actions taken w.r.t. the null hypotheses H,(I) and H,(3) for different test sizes. We may also interpret these tests as asymptotic approximations to Lindleytype Bayesian tests. As indicated in sect. 2, the leading term in the asymptotic expansion of the posterior pdf of /Ii and p2 is a normal distribution with mean the 3SLS estimates pi and fi2 respectively. Using a Lindley test, Pr(/Ir > 0) = 0.88, that is 88 % of the area under the posterior pdf for p1 is for values greater than zero. These differences between the x2 and normal on the one hand and the t and F on the other hand are only partly accounted for by the differences in power as there are also differences due to the asymptotic approximation. To calculate the power we need 13,the noncentrality parameter which depends on the unknown population covariance matrix in the case of x2 and For standard deviation for the normal and t test. However, we can use the sample esti?ates to give us an approximation to the power. In the case of &he normal t test 1, is the coefficient divided by ins standard error (table 2, line 1). Under H,(l), and conditionally uponl, = I, N 1, a ‘t’ test with v = 22 has a power close to the normal. It may therefore seem remarkable that in table 2 the asymptotic test size of the normal is so different from that of the t test with v = 22. However, as proved in Vandaele-Morgan (1972), the 3SLS t test is based on the transformed model, which implies a quadratic form in the residuals, different from the quadratic form in the normal test statistic. In the case of the F and x2 tests, the estimated noncentrality parameter is the value of x2 (see table 2, part 2). Both 1; and F2,22will have a power greater than 0.999, beyond the range of the available tables. Again the large difference in our results is mainly due to the use of the transformed model in estimating the denominator of the F statistic. We must not put too much weight on one example. Indeed an implicit assumption is that the model is correctly specified. No doubt, there are probably misspecifications and the different tests may reflect these differently. Also using two interest flow variables could introduce collinearity and this would affect the reliability of the results. However taking into account the above remarks, the difference in results in this application with 218 degrees of freedom suggests that the asymptotic t and F tests used are not equivalent to the normal and x2 tests.
7. Conclusion In this paper we have given simultaneous equation model. evaluation of the power of the The evidence on normal and asymptotic t and F distributions are largely unknown. However,
an analysis of different tests in the context of a A major part of the analysis was based on the different tests. x2 asymptotic tests vis a vis those based on the is not complete since small sample properties we have shown the greater asymptotic power of
64
A. Morgan,
W. Vandaele, Testing hypotheses in simultaneous models
tests based on the normal and x2 distributions. Our application of the tests shows a striking difference in the test sizes of the alternative tests and shows that the asymptotic normal and x2 tests lead to conclusions in a particular finite sample, different from the asymptotic t and F tests. The evidence on the power we have presented suggests that tests based on the asymptotic normal and x2 distributions are to be preferred to those based on the asymptotic r and F distributions. References Andersen, L.C. and K.M. Carlson, 1970, A monetarist model for economic stabilization, Federal Reserve Bank of St. Louis Review 52, no. 4,7-21. Anderson, T.W. and H. Rubin, 1949, Estimation of the parameters of a single equation in a complete system of stochastic equations, Annals of Mathematical Statistics 20, no. 1,46-63. Basmann, R.L., 1960, On finite sample distributions of generalized classical linear identifiability test statistics, Journal of the American Statistical Association 55, no. 292,65&659. Basmann, R.L., 1961, A note on the exact finite sample frequency functions of generalized classical linear estimators in two leading over-identified cases, Journal of the American Statistical Association 56, no. 295, 619-636. Basmann, R.L., 1963a, A note on the exact finite sample frequency functions of generalized classical linear estimators in a leading three equation case, Journal of the American Statistical Association 58, no. 301, 161-171. Basmann, R.L., 1963b, Remarks concerning the application of exact finite sample distribution functions of GCL estimators in econometric statistical inference, Journal of the American Statistical Association 58, no. 304, 943-976. Basmann, R.L., F.L. Brown, W.S. Dawes and G.K. Schoepfle, 1971, Exact tinite sample density functions of CCL estimators of structural coefficients in a leading exactly identifiable case, Journal of the American Statistical Association 66, no. 333, 122-126. Cragg, J.G., 1964, Small sample properties of various simultaneous-equation estimators: The results of some Monte-Carlo experiments, Princeton University, Econometric Research Program, Research Memorandum no. 68, 408 pp. Cragg, J.G., 1967a, On the relative small-sample properties of several structural equation estimators, Econometrica 35, no. 1, 89-110. Cragg, J.G., 1967b, Small-sample performances of various simultaneous equation estimators in estimating the reduced form, Metroeconomica 19, no. 2,77-93. Cragg, J.G., 1968, Some effects of incorrect specification on the small sample properties of several simultaneous equation estimators, International Economic Review 9, no. 1, 63-86. DeGroot, M.H., 1973, Interpreting a tail area as a posterior probability or as a likelihood ratio, Journal of the American Statistical Association 68, no. 344, 966-969. Dhrymes, P.J., 1969, Alternative asymptotic tests of significance and related aspects of 2SLS and 3SLS estimated parameters, The Review of Economic Studies 36 (2), no. 106,213-226. Dhrymes, P.J., 1970, Econometrics, statistical foundations and applications (Harper and Row, New York) 592 pp. Kabe, D.G., 1963, A note on the exact distributions of the GCL estimators in two leading over identified cases, Journal of the American Statistical Association 58, no. 302, 535-537. Kabe, D.G., 1964, On the exact distributions of the GCL estimators in a leading three equation case, Journal of the American Statistical Association, 59, no. 306, 881-894. Lindley, D.V., 1965, Introduction to probability and statistics from a Bayesian viewpoint, Part 2: Inference (Cambridge University Press, Cambridge) 292 pp. Neyman, J., K. Iwaszkiewicz and S. Kolodziejczyk, 1935, Statistical problems in agricultural experimentation, Supplement to the Journal of the Royal Statistical Society 2, no. 2, 107-l 80. Owen, D.B., 1965, The power of Student’s t-test, Journal of the American Statistical Association 60, no. 309, 320-333. Patnaik, P.B., 1949, The non-central x2- and F-distributions and their applications, Biometrika 36, no. l/2, 202-232.
A. Morgan,
W. Vandaefe, Testing hypotheses in simultaneous modek
65
Pearson, ES. and H.O. Hartley, 1951, Charts of the power function for analysis of variance tests derived from the noncentral F-distribution, Biometrika 38, no. l/2, 112-130. Revankar, N.S., 1971, The two-stage least-squares estimators as Aitken estimators and the asymptotic tests of significance, State University of Buffalo, Economic Research Group, Discussion Paper no. 174, 16 pp. Revankar, N. and P. Mallela, 1972, The power of an F-test in the context of a structural equation, Econometrica 40, no. 5, 913-915. Richardson, D.H., 1968, The exact distribution of a structural coefficient estimator, Journal of the American Statistical Association 63, no. 324, 1214-1226. Richardson, D.H. and R.J. Rohr, 1971, Distribution of a structural t-statistic for the case of two included endogenous variables, Journal of the American Statistical Association 66, no. 334, 375-382. Sawa, T., 1969, The exact sampling distribution of ordinary least squares and two-stage least squares estimators, Journal of the American Statistical Association 64, no. 327, 923-937. Takeuchi, K., 1970, Exact sampling moments of the ordinary least squares, instrumental variable and two-stage least squares estimators, International Economic Review 11, no. 1, l-12. Tang, P.C., 1938, The power function of the analysis of variance tests with tables and illustrations of their use, Statistical Research Memoirs, vol. II, 126-157. Tiku, M.L., 1967, Tables of the power of the F-test, Journal of the American Statistical Association 62, no. 318, 525-539. Tiku, M.L., 1972, More tables of the power of the F-test, Journal of the American Statistical Association 67, no. 339.709-710. Vandaele, W. and A. Morgan, 1972, Asymptotic test of significance of 3SLS estimated parameters (H.G.B. Alexander Research Foundation, University of Chicago, Chicago) 11 pp. Zellner, A., 1971, An introduction to Bayesian inference in econometrics (John Wiley and Sons, New York) 431 pp.