The existence of moments of some simple Bayes estimators of coefficients in a simultaneous equation model

The existence of moments of some simple Bayes estimators of coefficients in a simultaneous equation model

Journal of Econometrics 7 (1978) l-13. 0 North-Holland Publishing Company THE EXISTENCE OF MOMENTS OF SOME SIMPLE BAYES ESTIMATORS OF COEFFICIENTS...

655KB Sizes 1 Downloads 57 Views

Journal of Econometrics

7 (1978) l-13.

0 North-Holland

Publishing Company

THE EXISTENCE OF MOMENTS OF SOME SIMPLE BAYES ESTIMATORS OF COEFFICIENTS IN A SIMULTANEOUS EQUATION MODEL Jatinder Department

of Mathematics,

Temple

Paravastu Federal Reserve

Received November

S. MEHTA

System,

University,

Philadelpia,

PA 19122, USA

A.V.B. SWAMY Washington,

DC 2OS51, USA

1975, final version received March 1977

In this paper a simple modification of the usual k-class estimators has been suggested so that for 0 5 k g 1 the problem of the non-existence of moments disappears. These modified estimators can be interpreted either as Bayes estimators or as constrained estimators subject to the restriction that the squared length of the coefficient vector is less than or equal to a given number.

1. Introduction The ith structural _Vi =

equation

yiyi+xi~i+Ui,

of a G-equation

model may be written

as

(1)

where yi is the TX 1 vector of observations on a left-hand endogenous variable, Yi is the TX (Gi- 1) matrix of observations on the endogenous variables which appear in the equation, Xi is the T x Ki matrix of observations on the exogenous variables that appear in the equation, ui is the T x 1 disturbance vector, and yi and pi are coefficient vectors of dimensions (Gi- 1) x 1 and Ki x 1, respectively. Here Gi < G. In the full model there are K- Ki other exogenous variables whose observations are given in the TX (K-K,) matrix X,. We assume that eq. (1) is identifiable. Mariano (1972) has shown that when certain conditions are satisfied, the even moments of the usual two-stage least-squares estimator of yi exist only up to an order not exceeding the degree of over-identification of eq. (1). In particular, the two-stage least-squares estimator of yi does not possess finite moments of any integer order if eq. (1) is exactly identified; see also Richardson (1968), Sawa (1969) and Hatanaka (1973). Considering the cases in which

2

J.S. Mehta and P.A. V.B. Swamy, Moments of Bayes estimators

Gi = 2, T > K and K- Ki > 3, Sawa (1972) has derived explicit formulas for the mean and variance of a (k) estimator with 0 $ k 5 1. [Our terminology follows that of Goldberger (1964, p. 336).] Sawa also proves that a (k) estimator with k > 1 does not even possess a finite first-order moment; see also Hatanaka (1973). We will show in this paper that for 0 6 k 5 1 the problem of the nonexistence of moments of integer order disappears if we consider an approximate Bayes estimator or a restricted two-stage least-squares estimator which can be derived by making simple changes in the usual k-class estimators. In section 2 we derive some simple Bayes estimators of coefficients in a single structural equation and prove two lemmas which will be used in the sequel. In section 3, we consider the case of two included endogenous variables and prove the existence of the second-order moments of Bayes estimators developed in section 2. Concluding remarks are given in section 4.

2. Bayes estimators of coefficients The portions of the reduced form of the full model which relate to yi and Y, in eq. (1) can be written as yi = X~i+U,

and

Yi = Xni+Vi*

(2)

Assumptions

= “J. (i) E(uiI X) = 0 and E(u,u;lx) (ii) Given X, the rows of (ui, Vi) in (2) are independently and normally distributed with mean 0 and positive definite variance-covariance matrix

(iii) The TX T matrix X = (Xi, X,) is ‘fixed’ and has full column rank.l Defining the matrix Zi = (Yi, Xi) and the vector ai = (7:) /Ii)‘, eq. (1) can be written as yi = ZiSi+Ui.

(3)

Suppose that the prior distribution of fii is such that, given z2 > 0, ai is normally distributed with mean vector 0 and variance-covariance matrix 2’1. From the Bayesian normal theory developed by Zellner and Vandaele (1975, ‘The case in which T 5 k is considered by Swamy and Holmes (1971) and Swamy and Mehta (1976).

J.S. Mehta and P.A. V.B. Swamy, Moments of Bayes estimators

3

p. 649) it follows that the mean of the conditional posterior distribution of 6i, given ni = I?i = (xl-X)-‘X’Y,, olr, r2 and the data, is approximately equal to dip = [ZilX(X’X)-lX’Zj+~~-lZi’X(X’X)-‘X’yi,

(4)

where p = o1 1/r’ If the mean and the variance-covariance matrix of & are 8 # 0 and r2A-i # z2Z, respectively, then the estimator (4) gets modiefid as in Zellner and Vandaele (1975, p. 649, (81)). To avoid some complications, we confine our attention to the case where 3 = 0 and A = I. The nature of approximation underlying (4) is explained in Zellner (1971, p. 266). The estimator (4) arises quite naturally in a restricted least-squares context. To see this, rewrite eq. (3) as

= Zj6,+Ui,

(5)

where Zi = (X17i, Xi)

and

t)i = (Yi-XDi)yi+ui.

As lucidly exposited by Zellner (1972), the usual specifying assumptions for econometric models imply that the true value of Si lies in or on an ellipsoid given by (1/r2)(~i-~)‘A(6i-~)

= Ki+Gi+

1,

(6)

for suitable 3 and T’A-I. This prior information on fii is similar to that encountered in connection with a Bayesian analysis of eq. (1) because a uniform distribution over the interior of (6) has the same mean vector and variance-covariance matrix as the prior distribution of di with mean 8 and variance-covariance matrix r2Am1, see Cramer (1946, p. 300). In the practical situation where it is too costly to assess the prior parameters f and A-l, we may assume that the true value of 6i lies in or on a hypersphere of radius r with center at 0, i.e.,’ 6fsj $ r2.

(7)

For sufficiently large r the ellipsoid (6) will be entirely within the sphere (7). When there is a high degree of collinearity among the exogenous variables, given the value of 17,, the least-squares estimator of 6i in eq. (5) violates the 20ther reasons for using a diffuse prior distribution

are given by Zellner (1971, p. 20).

J.S. Mehta and P.A. Y.B. Swamy, Moments of Bayes estimators

4

constraint in (7). In that situation the value of ai which minimizes the error sum of squares (yi-ziSi)‘(yi-&Si) subject to the constraint in (7) is

bi = (z;zi+pz)-lz~yi,

(8)

where p is chosen such that &Ji = r2; see Meeter (1966). The estimator (8) can be approximated by (4). Guidelines for the choice of p in some simple cases are given in Swamy, Mehta and Rappoport (1975) and Swamy and Mehta (1976). An argument parallel to that of Johnston (1963, p. 260) shows that the estimator of yi implied by the estimator (4) is gifi = (y,‘Ml,yi+~LI)-‘yi’MlpYi,

(9)

where M,, = rX(X’X)-‘X’-Xi(~~Xi+~Z)-lXil]. Since the estimator of /Ii is related to g,, by hip =

(x[Xi+/l~-lX~(_Vi-

(10)

Yigi~),

the moments of bi, can be obtained from the moments of (yi- Yigi$. The moments of bi, cannot be finite if the corresponding moments of g,, are not finite. So we should first find the moments of gi,. To prove our results on the existence of moments of gip, we need the following lemmas: Lemma

1.

MI,

is a positive semi-dejnite

Proof.

We may write Mlp = I-

xi(x,lxi+pz)-%i’-

matrix of rank K.

[I-

x(x’x)-IX’].

(11)

It follows from a result in Zellner (1971, p. 73, eq. (3.59)) that [Z-Xi(X/Xi+/d)-‘X[]

where 0 is the (K- Ki) D=

x

Ki

=

I*(XiX~+jd)-l

=

+(;) (zo’)x’+pz-j-l,(12)

matrix of zeroes. Let

J.S. Mehta and P.A. V.B. Swamy, Moments of Bayes estimators

5

The following result can be verified by direct matrix multiplication:

p(xDx’+yz)-’

= z-x(x’x)-lx’+px(x’x)-’ * fp(X’X)_‘+o]-‘(X’X)_‘X’.

(13)

Using (12) and (13), we have Ml,

=

px(x’x)-‘[p(X’X)_‘+

II]-l(X’X)_‘X’.

(14)

From (14) it is seen that A4rp is positive semi-definite and rank (M,& = rank (X) = K. Q.E.D. Lemma 2. The Kpositive eigenvalues of MIp are either equal to 1 or less than 1. The number of eigenvalues which are equal to 1 is K- Ki. The remaining Ki positive eigenvalues are less than 1.

To find the eigenvalues of MIlr we consider 1MI,-AZ, in view of (14), can be written as

Proof.

I~x(x’x)-‘[p(x’x)-‘+D]-‘(x’x)-lx’-;izTT,

= 0.

1 = 0 which,

(15)

If the matrix products AB and BA are defined their non-zero eigenvalues are identical, see MacDuffee (1946, Theorem 16.2). Using this result we can write eq. (15) as l~(x’x)-‘[/L(x’x)-‘+D]-l-Iz~,

= 0,

(16)

or equivalently, as l(zK+(l//L)L)ox’x>-‘-nz~~

= 0.

(17)

Since the eigenvalues of the inverse are the reciprocals of the eigenvalues of the original matrix, we have lz~+(1//L)Dx’x-(l/E”)z~l

= 0,

(1%

which is expressible in the form

This shows that MIp has K- Kj unit roots and further 1X,lXi-fz~,l

= OS

(20)

6

J.S. Mehta and P.A. V.B. Swamy, Moments of Bayes estimators

where f = ,u(l-,I)/Iz. We have f > 0 and hence 1, < 1 for h = 132, *.*) Ki, because of the positive definiteness of X/Xi. Q.E.D. 3. The case of two included endogenous variables We now specialize (9) to the example in which Gi = 2 and write

Following Sawa (1972) we may rewrite eq. (1) in the form y: = Y,*yi”+xipT+u;,

(22)

where D = Y’Y,

!P =

5

0

cop

co

( >

t2 = (@ll- dzP22L 7; = o(y,-p)/&



0 =

1/Q,,,

P =

~12lfi22,

(yi”, Yi*) = (j+, Y,)!P,

/?i* = t-l&,

ui*= <-‘ul.

YT is independent of y:. In terms of these variables, the estimator (21) can be written as (23) where <* = c/o and ,u* = p/w 2. 3 From this relation it follows that the rth moment of gi, is finite if the rth moment of g$ is finite. The problem now resolves into showing that the rth moment of g$ is finite. A useful result is provided in Lemma 3. 3. Let (XI, X2)’ be a 2 x 1 vector of random variables such that Pr (XI 5 0) = 0, Pr(X, > 0) > 0 and Pr(X, 5 0) > 0. Suppose that the

Lemma

joint moment generating function

4(t19t2) = Ebp(tlXl

of XI and X2, denoted

+ t2X2)1,

(24)

exists and is uniformly continuous for ---co < t, 5 E > 0 and It,1 6 E > 0. Then E(X,/X,)’ < 00 for r 5 2n if the integral, 3The authors are indebted to a referee for pointing out an error in their previous derivation of the result in (23).

J.S. Mehta and P.A. V.B. Swamy, Moments of Bayes estimators

1 U2n) -f

O -a,

(-p-1

[“‘“;;;

t2)]f2=o dt,,

7

(25)

converges.

Suppose that $(tI, t2) exists and is uniformly continuous for -co < t, 5 E > 0 and 1t, 1 5 E > 0. Given that the function 4(tl, t2) has, for the particular value t, = 0, a finite partial derivative of even-order 212, his derivative is equal to the limit.

Proof.

Now note that under

the assumed

conditions

{ft, = (sinh t2X2/t2X2)2nX~” etiX*) is a sequence of non-negative integrable functions such that lim,,+oinf Hf,,) Then, by Fatou’s lemma [Rao (1973, p. 135)] it follows that lim,,,,inff,, integrable and

< 03 is

Multiplying both sides of the inequality in (26) by the non-negative quantity /r(2n), -co < tl 6 0, and integrating from -co to 0 with respect (-t,)2”-1 to t,, we obtain (25) 2 &

J”_oD(- t,)2”-1E(X,2” e’lX1) dt.

(27)

Since the integrand on the right-hand side of the inequality sign in (27) is non-negative, Fubini’s theorem [Rao (1973, p. 137)] shows that we may reverse the order of expectation sign and integration with respect to t, . Thus

&

(25) >=

EP?jym C-t,) ‘“-’

etiX* dt,] = E(X2/Xl)2”.

(28)

8

J.S. Mehta and P.A.V.B. Swamy, Moments of Bayes estimators

From (28), we see that E(X,IX,)‘” < 00 if the integral in (25) converges. The finiteness of E(X2/X1)2” implies that of E(X,/X,)’ for r 5 2n in view of the inequality 1(X,/X,> 1 6 1 + 1(X,/X,)] 2n. Q.E.D. We remark that our Lemma 3 is not true if a derivative, P$(t,, t2)/dt;t, of odd-order is considered in place of an even-order derivative considered in (25). [See Cramer (1946, p. 90) and Rao (1973, p. 100, (v)) for a similar result in the univariate case.] If we use an odd-order derivative (25) becomes 1

O

r(n> s -m

(-tJ‘-’

lim E [ r:t22x2>”

Xi etlxl] dt, ,

(26’)

t2+m

where n is an odd positive integer. Now the function f,, = (sinh t2X2/t2X2)“XJet1X1 is not non-negative. In (26’) we cannot use Fatou’s lemma to carry the limit inside the expectation sign. This shows that the proof of Lemma 1 in Sawa (1972, p. 658) is not correct. Also, the assertion of Sawa’s (1972, p. 658) Lemma 1 is incorrect because Sawa’s (1972) paper itself contains a counter-example to the claim that the integral in (26’) diverges whenever E(X,/XJ’ = co. Sawa (1972, pp. 664-665) sets up an integral of the form (26’) to obtain the first moment of the estimator (21) with ,M= 0. Lemma 3 in Sawa (1972, p. 659) shows that this integral converges if K- Ki 2 1. But the first moment of the estimator (21) with p = 0 does not exist if K-Ki = 1 as Sawa’s (1969) earlier result shows.4 While necessary for the existence of an odd-order moment of (X,/X,), the convergence of (26’) is by no means sufficient. [See Cramer (1946, p. 90) and Rao (1973, p. 100, (iv)) for a similar result in the univariate case.] If the nth moment of X,/X, exists, then one can find it by evaluating the integral in (26’). We find from eqs. (2) and (22), and making use of Assumption (ii), that, given X, yT is normally distributed with mean XnT and variance-covariance matrix 1, and YF is normally distributed with mean XJ7: and variancecovariance matrix 1, i.e.,

where X(n:, n:) = X(ni, IIJY-‘. Further, y: is independent of YT. Let P be an orthogonal matrix such that P’Ml,P = A, a diagonal matrix. Lemmas 1 and 2 show that the rows of P can be so arranged that n takes the form 4A part of the proof of Theorem 1 of Sawa (1972), p. 665 is based on his Lemma 1. In view of the error in his Lemma 1, the proof of his Theorem 1 is incomplete.

J.S. Mehta and P.A.V.B. Swarny, Moments of Bayes estimators

A = diag [0, Z, nil,

9

(30)

where 0 is a T-K x T-K matrix of zeroes, Z is an identity matrix of order K-K,, Ai = diag(A., , &, . . ., ARi) and 0 < A, 5 A, $ . . . 5 Ax, < 1. Partition P conformably to (30) by P = [PI, P, , PJ. Here P,,Pz and P, are the matrices of order TX (T-K), TX (K- Ki) and TX Ki, respectively. Following Sawa (1972, p. 663) we can show that the joint moment generating function of ( YF’M1aYF+ p*) and (p Y,*‘M1~Yz + t* Y,*‘M,,y:) is equal to f$(tl, t2) = ti”‘(l_2Q

for (1-2t,-2pt2-C*%$ a, = PiXnT,

-2pt2 - 5*2t;)-(=i)‘z

> 0. In (31), a2 = P;XlI:,

a3 = P;X$,

a4 = P;XII~,

and Q,, and Q,, are the hth elements of a3 and a4, respectively. Simple but tedious calculations show that for t1 < $, (TWIG),,=,

= AU2 +.A),

(32)

where

fl = Mb, t2)It2=o = exp [p*tJ( 1 - 2t,)- W-WI2 h=l

10

J.S. Mehta and P.A. V.B. Swamy, Moments of Bayes estimators

=

f 3

(K-KJ5*2+a;al<*2 + aLa25*2+4pa;a25*+2p2(K-Ki) (I-

(I-2t,y

2h)

4p2a$a2 +(I -2t$

Ki +c

h=l

We now prove the following lemma: Lemma 4.

The integral,

(33) converges for bi > 0, hi > 0, ci 2 0 (i = 1,2, . . . . m) if p > 0. Proof.

Let

f(‘) =

(ij (~~ix)~~)x~exp{-px+~l~)}~ (34)

It can be easily verified that

s m

k(x)Idx6 s0

x eeuX dx < co.

(35)

0

Hence the integral (33) is absolutely convergent if /J > 0. Theorem. The estimator gz moment if p > 0. Proof.

defined in (23) possesses a Jinite second-order

Consider the integral (36)

where [a2+(t,, t2)/8t~]t,,o is as defined in (32). The integral (36) can be evaluated by multiplying each term in (32) by (- tl) and then integrating term by term with respect to t, over (- co, 0). Each of these terms is in the form of

J.S. Mehta and P.A. V.B. Swamy, Moments of Bayes estimators

11

(34). So it follows from Lemmas 3 and 4 that the second-order gi*,is finite. Q.E.D.

moment of

Notice that in the above theorem no restrictions on K- Ki are imposed. The second-order moment of gz is finite even when K- Ki = 1, i.e., even when the model is exactly identified. One can use our Lemma 3 to prove the existence of the higher order moments of gz.’ These ideas easily extend to the analysis of the k-class estimators when k is non-stochastic. The (k) estimator of $ in eq. (22) is

g,*,(k)=

p YI*'Mkp Yi* + 5*

Y;r’M,,yf

( YF’MkpYi* + p*>



(37)

where il$, = I-Xi(&‘xi+pI)-‘Xi’-k[l--X(X’X)-‘X’] = (I-k)[l-X(X’X)-‘X’]+pX(X’X)-’ x ~(x’x)-‘+D]-l(x’x)-lx’.

It can be shown that for 0 $ k < 1 the rank of M,, is T. The eigenvalues of Mkp are (1 -k) of multiplicity (T-K) and 1 of multiplicity K- Ki. The remaining Ki eigenvalues of Mkp lie between 0 and 1. Application of Lemma 3 shows that for 0 6 k c 1 and p > 0 the second-order moment of g:(k) is finite regardless of the value of (T- Ki - 1) > 0. We only require (K- Ki - 1) L 0 (the necessary order condition for identification of yi and Bi). In an excellent paper, Zellner (1975) develops a minimum expected loss (MELO) estimator for 6i in eq. (3). It has been pointed out by Zellner that the conditions under which the finite-sample moments of a MEL0 estimator exist are the same as those under which the finite-sample moments of a (k) estimator with 0 < k < 1 exist. One such condition is that T-Gi-Ki+ 1 2 2. In contrast, the estimator (37) can possess a finite second-order moment even when Gi = 2 and 0 < T- Gi-Ki+ 1 < 2. What Zellner does not assume while we do is that X does not contain lagged endogenous variables. Suppose that Wi = [X* i, Xi] where X* i is a TX Gi- 1 submatrix of X, . Let rank (WJ = K, + G, - 1. Then under Assumptions (i)-(iii) any instrumental variable estimator of the form (W/ZJ-‘Wi’yi will not even possess a finite mean. We can write (WilZi)-’

Wiri = [Z~ Wi( Wi’Wi)-’ W,‘Zi]-‘Z~ Wi( Wil Wi)-l Wi’yi.

These results together with the Cauchy-Schwarz inequality can be used to show the existence of moments of the product Y**g*,“. It is obvious that an even moment of the estimator (10) exists if the corresponding moment of Y,*g*,,, exists.

12

J.S. Mehta and P.A. V.B. Swamy, Moments of Bayes estimators

The implication of our Theorem is that the modified instrumental variable estimator [Z; IVi(W; FVJ-r W[Zi+@J-‘Z; Wi(W/ Wi)-l W,‘yi will possess finite second-order moments for every p > 0. 4. Concluding remarks In this paper finite-sample properties of some simple Bayes estimators of structural coefficients in a linear simultaneous equation model are considered. This work establishes that the Bayes estimators considered here have finite second-order moments and thus finite risk relative to a quadratic loss function regardless of the degree of over-identification of the equation being estimated. This is a decided improvement relative to the usual two-stage least-squares estimators which do not, in general, possess finite second-order moments and have unbounded risk relative to a quadratic loss function if the degree of overidentification of the equation is less than or equal to unity. Thus it has been shown how simple modification of the usual estimators as in (4) and (37) lead to estimators which are, at least sometimes, superior to the usual estimators. Even though the prior information contained in the constraint in (7) is not as rich as that contained in the constraint in (6), the latter is not as easily assessible in practice as the former. Acknowledgement We thank the referees for their comments which led to improvements in the manuscript. We wish to thank Professor R. Srinivasan for some very helpful discussions with respect to the proof of Lemma 3. References Cramer, H., 1946, Mathematical methods of statistics (Princeton University Press, Princeton, NJ). Goldberger, A.S., 1964, Econometric theory (Wiley, New York). Hatanaka, M., 1973, On the existence and the approximation formulae for the moments of the k-class estimators, Economic Studies Quarterly 24,1-15. Johnston, J., 1963, Econometric methods (McGraw-Hill, New York). MacDuffee, C., 1946, The theory of matrices (Chelsea, New York). Mariano, R.S., 1972, The existence of moments of the ordinary least squares and two-stage least squares estimators, Econometrica 40,643-652. Meeter, D.A., 1966, On a theorem used in non-linear least squares, J. Sot. Ind. and Appl. Math. 14,1176-1179. Rao, C.R., 1973, Linear statistical inference and its applications, 2nd ed. (Wiley, New York). Richardson, D.H., 1968, The exact distribution of a structural coefficient estimator, Journal of the American Statistical Association 63, 1214-1226. Sawa, T., 1969, The exact sampling distribution of ordinary least squares and two-stage least squares estimators, Journal of the American Statistical Association 64, 923-937. Sawa, T., 1972, Finite sample properties of the k-class estimators, Econometrica 40, 653-680. Swamy, P.A.V.B. and J. Holmes, 1971, The use of undersized samples in the estimation of simultaneous equation systems, Econometrica 39,455-459.

J.S. Mehta and P.A. V.B. Swamy, Moments of Bayes estimators

13

Swamy, P.A.V.B. and J.S. Mehta, 1976, Minimum average risk estimators for coefficients in linear models, Commun. Statist. AS, 803-818. Swamy, P.A.V.B., J.S. Mehta and P.N. Rappoport, 1975, Relative efficiencies of a competitor of Hoer1 and Kennard’s ridge regression estimator, Manuscript (Federal Reserve Board, Washington, DC). Zellner, A., 1971, An introduction to Bayesian inference in econometrics (Wiley, New York). Zellner, A., 1972, Constraints often overlooked in analyses of simultaneous equation models, Econometrica 40,849-854. Zellner, A., 1975, Estimation of functions of population means and regressions coefficients including structural coefficients: A minimum expected loss (MELO) approach, Manuscript (H.G.B. Alexander Research Foundation, University of Chicago, Chicago, IL). Zellner, A. and W. Vandaele, 1975, Bayes-Stein estimators for k-means, regression and simultaneous equation models, in: S. Fienberg and A. Zellner, eds., Studies in Bayesian econometrics and statistics (North-Holland, Amsterdam) 627-653.