Estimation with aggregated data

Estimation with aggregated data

Journal of Econometrics 10 (1979) 43-55. ESTIMATION 0 North-Holland WITH Publishing AGGREGATED Company DATA R. W. FAREBROTHER Unioersity of ...

533KB Sizes 0 Downloads 117 Views

Journal

of Econometrics

10 (1979) 43-55.

ESTIMATION

0 North-Holland

WITH

Publishing

AGGREGATED

Company

DATA

R. W. FAREBROTHER Unioersity of Manchester, Manchester Ml3 9PL, UK Received

June 1975.

This paper is concerned with the problem of estimating the parameters of the standard linear model from grouped data, or more generally from aggregated data. A number of alternative solutions are suggested and compared.

1. General theory

1.1. The problem The problem of the model

considered

p=8p+E;

in this paper

EEl=O,

is that of estimating

E.C’=CT~I,,

when we only have data on the model obtained m x n matrix A of rank m*, y=Xp+&,

E&=0,

the parameters

(1) by premultiplying

(1) by the

Eed=o’AA’,

where jj is an n x 1 matrix of observations on the dependent variable, n x k matrix of observations on the independent variables, /? is a k x of unknown parameters, EI is an n x 1 matrix of disturbances and unknown positive scalar; y = Aj7, X = A% and E= AEI; X is assumed full column rank.

(2) J? is an 1 matrix c2 is an to have

1.2. Known error variance Suppose that W* = A* A*’ is known where A* is an m* x n submatrix of A of rank m*. Then, without loss of generality, we can suppose that A* lies in the first m* rows of A and that A= [:]A*, where P is an (m-m*) x m* matrix. Premultiplying (2) by the non-singular matrix (Z+P’P)_’ [

0

0 I][YP

PJ7

44

K.W Farebrother,

Estimation

with aggreguted data

we obtain

[-q=[;‘ia+[yJ

(3)

where J’* =A*?, X* =A*2 and E* =A*& and where X* is of full column rank. Deleting the last m - m* rows of (3) which are trivial we obtain EE*=~,

y*=X*p+&*,

EE*E*=D~ W*.

(4)

The arbitrary choice of a full row rank submatrix of A is innocuous as any other nr* x n full row rank submatrix of A takes the form QA*, where Q is non-singular, and model (2) again reduces to model (4). W* is known so we can evaluate the formulas for the best linear unbiased estimator (BLUE) of b given y* and X*, (5) (6) and CT~is estimated

unbiasedly

62 = i*’ W*-

’ i*/(m*

by - k),

(7)

where E^*=y* -x*fi,. less efficient’ than fl=(rf’g)-‘x’j, the ordinary least B.4 is generally squares estimator of /I, since j? is the BLUE of fi in model (1) after it has been premultiplied by the non-singular matrix [“,‘I whilst /?, is obtained by deleting the last n-m* observations. This is an important result as (4) may be transformed to take the form of (1) so we can deduce that the application of a sequence of aggregation matrices gives rise to progressively less efficient estimators. 1.3. Unknown

error

variance

Let Z’X and 2’~ be known ‘varfl=var~,

matrices

iff

r7~rl=R’A*‘(A*A*‘)-‘~*~,

iff

_?[I,-A*‘(A*A*‘)~‘A*]8=0,

iff

,? =A*‘C

for some

Thus DA is as efficient as j?‘iff r? is linearly

and let Z’X be non-singular

m* x kc. dependent

on A*‘.

then the

R.W Farebrother,

instrumental

variable

Estimation

with aggregated

45

data

estimator

b=(Z’X)_‘Z’y

(8)

is an unbiased estimator determine its variance

of /3. However,

var b = o’(Z’X)-

if W = AA’ is not known,

’ Z’ WZ(X’Z)-

we cannot

‘,

(9)

except in certain special cases. For Z*’ WZ* = &,iwiiZ,?r’ Zf does not involve wij iff ZF =0 or ZT, =O, where Z* = Z(X’Z)-‘, and Zi =0 iff Zi. =O. Thus if A may be partitioned as

where Ai. is an m, x n matrix of full row rank, and V$ is known for i=l,2,... , h I h*, where ej = Ai A;, . Then, assuming that Xi, has full column rank, we have a set of h irreconcilable unbiased estimators’ of p, h”’ = (Xi. w, l Xi.)_ var b”‘=a2(X;. and h irreconcilable

l

x;. w, l yi,

Wi; l Xi.)-‘,

unbiased

estimators

i = 1,2, . ., h,

(10)

i = 1,2,. . .) II,

(11)

i = 1,2, . . ., h,

(12)

of cr2,

s&= ef WiT ’ ei/(mi - k), where ei = yi - Xi,

b(“,

4’i=Ai.~,

Xi.=Ai.rZ,

and where

To resolve this situation unbiased estimator

&i=Ai,E:

when3

h* =h=

k, Haitovsky

has suggested

~,=(xbv~‘X)-‘(XI,V_‘L’), 2b(‘) is, of course, EEi&i = uz l4$. ‘The assumption

the

BLUE

of fi given

h* = h is merely for notational

the

(13) y, and

Xi. in the

convenience.

model

y,=X,.p+~~,

E&,=0,

R.W Farebrother,

46

Estimation

with aggregated

data

where ~d=diag{rl.,rf.,...R.,}, and A,=diag{A,.A,:..A,.), SO

Wzz.‘.Wkk}=l!

A,A:,=diag{W,, and

A,~,=diag{X,,X,,...X,,}=X,. Unfortunately

generally

IjH is

type

a

(8) estimator

and

has

a variance

which

is

unknown, var~H=~2(X~I/-1X)-1X~I/-1WI/-1X,(X’I/-1X,)-’.

(14)

Haitovsky did not note this fact for reasons which An alternative solution which does not suffer obtained by shortcircuiting Haitovsky’s (1973, pp. c be a vector of k constants such that Xj. is j=c1,c2,. . .,ck, then

we will discuss below.4 from this defect may be 3&35) full argument. Let of full column rank for

b*(,n)

is an unbiased

=

{

b’,‘d b(zEd

estimator

4Haitovsky

(15)

of p with known

var by) =a2(X;. where the superscript estimates are, however,

. bfdj

variance

elements

WI; l Xj,)ii,

(16)

ii denotes the iith element of the inverse. only obtained after extensive computation.

does not state the formulas

These

in this form. However,

X,V~‘X=R~A~(A,A~)~‘A9=~~H~=~~(l~~H)~~, where H=col{H,H2...H,j

and Hi=A;.(A,.A;.)-‘Ai..

Similarly, X:,~‘~=r7~H~=8~(l;oH)(I.,~~). Further X;V-‘AA’V-‘X,=R;HH’Rd=x;H*z_,, where H;=H,Hj=A;.

Wi,‘~jW,;‘Aj.A,.

Thus (13) is indeed the Haitovsky (p. 34) estimator but we disagree with his formula for the variance.

if our matrix

Ai. represents

his matrix

Gi,

R.W? Farebrother,

Estimation

with aggregated

data

41

2. Grouped data 2.1. Definitions Let Fi. be an mix n matrix each of whose columns contains one unit element and mi- 1 zeros then we shall refer to it as a simple aggregation matrix. Simple aggregation matrices have the following properties: lkiFi. = 1L Fi. 1, =x, Fi. F;. = diag (jJ, Fi. FJ. = Nij, where 1, is a p x 1 matrix of ones and where x(j), the jth element of A, records the number of unit elements in the jth row of Fi. and the ghth element of Nij records the number of unit elements common to the gth row of Fi. and the hth row of F). Let Fi. be a simple aggregation matrix of full row rank then Gi. = (Fi.FI.)- * Fi. is the corresponding

simple grouping

matrix.

2.2. Preliminaries Suppose matrices

that

F may

F=

Fl. i

be represented

as

k stacked

simple

aggregation

I! .

F,.

Then

FF’=N=

NI,

NI,

...

N,,

N,,

N,, .

...

N,,

Nk,

N,,

...

N,,

where the m, x mj matrix Nij=Fi.Fi. and jth simple aggregation matrices.

records

the joint

frequencies

of the ith

48

R.W Farebrother,

Suppose ing matrix

Estimation

with aggregated

data

that Fi. is of full row rank for i = 1,2,. . ., k, then the correspondof stacked simple grouping matrices is given by

r

G,.

G=

=M-‘F,

i

11G/C. where M=diag{N,,N,,.. 2.3 Haitovsky’s Setting

.N,,..

Clearly

GG’=M-‘NM-‘.

method

A = G in (13) we have

fiH= (& Mx)- ‘2; Mj,

(17)

/?Jj=(XJX+)-‘XJL’+,

(18)

or

where j=Gy,

X=Gr?,

R,=diag{X,,8,,...~,,},

and where X+= F;T?,

j&F’&

Xj=F&

F,=diag{F,.F,:..F,.}.

Eq. (18) is the simple form used by Haitovsky (pp. 3&32). The ith block of Xj’y+ is X,!‘y,!‘so the variance of flH involves form EX+~,t,f~Xf 1

II

J’

where y/=F;.&,

Xz=F;.&.,

tt=F;.&

and where &=Gi.j,

&.=Gi.a

&=Gi.Z

Now EX+‘,+,+‘X+,=,J~X+,‘X 1 1 lJ J

7,

J’

since Xl=Hir?

and

$=HiE;

Hi=F;.(Fi.F;.)-‘F,.

and

where HiHi=Hi,

terms of the

R.W. Farebrother,

Estimation

with aggregated

49

data

there is no need to adopt Haitovsky’s assumption that EE~EJ = o2 I, as the conventional assumption E.2 = o2 I, suffices. Haitovsky’s major error was to treat Xz as if it were known whereas Xz =F;.Xi. and only Xi, is known. This error probably derives from the valid assumption [Haitovsky (1973, p. 6)] that a single simple aggregation matrix may be written as so

Fi. =diag

. . . l;,(,,Jj

{l;i(i,l;,(~,

without loss of generality. If, however, this assumption is made for each Fi. then a set of Nijs result which may be correctly reconstructed from their row and column totals by the ‘north west corner rule’, fiij=diag 2.4. A further Noting

jl;i,i,l;~~z,.

. l;i(m,J .diag {lfj(~jl~j(~).

. lfjcrnj)l.

suggestion

that Nijl,j=Fi.F;.

lmj=Fi.

1, =f;,

and

so that if the elements

It therefore

were randomly

seems reasonable

allocated

to cells one would expect

to use

Nl,

Nil N 22

N 22

k,

as a surrogate for N. To obtain a full row rank submatrix of G we must delete at least one row from all but one of its submatrices; this generally suffices to make fl* non-

R.W Farebrother,

50

Estimation with aggregated data

singular’ so that we can evaluate formulas (5) (6) and (7). However the estimator is no longer independent of the choice of the full row rank submatrix of G so that different investigators making different choices of G* would obtain different estimates of /?. This impasse may be resolved by noting that

where (AA’)+ denotes Moore-Penrose eqs. (5), (6) and (7) become

generalised

inverse

of AA’. For then

/&=[x’w+x]-1x’w+y,

(19)

varfia=a2[X’W+X]-‘,

(20)

CT’= 2’ W+ 6/(m* - k),

(21)

and

where

And different

investigators

2.5. Illustrative

can agree on a single approximation

Wi

to W’.

example6

To illustrate

the theory

Pi=pO+pl

we consider

x+p2Si+&i?

Houthakker’s

model

i= 1,2,. . ., 1218,

where Pi is the net purchases of automobiles by the ith household, x is its income and Si is the value of its automobile stock at the beginning of the year. The E;S are disturbances which are assumed to be mutually uncorrelated and identically normally distributed with zero means. The original observations were crossclassified by income into seven groups and by stock into eight groups, the intercept classification being trivial. The 56 mean values for each variable and their corresponding frequencies are listed in tables A.l-A.4 of Haitovsky (pp. 77-80). Performing generalised least squares on the 56 observations we obtain the first row of table 1. Suppose now that the complete cross-classification is not available, but only the marginal means and the joint frequencies. Then deleting the last row and the intercept grouping matrix of the income grouping matrix s/5ij 1,,=(l/n)f;f;l,,=f;

so there are k- 1 linearly

independent

form FJ.,l,L-fl.jl,,,,-O. ‘1 am indebted

10 .lllhan Taylor

for performing

the calculations.

linear restrictions

on fl of the

R.W. Farebrother,

Estimation

with aggregated

data

and Haitovsky’s

regression.”

51

Table 1 Summary

of the simple regressions Intercept

Model

Y-coefficient

S-coefficient

8*

Complete cross-classification

17.98512 (5.85771)

0.72916 (0.12567)

-0.17236 (0.03370)

4273.067

Houthakker

18.07354 (5.86896)

0.72637 (0.12594)

-0.17186 (0.03384)

4285.348

Y-table

10.86600 (34.39248)

0.55054 (0.84097)

0.03815 (0.97711)

9027.315

S-table

73.74625 (30.80949)

- 0.65330 (0.76224)

-0.09312 (0.04720)

1348.399

0.72713 [0.10335]

-0.17178 [0.02820]

4335.491

Haitovsky

18.03350 C6.606521

“With the exception of the Haitovsky estimates, the numbers in parentheses are the standard errors of the estimated parameters above them. The numbers below the Haitovsky estimates are the north west corner rule approximations to the standard errors.

we obtain the Houthakker estimates given in the second row of W2W;,,, table 1. If, further, the joint frequencies are not known we can use either the income classification or the stock classification and eq. (10) to estimate the regression. These results are given in the third and fourth rows of table 1. Before attempting other estimates in this situation it is interesting to examine the standard deviations of the estimates we have already obtained. As will be seen from table 2 the standard deviations of the Houthakker estimates are larger than those of the corresponding complete cross-classification estimates and the standard deviations of the singleTable 2 The standard

deviations

of the simple regressions.

Model

Intercept

Y-coefficient

S-coetlicient

Original

(0.10080)

(0.001929)

(0.000507)

Complete cross-classification

0.089610

0.0019225

0.00051554

Houthakker

0.089654

0.0019238

0.00051692

Y-table

0.361980

0.0088512

0.01028407

S-table

0.839026

0.0207578

0.00128538

“The factor o is understood throughout computed from Haitovsky’s table 3.2.

the

table.

The

first

row

has

been

R.W Farebrother,

52

Estimation

with aggregated

data

classification estimates are larger still. These are to be compared with the standard deviations implicit in table 3.2 of Haitovsky (p. 18) which do not follow this ordering. Indeed our table 2 shows that the standard errors of the estimates obtained directly from the original data are also stated incorrectly. This error has previously been noted by Johnston (1972, p. 236). The remaining estimates are based on approximations to the unknown joint frequency table which is given in the body of table 3a. Our suggested approximation to this table, given in table 3b, reconstructs the expected joint frequencies from the marginal frequencies on the assumption that elements

Table 3a Observed

frequency

table.

6 7 23 26 34 15 7

6 3 21 19 31 17 15

4 8 22 25 26 20 14

118

118

119

21 26 50 28 38 12 14

18 19 55 56 37 18 7

10 17 58 60 46 24 12

11 7 36 29 33 18 17

195

210

227

151

-

-

4 2 12 16 22 10 14

86 89 277 259 213 134 100

80

1218

5.7 5.8 18.2 17.0 17.9 8.8 6.6

86 89 277 259 273 134 100

Table 3b Expected 13.8 14.3 44.4 41.5 43.7 21.5 16.0 195

14.8 15.3 47.8 44.7 47.1 23.1 17.2 210

16.0 16.6 51.6 48.3 50.9 25.0 18.6

10.7 11.0 34.3 32.1 33.8 16.6 12.4

227

151

frequency

table.

8.3 8.6 26.8 25.1 26.5 13.0 9.7

8.3 8.6 26.8 25.1 26.5 13.0 9.7

118

118

8.4 8.7 27.1 25.3 26.7 13.1 9.8 119

80

1218

Table 3c North

west corner

rule frequency

table

86 89 20 0 0 0 0

0 0 210 0 0 0 0

0 0 47 180 0 0 0

0 0 0 79 72 0 0

0 0 0 0 118 0 0

0 0 0 0 83 35 0

0 0 0 0 0 99 20

0 0 0 0 0 0 80

195

210

227

151

118

118

119

80

X6 89 277 259 273 134 100 1218

R.19: Forebrother,

Estimation with aggregated data

53

are randomly allocated to cells. Haitovsky implicitly7 uses the north west corner rule which constructs table 3c. The Haitovsky estimates, obtained by applying eq. (13) to the data in deviations from means, and the NWCR approximations to their standard errors8 are given in the last line of table 1. It is apparent from table 4 that the NWCR approximations to the standard deviations of the slope coefficients are gross underestimates. , If we delete the same observations as before the first method of section 2.4 produces the estimates of tables 5 and 6. The generalised inverse method, on the other hand, produces the estimates of tables 7 and 8.

Table 4 The estimated

Frequency

standard

matrix

deviations

of Haitovsky’s

estimator.P

Intercept

Y-coefficient

S-coefficient

True

0.089661

0.0019240

0.00051702

Expected

0.086284

0.0020160

0.00054025

NWCR

0.100335

0.0015696

0.00042830

“The factor

e is understood

throughout

the table

Table 5 The ‘deletion’ estimates

Frequency

matrix

and their estimated

standard

errorsa

Intercept

Y-coefficient

S-coefficient

True

18.07354 (5.86896)

0.72637 (0.12594)

-0.17186 (0.03384)

4285.348

Expected

18.50071 C5.659161

0.71338 [0.13202]

-0.16976 [0.03548]

4330.861

NWCR

12.50580 [8.42355]

0.83512 CO.126321

-0.16158 CO.034781

9682.899

“The numbers above them.

in parentheses

are the estimated

standard

errors

of the estimated

parameters

‘Our suggestion that Haitovsky uses the north west corner rule is confirmed by his table A.8 (pp. 86-87). ‘The approximate standard error of the intercept may be obtained from Haitovsky’s formulas (pp. 31-32) by using his data (p. 36) and &1x,x, =971898. uz is estimated biasedly by ?;,M-‘&/(p(G)k) where &, =y-Xfi” (p. 32).

R.W Farebrother,

54

Estimation with aggregated data

Table 6 The true and estimated

standard

deviations

of the ‘deletion’ estimates.” Y-coefficient

S-coefficient

0.089654

0.0019238

0.00051692

Expected

0.089936 [0.085993]

0.0019339 [0.0020062]

0.00051778 [0.00053907]

NWCR

0.185066 [0.085604]

0.0032171 [0.0012837]

0.00085452 [0.00035344]

Frequency

matrix

Intercept

True

“The factor c is assumed throughout the table. The numbers the estimates of the true values above them.

in parentheses

are

Table 7 The ‘generalised Frequency

inverse’ estimates

matrix

and their estimated

standard

errors.’

Intercept

Y-coefficient

S-coefficient

True

18.07458 (5.86906)

0.72634 (0.12594)

-0.17186 (0.03384)

4285.622

Expected

18.50171 [5.65922]

0.71335 [0.13202]

-0.16976 [0.03548]

4331.097

NWCR

12.50545 [8.42433]

0.83511 CO.126331

-0.16157 [0.03478]

9684.573

“The numbers above them.

in parentheses

are the estimated

standard

errors

of the estimated

parameters

Table 8 The true and estimated Frequency True

matrix

standard

deviations

Intercept

of the ‘generahsed Y-coefficient

inverse’ estimates.’ S-coefficient

0.089652

0.0019238

Expected

0.089935 CO.0859921

0.0019339 [0.0020061]

0.00051778 [0.00053907]

NWCR

0.185062 [0.085604]

0.0032170 [0.0012837]

0.00085451 [0.000353453

“The factor (I is assumed throughout the table. The numbers the estimates of the true values above them.

0.0005 1692

in parentheses

are

R.W Farebrother,

Estimation

with aggregated

data

55

3. Conclusion Despite the fact that the conventional x2 test statistic is 103.724, it is clear that the ‘expected’ frequency table is a close approximation to the observed frequency table. We therefore suggest the use of eqs. (19), (20) and (21) with this approximation, The results then obtained are still only second best to those obtained from eqs. (5), (6) and (7) which would be available if the compilers of aggregate data series were to oblige practitioners with the joint frequencies of their tabulation. References Haitovsky, Y., 1973, Regression estimation from grouped observations, Grifftn’s Statistical Monographs and Courses no. 33 (Charles Griffin, London). The section of the monograph which concerns us in this paper is based on an earlier paper by Haitovsky published in: 1966, Journal of the American Statistical Association 61, 72&728. Johnston, J., 1972, Econometric methods, 2nd ed. (McGraw-Hill, New York).