Testing for incremental information content in the presence of collinearity

Testing for incremental information content in the presence of collinearity

Journal of Accounting and Economics 6 (1984) 205-217. North-Holland TESTING FOR INCREMENTAL INFORMATION CONTENT IN THE PRESENCE OF COLLINEARITY* Andr...

717KB Sizes 0 Downloads 31 Views

Journal of Accounting and Economics 6 (1984) 205-217. North-Holland

TESTING FOR INCREMENTAL INFORMATION CONTENT IN THE PRESENCE OF COLLINEARITY* Andrew A. CHRISTIE University of Southern California, Los Angeles, CA 90089-1421, USA

Michael D. KENNELLEY, J. William KING and Thomas F. SCHAEFER Florida State University, Tallahassee, FL 32306, USA Received June 1984, final version received October 1984 A number of recent research papers use two-stage procedures in lieu of a single multiple regression, in some cases purportedly as a solution to coUinearity among independent variables. We demonstrate that, since collinearity is inherently a data problem rather than a statistical problem, no partitions of dependent or independent variables, orthogonal or otherwise, can provide insights into the relative influence of collinear variables. For the class of linear unbiased estimators this follows directly from the Gauss-Markov Theorem, but we demonstrate some of the results in detail as an aid to interpreting particular papers.

1. Introduction

A number of recent research papers use two-stage regression procedures in lieu of a single multiple regression to address the incremental information content of accounting measures beyond historical cost income. Examples are Beaver, Griffin and Landsman (1982), Evans and Patton (1982), Schaefer (1984) and Beaver and Landsman (1983). Patell and Kaplan (1977) use a portfolio procedure to address the incremental content of cash flow data. Typically, the incremental measures examined are highly collinear with the historical cost income number. One consequence of collinearity is that the precision of estimation declines and it becomes extremely difficult to untangle the relative influences of the various independent variables. Also, coefficient estimates become highly sample-dependent and point estimates may vary greatly with the addition or deletion of a few observations. Some authors have viewed the two-stage procedure as a way to overcome the effects of collinearity. For example, Beaver and Landsman (1983) claim that: '(The two-stage regression results) are not subject to this collinearity problem.' (p. 66, parenthesis added) *We are indebted to Bob Kaplan, Dennis Sheehan, Ross Watts and Jerry Zimmerman for helpful discussions and comments. 0165-4101/84/$3.00©1984, Elsevier Science Publishers B.V. (North-Holland)

206

A.A. Christie et al., Incremental information and coilinearity

Lev and Ohlson (1982), in reviewing the literature, state: 'This (two-stage regression) procedure permits the use of two completely independent variables in explaining returns. On the surface at least, the multicollinearity problem is circumvented by this procedure.' (p. 265, parentheses added) A related example is provided by Bell (1983) who states that: 'The actual industry index used was made orthogonal to the market index to avoid problems of multicollinearity.' (p. 5) Collins, Rozeff and Dhaliwal (1981) use a similar procedure to Bell. The purpose of this paper is to set out the relation between the two-stage and single-stage procedures and, in particular, to demonstrate that, since coninearity is inherently a data problem rather than a statistical one, n o partitions of independent o r dependent variables, orthogonal or otherwise, can provide insights into the relative influence of collinear variables. This point follows directly from the Gauss-Markov Theorem which states that, within the class of linear unbiased estimators, least squares estimators are minimum variance (best) [see Theft (1971, p. 119)]. 1 It is still useful, however, to examine the correspondence between two-stage and multiple regression procedures as an aid in interpreting results in some of the papers. For expositional reasons most of the remaining discussion focuses on the Beaver, Griffin and Landsman (BGL) paper (section 2) although a brief discussion of other related approaches is also provided (section 3). Section 4 discusses some recently developed heuristics that aid in determining whether coefficients are insignificant because collinearity has led to large standard errors or because the associated variables are simply irrelevant.

2. The Beaver, Griffin, Landsman procedure BGL address two inter-related questions: do replacement cost earnings provide incremental information about future cash flows and/or appropriate discount rates that is not contained in historical cost earnings, and, conditional on replacement earnings, does historical cost have any marginal explanatory power? ' . . . the various earnings variables are not mutually exclusive explanatory variables. Hence the level of explanatory power of m o r e t h a n o n e variable 1Theil (1971, p. 121) goes on to observe that: 'an obvious question is what happens when we drop either the linearity or the tmbiasedness restriction... (T)he LS (least squares) coefficient vector retains its optimum property.., when we impose only unbiasedness, not linearity, provided, however, that the random variation is normal. If only linearity is imposed, not unbiasedness, no useful result emerges... (T)he resulting 'optimal estimator' depends on unknown parameters.'

A.A. Christie et aL, Incremental information and collinearity

207

can be compared with the explanatory power provided by knowledge of only one of the variables... To examine this issue, we conduct a two-stage regression analysis.' (p. 27, emphasis in original) Using their two-stage cross-sectional regression procedure, BGL conclude that, independent of their choice of the return (dependent) variable and independent of their choice of replacement cost variable, replacement cost earnings provide no incremental information over and above historical cost earnings. In contrast, historical cost earnings provide information incremental to replacement cost. This finding is consistent with replacement cost earnings being equal to historical cost earnings plus noise. The two-stage procedure, however, does not provide any additional insights to those provided by a single multiple regression, since these dual questions are exactly the questions that a multiple regression is designed to answer. The single multiple regression is now compared with the two-stage regressions conducted by BGL. For compactness, all variables are written as deviations from sample means which results in suppression of the intercepts. First-stage: xR = alxn + zR,

(1)

x H = a 2 x R + Zn,

(2)

Second-stage: y = flxXH + )¢22R"~" E,

(3)

y = yxZH + fl2XR + e,

(4)

where XH xR y e ZR, Zn

-~-historical cost earnings variable, = replacement cost earnings variable, = rate of return on equity, = error term, = residuals from the first-stage regressions.

Further consider the multiple and bivariate regressions. Multiple:

y

= ] t l x H -~- ~ 2 X R + e,

(5)

Bivariate : y = fllXH + u,

(6)

y = flEXR + V.

(7)

208

A.A. Christie et al., Incremental information and collinearity

The purpose of the first-stage regressions (1) and (2) is to produce variables z R and z H which are orthogonal to x H and x R, respectively. The second-stage regressions therefore contain two orthogonal independent variables x H and z R in (3) and x R and z H in (4). It is shown in the appendix that the following correspondences hold among the coefficients, t-statistics and error terms of the equations in (1) through (7). 2 (1) The coefficient fll on x H is identical in the second-stage (3) and bivariate regression (6). Similarly, fie in (4) equals fiE in (7). (2) The coefficient 3'1 on the variable z n in the second-stage regression (4) is identical to the coefficient on x n in the multiple regression (5). Correspondingly, 3'2 in (3) and 3'2 in (5) are equal. (3) The error terms (e) in (3), (4), and (5) are identical. (4) The t-statistics on identical coefficients in the multiple and second-stage regressions are identical. That is, the coefficients and t-statistics from the multiple regression are equal to the coefficients and t-statistics on the respective orthogonal variables in the BGL second-stage regressions. Effectively, the multiple regression procedure orthogonalizes each explanatory variable against all others in the regression, so that the estimated coefficients represent only the incremental effects of each explanatory variable on the dependent variable. Any common explanatory power contained in the independent variables is therefore not reflected in any of the coefficients. The BGL two-stage approach assigns all of the common explanatory power to one of the regression coefficients. Evans and Patton (1982) and Chow (1982) follow similar procedures in the context of Probit and Logit models, respectively. Since it is possible to elicit the same information from either the single multiple regression or the two-stage procedures, the advantages and disadvantages of each approach are described. The primary cost of the two-stage procedure is the potential for errors of inference by any reader without a knowledge of the relations between the single-step and two-step procedures. Largely this potential follows from the inclusion of the raw variables x H and x R in the second-stage regressions. To illustrate, it can be shown that fll = 3'1 -at- 0t13'2'

(8)

f12 = 3'2 + a23'1-

(9)

2 Except where explicitly noted, it is assumed throughout this paper that the multiple regression equation (5) is well specified, that is, all variables are measured without error and there are no correlated omitted variables. This assumption obviates the need to make careful distinctions between coefficients and estimates of coefficients since statements about one imply parallel statements about the other. Relaxing this assumption at most clutters the notation.

A.A. Christie et ai., Incremental information and coilinearity

209

In general it is difficult to say anything useful about fl~ and f12 except that they are respectively equal to the coefficients in the regressions of y on xn and y on x R, i.e., (6) and (7). However, if ~'2 = 0, as BGL construe their evidence, then fll = "tl [from (8)] and, from (9), ~2 ~---tlt2~l = a 2 f l l "

(10)

Therefore, fix is completely determined by the relation between y and x H and f12 is partially determined by the relation between y and XH, notwithstanding that f12 is the coefficient on the replacement cost variable. Any statements about f12 that do more than reflect this dependence are incorrect. Expanding on this idea, if replacement earnings equals historical cost earnings plus noise, then f12, the coefficient on replacement cost earnings in the second stage and bivariate regressions, equals fllo~t/o 2. Therefore the replacement coefficient is just a downward-biased version of the historical cost coefficient where the bias depends on the variance of the noise term in the replacement earnings measure. In other words, if (6) is the 'true' model, then x R is x H measured with error and (7) represents the classic bivariate errors in variables situation. A second potential cost is that of computational error. BGL only discuss the coefficients (~'1 and ~'2) that equal those in the single multiple regression (5). Therefore, the BGL two-step procedure causes them to run four regressions when in fact a single multiple regression would suffice. In any empirical research, the more complex the procedures the greater the probability of computational errors and the greater the incidence of numerical problems such as round-off errors. Third, as we illustrate in the introduction, there is a potential for readers to misunderstand the properties of the two-stage procedure. The foregoing analysis clearly shows that the two-stage procedure does not mitigate or circumvent collinearity and so does not lead to insights beyond those provided by a single multiple regression. Presumably BGL understand the relation between the two-stage and single-stage approaches since they essentially ignore the redundant coefficients, but they increase the probability that others will misunderstand the relation by stating that their study: 'extends the multiple signal studies.., by the use of a two-step regression model to deal with the collinearity issue.' (p. 16, emphasis added) 3 3 In their footnote eight, BGL state that: 'the two-stage procedure is equivalent to an F-test on adding P R E (i.e., XR) tO the regression of return on historical cost earnings (HC) (i.e., XH)'. This statement follows from the equivalence of the t-statistic on the orthogonal variable z R in the second-stage regression (3) and the t-statistic on the raw variable x R in the multiple regression (5). The square of the t-statistic is the F-test on the incremental explanatory power from the addition of that variable. It is more usual to obtain that F-statistic by comparing the regressions (5) and (6) although the outcome is the same.

210

A.A. Christie et aL, Incremental information and collinearity

3. Other related procedures It is also instructive to briefly consider other ways of partitioning variables. Compare the regression y = 3 ' i x n + 3'2XR -I- e,

(5)

y = fllXH -1" U,

(6)

U = fl=XR + O,

(11)

with

where the variables are as defined previously and the error term u is the part of y that is orthogonal to x H. Also consider eq. (1), x R = a l X H + z R. Since the influence of x H has been removed from the dependent variable u in (11), it might mistakenly be inferred that the two-step procedure (6) and (11) overcomes collinearity between x H and x R. However, it can be shown that: (a) #1 = 3'1 "q- a13'2' (b)

f12 = 3'2(1 - P2R),

(c) It(~,2) I > It(/~2)[ with equality if and only if the correlation between x H and x a (P HR) is zero. The condition PHR = 0 alSO guarantees fll = 3'1 and f12 = 3'2" It follows that, except in this special case, the two-stage procedure in (6) and (11) does not mitigate collinearity and generates biased estimates of the coefficients of interest (3'1 and 3'2) and their respective test statistics. Intuitively, the bias in fll follows from the assumption that the multiple regression (5) is t h e ' true' model. The bivariate regression (6) is consequently subject to a correlated omitted variable (XR) which leads to a biased, inconsistent estimator of the 'true' parameter 3'lKaplan and Wilson (1984) point out that other attempts to circumvent the effects of collinearity that do not involve orthogonalizing variables are also doomed to failure. They use examples where the 'true' model is y = Y1x1 q- Y2x2 "Jr e,

(12)

and where accounting theory allows us to write xl=x2+z.

(13)

For example, if x 1 is historical cost income, x 2 could be replacement cost income and z would equal realized holding gains. Alternatively, x 2 could be

A.A. Christie et al., Incremental information and collinearity

211

cash flow from operations; z would then be accounting accruals. The components x 2 and z m a y be substantially less correlated than x~ and x 2. Using (13) to rewrite (12) gives Y = (~/1 + ~ 2 ) x 2 -~- "~1Z + E (14) = # 2 x 2 + ~ l z + E.

Kaplan and Wilson prove that testing whether fix = f12 in (14) is numerically equivalent to testing whether ~2 = 0 ill (12). Intuitively this follows from the identity of the error terms (e) in (12) and (14) and recognition that fll = f12 i f f ~/2 = 0.

A recent innovation of Davidson and MacKinnon (1981), that closely parallels the B G L paper, could also be interpreted as a solution to collinearity. Davidson and MacKinnon consider the problem of testing the non-nested hypotheses H o" y = f l l x n + u ,

(6)

H x: y = fl2XR + /3.

(7)

They propose conducting y -- (1 - a ) , l x H + - ( # 2 x R ) + , , ^

and testing whether ~t is significantly different from zero; f12 is obtained from (7) by regressing y on x R. Compare this procedure with the multiple regression y = "{1XH + Y2XR + e.

(5)

In this linear case it can be shown that t ( ~ ) = t(~2).

(15)

Further, reversing the roles of H 0 and H 1 in (6) and (7), it can be shown that t(&)= t(~l).

(15')

Therefore, the Davidson and MacKinnon procedure, which requires four regressions, is once again equivalent to a single multiple regression. Note that Davidson and MacKinnon do not claim that their procedure is a solution to collinearity and make no reference to the term in their paper. Miller and U p t o n (1984) make recent use of the Davidson and MacKinnon approach. T h e y do not claim or imply that their use is related to collinearity

212

A.A. Christie et al., Incremental information and coilinearity

but the above discussion illustrates that the single multiple regression is a more direct approach than the one they use.

4. Concluding remarks This paper clarifies the relations between single multiple regressions and the two-stage procedures used in recent studies of the incremental information content of accounting measures beyond historical cost income. In some cases the two-stage procedures were (or were interpreted as) attempts to avoid the effects of collinearity. The main point of this paper is that no orthogonalization or other partitioning procedures can mitigate the consequences of coUinearity. Within the class of linear unbiased estimators, least squares multiple regression is best (i.e., minimum variance). Collineai'ity is a small sample problem to which the only solutions in the realm of linear unbiased estimators are to obtain more observations or resort to the introduction of prior information. Belsley, Kuh and Welsch (BKW) (1980) provide a comprehensive discussion of collinearity with a view to developing heuristics that can aid in distinguishing whether coefficients are insignificant because of the presence of collinearity or whether they are insignificant because the associated variables are irrelevant. Recognizing that collinearity is inherently a data problem rather than a statistical one, they focus on conditions under which the matrix (X'X) is 'ill-conditioned'. Ill-conditioned is used in the numerical analysis sense of numerically unstable, one consequence of which is that the least squares coefficient estimates and their variance-covariance matrix are sensitive to the addition or deletion of observations. BKW provide a diagnostic called the condition number which provides a measure of how ill-conditioned (X'X) is. The condition number is a ratio of eigenvalues of (X'X); see BKW for a detailed description of this concept. In tandem with other diagnostics and their simulations, the condition number aids in determining which explanatory variables are collinear and the magnitude of the problem. The condition number explicitly addresses the fact that pairwise correlations between explanatory variables may be low but several variables may be nearly linearly dependent. Since the Statistical Analysis System (SAS) has implemented the BKW diagnostics in its multiple regression routines (option COLLIN), the diagnostics are easy to use [see SAS Users Guide: Statistics (1982)]. It is useful to inquire whether the BGL coefficient estimates are degraded by collinearity. Without replicating the BGL data we cannot examine the BKW diagnostics. Although, as BKW point out, other diagnostic techniques such as those by Farrar and Glauber (1967) and Klein (1962) are deficient in some respects, the BGL case is one in which the Klein technique can provide a useful indication that collinearity is a problem. Klein's approximation is that collinearity may be degrading the estimates when a pair-wise correlation exceeds the

213

A.A. Christie et al., Incremental information and collinearity

square root of the coefficient of determination. In the BGL case the pair-wise correlation between x R and x u is 0.84 whereas the square root of the coefficient of determination is 0.37, which suggests that ( X ' X ) may be ill-conditioned in the BGL situation. In other words, BGL may be finding apparently insignificant coefficients on the replacement cost variable because their variables are collinear, rather than because the replacement cost variable is irrelevant. If the BKW diagnostics indicate that collinearity is degrading parameter estimates, and it is not feasible to obtain more data, then the only solution is the introduction of prior information about the parameters via Bayes or mixed estimation. Ridge regression, another procedure that is often advanced as a solution to collinearity, is a special case of mixed estimation that leads to biased estimators. Both Bayes and mixed estimators are unbiased. Theil (1971), Johnston (1984) and, in particular, Belsley, Kuh and Welsch (1980) provide detailed discussions of these procedures. A final word of caution is in order. Sometimes researchers drop or combine variables because of collinearity [e.g., Foster (1977, fn. 4)]. If the original model is the 'true' model, then dropping variables results in correlated omitted variables and hence biased estimators [Johnston (1984)].

Appendix The relations between multiple regression and two-step regression procedures Consider the regression Yi = 3"0 q- 3"l X l , i "41-3"222,i -~ Ei,

i = 1, N,

(A.1)

which can be rewritten in terms of deviations from sample means as Yi --'~ 3'1X1,i at- 3"2X2,i + ei -~- X 7 d- e.

(A.2)

This regression is assumed to be well specified. That is, there are no measurement errors or correlated omitted variables. Except where needed for clarity, we make no distinction between coefficients and estimates, since parallel statements can be made about each. Using the fact that the least squares estimator of the 2 × 1 vector 3' is ( x ' x ) - i x ' y , it can be shown that 7z = t J y ( P l y - PI2P2y)/{ 01(1 - p22) },

(A.3)

72 = Oy(P2y- Px2Ply)/{ °2(1 - p22)},

(A.4)

A.A. Christie et al., Incremental information and collinearity

214

where o~=var(xj),

j = 1,2,

p12 = corr (xx, x2), pjy= corr (xj, y),

j = 1,2.

Also, since v a r ( ~ ) = o / ( x ' x ) -1 ,

then

var(~l) = o:/{ No?(~ - p~l~)},

(A.5)

var(¢/2)

(A.6)

=

o : / ( No~(1

-

pEx2) },

cov(~,,, ~)= -o:p~j( No,o~(~ -

p~)).

(A.7)

Now consider the following additional regressions. First-stage: X 2 = O~lX 1 "4- Z,

(A.8)

X 1 = O~2X 2 -I- W,

(A.9)

Second-stage: q- f12 Z d- e,

(A.10)

y = ~lw + ~2X2 "~- e,

(A.11)

y

= PlXl

Bivariate:

y = fllxz + u,

(A.12)

y

(A.13)

= ~2x2 + v.

The purpose of the first-stage regressions is to produce variables z and w which are respectively orthogonal to x x and x 2. The second-stage regressions therefore contain two orthogonal independent variables, xl(x2) and z ( w ) . In some of the above regressions, the same coefficients and error terms appear in more than one equation. That these notational correspondences represent algebraic equivalence is now proved along with other relations.

A.A. Christie et aL, Incremental information and collinearity

215

Proposition 1. The coefficient of x 1 [x2] in (A.IO) [(A.11)] is identical to the coefficient of x 1 [x2] in (A.12) [(A.13)]. To see this note that the definition of fl 1 in (A.IO) can be obtained from (A.3) with suitable change in subscripts. That is ~1 = O y ( P l y - - P X z P z y ) / { O1(1 -- P2z))"

(A.14)

But by construction Plz = O, so that (/1.14) reduces to ~1 = PlyOy/O1 '

(A.15)

which is the definition of B1 in (/1.12). The proof for ¢2 is similar. Proposition 2. The coefficient of x 2 [Xl] in (A.2) is exactly the coefficient of z (w) in (A.IO) [(A.11)]. That is Yz = f12 and Y1 = ¢1- Furthermore, t(Y2) = t(fl2) and t(Y1) = t(¢l). To demonstrate these results observe that by suitable notational adaptation of (A.4) and using Plz = 0, fl2 = cov ( y, z ) / o ) .

(A.16)

But, using (A.8), o~ = 02(1 - p:~2),

(A.17)

c o v ( y , z) = c o v ( y, x 2 - oqxx).

(A.18)

Furthermore, by definition, OtI = P1202/O1.

(A.19)

Therefore using (A.16), (A.18), and (A.19), •2 = O2Oy(P2y-- P 1 2 P l y ) / o z2,

which, using (A.17), is fl2= Oy(P2y- P l z P l y ) / { o 2 ( 1 - P212)) =Y2

in (A.4).

Demonstration that ¢1 = ~'1 is similar. To demonstrate equivalence of the t-statistics, it is necessary to show that the error terms in (A.2), (A.10) and (A.11) are identical. To see this, start with

A.A. Christie et al., Incremental information and collinearity

216

the definition of the error term in (A.10), e=y

- fllxx-

fl2z.

Using the definition of fl~ in (A.15), the definition of z in (A.8) and the equivalence of f12 and ~'2, = y -- "Y2X2 -at- "y20/lXl -- { PlyOy/(Til } XI"

On simplification this yields e = y

-

")t2x 2 -

~1x1,

which is the error term in (A.2). Again, demonstration that the error terms in (A.11) and (A.2) are equivalent is similar. Eouivalence of the t-statistics on (72,fl2) and (~'1,~1) then follows from identity of the errors in (A.2), (A.10) and (A.11), the fact that #xz = 0, (A.17), and the definitions in (A.5) and (A.6).

References Beaver, William H., Paul A. Griffin and Wayne R. Landsman, 1982, The incremental information content of replacement cost earnings, Journal of Accounting and Economics 4, no. 1, 15-39. Beaver, William H. and Wayne R. Landsman, 1983, The incremental information content of Statement 33 disclosures, Research report (Financial Accounting Standards Board, Stamford, CT).

Bell, T.B., 1983, Market reaction of reserve recognition accounting, Journal of Accounting Research 21, no. 1, 1-17. Belsley, David A., Edwin Kuh and Roy E. Welsch, 1980, Regression diagnostics: Identifying influential data and sources of collinearity (Wiley, New York). Chow, Chee W., 1982, The demand for external auditing: Size, debt and ownership influences, The Accounting Review 52, no. 2, 272-291. Collins, Daniel W., Michael S. Rozeff and Dan S. Dhaliwal, 1981, The economic determinants of the market reaction to proposed mandatory accounting changes in the oil and gas industry: A cross-sectional analysis, Journal of Accounting and Economics 3, no. 1, 37-71. Davidson, R. and J.C. MacKinnon, 1981, Several tests for model specification in the presence of alternative hypotheses, Econometrica 49, no. 3,781-793. Evans, John H. III and James M. Patton, 1982, An economic analysis of participation in the Municipal Finance Officers Association certification of conformance program, Journal of Accounting and Economics 5, no. 2, 151-175. Farrar, D.E. and R.R. Glauber, 1967, Multicollinearity in regression analysis: The problem revisited, Review of Economics and Statistics 49, 92-107. Foster, G., 1977, Valuation parameters of property liability companies, Journal of Finance 32, no. 3, 823-835. Johnston, J., 1984, Econometric methods (McGraw-Hill, New York). Kaplan, Robert S. and G. Peter Wilson, 1984, A futile attempt to eliminate collinearity in replacement cost, cash flow information content studies, Private communication. Klein, L., 1962, An introduction to econometrics (Prentice-Hall, Englewood Cliffs, N J). Lev, Baruch and James A. Ohlson, 1982, Market-based empirical research in accounting: A review, interpretation and extension, Journal of Accounting Research, suppl., 249-322.

A.A. Christie et al., Incremental information and collinearity

217

Miller, Merton and Charles Upton, 1984, A test of the Hotelling valuation principle and the case of the oil and gas companies, CRSP working paper, May (University of Chicago, Chicago, IL). PateU, James and Robert Kaplan, 1977, The information content of cash flow data relative to annual earnings, Working paper, Aug. (Graduate School of Business, Stanford University, Stanford, CA). Theil, Henri, 1971, Principles of econometrics (Wiley, New York). SAS Users Guide: Statistics, 1982 edition (SAS Institute, NC). Schaefer, Thomas F., 1984, The information content of current cost income relative to dividends and historical cost income, Working paper (Florida State University, Tallahassee, FL).