Economics Letters North-Holland
393
13 (1983) 393-396
ERRORS IN VARIABLES AND REVERSE REGRESSION MEASUREMENT OF WAGE DISCRIMINATION Gary SOLON
IN THE
*
Prrnceton Unioersrty, Princeton, NJ 08544, USA Received
9 June 1983
Several recent articles have proposed ‘reverse regression’ as a solution to errors-in-variables problems in wage discrimination regressions. This note demonstrates that the reverse regression approach is biased against finding evidence of discrimination.
1. Introduction
Several recent articles have applied the conventional errors-in-variables model to argue that imperfect productivity measurement leads to exaggerated estimates of race and sex discrimination in wage regressions. Some of these articles have proposed ‘reverse regression’ as a solution to the errors-in-variables problem. ’ As Goldberger (1982) has suggested, the appropriateness of the standard errors-in-variables model in this context is open to question. Even taking the standard model for granted, though, this note demonstrates the inconsistency of the reverse regression estimator of discrimination. The reverse regression approach is shown to be systematically biased against finding evidence of discrimination.
* The author thanks Mark Stewart for his advice. ’ SeeRoberts (1979), Kapsalis (1982), and Kamalich Kochin (1980) also discuss the errors-in-variables regression. 0165-1765/83/$3.00
and Polachek (1982).Hashimoto and issue, but do not propose reverse
0 1983, Elsevier Science Publishers
B.V. (North-Holland)
G. Solon / Errors in tmriahles and retlerse regrewon
394
2. Inconsistency
of the reverse
A simple formulation Y, =
qD, + azf’, +
regression
estimator
of the wage regression
equation
is
E,,
where y, is the ith individual’s wage (or natural logarithm of the wage), D, is a dummy variable equal to 1 for women (or blacks), P, is the ith individual’s productivity, and E, is an independently, identically distributed error term with zero mean and variance u,‘. All variables in the model are expressed as deviations from their sample means. The coefficient a2 is positive, and (Y, < 0 if discrimination is present and (Y, = 0 if it is not. The source of the errors-in-variables problem is that P, is not directly observable to the data analyst. Instead, the analyst uses proxies such as years of schooling an work experience. To represent this situation, let p,* = P, + u, be a productivity proxy that measures P, with error u,, which is typically (though rather implausibly 2, assumed to be independent of P,, D,, and E,, with zero mean and variance u,‘. Eq. (1) can then be rewritten as
Ordinary least squares (OLS) estimation of this equation yields inconsistent estimators of (Y, and LYEbecause of the correlation between P,* and u,. Consequently, several authors have suggested that the variable with measurement error, P,*, be shifted to the left-hand side of the equation and regressed against y, and D,. The equation then becomes P,*=
-(“,/“2)D,+(l/a2)~,+u,-~,/~2=P,D,+P2~,+~,,
where fl, = --~(,/cQ, & = l/al, sion of the regression equation,
(2)
and v, = u, - ~,/a,. In this reverse verp, > 0 reflects discrimination and p, = 0
’ Goldberger (1982) discusses this assumption in detail. Essentially, the standard errors-invariables formulation assumes that variables like schooling and experience are merely imperfect signals of productivity and not also determinants of productivity.
G. Solon
/ Errors
VI
oariablesand reverse
regression
395
reflects non-discrimination. As Kapsalis (1982) puts it, ‘the new method measures wage discrimination by comparing the characteristics of male and female employees with similar wages. In other words, wage discrimination exists if male and female employees earn the same wage rate, despite the fact that female employees are, for example, better educated.’ This reverse regression approach solves the errors-in-variables problem by shifting P,* to the left-hand side of the equation, but it creates a new source of inconsistency by shifting the endogenous variable y,, which is correlated with E,, to the right-hand side. As a result, the estimator of j3, is biased downward, that is, against finding discrimination. 3 Proceeding more formally, as the sample size N goes to infinity, the probability limit of the OLS estimator of /?, in eq. (2) is plim B, = plim A/plim
A = (~Y,Z~W’,*B = [ cDi%y:
-
B,
where
~D,Y,~Y,P,*)/N~~
(CD,Y,)']/N~.
and the summations are over i = 1,. . . , N. If, as N -+ co, ED,‘/N converges to a finite constant 62, Ey;/N converges in probability to a finite constant uy’, and CD, y,/N converges in probability to a negative constant p,,a,u, (reflecting the negative total correlation between wage and the sex or race dummy), then plim B = CT&J,’- p~,,u&, Also, dropping
= u&$( 1 - p;,)
> 0.
i subscripts,
plim A = u; plim[ED(
p, D + &y
+ 0)/N]
= /3, plim B + po,u,u~u~/a2.
3 The opposite directions of the inconsistency of the direct and reverse regression estimators of discrimination is reminiscent of the well-known result that, in the simpler context of a bivariate relationship, the probability limits of the direct and reverse regression estimators bracket the true slope parameter. See Stewart and Wallis (198 1, pp. 136- 139).
G. Solon / Errors in variables and rewrse
396
Then,
dividing
regression
plim A by plim B gives
Therefore, with (Ye> 0 and pD, < 0, the probability limit of b, is unambiguously less than the true fi,. Since discrimination is reflected in a positive p,, the downward inconsistency of fi, systematically biases the reverse regression method against finding discrimination.
References Goldberger, Arthur S., 1982, Reverse regression and salary discrimination. Aug. (University of Wisconsin, Madison, WI). Hashimoto. Masanori and Levis Kochin, 1980, A bias in the statistical estimation of the effects of. discrimination, Economic Inquiry 18, July, 478-486. Kamalich, Richard F. and Solomon W. Polachek, 1982, Discrimination: Fact or fiction? An examination using an alternative approach, Southern Economic Journal 49, Oct., 450-461. Kapsalis. Constantine, 1982, A new measure of wage discrimination. Economics Letters 9, 287-293. Roberts, Harry V., 1979, Harris Trust and Savings bank: An analysis of employee compensation, Report 7946 (Center for Mathematical Studies in Business and Economics, University of Chicago, Chicago, IL). Stewart, Mark B. and Kenneth F. Wallis, 1981, Introductory econometrics (Halsted, New York).