ABNORMAL SELECTION BIAS Arthur
S.
Goldberger
Department of Economics University of Wisconsin - Madison Madison, Wisconsin
I. INTRODUCTION How effective are the widely-used selection-bias adjustment procedures (Heckman, 1976; Maddala & Lee, 1976) when the normality assumption is violated?
In this paper, we build
upon Crawford (1979) to provide some guidance for the simplest situation.
For more elaborate situations, see Arabmazar &
Schmidt (1982) and references therein. The now familiar specification is y x = c^z + oulf where
y* = a^z + u 2 ,
y2
1
if y* £ 0
0
if y* > 0
z^ is exogenous (with first element 1 ) , u..
are bivariate standard normal with correlation outcome variable
y-
equals
the latent selection variable zero.
A sample on
regression of ex.. .
y1
y1 on
1,
and the
y£
that is, if and only if is less than or equal to
is thus a selected one, and linear z
would give inconsistent estimates of
The conditional expectation for observed E[y1|y2=l]
u«
is observed if and only if the binary y9
selection variable
p,
and
, (1)
y-.
is
= o^z + E [ u 1 | u 2 £ -otgZp = a^z + y g * ( - a _ 2 z ) ,
STUDIES IN ECONOMETRICS, TIME SERIES,
AND MULTIVARIATE STATISTICS
67
Copyright ©1983 by Academic Press, Inc.
A r 9hts of r e r o d u c t l on in a r ) f o r m
"'
(2)
P
y
reserved.
ISBN 0-12-398750-4
68
A R T H U R S. GOLDBERGER
where
γ = σρ, g*(·) = E [ u 2 | u
f*(·)
and
F*(·)
z
with
being the pdf and cdf of the univariate
normal distribution. conditioning
£ ·] = - f * ( · ) / F * ( · ) ,
(Here and in the sequel we suppress the
for notational convenience). y9
pectation of the binary variable
Also, the ex-
is
E [ y 2 ] = P r { y 2 = 1} = P r { u 2 £ -a^z} = F*(-a 2 z) .
(3)
Several methods for removing selectivity bias in this model have been developed.
For the "censored" case, where
y2
is observed, a two-step procedure is in use:
First, estimate
a_2
by maximum-likelihood probit analysis of
y~
on
across the full sample, and use the estimate
a«
to cal-
culate y-
g = g*(-a,9z,)
linearly on
estimate where
aLl
z
for each observation. and
g
For the "truncated" case,
is not observed, the probit step is unavailable.
y9
by nonlinear regression of
ex-
(and incidentally y-,
γ
and
a,2)
in (2) across the selected
In each case, the statistical consistency of the
estimator derives from the fact that yg*(-a,9z_) ,
E[u..|u2 £ - a 2 z ] =
which in turn derives from two aspects of b i -
variate normality:
the normality of
E [ u p | u 2 £ ·] = g * ( · ) , makes
Second, regress
across the selected sample to
(and incidentally γ ) .
Instead one may estimate
sample.
z
u~ ,
and the linearity of
which makes E[u1|u2],
which
E [ u 1 | u 2 <_ ·] = y E [ u 2 | u 2 £ · ] .
Our ultimate objective is to determine the properties of these "normal-adjusted" estimators when the disturbance distribution is not bivariate normal.
But we simplify the model
drastically for the sake of tractability.
We confine atten-
tion to the special situation where the latent
selection
variable coincides with the outcome variable, or more
ABNORMAL SELECTION BIAS precisely,
69
y* = (y., - c j / σ ,
truncation) point.
where
c
i s a known l i m i t
(i.e.,
Under n o r m a l i t y t h i s g i v e s T o b i n ' s
(1958)
model : α^ζ + a u 1 , where
u.. - N(0,1)
y2 and
y1
1
if
y-, 1 c
0
if
γ1
1
> c
,
(4)
is observed if and only if
y 0 = 1.
This reduces the dimensionality of the problem by leaving only a single disturbance, and reduces the conditional expectation of observed
y..
to
E[y 1 |y 2 = 1] = a^z + ag*((c -ο^ζ)/σ) .
(5)
Further, we confine attention to the truncated case, so that the normal-adjusted procedure estimates σ) by nonlinear regression of sample.
Further, we take
of generality,
σ
y..
(and incidentally
in (5) across the selected
as known (without further loss
σ = 1 ) , and we take
element, namely the constant (so that
II.
a*
z
to have only a single α,-,ζ; = μ ) .
SPECIFICATION Our specification is y = μ +u, y
E[u] = 0,
V[u] = 1,
observed if and only if
The disturbance
u
y <_ c,
c
(6)
known .
is not necessarily normal.
Observed data
now comprise a random sample from a truncated population whose expectation is h(c;u) Ξ E[y|y £ c] = μ +E[u|u
(7)
g(·) Ξ E[u|u £ ·]
(8)
where
is the truncated-mean function of the (zero-mean, unitvariance) u.
Let
y
denote the sample mean.
Then
μ,
the
70
ARTHUR S. GOLDBERGER
"normal-adjusted" estimator of
μ
to be considered here, is
obtained by solving y = h*(c;y) , where
(9)
h*(c;u) = μ + g*(c^)
is the expectation of a
variable given that it is less than or equal to is normal, then
μ
c.
If
u
is consistent, being the maximum-likeli-
hood, as well as a method-of-moments, estimator of is not normal, then
Ν(μ,1)
μ
E[y|y £ c] = h(c;y),
is inconsistent: it follows that
since
μ.
If
u
plim y =
μ* Ξ plim μ
solves
the equation μ + g(c-y) = μ* + g*(c-p*) . We seek the relation between c
(10) μ*
and
for various non-normal distributions.
μ
as a function of
To obtain a canoni-
cal form, let m = μ* - μ, so that
θ = c -μ ,
c - μ * = Θ -m,
(11)
and rewrite (10) as
m = g(9) - g*(6-m) .
(12)
Solutions to this equation give
m = m(6),
the asymptotic
bias of the normal-adjusted estimator, as a function of
Θ,
the truncation point expressed as a deviation from the true population mean. We will tabulate the bias functions
m(0)
for several
symmetric distributions.
To do so we first obtain their
truncated mean functions
g( ·).
III.
TRUNCATED MEAN FUNCTIONS It is convenient to summarize some general properties of
truncated mean functions (tmfs). variable
X
has
pdf f(x)
and
Suppose that the random cdf F(x)
where
f(x)
is
71
ABNORMAL SELECTION BIAS
continuous, dif ferentiable, and positive for
-°° < x < °°.
Then its tmf is J* vf(v)dv g(x) ΞΕ[Χ|Χ £ x] = — p . Loo With
X £ x
and
f
v
( >
(13)
dv
Pr{X < x} > 0,
it is clear that
x - g ( x ) >0,
i.e., the truncated mean is less than the truncation point. The slope of the tmf is g
,(x)
Ξ
| | = xf(x) - ( | U ) i ( x )
= Γ(χ)[χ
_
g(x)]
( 1 (4 1 )4 )
;
where
= U*l
r (v x )
'
Since
F(x)
=
r(x) > 0,
9 log F(x) it is clear that
is monotonically increasing. dix A, if
f(x)
(15)
3x
g'(x) > 0,
i.e., the tmf
Furthermore, as shown in Appen-
is strictly log-concave for all
X £ x,
then
g'(x) < 1. IV.
ANALYSIS For our analysis of bias, we consider the Student, logis-
tic, and Laplace (double-exponential) distributions, along with the normal.
All are symmetric with zero mean, which
makes them plausible disturbance distributions. Table I displays the
pdfs f(·)
and
tmfs g(·),
adapted
from Raiffa and Schlaifer (1961, pp. 229, 233) and Crawford (1979).
To reduce clutter in Table I, we use a "natural form"
for each distribution; as a consequence, the variances, given in the last column, are not necessarily unity. ceed to calculate the
g(6)
and
m(6)
When we pro-
functions, however, we
use the "standard form" for each distribution, in which the variance is unity.
The translation is straightforward:
If
X
72
ARTHUR S. GOLDBERGER
p
has the natural form with variance E[X|X <_ x ] ,
then
W = Χ/σ
σ
and
tmf g(x) =
has the standard form with
variance 1 and tmf g(9) Ξ E[W|W < Θ] = (i )E[X|X £ σθ] = (i )?(σθ) .
(16)
Table II gives the numerical values of the truncated means g(0)
and tail probabilities
-3 <_ Θ <_ 3
F(6)
for truncation points
by steps of 0.2, for the standard forms of our
distributions.
The Student family is represented by its
members with degrees-of-freedom
n = 5,10,20,30.
That the
Student tail probabilities are unfamiliarly close to those for the normal is a consequence of the standardization.
It seems
that much of the difference between the Student and normal is accounted for by the difference in variances: variable behaves rather like a
N(0,n/(n-2))
a natural
t(n)
variable.
Table III gives the values of the biases m(0), obtained by numerical solution of the nonlinear Equation (12), with being the normal tmf and
g( ·) being the tmf of the alternative
TABLE I. TRUNCATED MEAN FUNCTIONS FOR SELECTED ZERO-MEAN DISTRIBUTIONS Density
Distribution
Student*
Z
e
eX / (1+ ex)
Logistic
i/2
Lay lace
X
2
2 -fix)/F(x)
2 -*'2 in+1) +χύ/η)
φ(η)(1
2
- ^ j ^
X +
e -W
*n
is
the degrees-of-freedom /ίΤ(2/2η)
· im\)1/2l
log il
x-1
parameter> .
1 fix)/Fix) -Fix))
for
and <\>(n)
in-2)
2
x <_0 for
n /
π2/3
Fix)
(l+x)/(l-2eX)
T(2/2(n+l))
Variance o2
Truncated Mean gix) = EÎX\X £ χ ]
fix)
Normal
g*(·)
x>0
rQ
^ M tt
^ ö) 4i| * ^
CM IN. en en
IO to 00 t o . t o C M en en t o , t o to en en C O ^—11 ^—1 Os CM to CM O S Mi to CM en r-H C M . C M t o L O t o e ^ . CM to M i C O Cx en en en en en en en|r-i CM t o
en|en
en CM en en CX Mi Mi en en co M i en to t o to CO en CO CO en en en CM co to Cx en CO Mi to Cx Cx to
CM
Cx CO co O}
Cx Cx Cx
Oi
CM Mi to Cx en CO co O S CS O i
co CM OS OS
Ï-O>
&
P
^ «H 4^ r-O
^ H
»
O V4i tO CO "H
( )ä
•C5S ^8 M
^ &
4^
S3IIIlI3V30Xd
Λ
^ cn Ss
-~N
to öl
\-V
+i
K
+*ι F
« 3 « w g
4^>
IIVJJ
rQ ■ N — O en Ss CM Ω, T-Jk
+i
QN¥ ( ) ß
?
W 'tl
4^
1 1
1
r-i
Cx CM.OS C O Oi CM C O to C M Mi to 00 CM C O | O t o CM t o ^ H en Cx en,Mi C O L O rH,csa t o en en en CM en en en en en|en en en en
to|cji to to Mi LO LO.IO ΙΟ LO LO tO Cx | Ό C M | CM to to t o 1 1 1 1 1 1 Γ-i LO C O CM tO LO en es en en Çs en
CM t o t o t o t o Cx to Oi CM CM 1 1
M i en.to O i Lo|to t o M i e n V-H.CX] t o Ό en en|en en en
1
1
1
1 1
1
1
CO O V M i C O ^ | M i CN. en cx e n r-~l|rH CM
to Oi to CM
to to CM 1 1 1
c\â|c\â CM
Mi CM es CS
σ> to to to çs en QS en
CM 1
1
en LO Oi CM t o C O en O | t o t o t o C x Cx ^.c\i t o en en|en en en en
^—11 v—« 1 1
t o CO Cx t-H t o CM
1
1
CM.CM en|cM
LO M i Cx en t o ^—11 T-~H CM CM
1
CM t o en t o en en CM en t o Mi LO
en
t o | e n cx Cx e n C o | O i en.CM to CO e n . t o t o | M i CM en C O c x | u o
1 1
1
Cx ^|Mi Co Oi to to Cx|LO Mi CM en
to Mi to t o to Cx 00 Cx tO r-i 1
1
1
1
1
1
1
1
.1
1
Oi O} CM Cx c o CM CO cs t o CO en CO to Cx t o M i Oi CM Mi t o Cx CO Cx Mi Oi o> •-o t o Cx CO co CO Cs
Mi to to to Cx t o
1
1
1
Cx to
en CM to Oi to to en Mi to CM CM
1
1
1
es Cx e n CO Mi Oi O i en CM en CO e n e n C O 1 0 t o o> Mi LO t o Cx CX Mi to
1
1
.1
en
1
1
1
1
.1
to en CO Cx en to en to Mi t-O
Mi CM CO to to Ό
1
1
1
CO CO
CO to
CO CM CM
en en en en en en en 1
1
to CM C3i Mi 10 C O LO to to Mi C O CM M i t o Cx CO C O Oi
1
1
Oi 10 CO C O Cx CM to O^ Mi C O Cx cx CO C O
1
1
1
co to CO Cx r-S Cx to Ό CO C O OS OS Oi O} O S OS OS
1
1
en t o ^o en co OS Oi es
1
.
1
1
1
en CM M i C O CM t o CO t o M i en CM M i t o Cx C O O S CS Oi OS
.1
OS CM M i t o Cx Oi CS OS O S O S cts
t o CM en Oi *—1 LO CO M i c o en to tx co LO LO to c o 1 0 M i CM to CM C S t o CO e n en e n en en Mi to CM CM en 0^ Cx t o
1
1
1
CX
en en en 1
1
IX-
to t o t o Cx en, O S OS O S OS
C3S
to to t o C<3 C M | C M CM CM 1 1 1 1 1 t 1 1
v<>
SMV3N daiVDNflHI
4^
ΛCl^ P « W ks i £J
-P
r-O
^
>-0 Ss, }Oj r-^ 4^ 4^> S1
'ΙΙ
S
to to to t o C M | C M CM CM CM 1 1 1 1 1 1 1 1 1
CX CX Cx Cx Cx O} o> to en to CM to Cx CM CM t o to CM M i Cx es. to o> e n en en en Mi t o CM en C O t o LO Ml t o CM N3 en en en en en en Cx LO M i to CM CM
^
rQ !"~s <Λ es Ss r-ί û
|o->
Cxlcx
en.en en en en en.en en os|cx LO os|cx to
Cx LO 10 CO t o en co LO en CM C O crs t o C O LO co|cx en en cx to Cx τ^ Mi O i t o Mil^O CO Cx C O t o M i CM to tO.Mi 10 Cx O i t-i.Mi C O CM to CM C O t o M i Mi en en en e n en en e n en Mi CM e n C O tO|Mi CM en C O C X | L Q t o CM en OS C x t o Mi to CM CM
+V
·+-> S
C x l c x Cx Cx Cx
se1
W s «, 4^ g CO
s_
Cx IN. CX C x en en es en cx 10 to
3Ί8ΥΙ
« -U
+i KX>\ to
rQ
^ H
SO rs> Γ ^ 4^
«= * M ^^ " Ss O
4i
^
4^ S* S" S ,Ss bs
•^^ H
cx tO en es
to C O LO Cx es en çs en en
00 cx to|Mi v-H.CM en|en
1
^i|^4 1 1
1
O} en t o Cx Co LO CM M i M I | M I t o to Ό ev- en,Mi O i T-1 en en en
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
co en CM O i CM 0^ CM t o Cx en to co CO Ό LQ CO M i CO CM to en t o Cx en C O t o Mi en LO C S CM Mi t o CX LO to t o Cx CO CO C O Oi CM to Mi
CM CO CO Oi
Ό co co OS
CM CM Oi OS
1
1
Cx t o Mi t o O i oV OS OS
Oi OO en M i en|co o> M i CM MI|V-H to Mi to lO en CO en to LO to Cx CX M i Cx Mi LO C O t o Cx M i CO CM Mi to e n LO ^s CO t o M i CO CM LO Ml M i t o tO CM CM t o M i iO Cx C S CM to O S C X | L O t o CM en OS Cx t o Mi to CM CM 10 to en en en en en en e n C S Cx|LO to to to t o 1 1 1 Oi UO en O
IX. Cx en en
en
1
C M | CM
CM CM
1 1
1
1
1
t-I.CX o> o> t o e s t o to t o LO ^-i|to M i r-H.CM to Mi t o O i CM Cx en|en en en en en|r-i
to CO|MI
en en
os os r-H LO en|oo 0 ^ CM Γ~Η CX M to e n C O t o en IX. Oi|cx OS. to Mi CM e n eg to to to t o C M | C M CM CM CM 1 1 1 1 1 1 1 1 1 to to [X r~1 CM Mi es en en en en en
1 1
CN3 CO en en
Mi|tO Mi.tO co|to
to Mi
1
1
1
1
to
1
1
1
CM Mi to to to CM Cx Cv CO CO
en os en
1
e^
1
1
1
1
1
CO C O f~i C x c s cs.co to|CM i-O Mi e n O | C O r-H.CM t o LO C O t-~i,LO CM en|en en e n e n
to Mi Cx CM
to Mi Mi to
Cx en CM Mi
1 1
1
1
1
1
1
,
1
1
1
1
to CM CM CM CO to en CO M i Cx LO CO en Mi M i Cx to LO en Cx t-O CM CO Mi c o M t o Cx CO OS Oi 10 Oi Oi t o CX CX CO CO CS Ci
Mi Cv OS OS
Cx co Oi
1
1
1
1
1
1
1
1
1
,
1
1
Oi CM O} CO CO o> CO CX CM v^|to Cx M i M i C O | L O CX to 00 Mi CO Cx t o t o LO CX CM 00 es CO,CM t o to CO t o CM co to e n C O tol'-o t o CM e n O} CX t o Mi CO CM CM en
1
1
Cx OS t o 1—* ' CO es to O} CM Mi CO t o LO to Cx CO CO CO C S os' to O S o > OS OS OS O S Oi Oi Cs
CM es CS C X io|to
to to CM CM C M | C M CM CM 1 1 1 1 1 1 1 1
1
CX t o CO en t o Mi CM (X C O to LO Mi Mi CM c o CO CO e n C O CO en to c o Mi CO to Mi to CM CM to e n e n en en en en en CM e n CO Cx t o M i to co CM T-H 1
1
1
CO C O to en Cx CM to en t o t o to to t o en en O i C O to Mi CM
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
to CO Mi C O M
CO CM en en en en en en en 1
1
1
1
1
1
en en en e n en|en en en en en|en en en e n en en e n en en en en en en en e n en en en en en en 00 to M i CM en C O t o M i CM en C O to Mi CM en CM M i to co en CM Mi to C O en CM Mi to CO en to C M CM c\i CM CM v-i v-s r-H f~H T^H CM CM CM CM CM CO 1 1 1 1 1 1 11 1 1 11 1 1 1 1 1
74
ARTHUR S. GOLDBERGER TABLE Trunc.
III.
pt.
BIASES
m(Q)
OF NORMAL-ADJUSTED
Stu. (05)
Stu.(10)
Stu.(20)
'3.00 -2.80 -2.60 -2.40 -2.20
-3.303 -3.010 -2.715 -2.419 -2.122
-2.153 -1.911 -1.674 -1.445 -1.225
-1.303 -1.129 -.965 -.812 -.671
-2.00 -1.80 -1.60 -1.40 -1.20
-1.826 -1.534 -1.247 -.971 -.710
-.543 -.428 -.326 -.239 - · 1 6 -.105 -.057 -.021 .005 .021
ESTIMATOR
Stu.(30)
Logistic
Laplace
-.936 -.804 -.679 -.566 -.463
-2.132 -1.935 -1.738 -1.543 -1.350
-2.729 -2.529 -2.329 -2.129 -1.929
-.370 -.289 -.219 -.159 --109 -.069 -.037 -.014 .003 .014
-1.160 -.974 -.795 -.623 ~·462 -.315 -.186 -.078 .005 .062
-1.729 -1.529 -1.329 -1.129 ~-929 -.729 -.529 -.329 -.129 .071 .271
-1.00 - .80 - .60 - .40 - .20
-.471 -.260 -.086 .046 .135
-1.017 -.822 -.642 -.480 -'2^_8_ -.218 -.119 -.042 .013 .049
.00
.182
.069
.031
.020
.095
.075 .072 .062 .049 ,_034
.034 .032 .028 .022 .016
.022 .021 .018 .014 JJ10
.106 .101 .086 .066 .0_44_
5
.20 .40 .60 .80 l._00_
.195 .181 .153 .118 ._082
1.20 1.40 1.60 1.80 2.00
.051 .026 .006 -.007 -.016
.021 .008 .001 -.006 -.010
.010 .004 -.000 -.003 -.005
.006 .003 -.000 -.002 _-.004 _
.025 .010 -.002 -.010 -·014^
2.20 2.40 2.60 2.80 3.00
-.020 -.022 -.022 -.021 -.019
-.012 -.012 -.012 -.010 -.009
-.006 -.006 -.006 -.005 -.004
-.004 -.004 -.004 -.003 -.003
-.016 -.016 -.015 -.013 -.011
true distribution.
.342 .290 .213 .141 .083 .040 .009 -.012 -.024 _ ^0J31 -.033 -.032 -.030 -.026 -.023
Evidently the bias of normal adjustment
arises from the difference between the true tmf and the normal tmf. It is not surprising to find that the bias is negligible when
Θ
is algebraically large.
For there the truncation is
mild, so that the truncated mean for each distribution is close to its untruncated mean, namely zero.
Nor is it sur-
prising to find that the bias is substantial when algebraically small.
Θ
is
For there the truncation is extreme
ABNORMAL SELECTION BIAS
(with
Θ < -1,
75
less than 15% of the full population is re-
tained in the selected populations) so that the lower-tail differences among the density functions make the tmfs diverge from each other as well as from zero.
However, the course of
the bias functions for intermediate values of
Θ,
where trun-
cation is moderate and the tmfs are quite close, is perhaps unanticipated. To account for the situation, first observe that the meanvalue theorem permits us to write (12) as m = g(8) - [g*(0) - g*'(T)m] where θ-m.
g*'(·) = 3g*(-)/3(·)
and
(17) T
lies between
Θ
and
Thus m = m(9) = [g(0) - g*(6)]/[l - g*'(T)] .
(18)
Now, the normal distribution is strictly log-concave (see Appendix B ) .
Hence
0 < g*'(·) < 1
everywhere, so that the
denominator in (18) lies in the unit interval. Θ,
the bias
m(9)
is an amplification
tween the true truncated mean mean
g(6)
Hence, at any
of the difference beand the normal truncated
g*(0). While the bias vanishes at points
Θ
where the tmfs
intersect, everywhere else the bias exceeds in absolute value the difference between the tmfs.
This conclusion, which rests
on the properties of the normal tmf and hence holds regardless of the true distribution being considered, is our key analytical result. The amplification can be observed by a comparison of Tables II and III. Even in the central range of various
g(·)
functions are quite close to
is not always negligible.
Θ, g*(·),
where the the bias
For example, if our sample came
76
ARTHUR S. GOLDBERGER
from a standard logistic distribution truncated at
θ = 0, the
normal-adjusted estimator would overstate the population mean by
m(0) = .095. Had we made no adjustment, our estimator g(0) = -.764,
would have been
better than no adjustment.
so the normal adjustment is
We have not determined how
generally this phenomenon holds; that is, we have not characterized the distributions (and θ-values) for which |m(6)| <_ |g(0)|. V.
EXTENSIONS We now consider briefly how our approach might extend to
the regression situation y = a + 3z + u
(19)
where the disturbance
u
has
is observed if and only if
E[u] = 0,
y <_ c,
V[u] = 1,
and
y
the truncation point
c
being known. A.
Mean
Difference
When
z
takes on only two distinct values, we can code
them as 1 and 2.
The regression is equivalent to a two-
population model: yx = where
u-,
Ul
+ ulf
and
u~
y2 = μ2 + u 2 , are identically distributed,
μ. = α + 3z. (j = 1,2), of interest.
3 = μ 0 - u-
and
is the parameter
With random samples from each truncated popula-
tion, we estimate
y1
and
y~
by applying the normal adjust-
ment to each sample, and then estimate between the two estimates. estimator is
(20)
3
by the difference
The probability limit of this
ABNORMAL SELECTION BIAS
77
3* = μ* - μ* = ( P 2 + m 2 ) " ( ^ i + m i ) where
m. = m(6.), Θ. = c - μ .
= 3*_ 3 =
m
(^2" μ 1' )
(j = 1,2).
the normal-adjusted estimator of d
=
3
+
( m 2 _ m i) »
(21)
Thus the bias of
is
( 0 2 ) - m(61) ,
(22)
which may be calculated directly from Table III for selected values of
Θ-
and θ 2
(= Θ- - 3).
For example, suppose the
true disturbance distribution is logistic, and
c = 0. Then
Θ- = 0 , θ 2 = - 1 ,
-.315 - .095 - -.410.
μ- = 0, Pn = ^>
and d = m(-l) - m(0) =
The normal adjustment estimates
3* = .590 while in fact 3 = 1 . Had no adjustment been made, our estimator would have been the difference between the sample means, whose probability limit is
^
2
+ g(6 2 )) - (y 1 + g(6 1 )) = (μ 2 - μ χ ) + (g( θ 2 ) - g( θ^)
= 1 + g(-l) - g(0) = 1 - 1.594 + .764 = .170.
Here again it
appears that normal adjustment is better than nothing. But we have not determined how generally |g(62) - g(6 1 )| B.
|m(02) - m(8 1 )| £
holds.
Regression When
z
in (19) takes on J > 2
distinct values, we may
view the regression as a J-population model: Yj = Pj + Uj where
(j = 1,...,J) ,
μ. = α + 3z.,
cally distributed.
and the disturbances
(23) u. are identi-
Doing so suggests the following normal-
adjustment estimation procedure.
With random samples from
each truncated population (the common truncation point being known), estimate each
c
μ. by applying the normal
adjustment, and then regress those estimated means linearly on
z
to estimate
a
and 3. On this approach the estimated
ARTHUR S. GOLDBERGER
78
means are
(asymptotically)
y* = U with on
(24)
j + m j )
m. = m(6.), Θ. = c - u.. «j «J u J z.
bias
will give a slope d
Regressing the
3* = 3+ d,
μ.
c
is the slope in the linear regression of
and on the sequence of
linearly
say, where now the
This bias evidently depends on the sequence of on
J
m.
on
θ-'s,
z·.
that is
z.'s.
To illustrate the bias, a numerical example will suffice. We take
a = 1.4, 3 = .4, c = 1, z. = j-3, J = 5,
true distribution to be logistic.
and the
From Tables II and III we
calculate the entries in Table IV in which we also report g. = g(6.) «j
«J
and
h. = h(c;u.) = y. + g., j
«J
«J
«J
the latter being the
(asymptotic) observed truncated means of y. * Linear regression of μ. on z. gives the slope J l
_ (ζ^-ζ)μ*
3* = - ^
= .259
(25)
a* = Ti* - 3*z = 1.311 ,
(26)
2
I (z - z ) J j=l and the intercept
in contrast to the true parameters
TABLE d
z .
v
i
IV.
3 = .4, a = 1.4.
REGRESSION
_ Θ .
77? .
C
3
1
-2
.6
2
-1
1. 0
.4
.101
TABLE 2 μ.
g .
_l
12
. 701
h.
L·—
-.517
.083
3
0
1. 4
0
.095
1.095
-.764
.236
4
1.
1. 8
-.4
.005
1.405
-1.067
.333
2.2
-.8
-.186
1.614
-1.411
.389
-1.2
-.462
1.738
-1.781
.419
ABNORMAL SELECTION BIAS
79
An alternative version of normal adjustment is the nonlinear regression estimator sketched in Section I. observed means
h.
are regressed on the nonlinear conditional
expectation function in (5). That is, with known, we choose
i
Here the
a
and
2
b
σ = 1
and
c = 1
to minimize the sum of squared
.
l e., in j=l J h. = a + bz. + g * ( l - a - b z . ) + e, .
residuals,
Doing so for the data of Table IV gives
(27) b = .281 and a= 1.317.
In either version, normal adjustment leaves a substantial bias.
Once again, normal adjustment is better than nothing:
linear regression of
h.
on
j
z. J
3 = Σ.(ζ.-ζ)η. / Σ .(z,-z) 2 = .082 «j
VI.
«j
J
«J
J
in Table IV would give and
a = h - 3z = .292.
CONCLUSIONS Our analysis, which utilized symmetric zero-mean distribu-
tions, suggests that the normal selection-bias adjustment procedure will be quite sensitive to modest departures from normality.
For further documentation, see Arabmazar and Schmidt
(1982).
Consequently, a more general functional form of the
tmf might be required in practice (see Heckman, 1980; Lee, 1982).
Or one might examine the data for departures from
normality: μ. J
against
In the regression example above, a plot of the z.
straight line.
J
would reveal that they failed to track a But some skepticism about the efficacy of such
devices is warranted when the linearity of the true regression is itself open to question.
ARTHUR S. GOLDBERGER
80 VII.
APPENDIX A. LOGCONCAVITY AND THE TRUNCATED MEAN FUNCTION We show that logconcavity of the density function implies
that the slope of the truncated mean function is less than or equal to 1.
The result is due to Gary Chamberlain (personal
communication); our derivation is a slight adaptation of his. Suppose that the random variable
Y
has pdf
p = p(y;6)
which is continuous, differentiable, and positive over a domain which does not depend on the continuous parameter Its expectation is D»
so that
= i£
= JEL
D"
.00
J_œ pdy = 1,
Let
= 9 lQg P
z
2 p' = zp, p" = (z + z 1 )p.
analysis of f
s(0) = Ε[Υ;Θ].
Θ.
z.
= l2.
(AD
By the usual Cramer-Rao
and using
f
J
as shorthand fo r
, we obtain in turn 0 = / p'dy = / zpdy = E[Z] ,
(Α2 ) (A3 )
0 = / p"dy = / (z 2 + zf)pdy = E[Z 2 + Z'] . Similarly, the analysis of
s(0) = / ypdy
gives
sT(9) = s
"(e)
=
If = / yp'dy = / yzpdy= E[ZY] = C ( Z > Y > ' W / yP" y / y(z + z )pdy =
d
=
2
(A4
f
36 2
(A5 )
2
= E[(Z + Z')Y] = C[(Z + Z'),Y] , using (A2)-(A3). Now suppose that a random variable
X
has pdf
f(x)
which is continuous, differentiable, and positive over _oo < x < oo;
let
F(x)
denote its cdf.
Let
Θ
be a con-
tinuous parameter and consider the truncated distributions defined by
X £ Θ.
For
X £ Θ,
f(x)/F(0),
and the expectation of
g(9) Ξ E[X|X £ Θ] = fü
the pdf of X
X
is
f*(x;9) =
is
xf*(x;6)dx .
(A6)
)
ABNORMAL SELECTION BIAS Let
Y = X - Θ. Ρ(Υ;Θ) = \
Then t h e random v a r i a b l e f(y+0)/F(0)
for
y £ 0
0
for
y > 0
h a s pdf (A7)
Ξ E[Y;0] = /_Ocoy[f(y+9)/F(0)]dy .
(A8)
is
Observe that the distribution of
Y
meets the conditions of
the previous paragraph (its domain is of
Y ,
and i t s e x p e c t a t i o n s(0)
81
-°° < y £ 0
regardless
Θ ), and that s(0)=g(0)-0, Let
s«(6) = g» (Θ) - 1,
s"(0) = g"(0) . (A9)
t = log f(x), t' = at/9x, t" = Zt'/dx,
Γ(θ) =
and let
Iff} ' W = (f -r(0)) 2 + t" .
Using (A7) for
y £ 0,
(A10)
we obtain:
L = log p(y;0) = log f(y+0) - log F(0) ,
(All)
z = | | = t» - r(0) ,
(A12)
zf = |f = t" - r'(0) .
(A13)
Consequently, from (A4) and (A12), s'(0) = C(Z,Y) = C(T',Y) = C(Tf,X|X £ 0) , because
Tf
and
Y
differ from
Z
and
X
(A14)
only by constants.
Similarly, from (A5), (A10), and (A13), s"(0) = C[(Z 2 + Z'),Y] = C(W,Y) = C(W,X|X £ 0) .
(A15)
In view of (A9) we have shown that the derivatives of the truncated mean function are expressible in terms of conditional covariances of
X
with (functions of) the derivatives
of the logged density function, namely: g'(0) = | | = 1 + C(T',X|X £ 0) ,
(A16)
g"(0) = ||p = C(W,X|X £ 0) .
(A17)
If, for
X £ 0,
the pdf of
X
is logconcave (t" £ 0 ) ,
82
A R T H U R S. GOLDBERGER
then
Τ'
is non-increasing in X,
correlated with
X.
and hence is non-positively
From (A16) this implies
Further, if the log-concavity is strict g'(6) < 1.
g'(6) £ 1.
(t" < 0 ) , then
These are Chamberlain's results on the slope of
the truncated mean function. Karlin (1982) has shown that if the pdf of concave, then the truncated variance creasing in
Θ.
V(X|x <_ Θ )
is logis in-
Other implications of log-concavity can be
found in Barlow and Proschan VIII.
X
APPENDIX B.
(1981, pp. 7 6 - 7 8 ) .
LOGCONCAVITY FOR SPECIFIC DISTRIBUTIONS
The main purpose of this appendix is to establish, by applying the theory in Appendix A, the key result used in the text, namely that the tmf of the normal distribution has slope less than unity.
For completeness we also apply the theory to
the other distributions under consideration. A.
Normal From
f(x)
in Table I, we calculate
t = - (log 2 V With all
t" < 0 Θ.
χ2)
, f = -x, t" = -l.
everywhere, we conclude that
Indeed with
t' = -x
(Bl) g'(6) < 1
for
inserted in (A16) it follows
that g'(6) = 1 - V(X|X 1 θ) , which verifies
g f (6) < 1
strictly positive. now follows that cave.
(Β2)
since the conditional variance is
Incidentally, from Karlin's result, it g"(6) < 0:
the normal tmf is itself con-
ABNORMAL SELECTION BIAS
83
Student
S.
From
f(x)
in Table I, we calculate
t = log φ(η) - (l/2)(n+l) log (1 + x 2 /n) , t' =-(n+l)x/(n+x 2 ), t M <_ 0
Observe that
t"=-(n+l)(n-x 2 )/(n+x 2 )
only for
2 x
£ n,
density is not log-concave in its tails. is log-convex for g f (G) > 1
for all
natural form. find
x <_ - /n
(Β3) o .
(B4)
so that the Student Indeed the density
which implies via (A16) that
Θ <_ - /n.
This calculation refers to the
Translating into standard form via (16), we
g'(6) > 1
for all
Θ £ -/n-2,
a phenomenon which is
manifest in the low-n Student columns of Table II. The perverse behavior of the slope persists beyond
-/n-2,
until
the log-concavity overwhelms the log-convexity to make negatively correlated with C.
T'
X.
Logistic From
f(x)
in Table I, and using
f(x) = F(x)[l - F(x)],
we calculate
With
t = x + 2 log[l - F(x)] ,
(Β5)
t1 = 1 - 2F(x),
(B6)
t" < 0
all
Θ.
D.
Laplaoe From
t" = -2f(x) .
everywhere, we conclude that
f(x)
g'(9) < 1
in Table I, we calculate
t = log(l/2) - |x| tf = \ With
1
for
x < 0
-1
for
x > 0
t" _< 0
for
(B7) ,
t" = 0
for
x f 0 .
everywhere (except at the isolated point
(B8) x=0)
ARTHUR S. GOLDBERGER
84 we conclude that
g'(9) £ 1
for all
Θ.
Indeed from
g(6)
in Table I, we calculate 1 ΐ'(θ)
=
fi e
θ
2
(l+20e )/(l-2e )
for
Θ < 0
for
Θ > 0
Thus, as is manifest in Table II,the Laplace tmf has a unit slope for
Θ < 0,
unit slope for
a kink at
Θ = 0,
and then a less-than-
Θ > 0.
ACKNOWLEDGMENTS This research was supported in part by National Science Foundation grant SOC-7624428 and by the William F. Vilas Trust Foundation.
I am grateful also to Insan Tunali for expert
research assistance, and to Kenneth Burdett, Gary Chamberlain, John Geweke, Donald Hester, and Samuel Karlin for instruction and criticism. REFERENCES 50, 1055. Arabmazar, A., and Schmidt, P. (1982). Eeonometrica Barlow, R. E., and Proschan, F. (1981). "Statistical Theory of Reliability and Life Testing." Silver Spring, MD. Crawford, D. L. (1979). Department of Economics doctoral dissertation, University of Wisconsin. Heckman, J. J. (1976). Annals of Economic and Social Measurement 5, 475. Heckman, J. J. (1980). In "Evaluation Studies Review Annual," Vol. 5 (E. W. Stromsdorfer and G. Farkas, eds.), p. 69. Sage, Beverly Hills. Karlin, S. (1982). In "Statistics and Probability: Essays in Honor of C. R. Rao" (G. Kallianpur, P. R. Krishnaiah, and J. K. Ghosh, eds.), p. 375. North-Holland, Amsterdam. Lee, L.-F. (1982). Review of Economic Studies 49, 355. Maddala, G. S., and Lee, L.-F. (1976). Annals of Economic and Social Measurement 5, 525. Raiffa, H., and Schlaifer, R. (1961). "Applied Statistical Decision Theory." MIT Press, Cambridge. Tobin, J. (1958). Econometrica 26, 24.