Journal
of Econometrics
45 (1990) 351-366.
North-Holland
PERSONAL CHARACTERISTICS, UNEMPLOYMENT INSURANCE, AND THE DURATION OF UNEMPLOYMENT* Dean A. FOLLMANN Notiortul Heort Lung md Blood Iustttute.
Matthew
Bethesdcr. MD 2089992,LiSA
S. GOLDBERG
lnstttutefor Defeme Aw(wes.
Alexondr~cr. VA _?.?311,USA
Laurie MAY
Received July 1988, final version received September
19X9
This paper develops a methodology to accommodate the ‘spikes’ in reemployment probabilities that are observed when unemployment insurance expires. Previous studies apply the Cox regrcssion model. which has the advantage of a nonparametric baseline hazard function. However. even the Cox model constrains the explanatory variables to have the same effect throughout the entire spell of unemployment. By contrast, our new model allows the explanatory variables to have direrent effects at and away from the spike. After a simple data transformation, our model may be estimated using readily available software. An empirical example is provided and generalizations are discussed.
1. Introduction Economists have long been concerned with factors that determine the duration of unemployment. The effects of observable individual characteristics on unemployment duration generally have been modelled using a proportional hazard specification. Under this specification, the hazard function (the instantaneous conditional probability of reemployment or, more generally, ‘failure’) is multiplied by a factor exp(Z’p), where Z is a vector of observable characteristics and /3 is a coefficient vector to be estimated. An important feature of unemployment duration data is that spikes are observed in the hazard function at 26 and 39 weeks of unemployment. These *This research was sponsored by the Center for Naval Analyses. However. the findings are those of the authors alone and do not reflect the views of the Center for Naval Analyses or the Department of Defense. Suggestions for improvement from Russ Beland, Jacob Klerman, Peter Kostiuk. Diane Lambert, Aline Quester, and Martha Shiells are gratefully acknowledged. We also thank Larry Katz for providing the data and the folks at the Epilepsy Branch of the National Institute of Neurological Disorders and Stroke for use of their VAX computer.
0304-4076/90/$3.50
:i”1990. Elsevier Science Publishers
B.V. (North-Holland)
352
D.A. Follmurm et 01.. Ohseruuhle chrrructeristics nnd unemployment
duration
spikes correspond to the expiration of unemployment insurance (UI) benefits. A smooth parametric baseline hazard function (e.g., an exponential or Weibull function) does not adequately capture the spikes observed in the data. Recently, Moffitt (1985) applied the Cox regression model, which allows an arbitrary baseline hazard function, to unemployment duration data [see Cox and Oakes (1984) or Kalbfleisch and Prentice (1980) on the Cox regression model]. In addition, Meyer (1986) and Han and Hausman (1986) have proposed modifications of the Cox regression model that also have nonparametric baseline hazard functions. In contrast to parametric models, the nonparametric hazard function allows an arbitrary jump in the probability of reemployment at 26 or 39 weeks. However, in all these models, the effect of observable characteristics again enters through the multiplicative factor exp( Z’P ). The difficulty with the nonparametric approach is that it constrains the effects of observable characteristics to be the same at these spikes as they are elsewhere throughout the time axis. For example, a common finding is that older individuals have shorter average periods of unemployment and, by implication, higher hazard rates. But because only a single coefficient vector p is estimated, older individuals are predicted to be more likely to find a job at any point in time, for example, at the 26th or 39th week of unemployment. However, the positive coefficient on age is driven by the behavior of older and younger individuals throughout the entire time axis, and may not apply specifically at 26 or 39 weeks of unemployment. A more flexible model would allow the effect of age (or other observable characteristics) to be different at UI termination spikes as compared to the remainder of the time axis. This paper develops such a model. In particular. one regression model is used to describe the probability of reemployment away from these spikes, and a coefficient vector /3 measures the effects of observable characteristics on this probability. A separate regression model with coefficient vector 0 is used to describe the conditional probability of reemployment at each UI termination spike. An example is provided using Cox and extreme-value regression models. The likelihood based on the proposed model is a function of the parameters of the two models. We provide a regularity condition under which the overall likelihood function factors into the product of two ‘likelihoods’. In our example, the two factors are Cox and extreme-value likelihood functions. This factorization eases the computations considerably, because the likelihood factors may be maximized individually with no loss of efficiency. We use our model to construct the profile of the individual most likely to find a job during the week in which his UI benefits expire. We show that this individual differs somewhat from the individual most likely to find a job in some global sense. In particular, older white individuals are most likely to find a job globally. When UI benefits expire, however, the individuals most likely
D.A. Follmurtt~ et al., Ohserruhle
chuructetistics
and unemplqvment
dururion
353
to find a job are college graduates who reside in counties with low unemployment rates. This suggests that the latter individuals are likely to accelerate their job search when the cost of remaining unemployed sharply increases. The acceleration of job search is consistent with the theoretical search models proposed by Mortensen (1977) and Burdett (1979). There are other possible areas for application of the proposed methodology. In some states, striking workers receive unemployment benefits. When these benefits expire, there is a new incentive for the workers to settle the labor dispute [see Kennan (1985)]. Also, some health insurance plans cover hospital stays only up to a maximum number of days in a hospital. The duration of hospital stays may be affected by this provision. Attrition in the military increases sharply at the end of service contract [see Follmann, Goldberg, and May (1987)]. Finally, pension vesting rules have a powerful effect on the timing of retirement [see Mitchell and Fields (1984)]. In any of these examples, if different individuals react differently to the spike event, then simple duration regression models may not provide a complete description of transition behavior.
2. Basic models The key idea of this paper is that a hazard function for the entire time axis can be specified by the contributions of individual hazards defined over distinct intervals of the time axis. As suggested above, there are cases in which we expect the transition behavior to be quite different over distinct time intervals. In this section, a specific model will be developed that seems appropriate to describe reemployment behavior for UI recipients. Generalizations of this model will also be discussed. Most survival models assume that the hazard function varies smoothly with duration. Even the Cox regression model, which allows an arbitrary baseline hazard function, requires the effect of explanatory variables to be the same throughout the time axis. In some cases, however, a particular interval (e.g., to,th]) may be qualitatively different from the remainder of the time axis. Consider the following hazard function:
h(t) =h(t),
ts t,,
=d-t,),
t,
=h(t-A),
th
where A = t,, - t,, and h( .) and functions. Here the overall hazard,
(1) v( .) represent (possibly) different hazard h(r), is specified by one function h(t) until
4
5
0
Fig. 1. Hazard
function
Time
2
for model given by eq. (1). where h(r) is Weibull, A = 0.5.
1
y(r) is exponential,
3
and
4
$ 9 8
Table 1 Density Density,
and survivor
Survivor
r( f )
F(r)h(r) Qt,)C(r ~ f,)v(r-
function
1,)
based on h(r) of eq. (l).a function,
F(l) F(t,)c(r -
R( f )
Interval
In)
F(f - d)C(d)
f(f-A)C(A)X(r-A) aF(t)=exp[-J~‘X(u)du],G(r)=exp[-ldy(u)du],
A=t,-I,.
time t,, then by a possibly different function, y(t - t,). Following the end of function with a the interval at t,, the hazard is given by the original time-shifted argument, h( t - A). Fig. 1 gives an example of one such hazard, where x(t) is a Weibull hazard and y(t) is a constant (i.e., the hazard of an exponential distribution). In fig. 1, the hazard is a smoothly increasing function given by the Weibull form until time t = 0.5. The hazard is elevated to 4 over the interval (0.5,1.0], indicating a much higher conditional probability of failure over this interval. Finally, the hazard resumes its previous Weibull form, with a translated argument, after time t = 1. We now demonstrate that, under our assumptions, the overall likelihood function factors into the distinct contributions of the disjoint time intervals. Let T be the time unt;l failure. For a hazard function h(t), the density r(r) and survivor function R(t) are given by
-/‘h(u)du 0
1
h(t),
R(t) =exp[
-p(u)du].
When h(r) has the form given in eq. (1) the density and survivor function can be rewritten as in table 1. Suppose that n observations (ti, d,), i = 1,. . . , n, are obtained, where tj marks the i th time until failure or censoring, and di is a dummy variable equal to 1 for failures and 0 for censored observations. For generality, we do not require that the anomalous interval occur at the same time or even be of the same length for all individuals. though in most applications this generality is not needed. Let n, equal the number of observations that fail or are censored before their anomalous interval (nr = # { t,]t, I t,,}). Let n2 equal the number of observations that fail or are censored during their anomalous interval (n2 = #{tilt,, < ti I tb,}). Finally, let n3 equal the number of observations that fail or are censored after their anomalous interval (n3 = # { l, 1‘b,
J.Econ
C
<
‘r 1).
356
D.A. Follmur~~~et d, Ohseruuhle characteristics
und unemployment
duration
Table 2 Data transformation ‘X’ sample
that factors the likelihood.a
‘y ’ sample
Interval
(u,.6,*)=(ti-t,;d,)
ti 5 t,, tan< t, 5 lb
(ui,6,*)=(A,.0)
tb,
(M;,,q)=(ti’di) (&a,)
(t,,.O)
=
(~;,6,)=(t,-A,.d,)
c
t,
N “1 “2 ‘73
ati is the time to failure or censoring for i th individual, di is 1 for failures and 0 for censored observations, t,, is the start of the anomalous interval for ith individual, t4 is the end of the anomalous interval for ith individual, Ai = tb,- I,,.
The likelihood
L(X,y)
function
=
is
,fp(ri)d’3i(ri)‘-4 dGi(tiydI
= , ,j-J, r( ri) I I a,
f ,f
g,,
I 0, I
‘(ti)d’R(ti)l-d’
b,
x ’ ,~<,,r(ri)d’li(ti)l-d’. I
Substituting
for r(t)
n 1,I f, 5 ‘0,
I
and R(t)
from table 1 yields
F(ri)X(fi)d, t,f (
0,
Q,( I
F(r,,)G(ri-r,,jY(ri-r~,)d’ b,
x 7,~<7F(ri-A,)G(Ai)X(ri-Ai)di. t ’ I
Rearranging
and collecting
terms gives
where (w,, S,) and (u,, Si*) are given in table 2. In other words, the overall likelihood function factors into a function of h and a function of y. Therefore, the likelihood function L(h, y) may be maximized by maximizing L,(X) and L,(y) separately. The separate maximizations can be effected using the transformation of the data (r, d) + (w, a), (u, S*) given in table 2. That is, from the original n observations, create two ‘samples’. One sample ‘A’ contains n =
D.A. Follnuuv~ et 01.. Observable chamcteristm
and ut~emplqwent
durutiorz
357
n, t n2 4 n3 observations (wi, 4) and is used to estimate A, while the other sample ‘y’ contains nz + n3 observations ( ui, Si*) 2nd is used to estimate y. Table 2 can be described in words. For the ‘X’ sample, which applies to the entire time axis except at the spike, there are three kinds of observations. Individuals who fail or are censored before the start of their spike interval (t,,) are not transformed. Individuals who fail or are censored during their spike who interval (t,,, t, ] are treating as being censored at ta8. Finally, individuals fail or are censored after the end of the spike interval (t,,) are treated as failing or being censored at time t, - Ai. Their time to event is reduced by A,. For the ‘y ’ sample, there are two kinds of observations. Individuals who fail or are censored in (t,,, t, ] are treated as failing or being censored at time who fail or are censored after t, are known to have t, - to<. Individuals survived past the spike, and are treated as censored at time A,. Eq. (1) specifies a continuous hazard function with a single anomalous interval. However, it may be more appropriate to represent the ‘interval’ as a discrete point in time. For example, unemployment duration data are usually collected by week, and there is a sharp increase in the hazard during the final week of UI eligibility. The analogue to eq. (1) for a discrete-time spike point is
h(r) = h(t)
for t before the spike point,
= P
at the spike point,
= X( t - A)
for t after the spike point,
(3)
where X is a continuous hazard function, p is a discrete conditional probability of failure, and A is the interval of time for the spike point. Here, h(t) is a combination of discrete and continuous contributions.’ In general, the hazard before, during, or after the spike could be either continuous or discrete. Covariates may be incorporated into the components of h(t). One popular choice for the continuous hazard functions h(.) and y( .) is given by the proportional hazard model [see, e.g., Kalbfleisch and Prentice (1980)]. Here, the hazard function (1) would be given by
h(t) = &(t)exp(Z’B), v(r)
=
Yo(f)ew(X’~)T
(4) (5)
where h,( .) and yO( .) are baseline hazard functions, Z and X are covariate vectors, and j? and 0 parameter vectors. Note that while Z = X is permissible, p = 0 is not, since it precludes likelihood factorization. ‘See Kalbtleisch
and Prentice
(1980, p. 8) who discuss such models.
358
D.A. Folln~ut~t~et ul.. Ohservuhle churacreristics and unemplqyment
duration
For the discrete probability p, several common functions, such as the logit and probit, immediately suggest themselves. However, if the spike event is the discretization of a continuous process over the interval (t,, th], with hazard y(t - t,) given by (5) then p has a different form. As shown in Prentice and Gloeckler (1978) and Kalbfleisch and Prentice (1980, pp. 36-37, 98-99) p=l
-Pr(T>t,]T>t,)=l-exp[-ea+X’e],
(6)
where T is time until failure and (Y= log /~~O(s) ds. Eq. (6) is the extremevalue survival function evaluated at (Y+ X’8, and the inverse of (6) is known as the complementary log-log link function [see, e.g., McCullagh and Nelder (1983)]. Hence, if we believe that the underlying continuous hazard function has the proportional-hazard property, then the correct discrete regression function is the extreme-value survival function, not a logit or probit. When covariates are used, the restriction to proportional hazards models at and away from the spike with X = Z is favored over other specifications because B and p are perfectly commensurable. In addition, likelihood ratio tests of the equality of 8 and p could be computed with specialized software. Comparing p with 0 when A( .) and p are, e.g., respectively, Cox and logistic is limited to sign and significance because different parameter regressions, magnitudes are not comparable when the underlying models differ.
3. Generalizations
of basic models
The basic models of (1) and (3) may be generalized in a variety of ways. For example, failure behavior may vary over more than a single anomalous interval. One possibility for modeling this situation is to have several intervals with separate hazards. For example, individuals considering retirement may be eligible for both social security and a company pension. Suppose that benefits for the former start at time t, and that benefits for the latter start at time t,. Retirement behavior may change substantially as an individual’s total retirement benefits change. One way to model this situation is to replace x(t - A) with a new hazard +(t - A) in eq. (1). Different hazards for several intervals may seem to require estimation of a large number of parameters. However, this need not be the case. Consider the special case where there are two special intervals and the overall hazard function is given by
h(f) = +(~)exp(Z’B), =
G(t - t,)exp(x’b),
tl
t,,
t,
-=zt.
(7)
D.A. Follmum~ et ~11..Observable
churucteristics
und unemployment
duration
359
the analogue to the likelihood of eq. (2) is the likelihood of a ‘single’ sample of n, + 2n, observations, where n, is the number of observations with t I t, and n2 is the number of observations with t, < t. The covariate vectors X and Z could be identical or could include interaction terms, i.e., x’ = (W’, 0’) and Z’ = (0’, I+“), where 0 and W are k-dimensional vectors. One could also include a dummy covariate to identify time before and after t,. Significance tests on the elements of p could proceed in the usual fashion and could include joint likelihood ratio tests. We note that in the special case when +(t) is constant, eq. (7) is a proportional hazard model with time-varying covariates [see, e.g., Cox and Oakes (1984)]. For any of the above models, unobserved heterogeneity could also be incorporated. For example, in eq. (4) X0( .) could be specified as a Weibull regression model with a random intercept drawn from a Gamma mixing distribution. This model was first proposed (without covariates) by Dubey (1968) and was applied to unemployment duration data by Lancaster (1979) and Lancaster and Nickel1 (1980). This model was generalized to a panel data context by Follmann and Goldberg (1988). A nonparametric version of the model, where the random intercept is drawn from an unspecified mixing distribution, was introduced by Heckman and Singer (1984). Similarly, the discrete probability p could be specified as an extreme-value regression model with a random intercept drawn from a Gamma mixing distribution [see, e.g., Dubey (1969) or Solon and Warner (1989)]. It is of some interest to note that if the Gamma reduces to an exponential mixing distribution, the resulting ‘mixed’ function is the logit. Or as for the continuous model, a nonparametric version similar to that given by Follmann and Lambert (1989) for logistic regression could be specified. If the mixing distributions for the spike and away from the spike are assumed independent, then the overall likelihood function factors into the product of the two mixed distributions. However, the assumption of independent mixing distributions may be too strong. In principle, dependent mixing distributions could be specified, but the computational advantage of independent factors would be lost. On the other hand, Han and Hausman (1986) and Manton, Stallard, and Vaupel (1986) have argued that heterogeneity is a relatively minor concern if a flexible baseline hazard function (such as that given by Cox regression) is specified. 4. Example The data for this study are identical to those used by Katz (1986) and by Han and Hausman (1986). The data derive from Waves 14 and 15 of the University of Michigan Panel Study of Income Dynamics (PSID). The unit of observation is an individual unemployment spell. Observations are available
360
D.A. Follmu~n
et al., Ohservuhle
characteristics
and unemployment
durutmn
on the last unemployment spell experienced by the head of household during the year preceding the survey (the years in question are 1980 and 1981).2 The observations are restricted to heads of household between the ages of 20 and 64 for whom no important data elements are missing. These restrictions leave a basic sample of 1055 unemployment spells, exactly the sample used by Katz (1986) and by Han and Hausman (1986). The rules for UI eligibility varied slightly across states during the 1980-1981 period. Workers who quit or were fired for cause generally were not eligible to receive UI benefits. In addition, workers must demonstrate sufficient attachment to the labor force. by meeting minimum earnings or minimum weeks worked criteria during some base period. Most states offered a basic package of 26 weeks of benefits during the period, though there were some exceptions.3 Of the 1055 individuals, 671 received some UI benefits, but 384 were not eligible for any benefits. We analyze the former group only, because the two groups should behave differently, and we do not expect spikes in the hazard function of the latter group. Moreover, for 16 individuals either the state code or the starting date of unemployment was missing. Deletion of these 16 individuals left a final sample of 655. Table 3 presents a breakdown of the 655 individuals in the final sample. The vast majority had either 26 or 39 weeks of potential benefits. Note that the percentage still unemployed is much smaller at 39 weeks than at 26 weeks, reflecting the longer search time for the latter group. Table 4 gives the means and standard deviations of all variables used in the analysis. The average spell duration is about 16 weeks, and about 16 percent of the spells were censored. Almost half the heads of household are nonwhite, reflecting the oversampling of low-income households. To provide a point of comparison for our new methodology, we will initially estimate a Cox regression model. In this analysis, a ‘partial’ likelihood is maximized to provide estimates of /3 while not explicitly estimating the baseline hazard. See, for example, Kalbfleisch and Prentice (1980) for a discussion of partial likelihood for the Cox regression model. Table 5 provides
“The PSID oversampled low-income households, and thus does not strictly constitute a random sample. However, Katz (1986) reports that hazard functions are quite similar in the ‘random’ subsample and the ‘poverty’ subsample. Therefore, we will include both in order to maximize sample size. ‘Louisiana and West Virginia offered 28 weeks, Massachusetts, Pennsylvania, and Washington offered 30 weeks. Wisconsin and the District of Columbia offered 34 weeks, Utah offered 36 weeks, California. Connecticut, and Hawaii offered 39 weeks. In addition, the Federal Extended Benefit Program provided extended benefits up to 39 weeks in certain states and certain weeks. Extended benefits were triggered whenever the state-insured unemployment rate exceeded a specified threshold level. Twenty-eight of the fifty states had extended benefits during some portion of the 1980-1981 period. Finally, extended benefits were offered in all fifty states, based on the national insured unemployment rate, during the period 20 July 1980 to 24 January 1981.
D.A. FoNmr~~t~ et 01.. Ohseruuhle choructeristits
and unemplq~ment
durution
361
Table 3 Breakdown
Weeks of potential benefits
of sample. Unemployed at start of spike
Sample size
Found job during week UI expired
26 28 30 34 39
214 6 29 8 39X
47 1 5 3 35
9 0 0 0 3
Total
655
91
12
Table 4 Sample statistics Variable
name
Duration Censored Age Education Nonwhite Female Married Unemployment
Description
Mean (Standard
Spell length in weeks 0 if censored. 1 if complete spell observed Age in years Years of schooling completed 1 if nonwhite 1 if female 1 if married County unemployment rate
15.57 (17.70) 0.837 33.76 (10.75) 11.38 (2.09) 0.414 0.137 0.715 X.05 (2.60)
Table 5 Estimates
of Cox regression
aAsymptotic hStatistically
model
Variable
Estimatea
Age
0.0127 0.0010 ~ 0.3036 -0.052X 0.0811 ~ 0.0020 655 -313X.4
Education Nonwhite Female Married Unemployment Sample size Log-likelihood standard errors appear in parentheses significant at the 1 percent level.
deviation)
(0.0040)b (0.0220) (0.0934)b (0.1576) (0.1218) (0.0164)
362
D.A. Folln~um
et ul., Ohservahle
churucteristics
and unemployment
durution
the estimates of j3. The probability of reemployment increases with age and is higher for whites than for nonwhites. The hazard for nonwhites is about 75 percent that of whites, while the hazard of a 20-year-old is about 69 percent that of a 50-year-old. Because the baseline hazard function is arbitrary, the Cox regression model allows the conditional probability of reemployment to be much higher when UI benefits expire. Fig. 2 gives the Kaplan-Meier empirical estimate of the baseline hazard function ignoring covariates. Note the large spikes in the data at 26 and 39 weeks, the most common expiration dates of UI benefits. Implicitly, the Cox regression model is accommodating these spikes. If, however, individuals with different characteristics react differently to the expiration of UI benefits, then a simple proportional hazard model will provide an incomplete description of reemployment behavior. To apply our new methodology, an anomalous interval for each individual must be specified. We view the week in which unemployment benefits expire (cf. table 3) as special, and will use the model given by eq. (3). For A, a Cox regression model will be used. For p, an extreme-value regression will be used. Note that this Cox-extreme value model does not treat the spikes in the empirical hazard given at 30 and 40 weeks in fig. 2 as special. In any duration data, some weeks will appear ‘spikey’ due to chance. Our view is not to accord special treatment to every observed spike, but only those weeks that have an independent plausible reason for being special, Recall that the Cox portion of the Cox extreme-value model uses data from the entire sample, in this case 655 individuals. However, the extreme-value portion of the model uses the sample of individuals still at risk at the beginning of their final week of potential benefits. Hence only 91 observations are available for estimating the extreme-value model. Table 6 gives the required transformation for individuals whose UI benefits expired after the 26th week. Similar transformations were applied to individuals whose UI benefits ran out during other weeks. Table 7 presents the estimates of the Cox extreme-value model. The Cox estimates are qualitatively similar to those presented earlier. In particular, both age and race remain statistically significant. Note, however, that the signs of the coefficients on education and unemployment switch from those presented in table 5, although neither set is statistically significant. It appears that when a simple Cox regression model is used (cf. table 5) an ‘averaging’ of the effects at and away from the spike is taking place. Therefore, even if reemployment behavior away from the spike were of sole interest, ignoring the spike could lead to misleading estimates. In the extreme-value model, there are two marginally significant effects. The conditional probability of reemployment, at the time UI benefits expire, increases with education and decreases with the county unemployment rate. In the Cox model, neither of these effects is significant and they have opposite
D.A. Follmom~ et ul., Obserwhle
charcrcteristics and unemployment
duration
N
CD -
364
D.A. Follnwrtrt
et ~1.. Oh.rercuhle chwucteristics
and unempkyment
durutim
Table 6 Transformation Original
applied
to individuals
data
I,. Ll,
whose benefits expire at the 26th week ‘p’ data
‘X’ data
u, 1s,*
w,36,
l,O I,1 1-o 1.0 I,0 1-o
24.0 24,l 25.0 25.1 25.0 25.0 26,O 26,l 27.0 27.1
24.0 24.1 25.0 25.1 26.0 26, I 27,0 27. I
2x.0 zx, 1
Table 7 Estimates Variable
of Cox-extreme Cox model”
Intercept Age Education Non-white Female Married Unemployment Sample size Log-likelihood ‘Asymptotic bStatistically ‘Statistically
0.0124 -0.0037 -0.3216 PO.0437 0.0719 0.0013 655 - 3079.6
(0.0041)b (0.0223) (0.0946)b (0.1591) (0.1234) (0.0165)
value model Extreme value” ~4.7809 0.0091 0.3286 0.3951 PO.9383 0.4659 PO.2324 91 -31.5
(2.5525) (0.0319) (0.1746)’ (0.6362) (1.1878) (0.7485) (0.1295)’
standard errors appear in parentheses signilicant at the 1 percent level. significant at the 10 percent level.
signs. The disparity between the effects of the covariates at and away from the spike suggests that a simple proportional hazard model provides an incomplete description of reemployment behavior. The hazards of two individuals can be used to illustrate the differences in behavior at the spike. Consider two hypothetical individuals whose UI benefits expire after 26 weeks. For an individual in a county with an unemployment rate of 5.5 and the other covariates evaluated at their means, the conditional probability of reemployment at the spike is 0.1713 with a standard error of 0.0581. Changing the unemployment rate to 10.5 and performing the same calculation provides a value of 0.0571 with a standard error of 0.0327. To
illustrate the effect of education, we compare two individuals with averaged covariates, except one is a college graduate and the other only a high school graduate. Their conditional probabilities of reemployment (standard error) are, respectively, 0.3763 (0.2141) and 0.1191 (0.0384). It appears that while individuals with higher education levels in counties with low unemployment do not become reemployed more rapidly than other individuals before UI benefits run out, they are more likely to become reemployed when UI benefits expire. 5. Conclusions A nonparametric baseline hazard function often is used to accommodate spikes in duration data, such as the large number of unemployment spells that terminate after 26 or 39 weeks, A proportional hazard specification often is used to model the effect of observable characteristics on the hazard function. However. this specification constrains the effects of observable characteristics to be the same at the spikes as they are elsewhere throughout the time axis. We have introduced a methodology that allows the observable characteristics to have diflerent effects over different intervals of the time axis. In addition, we have shown that, in an analysis of unemployment spells, observable characteristics indeed have different effects at the spikes and elsewhere. The hazard functions for different individuals are not proportional.4 The methodology presented here describes such behavior, yet does not impose a computational burden. The likelihood function for the model factors into components that may be maximized easily. Moreover, the procedure outlined in this paper may be generalized in a variety of ways, and should be of use whenever the time axis consists of intervals over which transition behavior differs qualitatively. References Burdett. Kenneth. 1979. Unemployment insurance payments as a search subsidy: A theoretical analysis. Economic Inquiry 17. 3333343. Cox. D.R. and David Oakes. 19X4, Analysis of survival data (Chapman and Hall. London). Dubey, SD., 1968, A compound weibull distribution, Naval Research Logistics Quarterly 1.5, 179-18X. Dubey, S.D., 1969, A new derivation of the logistic distribution. Naval Research Logistics Quarterly 16. 37-40. Follmann, Dean, Matthew Goldberg, and Laurie May. 19X7, Modeling spikes in hazard rates. Research contribution 572 (Center for Naval Analyses, Alexandria, VA).
‘A violation of proportionality was also observed in British data by Narendranathan. Nickell. and Stern (1985) and by Gamerman and West (19X7). By interacting covariates with time itself. they found that covariate effects drifted toward zero as time evolved, leading hazard functions for different individuals to cross. However. neither paper considered spikes in the baseline hazard function. which arc apparently absent from British data.
366
D.A. Fo‘ollmur~tt et al., Ohseruuhle characteristics
and unemployment
duration
Follmann. Dean and Matthew Goldberg, 1988. Distinguishing heterogeneity from decreasing hazard rates. Technometrics 30. 389-396. Follmann. Dean and Diane Lambert, 1989, Generalizing logistic regression by nonparametric mixing. Journal of the American Statistical Association X4, 295-300. Gamerman, Dani and Michael West, 1987, An application of dynamic survival models in unemployment studies, The Statistician 36, 269-274. Han, Aaron and Jerry Hausman, 1986, Semiparametric estimation of duration and competing risk models, Mimeo. (Harvard University, Cambridge, MA). Heckman. James and Burton Singer 1984, A method for minimizing the impact of distributional assumptions in econometric models for duration data, Econometrica 52, 271-320. Kalbfleisch. John and Ross Prentice. 1980. The statistical analysis of failure time data (Wiley, New York. NY). Katz, Lawrence. 1986, Layoffs. recall and the duration of unemployment, Working paper 1825 (NBER. Cambridge. MA). Kennan, John, 1985, The duration of contract strikes in U.S. manufacturing, Journal of Econometrics 28. 5-2X. Lancaster. Tony. 1979. Econometric methods for the duration of unemployment, Econometrica 47, 939-956. Lancaster, Tony and Stephen Nickell, 1980, The analysis of reemployment probabilities for the unemployed, Journal of the Royal Statistical Society A 143, 141-152. Manton. Kenneth. Eric Stallard, and James Vaupel, 1986, Alternative models for the heterogeneity of mortality risks among the aged, Journal of the American Statistical Association 81, 635-644. McCullagh, P. and J.A. Nelder, 1983. Generalized linear models (Chapman and Hall. New York, NY). Meyer, Bruce. 1986, Unemployment insurance and unemployment spells, Mimeo. (Northwestern University. Evanston. IL). Mitchell, Olivia and Gary Fields. 1984, The economics of retirement behavior, Journal of Labor Economics 2, 84-105. Moffitt. Robert, 1985, Unemployment insurance and the distribution of unemployment spells, Journal of Econometrics 28, 85-102. Mortensen. Dale, 1977. Unemployment insurance and job search decisions, Industrial and Labor Relations Review 30. 505-517. Narendranathan. W., S. Nickell. and J. Stern. 1985, Unemployment benefits revisited, Economic Journal 95. 307-329. Prentice. Ross and L.A. Gloeckler, 1978, Regression analysis of grouped survival data with application to breast cancer data, Biometrics 34, 57-67. Solon, Gary and John Warner, 1989, A duration model with non-parametric duration dependence, Paper presented at SRA-AR1 conference on army manpower (Arlington, VA).