Journal of Statistical Planning and Inference 47 (1995) 71 78
ELSEVIER
journal of statistical planning and inference
Modelling ordinal recurrent events D a m o n M. Berridge (;entre.[br Applied Statistics, Lancaster University, Lancaster LA 1 4 YF, UK
Abstract
An ordinal outcome is one which comprises a range of categories that are inherently ordered in some sense. An example of such an outcome is degree of supervision required (from 'not at all' to "very closely') in the current job of an individual. In addition, an individual's outcome may be recorded repeatedly over time. For example, information may be available on the level of supervision required in each and every job in which an individual had been employed. These ordinal outcomes, measured repeatedly over time, are defined to be ordinal recurrent events. The early stages of a project seeking to develop and implement statistical models for these ordinal recurrent events will be described. Suggestions for future progress of the project, including possible solutions to problems already anticipated, will be discussed. Key words': Continuation ratio; Ordinal recurrent events; N o r m a l random effects; Residual heterogeneity
1. Introduction An ordinal outcome comprises a range of categories that are inherently ordered in some way. Examples in a social science context include the level of security (from 'very secure' to 'very insecure') and degree of supervision required (from 'not at all' to 'very closely') in the current job of a survey respondent. In addition, an ordinal outcome for any one individual may be recorded repeatedly over a period of time. For example, a respondent in a social survey may be asked for full details of his/her work history. In the above examples, a respondent may be asked to say how the level of security and degree of supervision required varied with each and every job in which he/she had been employed. These ordinal outcomes, measured repeatedly for each individual, are defined to be ordinal recurrent events. We would like to ascertain how a group of explanatory variables influences the pattern of response among the ordered categories. In the above examples, interest may focus on whether perhaps there are significant gender, educational, and other differences in work histories. In other words, on whether the pattern of response among the categories of level of job security or level of supervision through time differs 01378-3758/95/$09.50 © 1995--Elsevier Science B.V. All rights reserved SSDI 0 3 7 8 - 3 7 5 8 ( 9 4 ) 0 0 1 2 2 - 7
72
D.M. Berridge /Journal of Statistical Planning and Inference 47 (1995) 71-78
significantly between the male and female respondents, differs with different levels of educational achievements, and depends on time-varying explanatory variables such as the age at which a respondent starts each job. In order to do this, we have to take into account not only ordinality within the categories, but also the usually substantial variation between individuals due to unmeasured and possibly unmeasurable variables (known as residual heterogeneity). Ignoring residual heterogeneity can lead to seriously misleading conclusions about the dependence of current behaviour on past behaviour and temporal variations in exogenous variables (for example, Massy et al., 1970; Heckman, 1981; Davies and Pickles, 1985). Binary recurrent events are just a special case of nominal and ordinal recurrent events, when the number of categories is restricted to two. Methods relating binary recurrent events to a set of explanatory variables have already been developed. One such approach handles residual heterogeneity by incorporating individual-specific normal error structure into an event-specific binary logistic model (Allison, 1987). Error terms are subsequently eliminated by applying marginal likelihood techniques. This methodology is discussed in the following section. Recurrent events comprising more than two nominal categories may be analysed using mixed Markov renewal models (for example, Heckman and Singer, 1985). In spite of the importance of ordinal measures in the social sciences, there has been little work on developing methods for the analysis of recurrent events comprising more than two ordered categories. Empirical generalised least squares (Koch et al., 1977) and generalised estimating equations (Liang and Zeger, 1986; Zeger et al., 1988) have been suggested as non-likelihood based methods of analysing such data in the physical sciences. Likelihood based models include the multivariate probit (Lesaffre and Molenberghs, 1991) and the multivariate Dale (Molenberghs and Lesaffre, 1992), a multivariate extension of the method proposed by Dale (1986) and based on the Plackett distribution. Single ordinal responses may be analysed using the continuation ratio model (McCullagh, 1980), which is a direct generalisation of logistic regression for single binary responses. In this paper it is proposed to develop methodology for ordinal recurrent events by generalising an existing model for binary recurrent events, described in the next section, in a similar manner. The technical details of this work are discussed in Section 3, while an application of this new methodology is considered in Section 4. Section 5 covers limitations of this particular generalisation and discusses the scope for further generalisation.
2. Modelling binary recurrent events Denote the outcome of the tth event for the ith individual by Ylt, which equals 0 or 1 for every i and t. There are individuals for whom there will be zero (or very low) probability of change in outcome, from one event to another. Such individuals are called 'stayers'. It is assumed here that there are no stayers. How stayers can be
D.M. Berridge / Journal o[' Statistical Planning and ln[erence 47 (1995) 71 78
73
handled is described by Barry et al. (1990). A method which may be employed to model such binary recurrent events is based on the conventional logistic model, in which each observation is assumed to be independent. The tth event associated with the ith individual contributes the term [exp(p'xi,)] y'' 1 + exp(p'xit) to the overall likelihood, were xl, is a vector of explanatory variables and p is a vector of parameter estimates. In order to take unmeasured and unmeasurable variables into account, an individual-specific random error term is added to the above model. For the sequence of outcomes of the ith individual, this random effects model has the integrated (or marginal) likelihood
Li(fl) =
f [ r~1 [exp(fl' xi~ + e')]Y'~7 . . . . z-----~5~,-- u-~lJte~og, , t + e x p t p x~, + e~ I
where f(e,) is the probability density function (mixing distribution) of the error term and Ti is the length of the sequence. If f ( . ) assumes a normal parametric form, then the likelihood integral can be evaluated numerically using Gaussian quadrature. This results in the (marginal) likelihood for the sequence of
,= 1 1 + exp(p xi, + cozq)J pq' where zq and pq, q = 1 ..... Q, are the Q fixed quadrature locations and probabilities respectively, and the scale parameter co is the unknown standard deviation of the mixing distribution. The parameters p and co, along with their corresponding standard errors, may be estimated using the statistical software package Software for the Analysis of Binary Recurrent Events, SABRE (Barry et al., 1990). Binary recurrent events may be analysed using other statistical software packages, but SABRE is the only package that has been designed specifically for the modelling of binary recurrent events, and that can handle stayers. Its development spawned a number of substantive research papers on subjects such as the labour force participation of married women (Davies et al., 1992), inter-city migration (Davies and Flowerdew, 1992) and male unemployment (Francis et al., 1989). In SABRE, two approaches are employed to create the Hessian matrix of second derivatives of the likelihood with respect to the model parameters. One method, proposed by Berndt et al. (1974) and Meilijson (1989), approximates the Hessian matrix with the variance-covariance matrix of first derivatives. The true matrix of second derivatives is used in the other approach. For more details on the benefits and
D.M. Berridge / Journal o f Statistical Planning and Inference 47 (1995) 71-78
74
drawbacks of each approach, and the technicalities involved in implementing these methods in SABRE, see Barry et al. (1990).
3. O r d i n a l r e c u r r e n t e v e n t s
H o w ordinal recurrent events may be modelled is considered in this section. It is assumed that no stayers are present in the population of interest. The issue of handling stayers when modelling ordinal recurrent events is discussed in Section 5. Consider a recurrent outcome comprising (c + 1) categories, denoted by Co ..... Cc, and coded 0,..., c, respectively. Each event of each individual comprises one of the categories Co ..... Co Thus, we have a sequence of ordinal outcomes; for example, when c = 3, 0
2
1
1
3
0
3
2
2
0.
Each ordinal response can be considered as a series of conditionally independent binary partitions of the original response, each of which can be modelled via logistic regression. The probability of the tth event resulting in outcome C~, given the outcome is one of C~..... Cc is exp(0j + f i x , ) , , j = 0 ..... 1 + exp(0j + p xit)
c-1.
The parameter 0r is the intercept corresponding to the (j + 1)th binary partition (Cj: Cj+ 1..... C~) of the ordinal response, j = 0 ..... c - 1. Denote the outcome corresponding to the (j + 1)th partition of the tth event for the ith individual by Yijt, which equals 0 or 1 where appropriate. The above sequence of ordinal outcomes can be expressed in terms of ylit's as follows:
E v e n t no.
Y~ot Y.t Yi2t
t
l
2
3
4
5
6
7
8
9
l0
1
0 0 1
0 1
0 1
0 0 0
1
0 0 0
0 0 1
0 0 1
1
The variable Yijt equals 1 ifj equalsj(i, t), and 0 otherwise, and is undefined for values o f j greater than j(i, t), because an individual responding with outcome Cj,,,~ will not be present in partitions j(i,t) + 2 ..... c,j(i,t) = 0 ..... c - 2. Only the three binary variables Yiot, Yil, and Yizt are required to fully specify all possible outcomes along the four-category response. To handle residual heterogeneity, individual-specific random error can be incorporated into the above framework. As in the binary case, we can use
D.M. Berridge / Journal o[" Statistical Planning and Inference 47 (1995) 71 78
75
Gaussian quadrature for the numerical evaluation of the above likelihood integral, assuming that f ( . ) takes a normal parametric form, leading to the (marginal) likelihood
L~(fl) = qy='l
t=lk=O
I. 1 at- e x p ( 0 k
+ fltxit 4- (y.)Zq)
Pq'
(1)
where zq and pq, q = 1 . . . . . Q are the Q fixed quadrature locations and probabilities respectively, and the scale parameter o) is the unknown standard deviation of the mixing distribution. Under the assumption of normally distributed random effects, the continuation ratio structure carries over approximately to the marginal model with parameters scaled. To explain how model (1) may be fitted, first consider the simple case in which each individual has a single event, i.e. Ti = 1 for all i, so that the full likelihood corresponding to individual i with outcome Cj(~) is L,(fl) = E k:o
Li(fl) : k = o
{_exp(0k+_fiX 1 + exp(0k + f l
1 -k- e x p ( 0 k
Xit )
J'
"k- fl'Xi,) .J
j ( i ) = 0 . . . . . c -- 1,
(2) j(i)
=
c.
The original data can be transformed in such a way that model (2) can be fitted by performing logistic regression on the transformed data (Berridge, 1991). Details of a G L I M macro which performs the necessary transformations are provided by Berridge (1992). We can apply the same principles to model (1). Instead of transforming data on a single event for any one individual, information on all events relating to that individual must be transformed. For any one event t, a row is generated for each partition until a Y~kt equals one, or until the third is generated so that 22 rows of data would be generated from the original data on individual i that was presented earlier in this section. This transformation process can be applied to all other individuals in a similar manner. By running this new dataset through SABRE, as though modelling binary recurrent events, we are actually fitting model (1) to the ordinal recurrent events. An example of the application of model (1) is considered in the following section.
4. Example Part of the Social Change and Economic Life Initiative (SCELI) (ESRG 1991) invited 1000 individuals resident in Rochdale, Lancashire, to provide detailed information about their working life. Such information included the duration and type of each job in which a respondent had been employed. Here, an event is defined to be a respondent gaining employment in a new job.
76
D.M. Berridge/JournalofStat~t#al Planning and Inference 47 (1995) 71-78
Table 1 Results of fitting model (1) to SCELI data Parameter 01 0z 03 fll f12
Estimate 0.769 1.961 2.207 0.191 -- 0.033 1.213
s.e. 0.107 0.121 0.140 0.095 0.003 0.050
A subset of data comprising all work events for respondents was then extracted, excluding any work event with missing data or relating to a period of unemployment. This produced 5626 jobs on 968 cases. One of the objectives of SCELI was to assess the effects of various factors on job security. The respondents were asked to state whether each job had been very secure, fairly secure, fairly insecure or very insecure. To investigate whether the security of each job in a respondent's life depends on the gender and starting age of the respondent, we m a y employ model (1). Consider information available on the ith respondent. The vector of explanatory variables comprises a variable representing gender, denoted by x~it =
if the ith respondent is a {female '
and Xzit, which equals the age at which the ith respondent started the tth job. The parameter fll corresponds to the measure of difference in job security between males and females, which is assumed to be constant over all three partitions of the fourcategory ordinal outcome and constant over all jobs. The same assumption applies to the effect of starting age on job security, denoted by f12. Precise interpretation of these parameters is provided later in this section. The parameter Ok is the intercept associated with the (k + 1)th partition of the ordinal outcome, k = 0, 1,2, while e) is the scale parameter introduced in Section 2. The procedure outlined in the previous section is employed to transform the original data. Model (1) can then be fitted to the original data by applying SABRE to this newly generated set of data. The value of - 2x log-likelihood for the null model is 11 897.3 compared to a value of 1 l 778.0 for model (1), indicating that either or both explanatory variables have a significant effect on level of job security. Values of - 2x log-likelihood of 11 893.5 and 11 782.0 for models including only gender and only starting age respectively indicate that the effect of starting age is much more significant than the effect of gender, but that the effect of gender is almost significant at the 5% level. The estimates and corresponding standard errors of the parameters included in model (1) are presented in Table I. The estimates in Table 1 indicate that, on average, the female respondents felt more secure in their jobs than the males, and that the older a respondent is when he/she
D.M. Berridge / Journal of Statistical Planning and lnJk'rence 47 (1995) 71 78
77
starts a job, the less secure he/she feels in that job. The estimate of co is significantly different from zero, indicating that there is substantial residual heterogeneity, due to the omission of important explanatory variables (such as job characteristics in this case), which has been taken into account by incorporating random error into the modelling framework.
5. Discussion
The continuation ratio-type generalisation assumes that the mixing distribution of the error term is identical over all partitions of an ordinal outcome. This assumption is unrealistic since the number of respondents in any one partition decreases with successive partitions and the mean and variance in successive partition-specific normal distributions are likely to vary accordingly. A system of prior and posterior normal mixture distributions could be set up. The distribution corresponding to partition j would be the posterior produced from the distribution corresponding to partition (j - 1) in the light of responses in partition (j - 1) and would be the prior used to create the distribution corresponding to partition (j + 1), given the responses in partition, j, j = 2 ..... c - 2. It is implausible that a univariate error distribution in the logistic-normal model, as employed in Section 2, can represent the effects of residual heterogeneity in an effective manner, with different processes likely to govern moves out of different states. It is unlikely that a single parsimonious error distribution specification will be universally robust. Parsimonious error distributions which are robust over a range of datatypes could be identified through the use of mixed M a r k o v renewal models with explanatory variables. The tail behaviour of the normal distribution is inconsistent with 'stayers'. For example, approximately 20% of the respondents in the example believed that all the jobs in which they had been employed had the same level of security. Many empirical researchers in the social sciences have noted that parametric distributions tend to underestimate the number of stayers (for example, Spilerman, 1972). For partition j of a c-category response, two end-point parameters associated with the conditional probabilities of 'staying' in partition j or not, given that the individual is not a stayer in any of the partitions 1 ..... j - 1,j -- 1..... c - 1, could be defined. These end-point parameters could be incorporated into the likelihood framework. In general, with a single ordinal response, the jth continuation ratio partition of a c-category response is conditionally independent of the first ( j - 1) partitions, j := 1 ..... c - 1. This property of conditional independence allows the continuation ratio model to be fitted as a series of binary logistic models with independent intercepts/cut-points, each one corresponding to a different partition of the ordinal response. This property does not hold for the proportional odds model (McCullagh, 1980), and so the resulting variance -covariance matrix is more complicated. The same
78
D.M. Berridge/Journal of Statistical Planning and lnJerence 47 (1995) 71 78
problem applies to the proportional odds-type generalisation of the logistic-normal model.
References Allison, P.D. (1987). Introducing a disturbance into Logit and Probit regression models. Soc. Methods Res. 15, 355-374. Barry, J., B.J. Francis and R.B. Davies (1990). SABRE: Software for the Analysis of Binary Recurrent Events. A guide for users. CAS Publications, Centre for Applied Statistics, Lancaster University. Berndt, E.K., B.H. Hall, R.E. Hall and J.A. Hausman (1974). Estimation and Inference in Nonlinear Structural Models. Ann. Econom. Soc. Measurement 3/4, 665. Berridge, D.M. (1991). Analysis of failure time data with ordered categories of response. Ph.D. thesis, University of Reading. Berridge, D.M. (1992). Fitting the continuation ratio model using GLIM4. In: L. Fahrmeir et al., Eds., Advances in GLIM and Statistical Modelling, Lecture Notes in Statistics Vol. 78, Springer, Berlin, 27-33. Dale, JR. (1986). Global cross-ratio models for bivariate, discrete ordered responses. Biometrics 42, 909 917. Davies, R.B., P. Elias and R.D. Penn (1992). The relationship between a husband's unemployment and his wife's participation in the labour force. Oxford Bull. Econom. and Statistics 54, 145 171. Davies, R.B. and R. Flowerdew (1992). Modelling migration careers using data from a British survey. Geographical Analysis 24, 35-57. Davies, R.B. and A.R. Pickles (1985). A panel study of life cycle effects in residential mobility. Geographical Analysis 17, 199 216. E.S.R.C. (1991). The Social Change and Economic Life Initiative: An Evaluation, mimeo. Francis, B., T. Hogarth and D.J. Smith (1989). A logistic regression analysis of religion and male unemployment in Northern Ireland. Research Report, Centre for Applied Statistics, Lancaster University. Heckman, J.J. (1981). Statistical models for discrete panel data. In: C.F. Manski and D. McFadden, Eds., Structural Analysis of Discrete Data with Econometric Applications, MIT Press, 114 178. Heckman, J.J. and B. Singer (1985). Social science duration analysis. In: J.J. Heckman and B. Singer, Eds., Longitudinal Analysis of Labour Market Data, CUP, 39-110. Koch, G.G., J.J. Landis, J.L. Freeman, D.H. Freeman and R.G. Lehnen (1977). A general methodology for the analysis of experiments with repeated measurement of categorical data. Biometrics 33, 133-158. Lesaffre, E. and G. Molenberghs (1991). Multivariate probit analysis; a neglected procedure in medical statistics. Statistics in Medicine 10, 1391-1403. Liang, K.-Y. and S.L. Zeger (1986). Longitudinal data analysis using generalized linear models. Biometrika 73, 13-22. Massy, W.F., D.M. Montgomery and D.G. Morrison (1970). Stochastic Models for Buying Behaviour. MIT Press. McCullagh, P. (1980). Regression models for ordinal data (with discussion). J. Roy. Statist. Soc. B 42, 109-142. Meilijson, I. (1989). A first improvement to the EM algorithm on its own terms. J. Roy. Statist. Soc. B 51, 127 138. Molenberghs, G. and E. Lesaffre (1992). Marginal modelling of correlated ordinal data using an n-way Plackett distribution. In: L. Fahrmeir et al., Eds., Advances in GLIM and Statistical Modelling, Lecture Notes in Statistics 78, Springer, Berlin, 139 144. Spilerman, S. (1972). Extensions of the mover-stayer model. Ame. J. Sociology 78, 599 626. Zeger, S.L., K.-Y. Liang and P.S. Albert (1988). Models for longitudinal data: a generalized estimating equation approach. Biometrics 44, 1049-1060.