Transpn. Res..A. Vol. 27A, No. 6, pp. 461-476, 1993 Printed in Great Britain.
0965-8564/93 $6.00 + .00 © 1993 Pergamon Press Ltd.
A PANEL DATA SWITCHING REGRESSION MODEL OF MOBILITY AND CAR OWNERSHIP HENK MEURS MuConsult, and Department of Econometrics, Universityof Groningen, P.O. Box 9846, 3506 GV Utrecht, The Netherlands (Received I December 1991; in revised f o r m 15 June 1992)
Abstract-The objective of this paper is to present a panel data model of car ownership and mobifity. Unobserved heterogeneity is controlled for by including correlated random effects in the equations describing car ownership and mobility. A mass-points approach is adopted to control for unobserved heterogeneity. The results show that decisions concerning the first car in the household are difficult to affect; a large number of households are inclined to keep one car. Second car ownership may be more sensitive to changesin the observedcontributing factors. This suggests that in The Netherlands policies aimed at changing second car ownership will be more successful than those aimed at influencing decisions concerning the first car in households. A major part of the correlation between the unobservables in the car ownership and the mobility equations is attributable to random effects. The time-variant errors of the mobility equations are not significantly correlated to car ownership decisions. This implies that mobility can only be influenced to a small extent by policy makers without measures aimed at reducing (second) car ownership. INTRODUCTION Most trip generation models assume that car ownership decisions are exogenous with respect to trip generation. It is assumed that actual mobility is a function o f household characteristics and actual car ownership. In these models the effects o f household characteristics are (controlling for the effects of car ownership) the same for households with and without cars. This assumption may not hold. Due to omission o f relevant variables, for example, coefficients associated with the household characteristics may differ between car owners and households without a car. In that case the demand function depends upon the group or regime to which households belong. This type of model is known as a switching regression model. Households belong to a regime depending on the choices they make with respect to car ownership. This choice is described by a car ownership model. Not all factors influencing car ownership and mobility decisions are observed. The unobserved characteristics, represented by error terms in the models, are the source for the complications in these models. The hypothesized presence o f correlation among the unobservables implies that the estimated effects of household characteristics on mobility will be biased, if not accounted for explicitly. This is the case of a switching regression model with endogenous switching. If the unobserved variables in the car ownership model are independent of the trip generation model, then there is no problem in estimation o f the parameters; the trip generation and the car ownership models can be estimated separately. A correctly specified switching regression model with endogenous switching allows the analyst to decompose the observed effects into a structural effect and an effect through the unobservables. Cross-sectional estimation of these models, without random effects, are by now well known. It is a member of a class of models that can be handled within Heckman's (1976) framework. With this approach, a car ownership model is estimated first. The estimates of this model are used to derive correction terms. These terms are introduced in the trip generation models associated with each regime. Estimation o f each regime is done using OLS (Maddala, 1983). This paper provides an extension of cross-sectional selection models to panel data. Panel data are very useful in understanding the interrelationships over time. Train and Lohrer (1982) and Mannering and Winston (1982) estimate a joint model o f car ownership and car usage with a d u m m y variable representing previous choices. Hensher (1984) 461
462
H. MEURS
presents a model including lagged exogenous variables. Kitamura and Goulias (1988) use a selection model with lagged endogenous variables to describe car ownership, trip generation and mode choice. All these authors observe that current choices can be explained to a substantial extent by previous choices.
State dependence versus heterogeneity High stability of car ownership and trip generation, after taking the exogenous variables into account, has two potential explanations. Unobserved heterogeneity can lead to high stability in behaviour; due to permanent unobserved effects some households are likely to remain mobile or to maintain their car ownership levels. Another explanation may be the presence of true state dependence. State dependence refers to lags in behaviour; due to costs of change, search time for alternatives, and so on, households are likely to maintain their travel patterns and their car ownership levels. In a previous analysis by the author (Meurs, 1990) the issue of state dependence versus heterogeneity has been examined for trip generation. The general conclusion was that unobserved heterogeneity provides the main explanation for stability in trip generation after controlling for the observed exogenous variables. State dependence effects were small, implying that adjustment in trip generation to new circumstances is rather quick. Kitamura and Bunch (1989) examined the issue of unobserved heterogeneity and statedependence in car ownership models. In their approach, significant effects of both heterogeneity and state dependence were found, if estimated in separate models. Including both effects in one model suggested that state dependence effects were significant, while unobserved heterogeneity was not. In examining the issue of state dependence versus unobserved heterogeneity, Kitamura and Bunch use an ordered response probit model in which unobserved heterogeneity is parameterized by a normal distribution. Researchers have pointed out than an erroneous choice of the distribution of unobserved heterogeneity may have considerable effects on the parameter estimates (Davies and Croughley, 1984; Heckman and Singer, 1984). The question is whether short-term panel data are rich enough to test these two explanations. A major problem in panel data discrete choice models is the issue of initial conditions, or the question how the first observations are generated. Heckman (1981b) proposed an ad-hoc procedure, applied by Kitamura and Bunch (1989). Some unreported tests by the author revealed that the outcomes are very sensitive to differences in the assumptions about these initial conditions. In addition, it is doubtful whether these effects can be separated, given the fact that car ownership levels hardly change. In case car ownership levels do not change, this issue cannot be examined (Heckman, 1981a). The present paper attributes stability in car ownership and trip generation to unobserved heterogeneity. Objectives and outline The aim of thi~ paper is to estimate a joint model of car ownership and mobility using panel data with two extensions of the current literature: 1. The model is a joint model of car ownership and mobility controlling for unobserved heterogeneity. 2. Unobserved heterogeneity is parameterized using a nonparametric procedure (masspoints approach). First, a model will be discussed describing the mutual relationships between car ownership and mobility. This model has a linear regression component describing trip generation of households and equations describing the propensity to own cars. The error term is decomposed into a time-invariant and a time-varying component. Second, estimation methods will be discussed. Like the cross-sectional models, correction terms can be used to take correlation between the error terms of the car ownership and trip generation equations into account. These terms are derived from discrete-choice models describing
A panel data switching regression model
463
car ownership choices. In the remainder o f the paper the empirical application will be presented, starting with estimation o f the car ownership models, followed by a discussion o f the trip generation models. JOINT CAR OWNERSHIP AND MOBILITY MODELLING WITH PANEL DATA The m o d e l In this section, a general description of the model to be estimated will be described. It is assumed that households own cars if ate, = 1 and that they do not own cars if a~t = 0. The index i represents households (i = 1. . . . . N ) and the index t denotes waves (t = 1. . . . . T). The index 1 represents the decision whether or not to own at least one car. Households own cars if a continuous latent variable aT~tcrosses a threshold. Without loss of generality this threshold is assumed to be zero. Hence, the observed variable a~it is an indicator of a continuous characteristic that is unobserved. It represents the propensity to own at least one car. This propensity is assumed to be a linear function o f K characteristics o f household i in wave t. Households that own one car have a propensity to own two or more cars, represented by a~,. Households make Y~t trips. Three regimes are identified in describing the demand for trip generation, represented by an index c: 1. Household without cars (c = 0). 2. Households with one car (c = 1). 3. Households with two or more cars (c = 2). Car ownership choices affect trip generation because these decisions determine the regime which describes the demand for trips. Conversely, it is assumed that mobility decisions do not affect car ownership choices directly. Including direct effects of actual mobility on the propensity to own cars is not possible. This would imply that the propensity (or probability) to own cars is directly affected by actual car ownership choices. This violates the logical consistency conditions described in Maddala (1983). Hence, the model considered in this paper is a sequential model in which households first decide on car ownership. The demand for mobility is observed after choices concerning car ownership. This introduces selectivity; the demand for mobility of car owners is not observed in case they do not have a car. Not all factors affecting the propensity to own cars and the demand for trip generation are observed. These unobserved effects are represented by disturbances in the models. These disturbances are decomposed into two components. One effects, and the other term captures the time-varying unobserved effects. Hence, both the car ownership and trip generation models are r a n d o m effects models. The models are extensions o f the ones presented in Meurs (1989, 1990). In these papers, simple trip generation with random effects are discussed and estimated. The r a n d o m effects switching regression model with endogenous selection is: y ,c, =
+
+
c = o,1,2
a*. = Z~.'rt + ~ + e~, a~, = Z~,Tt + 6~ + ~.2it
if a*, > 0,
with al, = 1 if aT,t -> 0 al, = 0 if a'it < 0 a~, = 1 if a$i, -> 0 , a% > 0 a~, = 0 if a ~ t <
0 , a% > 0.
(1)
464
H.
MEURS
The following assumptions are made about the distribution o f the error terms: e.t - N ( 0 , 1 )
e~t - N ( 0 , 1 )
E(u~7 ) )
0
Var (u~7))
~
-~(~'
U u
E(~> c)) = E(8~) = 0 Var(a~ c)) = o~
(c) E ( ~ ~) 8~ ao,, E(~.,
u~,~))
=
=
E ( ~ ~) 8~)
~.,,-(~)
E(r~,
=
o~
u~p)
=
o (°).,,
E(8,# (~>~)) = E(e-~tai(c)~. = E ( u ~ t ) 8~) = E ( u ~ 7 ) 6~,) = 0 E(e,# ~,) = 0
E(5~ 6~) = O.
(2)
It is assumed that the random shocks and the random time-invariant components in the equations describing the decisions with respect to ownership of zero/one and o n e / t w o + cars are independent. Assuming such independence approximates the findings o f Meurs (1991); some slight correlation was found, but neglecting it did not change the coefficients substantially. The key assumption about the error terms concerns the potential correlation between the random components and the random shocks in the equations describing mobility and car ownership decisions. The correlation between these error terms implies that, apart from overlapping observed exogenous influences, car ownership and mobility decisions are correlated. No assumptions are made about the distributions of the random effects. The car ownership model differs in two respects from conventional discrete choice models. First, a distinction is made between time-varying and time-invariant unobserved effects. This distinction is possible because more observations on the same households are available. Second, no distributional assumptions are made about the random effects. The absence of distributional assumptions deviates from conventional discrete-choice models. In the recent past, models with flexible functional forms have been proposed (Gaudry, 1990), with less restrictions than the standard discrete-choice models. The major difference with the current approach is that here not even mild restrictions are imposed in the distribution of the random effects. It is assumed that the time-varying errors in the car ownership models are normally distributed. With the latter, this model differs from the fully nonparametric estimation approaches proposed by Manski (1975) and Cosslett (1983). Here, only the time-invariant effect is estimated nonparametrically. First, consider the correlation between the random effects. If the random effects in the two models are correlated positively, households that are prone to have cars in all waves are also likely to be more mobile. Second, the correlation between the error terms may also be attributable to random shocks. Households that acquire a car in a certain period might also be more mobile in that period than expected on the basis of the included explanatory variables. The covariances between the random effects and the time-varying residuals can have opposite signs. For example, while car owners might on average be more mobile than expected on the basis o f the included explanatory variables, the mobility in the period that these households acquire a car may be less than expected. The implications of these correlations for policy makers and planners are different. First consider policies aimed at reducing mobility, such as increasing the costs of using cars. A considerable unobserved heterogeneity implies that mobility can only be influenced to a small extent. If the correlation among the random effects in the two equations is high, this suggests that a substantial part of stability in behaviour can be attributed to car ownership decisions. Policies affecting both mobility as well as car ownership will be more effective than expected on the basis o f the included explanatory variables. Next, consider policies aimed at making car ownership less attractive. Households
A panel data switching regression model
465
that persist in keeping their cars will be prone to own cars due to other observed circumstances, e.g. high income households, or will have a high unobserved component. The marginal car owners may react to the policy and reduce their car ownership levels. If the random effects are positive, this implies that more mobile households will keep their cars. The effects o f policies will not be as large as expected on the basis o f direct effects of car ownership on mobility. For panel data models the same estimation approach procedures can be adopted as for cross-sectional models. Only panel data models described in this section require more correction terms. For households without a car one has two correction terms. One term to control for the correlation between the random effects in the trip generation and the car ownership models and one to control for the correlation between the random shocks in these models. For households with one or two or more cars, four correction terms are required, two terms for the correlation between the random effects and two for the correlation between the random shocks. In the next section these procedures will be discussed. First, a random effects car ownership model will be presented. Next, estimation o f the trip generation equations will be discussed. ESTIMATION
Car ownership For simplicity of notation, consider the situation in which households choose between zero cars (ai, = 0) and one car (a, = 1). If one has panel data to estimate a car ownership model, a sequence S, can be observed for each household, where S~ is a sequence o f T values on the variable a~,. The probability that a sequence S~ is observed can be written as (Heckman, 1981a): T
1=1
where Si is a sequence of Tvalues on the variable a~,. This yields a marginal likelihood function. The unknown variables 8~ are integrated out. The distribution F05) o f / ~ is called a mixing distribution. The use of a marginal likelihood function is allowed if the ~:s are independent o f the exogenous variables. In that case it is sufficient to use the parameters describing the distribution function F(/D in the likelihood; the individual values of/ii are of no concern. There are two main approaches to modelling the mixing distribution f(8). We may a priori assume a parametric form. The approach often adopted in connection with probit models is one proposed by Bock and Lieberman (1970) and Butler and Moffit (1982), and applied by Kitamura and Bunch (1989). They assume that/~ - N(O,o~). The integral is evaluated using Gaussian quadrature. The disadvantage of using such parametric approaches is that the outcomes are sensitive to the specification of the mixing distribution (Heckman and Singer, 1984; Davies and Crouchley, 1984). In the recent past, proposals that avoid the use o f an a priori specified distribution for the nuisance parameters have been made and implemented. These proposals are based on the conditions of Kiefer and Wolfowitz (1956) for consistent estimation o f the structural parameters without distribution assumptions. The 5i are, unlike in the fixed effects case, not arbitrary constants, but independently and identically distributed random variables with a distribution function F, unknown to us. This distribution F can be estimated as an empirical distribution function, given the fact that for each individual more observations are available. It is referred to as a nonparametric approach. Laird (1978) and Lindsay (1983a and b) have shown that, under rather general conditions, a finite number o f mass points (the support size) still allow consistent ML-estimation o f B. The contribution o f individual i to the likelihood can be written as: M
L,(x,~.~i) = ~ m=l
114(A) 27:6-0
T
7r~ I ' [ [¢(Z,,~ + ~.)]°,, [1 - ¢ ( Z . ~ t=!
+ ti.)]'-°,,,
(4)
466
H. MEURS
where ~m is the location of mass point k and 7rm is its mass, with M
E
7rm = 1.
ra=l
So, the integration in eqn (3) is replaced by a finite summation sign. These mass points do not reflect finite divisions within the population; they are points that capture the characteristics of the distribution function F(~). In contrast to the Gaussian-quadrature approach, we can estimate the locations and weights necessary to capture heterogeneity, while the use of the normal distribution implies certain optimal weights and locations. Experience with this approach shows that in general only a limited number of mass points is needed (Laird, 1978; Heckman and Singer, 1984; Davies and Crouchley, 1984). Estimates o f the parameters r=, ~,~ and fl are obtained by maximizing the likelihood N
M
function In L = ~ In L~, under the restrictions that ~ i=1
~rm = 1 and that 7rm >- 0, m =
m=l
1. . . . M. These constraints can be incorporated by adding a Langrangian term to the likelihood function. It is also possible to write the mass r,~ as a logit-function: exp (Am3") 7r,~- M ,
(5)
~', exp (A/,/) j=l
where Am are characteristics associated with mass point m. Substitution o f eqn (5) into eqn (4) yields the likelihood function that will be maximized. The advantages of such a transformation are that constrained optimization can be avoided and that one can model the masses as a function of known time-invariant variables. The main advantage of the nonparametric approach is that consistent estimates of the structural parameters are obtained without any assumptions about the distribution of the nuisance parameters. Hence, this model allows protection against the consequences o f misspecification of discrete choice models resulting from incorrect assumptions about the mixing distribution. The estimate of the mixing distribution, characterized by the locations and probabilities of the mass points, is also consistent (Kiefer and Wolfowitz, 1956). This suggests that an interpretation could be given to these outcomes while not merely treating the estimate of the mixing distribution as being uninteresting from a substantial point o f view. However, the mixing distribution can be overemphasized. Although the nonparametric MLestimators o f the mixing distribution is consistent, no asymptotic distribution theory has been derived (Lancaster, 1990). This implies that no standard errors for the estimates of the distribution function can be obtained. This may be an important limitation, since small changes in the locations can be offset by changes in the probabilities without substantially affecting the likelihood function. One way to proceed is to start out with estimating the parameters and the mass points adopting the approach outlined above. Then a parametric distribution can be used that appears to yield structural estimates that are not too far from the estimates obtained with this mass-points approach. A question is: how many mass points are needed to characterize the mixing distribution? This issue reflects the central difference between the use of a finite mixture and the mass-points approach adopted in this chapter; the support size is an additional unknown parameter. Lindsay (1983a and b), Heckman and Singer (1984) and Lancaster (1990) suggest usage of the Gfiteaux derivative, which is the derivative o f the likelihood function at some mixing distribution that is claimed to be optimal in the direction of a mixing distribution with all mass in one point. This is not a straightforward procedure. A simple alternative is to start an analysis with one point and continue to add new points until a new point coincides exactly with a point already obtained. This procedure is adopted here.
A panel data switchingregression model
467
Estimation o f the trip generation model In this section we turn to estimation of the trip generation model. As argued, i t is a switching regression model with an endogeneous selection rule. The main difficulty in the present application is that random effects are included in both the car ownership model and in the model describing trip generation. Hausman and Wise (1979) describe a selection model with an error component structure for the regression, but no separate random effect in the selection equation. Ridder (1990a) shows that their model does not allow for history-dependence, i.e. the probability of having a car in wave t does not depend upon car ownership levels in previous waves. This implausible assumption is relaxed by Ridder (1990a and b). He shows that a panel equivalent of selectivRy-correction in cross-sectional models can be used. Instead of one correction-term, panel data models with random effects require two terms; one to take the correlation between unobserved heterogeneity into account and one for the correlation between the random shocks. He derives the conditional distributions required to obtain correction factors. These distributions will be used in the sequel. In order to simplify estimation assume that the individual effects and random errors in the regression are linearly related to the effects in the equations describing decision making with respect to car ownership. For the random effects assuming linear conditional expectations for simplicity, we have: for noncar owners:
e ( ~ : °,161) = ~;°'6~
for hh with one car:
E(ot[l~ 162) = ~,"tl),,oi + ^2~')~2oi
for hh with 2 + cars:
(6)
With respect to the random shocks, we have: for noncar owners:
E ( u~Ol l e,. ) = .col^ '/1 ~'lit
for hh with one car:
E ( u : , ' ~ l c . , , oz.) = -('~^ ql ~lit + l'/~l)c2it
for hh with 2 + cars:
E(u:,2~lc,..
~2it)
(2)
~12)~lit + 712 C2it"
(7)
In the remainder of this section, models will be described for households with one car. For the households with no cars or with two cars, similar models can be derived using eqns (6) and (7). For households with one car, expected mobility is given by: E ( y . l a * . _> 0, a*. < 0) = X~,~/3") + E(o~,la*,, -> 0, aL, < 0) + E(u.la*. > O. a~. < 0) .(I) L"t C2 = X:i~fl ") + ~l~E(6:la*,, > 0, aL, < 0) + ~2 L~o, laL, ~" 0, a*. < 0)
+ ~l~)E(e.,ba'L, > O, a'L < O) + ~ (1)E(c2.1a,., > O. aL, < 0).
(8)
For eqn (8), we must find the conditional expectations of the random effects 61 and 62 for households with only one car. The mass points approach, used in the previous section, provides us with the locations and mass (probabilities) of the unconditional 6 ~and 82. First, consider the conditional distribution of 6 ~, to be used to derive the conditional mean of mobility of households with one car. The distribution, derived by Ridder (1990a), associated with decisions with respect to first car ownership is, after adopting his formula to discrete distributions:
468
H. MEURS T
"l~lm H
f(~lmlal~) =
t~[alit(Zli,~¢ 1 .~ ~ l m ) ]
,=lr
M
,
1I t=l
+
t=l
where: a . is a T x l-vector with elements a m . . . . . in wave t. alit = 2aut -
(9)
aur
where alu = I if household i has a car
1
~ , is the location of mass point m(/n = 1. . . . . M ) associated with 5~i r~., is the probability associated with mass point m. Equation (9) can also be used for the distribution of 55. The entire observed history o f car ownership levels must be used, because households that have one car in all waves will have a high/~ if Z~, is low. In other words the entire history contains information about/~ which must be used. From eqn (9), describing a discrete distribution, the conditional expectation can be derived: M
E(6~ta*. > O. a~,, < O) = ~
~. • f(~,mla,,).
(10)
m=l
Ridder (1990a) also derives the conditional distribution o f e.. From that distribution we can derive the conditional expectation as: M
E(e~.laj. ~ 0, a2. < 0) = ~
(I)(Zlit'~l "~-
~lm) • f ( ~ . , I ali).
(11)
Substitution of eqns (10) and (1 l) into eqn (8) provides us with the correction terms. The resulting equation can be estimated with OLS. However, there are two problems with OLS estimation o f the resulting model• First, the correction term depends upon the estimate -~, introducing an additional source of random variation in the final regression. Second, the conditional variance depends upon Z . introducing heterocedasticity. The resulting disturbance term has a complex distribution. A relatively simple alternative to OLS estimates of the standard errors associated with the estimates ~ is using the matrix proposed by White (1984). This provides as asymptotically with correct standard errors. Define e. = Y. - X . / 3 -
~,C1. - rt,Gi,.
where C~. and C2. are the correction-terms used in the trip generation model. Then f', the estimated variance-covariance of ~ is given by: -1
X tt t
i=l
T
N
t=l
i=l
7"
N
-I
t t=l iffil
DATA Data are from the Dutch Mobility Panel• In the present application, only 4 waves are used, held in spring 1984-1987. Information is available about characteristics of households and of individuals of 12 years and older belonging to the selected households. In each wave, all household members o f 12 years and older were asked to keep a sevenday travel and activity diary. Not all households or persons remained in the sample for
A panel data switching regression model
469
Table 1. Turnover in car ownership (N = 668) Wave I
Wave t -
1
0 cars
1 ear
2 + cars
total
0 cars 1 car 2+ cars
372 40 3
42 1280 50
2 54 161
416 1374 214
total
415
1372
217
2004
the full period. This raises the issue of attrition. A high attrition rate is present in the data; 30°7o of the households drop out between wave 1 and 2. Attrition may bias the coefficients of regression models. Ridder (1990b) concluded in an analysis o f attrition for the Dutch Mobility Panel that the effects of attrition do not appear to affect the slopes in trip generation models. This is a surprising result, since many discussions around selectivity deal with potential problems in estimating these regression coefficients. Although his analysis requires further study, this paper will not address the issue o f attrition. Further information about the Dutch Mobility Panel and about empirical analyses that have been carried out can be found in van Wissen and Meurs (1989). Household car ownership is a dependent variable in this paper. Table 1 presents the amount of change in car ownership. These changes are pooled over the waves in order to simplify the table. There is some change in car ownership levels of individuals, but most households do not change their levels. This implies that it is unlikely that the data allow for a separation of the spurious and true state-dependence effects. Trip making by a household is defined as the total number o f trips made by all household members of 12 years and older over a seven day period. Car trips concern trips made as driver, and transit refers to train, bus, tram and metro. Table 2 presents the means and variances of the mobility variables. A sharp decrease in trip making from wave one to three may be observed. This may be due to attrition and measurement biases from respondents not maintaining an accurate description of their mobility (Golob and Meurs, 1986; Meurs, et al., 1989). With respect to the variables used in the analysis, a number o f comments can be made. Household income (in 1000 guilders) is created as a continuous variable from a 10 point income scale in the questionnaires. Means are taken at the individual level (from individual incomes) and added to get household income. The income estimates obtained in this way are checked against the questions about household income in the questionnaire. A car is defined to be a business car if households register it as being one in the car questionnaire. The number of workers in the household is defined as the number of adults working more than 20 hours per week. Car accessibility represents the difference in logsums o f households with one car and households without cars obtained from models describing mode choice to work. This difference can be regarded as an indicator of car accessibility to work relative to public transport accessibility. PANEL DATA CAR OWNERSHIP MODELS
Since the simplifying assumption was made that households make independent decisions with respect to 0/1 ~ cars and 1/2 ~ cars, we will discuss the results separately. Estimation is done using the MAXLIK-routine in Gauss. Analytical gradients were used Table 2. Descriptive statistics of mobility indicators (N = 668) Wave I
Wave 3
Wave 5
Wave 7
Mobility
Mean
Variance
Mean
Variance
Mean
Variance
Mean
Variance
Total
57.88 17.93 2.74
1019.31 218.58 33.26
54.21 16.43 2.50
879.65 191.90 30.82
53.97 16.85 2.57
881.25 199.47 29.14
52.33 17.43 2.53
870.51 217.38 27.37
Car Transit
470
H . MEURS
in combination with the Berndt-Hall-Hall-Hausman ( B H H H ) approach to obtain estimates of the information matrix (Berndt et al., 1974). Consider estimation of the model describing choices concerning owning 0 or 1 + cars (see Table 3). Models were estimated with 2, 3 and 4 mass points. The location o f the fourth mass point is precisely equal to the location o f one of the other mass points. Therefore, it can be concluded that only three mass points were required to characterize the distribution of unobserved heterogeneity; the data are not sufficiently rich to obtain more mass points. For comparison, we also included the model with two mass points. Consider the variables representing age and age squared of the head of the household. These parameters suggest that age of the head o f the household has an inverse U-shaped effect on the propensity to own a car. Households have, in the period 19841987, a maximum propensity to own cars when the head is 37 years old. Households in which the head has a low education or a high education (university degree) tend to have fewer cars than the middle groups. These effects were not significant in the model for describing ownership of the second car. Income has the expected positive sign. Households with higher incomes tend to have more cars. This effect is also significant in the equation describing decision making with respect to the second car, despite a variable capturing the effects of the number o f workers. This indicates that not only a higher number o f workers is important in explaining second car ownership, but also the household income. Of course, license holding is important in explaining car ownership levels. It is assumed that license holding is exogenous with respect to car ownership levels. Especially for households consisting of older couples, this may be a simplification. If car accessibility is high in comparison with public transport accessibility, households are more prone to own cars. This effect is especially important in decisions with respect to the ownership of second cars. However, accessibility may not be an exogenous variable; households might choose residential location and car ownership levels simultaneously. Households with more members are less prone to own cars. This result was found by Kitamura and Bunch (1989) as well. Two possible reasons may explain this effect. First, there may be a cohort effect associated with life cycle. Second, the increase in household size may imply greater essentials such as food, clothing and housing. This reduces the amount of financial resources for expenditures on cars (Lerman and Ben Akiva, 1975). Households size was not significant in decision making with respect to the second car. Finally, households with more workers tend to have more cars. The number of workers does not significantly affect decisions about the first car.
T a b l e 3. Result o f
0/I
+
choice o f car o w n e r s h i p using m a s s points ( N = 668) 2 Mass-Points
Decision
Variable age age s q u a r e d lov, e d u c a t i o n high e d u c a t i o n income n u m b e r of licenses car accessibilit~ hou'-.ehold size
Mass-Point I : M a s s - P o i n t 2: M a s s - P o i n t 3:
mas~ Iocatlor; mas, location mass location
L(3,
~, ~)
coefficient .067 -.001 -.715 -.739 .020 2,503 .304 - . 123 .20 - 5.88 .80 - 2.62 -521.31
t value 2.42 -2.31 -4.77 -4.32 3.99 31.01 2.69 - 2.28
3 Mass-Points coefficient .085 -.001 -.719 -.722 .024 2,868 .585 - 1.39 .24 -6.68 .04 - 10.03 .72 - 3.78 -478.64
t value 2.57 -2.32 -3.24 -2.91 3.60 27.96 3.48 - 1.96
A panel data switching regression model
471
In going f r o m 2 to 3 mass points, the coefficients associated with the included explanatory variables increase; this can be explained by better control for heterogeneity. Using discrete distribution theory, we can calculate the m o m e n t s o f the mass-points distribution. The m e a n o f the 3 mass-points model equals - 4 . 7 3 and the estimated variance equals 2.69. The 3 mass-point model shows that 1 point, with a b o u t 72°7o o f all mass, has a relatively large coefficient. This indicates that a substantial n u m b e r o f households are relatively prone to have a car after controlling for observed characteristics. This implies that it is likely that these households will continue to o w n cars, even if the observed characteristics change. A relatively small a m o u n t o f mass (407o) is concentrated in a point with a large negative location. This represents some households that tend to o w n no cars, even if observed characteristics o f these households are the same as those that have cars. Therefore, a relatively small g r o u p tends to own no cars, even if their incomes, licence holding and other characteristics leads to a high probability o f car ownership. A b o u t a quarter o f the mass is in the mid-range. This characterizes households that require favorable circumstances in order to buy a car. This pattern o f heterogeneity is consistent with the J-shaped f o r m that would arise if a beta-distribution is used to characterize heterogeneity. It can be concluded that the use o f a n o r m a l mixture is less appropriate for car ownership choice. F o r policy purposes, this implies that it will be relatively difficult to influence the car ownership since a relatively large g r o u p o f households display an unexplained tendency towards car ownership. Next consider the model describing the choice to own one car or m o r e than one. A n i m p o r t a n t characteristic in decision m a k i n g with respect to ownership o f 2 ÷ cars is that the first car frequently is a business car. The ownership o f a business car is introduced as a d u m m y variable in the equation describing decision m a k i n g with respect to owning 2 ÷ cars. It is assumed that the presence o f such a car is exogenous, as part o f the labor agreement. A n o t h e r way o f dealing with the ownership o f business cars is to consider only decision making with respect to cars b o u g h t by the household. This a p p r o a c h (see for example De J o n g , 1989) is not followed here since business cars are frequently used for private purposes as well. It is assumed that, although exogenously presented to the household, it m a y have a genuine effect on decision making with respect to the second car. The results associated with the model in which heterogeneity is parameterized using mass points is displayed in Table 4. Again, 3 mass points were sufficient to characterize
Table 4. Choice 1/2 ÷ cars with mass points to characterize the mixing distribution 2Mass-Points Decision
Variable age age squared income number of licences accessibility to work business car number of employees
Mass-Point 1: Mass-Point 2: Mass-Point 3:
mass location mass location mass location L(~, ~-, ~)
coefficient
t value
3Mass-Points coefficient
t value
1.54 - .002 - .026 .689 .794 1.089
2.76 - 2.94 5.35 7.83 4.72 7.08
.166 - .002 .027 .649 .820 1. ! 12
2.89 - 2.99 5.13 6,96 4.68 6.48
.249
2.56
.347
3.11
.29 - 7.11 .71 - 9.66 -
.11 - 16.14 .20 - 7.50 .69 - 9.99
-475.43
-474.09
H. MEURS
472
Table 5. Random-effects switching regression model of total trip generation
0 cars coefficient constant household income
no. of persons > 11 travel reimbursements accessibility to work perc of license holding
no. of children no. of workers business car ~l ~2 ~l ~2 L(C) L0~,tl,t2,~l,~2) N
- 6.054 - .014 18.892 .030 - .862 3.224 .479 2.789 -2.002 -.672 -2563 --2344 556
1 car t value - 1.69 - .20 15.18 .013 - .59 .85 .34 1.72 -2.15 -.36 -
2 + cars
coefficient
t value
3.474 .066 25.623 -2.722 3.035 16.589 - 3.676 4.438 - 2.679 2.831 1.922 4.925 13.705
.22 1.75 25.20 -2.54 2.97 6.53 - 3.30 4.92 - 1.16 2.95 1.14 1.84 2.96
-8716 --8015 1829
coefficient - 22.431 - .138 24.294 -4.339 3.094 .266 - 1.520 - .944 - 5.431 .427 -3.198 -9.052 1.232
t value - .79 - 1.35 6.25 - 1.25 .75 .01 - .39 - .35 - 1.25 .11 - 1.21 .84 .29
-1448 -- 1313 287
the m i x i n g d i s t r i b u t i o n . All h o u s e h o l d s h a v i n g at least o n e car in a specific wave were i n c l u d e d in the analysis. T h e analysis was n o t restricted to h o u s e h o l d s with at least one car in all waves. Hence, a n u n b a l a n c e d e s t i m a t i o n p r o c e d u r e was used. T h e implied expectation or intercept in the 3 m a s s - p o i n t s case equals - 1 0 . 1 7 with v a r i a n c e 5.37. A r o u n d the m e a n , a small g r o u p (11~70) tends to be less p r o n e to o w n a s e c o n d car, c o n t r o l l i n g for the other observed characteristics. A larger a m o u n t o f mass (20%) is c a p t u r e d by a mass p o i n t with a smaller negative location; these h o u s e h o l d s are consistently m o r e p r o n e to have a second car. The largest a m o u n t o f mass is c o n c e n t r a t e d in the middle. This is a crucial g r o u p f r o m a policy p o i n t o f view; changes in circumstances m a y lead to the decision to acquire a car. T h e p a t t e r n o f heterogeneity is different in this case, c o m p a r e d with the heterogeneity in the decision with respect to a first car. This p a t t e r n is consistent with the n o r m a l d i s t r i b u t i o n o f heterogeneity; most o f the mass is distributed s y m m e t r i c a l l y a r o u n d the m e a n . Because the v a r i a n c e in the model with a n o r m a l m i x t u r e is smaller in this model in c o m p a r i s o n with the previous one, decisions with respect to the second car are based m o r e u p o n observed characteristics t h a n decisions with respect to the first car.
PANEL DATA RANDOM EFFECTS SWITCHING REGRESSION MODELS
Total trip generation T h e r a n d o m effects specification o f a switching regression model allows for the d e c o m p o s i t i o n o f the correlation between car o w n e r s h i p decisions a n d trip g e n e r a t i o n into two c o m p o n e n t s . O n e c o m p o n e n t captures the correlation between the r a n d o m effects a n d the other c o m p o n e n t captures the c o r r e l a t i o n between the r a n d o m shocks. In this section, the estimates o f the switching regression model for the total n u m b e r o f trips will be described. The two types of c o r r e l a t i o n between car o w n e r s h i p a n d mobility decisions are c a p t u r e d by two correction terms: one to c o n t r o l for correlations between the r a n d o m effects in the car ownership a n d the trip g e n e r a t i o n model, a n d a n o t h e r to c o n t r o l for c o r r e l a t i o n between the t i m e - v a r y i n g disturbances. T a b l e 5 presents the e s t i m a t i o n results. First, we consider trip g e n e r a t i o n o f h o u s e h o l d s w i t h o u t cars. High i n c o m e carless h o u s e h o l d s in all waves will, o n average, have a lower r a n d o m effect t h a n households with lower i n c o m e s since aT,, < 0. These households are less p r o n e to have a car. Since ~ is negative, these h o u s e h o l d s will be relatively m o b i l e (recalling that c~i = K. 6i)- Since they decided n o t to acquire a car, these h o u s e h o l d s w i t h o u t cars p r o b a b l y satisfy their travel needs relatively well with other m o d e s o f t r a n s p o r t a t i o n .
1.872 .096 2.722 3.812 -3.321 - .828 - .039 - .920 - .351 _ .509 _ - 1848 -- 1760 556
constant income no. o f p e r s o n s > i I yrs travel r e i m b u r s e m e n t s accessibility percent, with lic. n o o f children no of workers business car ~1 ~2 ~, ~2
L(C) L~,tl,t2,~j,~2) N
coefficient
0 cars
1.49 3.80 6.24 4.94 -6.54 - .62 - .08 - 1.62 - 1.08 .79 -
t value
-5546 --5383 1829
- .447 .008 1.105 ! .307 -1.167 - 1.411 1.097 .411 - 1.014 - .277 -.079 .266 -.532
coefficient
1 car
transit
- . 12 .93 4.59 5.14 -4.82 - 2.35 4.19 1.92 - 1.86 - .99 -.19 .36 -.48
t value
-822 --780 287
3.444 - .025 1.460 1.341 -1.923 - 1.788 .282 - . 196 .216 - .323 .133 1.149 .018
coefficient
2 + cars
.78 - 1.56 2.40 2.47 -2.96 - .59 .47 - .46 .32 - .52 .32 .68 .02
t value
3.426
-6939 --6817 1829
2.267 2.153 1.279 1.795 - 1.061 4.626
-
18.442 .048 2.889 1.687 3.091 9.561
coefficient
1 car
T a b l e 6. R a n d o m - e f f e c t s switching r e g r e s s i o n m o d e l s f o r t r a n s i t a n d car trip g e n e r a t i o n
-
4.84 1.80 2.57 2.05 - .76 1.93
5.97
2.19 2.44 5.47 3.03 5.83 7.25
t value
car
3.734
- 1233 -- 1189 287
.952 -4.388 2.493 -4.660 3.194 -6.541
-
- 26.518 - .009 6.974 - 1.835 .988 19.497
coefficient
2 + cars value
1.49
.54 - 1.55 .97 -2.71 .46 -2.32
-
- 1.44 -.13 2.76 - .81 .36 1.55
t
474
H. MEURS
Next, we consider the households with one car. Since ~ is positive, this implies that low income households with one car in all waves and a relatively high random effect in the trip generation equation wildcccl be relatively mobile. For high-income households the converse holds; they do not need a high random effect to decide to acquire a car. Therefore, the average mobility o f such households may be lower. In other words, for households with low incomes, mobility considerations appear to be relatively important in deciding to acquire a car. The correction term associated with the correlation between the time-varying disturbances is positive as well, implying that the higher mobility o f car owners has a dynamic component. The correction terms associated with the second-car ownership decision are significantly positive. Households with one car, that are likely to have a second car, are less mobile than expected on the basis of the explanatory variables: The correction terms in the regime describing the mobility o f households with two cars do not differ significantly from zero.
Trip generation f o r car and transit Next, consider the random effects switching regression models for car and transit trip generation. The estimation results are displayed in Table 6. None o f the correction terms in the model describing the trip generation of transit is significant. This conclusion was also obtained in the cross-sectional model. This implies that the unobserved factors determining the demand for transit are not correlated with the unobserved factors determining car ownership decisions. The correction terms controlling for car ownership decisions are significant in the regime describing the demand for car trips for households with one car. Households that are likely to own a car due to income and other observed characteristics will have low random effects and, due to the positive effect ~, will also make less trips by car on average. Households with high random effects in the car ownership equation will make more trips by car as well. The effects related to second car ownership are also positive. This implies that households that are likely to have a second car, judging from the observed characteristics, will have low random effects. Due to a positive ~2, these households will make less trips. Therefore, this effect works opposite to the effect related to first car ownership decisions. Both the covariances between the random components and the time-varying disturbances are negative for households with two cars. This implies that households that are not likely to have a second car (judging from the observed characteristics), but, due to high random effects, do have a second car, will be less mobile than expected from the included explanatory variables. In order to examine the relative contributions of each of the terms, consider Table 7. Households that have no cars are about 30% more mobile than expected from the effects
Table 7. Contributions of all terms in trip generation
Zero cars
One car
Total Trips
Transit
3 K~ ~
9.04 10.17 .29
4.21 .54 .37
~ r~ K: ~
76.18 - 1.21 - 19.11 .31 .89
.20 .89 .78 .01 .03
42.18 -4.99 - 17.84 -.06 .30
47.29 - 1.64 27.73 - .21 1.10
2.04 1.25 - 1,16 ,03 .16
10.56 -9.61 40.44 .08 --5.84
~:
2+ cars
~ K~ ~2 ~ 712
-
-
Car
-
A panel data switching regression model
475
of the included explanatory variables. Households with one car tend to be less mobile than expected on the basis of the explanatory variables, and also make less trips by car. A m a j o r proportion o f the mobility by car is attributable to indirect effects of car ownership decisions. CONCLUSIONS The aim of this paper was to examine whether decisions with respect to car ownership and trip generation are jointly made. Switching regression models were used that included correction terms to control for selectivity in the models estimated for each regime separately. These results highlight the importance of modelling interactions in decision-making. Such interactions are captured partly by c o m m o n exogenous variables describing trip generation and car ownership decisions. But there are also correlations of errors present. The panel data models presented in this chapter show that most o f the correlation between the error-terms can be attributed to unobserved heterogeneity rather than to r a n d o m shocks. Since these effects are differenced out in the models with the covariance transformation, it can be concluded that part of the difference between the cross-sectional and fixed-effects models, found in a previous analysis by the author (Meurs, 1989), are due to measures of direct and indirect effects. F r o m the estimates of the mass points, the question arises whether a normal distribution of unobserved heterogeneity is tenable. The decision whether or not to own a car seems to be affected by unobserved effects with a J-shaped distribution. Most mass is related to households with a high propensity to own a car. This could indicate that this decision can hardly be affected by changes in the observed characteristics. The decision whether to own one or two cars is affected by unobserved effects with a more symmetric distribution of heterogeneity. An important direction for future work is to examine the specification of these models in more detail. The effects of household characteristics on mobility depend on the regime of the households, even after controlling for selectivity. This may indicate the presence of other specification errors, such as nonlinearities or incorrectly specified disturbances. It was assumed, for simplicity o f the model and for computational convenience, that the potential correlation between the time-varying regressors and the unobserved heterogeneity is controlled for by including a number of time-variant regressors; this can he a subject of more detailed analysis. The model was also assumed to be linear in variables and parameters (apart from the age-effect). A question is whether the differences between the coefficients in the first and second equation for car ownership can be explained by nonlinearities in the model, since, for example, average incomes are higher in households with two cars than in other households. Another issue that remains to be tested is the distributional characteristic of heterogeneity. Continued work with parametric distributions using the results from the nonparameteric estimates may be useful. Finally, there is the important problem of the exogeneity of the regressors. It is assumed that all regressors, including the ownership of a business car and license holding, are exogenous. It is possible that individuals acquire licenses as a first step in the decision process to own a car. Alternatively it can be assumed that individuals get their licences like swimming certificates, potentially with an idea about a long term advantage. Acknowledgements-The author appreciates stimulating discussions with Geert Ridder, Tom Wansbeek and
Ivo Molenaar of the University of Groningen in The Netherlands. A preliminary version of this paper was presented at the 6th International Conference on Travel Behaviour in Quebec, Canada, May 22-24, 1991. Comments received from Marc Gaudry, Tom Golob, Gusta Renes, Riuychi Kitamura, Toon van der Hoorn and two referees are appreciated. Geertje van Hoeven skillfully typed the manuscript. Of course, the usual caveat applies. REFERENCES Berndt F. R., Hall B. H., Hall R. E., and Hausman J. A. (1974) Estimation and inference in non-linear structural models. Annals of Economic and Social Measurement, 3, 653-665.
476
H. MEURS
Bock R. D. and Leiberman M. (1970) Fitting a response model for n dichotomously scored item. Psychometrica, 35, 179-197. Bock R. D. and Aitkin M. (1981) Fitting a response model for n dichotomously scored item. Psychometrica, 46, 443-459. Butler J. S. and Moffit R. (1982) A computionally efficient quadrature for a one-factor multinomial probit model. Econometrica, 50, 761-764. Cosslett S. R. (1983) Distribution- free maximum likelihood estimator of the binary choice model. Econometrica, 51,765-782. Davies R. B. and Crouchley, R. (1984) Calibrating longitudinal models of residential mobility: An assessment of a nonparametric marginal likelihood approach. Regional Science and Urban Economics, 14, 231-247. De Jong G. (1989) Some Joint Models of Car Ownership and Car Use. Unpublished PhD thesis. University of Amsterdam, The Netherlands. Gaudry M. J. I. (1990) Three families of choice models applied to intercity travel demand with aggregate Canadian data. Queen's University. John Deutch Institute for the Study of Economic policy. Discussion Paper No. 13. Goloh T. F. and Meurs H. (1986) Biases in response over time in a seven-day travel diary. Transportation, 13, 163-181. Hausman J. A. and Wise D. A. (1979) Attrition bias in experimental and panel data: The Gary income maintenance experiment. Econometrica, 47, 455-473. Heckman J. J. (1976) The common structure of statistical models with truncati on, sample selection and limited dependent variables and a simple estimation for such models. A nnais of Economic and Social Measurement, 5,475-492. Heckman J. J. (1981a). Statistical models for discrete panel data. In C. F. Manski and D. McFadden, (Eds.), Structural Analysis of Discrete Data with Econometric Applications, pp. 114-178. MIT Press, Cambridge, MA. Heckman J. J. (1981h) The incidental parameters problem of initial conditions in estimating a time-discrete data stochastic process. In C. F. Manski and D. McFadden, Eds. StructuralAnalysis of Discrete Data with Econometric Applications, pp. 179-195. MIT Press, Cambridge, MA. Heckman J. J. and Singer B. (1984) A method for minimising the impact of distributional assumptions on econometric models of duration. Econometrica, 52, 271-320. Hensher D. A. (1984) An overview of the theoretical, methodological, empirical and policy bases of the dimensions of automobile demand project. Dimensions of the Automobile Demand Project, Working Paper 12. Macquarie University, Australia. Kiefer J. and Wolfowitz J. (1956) Consistency of the markovian likelihood estimator in the presence of infinitely many nuisance parameters. Annals of Mathematical Statistics, 27, 887-906. Kitamura R. and Goulias K. G. (1988) MIDAS: A travel demand forecasting tool based on a dynamic model system of household car ownership and mobility. A report prepared for the Dutch Ministry of Transport, The Hague, Netherlands. Kitamura R. and Bunch D. S, (1989) Heterogeneity and state dependence in household car ownership: A panel analysis using ordered response probit models with error components. Research Report UCD-TRG-RR-89-6. University of California at Davis. Laird N. (1978) Non-parametric maximum likelihood estimation of a mixing distribution. Journal of the American Statistical Association, 73, 805-811. Lancaster T. (1990) The Econometric Analysis of Transition Data. Cambridge University Press, Cambridge, MA. Lerman S. and Ben Akiva M. (1975) Disaggregate behavioural model of automobile ownership. Transportation Research Record, 569, 43-51. Lindsay B. G. (1983a) The geometry of Mixture Likelihoods, Part I, Annals of Statistics, 11, 86-94. Lindsay B. G. (1983b) The geometry of mixture likelihoods, Part II, the exponential family. The Annals of Statistics, 11,783-792. Maddala J. G. (1983) Limited Dependent Variables. MIT Press, Cambridge, MA. Mannering F. L. and Winston C. (1985) Dynamic empirical analysis of household vehicle ownership and utilization. Rand Journal of Economics, 16, 215-236. Manski C. F. (1975) The maximum-score estimation of the stochastic utility model of choice. Journal of Econometrics, 3, 205-228. Meurs H. (1989) Trip generation models with permanent unobserved effects. Transpn. Res., 16B, 175-194. Meurs H. (1990) Dynamic analysis of trip generation. Transpn. Res., 16, 175-194. Meurs H. (1991) A Panel Data Analysis of Travel Demand. Unpublished PhD thesis. University of Groningen, The Netherlands. Meurs H., van Wissen L., and Visser J. (1989) Measurement biases in panel data. Transportation, 16, 175-194. Ridder G. (1990a) Attrition in multi-wave panel data. In: J. Hartog, G. Ridder and J. Theeuwes (eds.), Panel Data and Labor Market Studies, pp. 45-67, Amsterdam, The Netherlands. Ridder G. (1990b) An empirical evaluating of some models for nonrandom attrition in panel data. Mimeo, University of Groningen, The Netherlands. Train K. and Lohrer M. (1983) Vehicle ownership and usage: An integrated system of disaggregate demand models. Paper presented at the Transportation Research Board, Washington, D.C. White H. (1984) Asymptotic Theory for Econometricians. Academic Press, London. van Wissen L. J. G. and Meurs H. (1989) The Dutch mobility panel: Experiences and evaluation. Transportation, 16, 99-119.