TRANSPORTATION RESEARCH PART B
Transportation Research Part B 33 (1999) 63±79
A practical technique to estimate multinomial probit models in transportation Denis Bolduc * DeÂpartment d'eÂconomique, Universite Laval, Sainte-Foy, QueÂbec, Canada G1K 7P4 Received 14 September 1994; received in revised form 1 June 1998
Abstract The Multinomial Probit (MNP) formulation provides a very general framework to allow for interdependent alternatives in discrete choice analysis. Up until recently, its use was rather limited, mainly because of the computational diculties associated with the evaluation of the choice probabilities which are multidimensional normal integrals. In recent years, the econometric estimation of Multinomial Probit models has greatly been focused on. Alternative simulation based approaches have been suggested and compared. Most approaches exploit a conventional estimation technique where easy to compute simulators replace the choice probabilities. For situations such as in transportation demand modelling where samples and choice sets are large, the existing literature clearly suggests the use of a maximum simulated likelihood (MSL) framework combined with a Geweke±Hajivassiliou±Keane (GHK) choice probability simulator. The present paper gives the computational details regarding the implementation of this practical estimation approach where the scores are computed analytically. This represents a contribution of the paper, because usually, numerical derivatives are used. The approach is tested on a 9-mode transportation choice model estimated with disaggregate data from Santiago, Chile. ReÂsume La formulation probit polytomique (MNP) permet d'analyser et de deÂcrire de fac,on treÁs ¯exible, le choix d'un individu parmi un ensemble de modaliteÂs inter-deÂpendantes. Les nombreux progreÁs eectueÂs au cours des dernieÁres anneÂes concernant l'estimation eÂconomeÂtrique des modeÁles MNP, permet maintenant de contourner la probleÂmatique lieÂe aÁ l'eÂvaluation d'inteÂgrales normales multiples qui de®nissent les probabiliteÂs de seÂlection des modaliteÂs. Les diverses approches consideÂreÂes exploitent geÂneÂralement des simulateurs ecaces agissant comme substituts aux probabiliteÂs exactes de choix. Le simulateur ayant la faveur geÂneÂrale est le GHK, suggeÂre de fac,on indeÂpendante par Geweke, Hajivassiliou et Keane. Pour les situations telles que geÂneÂralement rencontreÂes dans le domaine des transports ouÁ les eÂchantillons ainsi que les ensembles de choix sont de grande taille, la litteÂrature suggeÁre treÁs clairement l'emploi d'une approche du * Tel: +1-418-656-5427; fax: +1-418-656-7412; e-mail:
[email protected] 0191-2615/98/$Ðsee front matter # 1998 Elsevier Science Ltd. All rights reserved. PII: S 0191-2615(9 8 ) 0 0 0 2 8 - 9
64
D. Bolduc/Transportation Research Part B 33 (1999) 63±79
maximum de vraisemblance utilisant le simulateur GHK pour approcher les probabiliteÂs de choix. Le preÂsent article fournit les deÂtails relatifs aÁ l'utilisation de cette meÂthodologie dans un cadre du maximum de vraisemblance avec deÂriveÂes analytiques. L'approche est ensuite testeÂe sur un ensemble de donneÂes deÂcrivant le choix entre neuf modes servant aÁ relier le centre-ville de Santiago aÁ des reÂgions en peÂripheÂrie. # 1998 Elsevier Science Ltd. All rights reserved. Keywords: Multinomial probit; Simulation based estimate; Discrete choice; Transportation demand modeling
1. Introduction Since the introduction of discrete choice techniques to analyze transportation related problems, hundreds of studies have focused on the behavioral related aspects associated with the decision process of individuals making a choice among a ®nite set of alternatives. The operational model mostly used has the Multinomial Logit (MNL) form. To have choice probabilities with a closed-form that can be calculated easily is its major advantage over more general strategies. However, the assumption made by this model that the alternatives are mutually independent is often limitative. An attractive solution to this problem is to use the MultiNomial Probit (MNP) framework. The inter-dependencies are then accounted for through the correlation structure of an error term assumed to be normally distributed. Any error correlation structure can be postulated. Up until recently, its use was rather limited, mainly because of the computational diculties associated with the evaluation of the choice probabilities which are multidimensional normal integrals. Recently, alternative simulation based approaches have been suggested. They are described and compared in Hajivassiliou et al. (1996). Most approaches exploit a conventional estimation technique where easy to compute simulators replace the choice probabilities. For situations such as in transportation demand modelling where samples and choice sets are large, the maximum simulated likelihood framework combined with a Geweke±Hajivassiliou±Keane (GHK) choice probability simulator approach should be favoured. This paper gives the computational details regarding the implementation of this particular estimation strategy where to speed computation, the score vector associated with the likelihood function is computed using analytic expressions. The approach is tested on a 9-mode transportation choice model using disaggregate data from Santiago, Chile.
2. The multinomial probit formulation A typical transportation mode choice model concerns the choice by individual n; n 1; . . . ; N of the alternative i in the set Cn f1; . . . ; Jn g which produces the highest Vin utility level, i.e. so that Vin 4Vjn , 8j 2 Cn . In this notation, the choice set is allowed to dier across individuals, to account for their own speci®c travel mode availabilities. In estimation, it is very important to take this choice set variability into account. To present the proposed estimation approach intelligibility, it is easier to ®rst assume that each individual faces a same choice set C and then, in a second stage, adapt the notation to the individual speci®c choice set case.
D. Bolduc/Transportation Research Part B 33 (1999) 63±79
65
2.1. MNP with universal choice set Assuming that each individual n faces the same J alternatives, an MNP model formulation based on linear-in-parameters utilities may be written as follows: Vin Zin in ; with
yin
1 0
if Vin 4Vin for j 1; . . . ; J; and otherwise
The variable yin designates the choice made by individual n, Vin is the unobservable utility of alternative i as perceived by individual n, Zin is a
1 K vector of explanatory variables characterizing both the alternative i and the individual n. is a
K 1 vector of ®xed parameters and ®nally in is a normally distributed random error term of mean zero assumed to be correlated with the errors associated with the other alternatives j; j 1; . . . ; J; j 6 i. In vector form, one can write this relationship as: Vn Zn n ; n N
0; ;
1
where Vn and n are
J 1 vectors and Zn is a
J K matrix. As is well known, the only identi®able parameters in the original model (1) are those that can be retrieved uniquely from the parameters of a scaled model dierenced with respect to the utility of an arbitrary alternative. Below, we use the last alternative as the base and the scaling is performed by ®xing to one, the variance of the ®rst error term in the dierenced model. This version which is referred here as the estimable model, can be written as: Un Xn n ; n N
0; ;
2
which is the model in Eq. (1) written in deviation with respect to the utility of the last alternative VJn . More speci®cally, let m J ÿ 1, then Un is a
m 1 vector with components Uin Vin ÿ VJn , i 1; . . . ; m. The matrix Xn and the vector n are de®ned similarly. The scaling is such that var
n var
1n ÿ Jn 1.1 To impose a positive de®nite error covariance matrix
, it is preferable to work with a formulation based on a Cholesky decomposition of . Such an equivalent model is written as: Un Xn Swn ; wn N
0; Im ;
3
1 To set var
1n 1 is equivalent to dividing each row of the dierenced model by the quantity model by the p quantity s var
1n ÿ Jn . In that case, one would get Uin
Vin ÿ VJn =s and the vector should be replaced with
=s. To assume that s 1 simpli®es the notation.
66
D. Bolduc/Transportation Research Part B 33 (1999) 63±79
where S is a lower triangular matrix such that SS0 , with s11 1 to be consistent with the scaling used. The parameters one is interested in are the K parameters k in the vector and the p m
m 1=2 ÿ 1 parameters s21 ; s31 ; . . . ; sm1 ; s22 ; . . . ; smm that we incorporate in a
p 1 vector denoted as s. To complete the notation, we call
0 ; s0 0 , the joint vector of parameters formed by the vertical concatenation of and s. 2.1.1. The utilities in deviation with respect to the chosen alternative To evaluate the log-likelihood function requires calculating for each individual in the sample the probability associated with the choice made. To compute the GHK choice probability simulator for a given individual n, it is convenient to map the utilities into dierences with respect to the utility of the choice made so that the probability Pn
i then becomes an integral over a nonpositive orthant. This particular representation can be obtained by premultiplying the estimable model in Eq. (2) by a
m m linear operator that we call Mi . For any i other than J, the following
m 1 vector is then obtained: 3 ÿ V V 1n in 2 3 2 3 .. 7 6 V1n ÿ VJn U1n 7 6 . .. 7 6 . 7 6 6 7 Un 4 .. 5 6 Viÿ1;n ÿ Vin 7 Mi 4 5 Mi Un : . 7 6 4 Vi1;n ÿ Vin 5 Umn VJÿ1;n ÿ VJn VJn ÿ Vin 2
Premultiplying Eq. (2) by matrix Mi gives: Un Mi Un Mi Xn Mi n ; Xn n ;
4
where n N
0; i , with i Mi M0i . Note that the Un vector thus obtained contains only negative components:
U1n < 0; U2n < 0; . . . ; Umn < 0, As seen below, the GHK simulator exploits this particular feature. The vector n is de®ned similarly to Un and the matrix Xn is of dimension
m K with rows
Z1n ÿ Zin ; . . . ;
ZJn ÿ Zin . Finally, note that because of Eq. (3), one may also write: V
n i Mi M0i Mi SS0 M0i :
5
Eqs. (4) and (5) are particularly useful because they are expressed in terms of the estimable parameters in and the vector s which, one can recall, is composed of the elements of the cholesky matrix S in Eq. (3). 2.2. MNP with individual speci®c choice set To account for individual speci®c choice sets is relatively straightforward. Let n denote an individual
n 1; . . . ; N which chooses that alternative i in choice set Cn f1; . . . ; Jn g for which Vin 4Vjn , 8j 2 Cn . The corresponding MNP model is obtained by excluding from Eq. (1) those utilities associated with the alternatives that are not available to individual n. In terms of the
D. Bolduc/Transportation Research Part B 33 (1999) 63±79
67
estimable model, is amounts to removing the appropriate rows from Eq. (2). Such a formulation can be obtained with the use of an
mn m mapping matrix En , where mn Jn ÿ 1. It is de®ned as an identity matrix with rows associated with the alternatives not available, deleted. In the case of an universal choice set, En would simply be an identity matrix of size m. With this in mind, one can therefore use a unique and general notation to refer to both cases of the MNP model formulation. 2.3. A general notation for the MNP model Consider the previously introduced
m m mapping matrix Mi that maps the m J ÿ 1 utility dierences Un of the estimable model in Eq. (2) into a deviation with respect to the alternative i chosen by individual n. Consider also the
mn m mapping matrix En which removes the utilities associated with the alternatives unavailable to n. Those two operations can be combined into the following operator: Min En Mi :
6
This
mn m mapping matrix, when premultiplying the utility vector Un gives: 3 2 3 3 2 V1n ÿ Vin U1n V1n ÿ VJn .. .. 7 7 6 . 7 6 6 Un 4 .. 5 4 5 Min 4 5 Min Un : . . Umn ;n VJn ;n ÿ Vin VJÿ1;n ÿ VJn 2
This is the
mn 1 vector of the utilities associated with the alternatives available to individual n expressed in deviation with respect to the chosen alternative. Note again that in the case of a universal choice set, Min Mi , 8n since En would be an identity matrix. Using the so called Min matrix and the estimable model in Eq. (2), the ®nal notation that we exploit to compute the choice probabilities is: Un Min Un Min Xn Min n Xn n ;
7
where n N
0; in , with in Min SS0 M0in . This notation should make clear that depending on the mode availability and the choice made, the error covariance matrix vary between observations. However, it should be noted that a common S Cholesky matrix appears in the covariance structure. The matrix Xn represents a
mn K matrix of the deviation of the explanatory variables of each available alternative j other than alternative i with respect to the explanatory variables of the chosen alternative i. Again, note that by de®nition, the Un is a vector with negative components, i.e.: U1n 40; . . . ; Umn n 40. To compute a given choice probability, we will also need to consider a Cholesky transformed version of this formulation. This is written as: Un Xn Sn wn ; wn N
0; Imn ; where Sn is a
mn mn lower triangular matrix such that:
8
68
D. Bolduc/Transportation Research Part B 33 (1999) 63±79
Sn S0n in Min SS0 M0in :
9
For estimation, we will exploit this relationship between the individual speci®c Cholesky matrix Sn and the unique Cholesky matrix S containing the parameters to estimate. To clarify the distinction between those two matrices, note that Sn refers to the formulation expressed in deviation with respect to the chosen alternative with only the available alternatives being considered while S refers to the estimable model expressed in deviation with respect to the last alternative with all the alternatives included, whether it is available or not. Obviously, a given observation n contributes to estimate only those elements of S referring to the available alternatives. 3. Model estimation 3.1. The choice probabilities Denote Pn
i as the choice probability associated with the alternative i; i 2 Cn chosen by individual n. Given the formulations in Eqs. (7) or (8), this is also the probability of drawing Un with each component U1n ; . . . ; Umn ;n being non positive. It can be computed as an mn - dimensional integral of the form: ÿX
mn ;n
ÿX
1n ÿX
2n
Pn
i
... ÿ1
ÿ1
n
n ; in dn ;
10
ÿ1
where n
n ; in is a multivariate normal density with mean zero and covariance matrix in . Unless mn is small, the choice probability in Eq. (10) cannot be computed using a numerical integration technique. One solution is to simulate Pn
i. Many simulators with good properties have been suggested. The most useful ones are described and compared in Hajivassiliou et al. (1996). The GHK simulator, that is used here, was clearly found to be the one with the best theoretical and empirical properties. Theoretical and analytical details on this simulator may be found in BoÈrsch-Supan and Hajivassiliou (1993), Hajivassoliou (1993) and Geweke et al. (1992). Still, we provide computational details regarding its implementation because we need speci®c expressions to derive analytical relationships to compute the scores. The GHK simulator exploits the recursive structure imposed by the Cholesky transformation present in Eq. (8). For a given observation n, using Eq. (8), one can write: 3 2X 3 2 S U1n 11;n 1n 7 6 s21;n X 6 U2n 7 6 2n 7 6 6 . 76 . 76 . 4 .. 5 6 4 .. 5 4 .. Umn ;n Xmn ;n smn ;1;n 2
which imply:
0
s22;n .. .
smn ;2;n
... ... .. . ...
32 3 w1n 76 w2n 7 76 40; 74 .. 7 5 . 5 wmn ;n smn ;mn ;n 0 0 .. .
D. Bolduc/Transportation Research Part B 33 (1999) 63±79
U1n 40 ! w1n 4 U2n 40 ! w2n 4 .. .
69
ÿX1n s11;n ÿs21;n w1n X2n s22;n .. .
Umn ;n 40 ! wmn ;n 4
ÿsmn ;1;n w1n smn ;2;n w2n . . . smn ;mn ÿ1;n wmn ÿ1;n Xmn ;n : smn ;mn ;n
To write it in a more compact way, we use the notation: w1n w2n .. .
4 4
wmn ;n
a1n a2n
w1n .. .
11
4 amn ;n
w1n ; . . . ; wmn ÿ1;n
where: aln
8 lÿ1 P slh;n > > <ÿ ; s whn Xln s
l>1
> > : ÿX
l 1:
h1
ll;n
ll;n
1n s11;n
12
Therefore, the choice probability Pn
i can be written as: Pn
i pr
Un 40 pr
w1n 4a1n ; w2n 4a2n
w1n ; . . . ; wmn ;n 4amn ;n
w1n ; . . . ; wmn ÿ1;n : By conditioning, one can also write: Pn
i pr
U1n 40pr
U2n 40jU1n 40. . .pr
Umn ;n 40jU1n 40; U2n 40; . . . ; Umn ÿ1;n 40 pr
w1n 4a1n pr
w2n 4a2n
w1n jw1n 4a1n . . . :
3.2. The GHK simulator Let r denote a draw. Call wnr a given realization r of the vector wn such that Eq. (11) is satis®ed. Based on R such draws, the following expression: fn
i
R 1X fnr
i R r1
13
70
D. Bolduc/Transportation Research Part B 33 (1999) 63±79
where fnr
i pr
w1n;r 4a1n;r pr
w2n;r 4a2n;r
w1n;r jw1n;r 4a1n;r . . . ; is a choice probability conditional on wnr which can be used to provide an unbiased simulator for Pn
i. The GHK simulator de®ned in Eq. (13) is smooth with respect to the parameters k , k 1; . . . ; K and the s11;n ; . . . ; smn ;mn ;n elements forming the lower part of Cholesky matrix Sn . It is also known to have good asymptotic properties. For proofs, refer to Hajivassiliou and McFadden (1990). For notational compactness and because of the normality assumption, we can write fnr
i as follows: fnr
i
a1n;r
a2n;r . . .
amn ;n;r 1nr 2nr . . .mn ;nr ;
14
where denotes a standard normal cumulative distribution function. Using Eq. (12), the aln;r are computed as: 8 lÿ1 > < ÿ P slh;n w X ; l > 1 ln sll;n sll;n hn;r aln;r
15 h1 > : l 1; ÿX1n s11;n where according to Eq. (11), the whn;r are drawn using ÿ1
u
ahn;r , with the term u being a random uniform number taken from the (0,1) interval. 3.3. The simulated likelihood function The estimation method considered is based on the maximisation of the natural logarithm of the simulated likelihood function. Denoting as the joint vector of parameters to estimate (i.e. it contains and s), the simulated log-likelihood function which is written as: ! mn N N R Y X X 1X ln fn
i ln lnr ;
16 L R r1 l1 n1 n1 is maximized with respect to . Technical details on the computation of the analytical ®rst- order derivatives are provided in Appendix A. To our knowledge, this is the ®rst implementation of the simulated likelihood framework based on the GHK probability simulator which uses analytical instead of numerical derivatives.2 The computation of @L=@ is quite straightforward when 2
One of the referees claimed that numerical derivatives should be more reliable and as fast as suggested approach with analytical expressions for the derivatives. The numerous Monte Carlo experiments based on four alternatives that we performed clearly indicate that this is not true. On the issue of computation time, the approach based on the analytical derivatives was found to be 50% faster than the alternative technique. One reason for this is the fact that numerical derivatives require recalculating, for each perturbation of the parameter vector, the error covariance matrix, the Cholesky matrix and the choice probabilities. With analytical expressions, all those quantities are ®xed within a given iteration step. Reliability should not be questioned because in general, approaches with analytical derivatives are more reliable than those based on numerical derivatives.
D. Bolduc/Transportation Research Part B 33 (1999) 63±79
71
compared to the computation of @L=@s. The latter is computed using a chain rule which exploits a jacobian transformation between s and sn . This transformation arises from the relationships in Eq. (9). Appendix A provides the details on the computation of @L=@sn . The derivative @L=@s of the log-likelihood function with respect to the vector s of interest is evaluated as @s0n =@s@L=@sn , where @s0n =@s is a jacobian matrix whose calculation is detailed in Appendix B. In the implementation of the estimation algorithm, we use a BHHH approach which avoids the need to compute the second-order derivatives which are in this case, rather involved. The computer program written in Fortran 77 has gone through several stages of testing. Dierent Monte Carlo based tests were made on a SUN workstations. We now use it on some real data describing a choice situation among nine inter-related travel mode alternatives. 4. An application To test the methodology, we use a data bank about the choice of modes for the morning peak journey to work to the central business district (CBD) of Santiago. This data bank has been described and employed several times in the past. Gaudry et al. (1989) focused on the estimation of the valuation of time saving by the transportation mode users. They found that the values obtained with linear Logit or linear nested Logit speci®cations were particularly high. By using nonlinear speci®cations of Box±Cox Logit type, they were able obtain more satisfactory results.3 In our application, we use the same data and the same model speci®cation in order to ®nd out whether to allow for a ¯exible correlation structure of the utilities can lead to reasonable VOT values. Details on the sample size, the dierent mode shares and their availability are displayed in Table 1. We will ®nd that to do so certainly helps to improve the value of time (VOT) estimates. Still, their seems to remain some room for improvement. To implement a Box±Cox technique within a MNP setting represents a too formidable task. We suspect that an MNP extension of the MNL formulation with lognormally distributed VOT implemented in Ben-Akiva et al. (1993) would be a reasonable approach to address this problem. Our exploratory analyses using the random VOT based MNL speci®cation clearly point into this direction. Such a more general MNP model framework with randomly distributed VOT coecients will be considered in a later research. The important variables entering the model speci®cation are travel time tin and cin =wn , the cost for individual n of travelling by mode i as a proportion of his/her net personal income per min of work [Chilean pesos/min ($)]. To use cost as a proportion of income allows the VOT to vary deterministically with income. The other VOT related variables used in the speci®cation are walking and in vehicle time for all modes and a waiting time variable for all modes other than car. Finally, in addition to eight alternative speci®c dummies, appear two other explanatory variables of socio-economic type. The ®rst one, speci®c to car driver alternatives 1 and 6 indicates the number of cars in the household as a proportion of number of driving permit holders. The last variable is a sex dummy included for modes 2, 3 and 7 listed in Table 1. The model estimates 3 In order to implement their Box-Cox methodology, they removed observations that contained zero values for some of the explanatory variables. Unless there is a good reason for removing those observations, we favor the use of an approach that retains those observations for model estimation.
72
D. Bolduc/Transportation Research Part B 33 (1999) 63±79
Table 1 Statistics on the sample used Alternative
Chosen
Percent
Availability
168 66 58 295 430 101 41 65 75
0.12933 0.05081 0.04465 0.22710 0.33102 0.07775 0.03156 0.05004 0.05774
681 730 833 407 1287 530 594 828 841
1299
1.0000
6731
1. Car-driver 2. Car-passenger 3. Shared-taxi 4. Metro 5. Bus 6. Car-driver±metro 7. Car-passenger±metro 8. Shared-taxi±metro 9. Bus-metro Totals
obtained using a basic MNP i.i.d. speci®cation are displayed in column 1 of Table 2. An MNP speci®cation with i.i.d. errors can easily be estimated using Gauss±Hermite quadrature to compute the choice probabilities entering the log-likelihood function. Only 12 quadrature points were used to perform the integrals. The solution was obtained in 14 iterations using 0 as the starting value for all the parameters. The implied VOT estimates as a percentage of net income are produced in the same column of Table 3. Those values are in the same range as the ones obtained using a linear logit speci®cation of Gaudry et al. (1989). The value of time sensitivity in vehicle, as a percentage of net personal income is computed as: @
cin =wn @Vin @Vin tm = : @tin @tin @
cin =wn
cin =wn As well known, in linear speci®cation, the VOT measures are evaluated as ratios of parameters. The two other VOT estimates were produced similarly. Standard errors were computed using the delta method. 4.1. The model formulation with correlated utilities To capture the similarities between the transportation modes and to keep a rather parsimonious parametric speci®cation of the error covariance structure, we use the ®rst-order Generalized Autoregressive [GAR(1)] process approach suggested in Bolduc (1992). In order to obtain such a formulation, the original model in Eq. (1) is replaced with: Vn Zn TPÿ1 n ; n N
0; IJ ;
17
where T is a J-diagonal matrix which contains standard deviation i in the ith position, and P is a matrix for capturing the covariance eects using functions based on few underlying unknown
D. Bolduc/Transportation Research Part B 33 (1999) 63±79
73
Table 2 Estimation resultsa Variables
MNP i.i.d.
Alternative speci®c dummies 1. Car-driver 3. Shared-taxi 4. Metro 5. Bus 6. Car-driver±metro 7. Car-passenger±metro 8. Shared-taxi±metro 9. Bus±metro
0.19 0.61 3.11 1.49 ÿ0.02 0.36 0.66 0.90
(0.86) (4.74) (16.56) (12.30) (ÿ0.01) (2.68) (3.65) (5.02)
0.12 0.82 3.10 1.65 0.18 0.58 0.90 1.12
(0.55) (5.25) (15.97) (11.11) (0.80) (3.60) (4.54) (5.78)
ÿ0.44 0.61 2.81 1.47 ÿ0.59 ÿ1.46 ÿ0.09 1.23
(ÿ0.94) (2.31) (5.78) (8.27) (ÿ1.15) (ÿ0.96) (ÿ0.16) (6.60)
ÿ0.37 0.51 2.73 1.45 ÿ0.49 ÿ1.94 ÿ0.14 1.20
(ÿ0.81) (1.66) (5.84) (8.38) (ÿ1.01) (ÿ1.07) (ÿ0.22) (6.74)
ÿ0.39 0.27 2.88 1.49 ÿ0.59 ÿ2.33 ÿ0.20 1.25
(ÿ0.85) (0.89) (5.88) (7.96) (ÿ1.20) (ÿ1.11) (ÿ0.62) (7.72)
ÿ0.02 ÿ0.08 ÿ0.05 ÿ0.25 1.29
(ÿ8.61) (ÿ9.37) (ÿ5.50) (ÿ5.28) (5.61)
ÿ0.02 ÿ0.08 ÿ0.05 ÿ0.18 1.24
(ÿ7.59) (ÿ9.24) (ÿ5.62) (ÿ6.15) (5.44)
ÿ0.02 ÿ0.07 ÿ0.04 ÿ0.13 1.54
(ÿ3.89) (ÿ4.04) (ÿ3.51) (ÿ3.56) (3.16)
ÿ0.02 ÿ0.06 ÿ0.04 ÿ0.13 1.45
(ÿ3.81) (ÿ3.98) (ÿ3.48) (ÿ3.54) (3.13)
ÿ0.02 ÿ0.07 ÿ0.05 ÿ0.13 1.52
(ÿ3.83) (ÿ4.01) (3.55) (ÿ3.64) (3.14)
Other variables Cost/income ($) Walk time (min) In vehicle time (min) Waiting time (min) No. cars/no. permit holders (alternatives 1 and 6) Sex dummies (male=1) (alternatives 2,3 and 7)
ÿ0.42 (ÿ3.64)
SML MNP R 50 homoscedastic
ÿ0.39 (ÿ3.69)
SML MNP R 50 unconstrained
ÿ0.41 (ÿ3.21)
Error correlation structure 1 2 3 4 5 6 7 8 9
1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1
1.52 1 0.95 1.22 0.31 1.32 2.26 1.51 0.46
0
0.46 (5.89)
0.29 (2.29)
(Simulated) log. Likelihood ÿ1482.39 at convergence. Number of iterations 14 Run time (min and s)/iteration 0.06 on SUN UltraSparc 1 a
(3.85) (3.46) (3.27) (1.31) (3.34) (2.04) (2.78) (2.84)
SML MNP R 250 unconstrained
ÿ0.43 (ÿ3.27)
1.43 1 1.01 1.15 0.27 1.22 2.57 1.50 0.46
(3.76) (3.33) (3.25) (1.08) (3.40) (2.02) (2.72) (3.00)
0.28 (2.19)
SML MNP R 250 constrained
ÿ0.49 (ÿ3.53)
1.35 1 1.27 1.27 0.40 1.35 2.87 1.27 0.40
(3.97) (3.90) (3.90) (3.66) (3.97) (2.00) (3.90) (3.66)
0.35 (3.15)
ÿ1473.60
ÿ1447.40
ÿ1442.83
ÿ1443.86
7 1.44
20 1.50
22 7.26
18 6.43
Asymptotic t-statistics are in parentheses.
parameters. It can be viewed as a restricted version of a saturated factor analytic formulation. This model can be obtained from an initial model Vn Zn Tn , with n Wn n , being assumed. This last autoregressive process is assumed to be based on a
J J Boolean (0±1) contiguity matrix W which, in this particular application, relates the ®rst two alternatives together and does the same with the last seven alternatives. The W matrix used is as follows:
74
D. Bolduc/Transportation Research Part B 33 (1999) 63±79
Table 3 Value of time as a percentage of net personal incomea Variable
MNP i.i.d.
SML MNP R 50 homoscedastic
SML MNP R 50 unconstrained
SML MNP R 250 unconstrained
SML MNP R 250 constrained
In vehicle time Walking time Waiting time
201.7 (4.72) 345.7 (6.46) 773.5 (4.74)
211.3 (4.64) 353.9 (6.05) 854.5 (4.83)
184.0 (4.43) 288.2 (5.16) 558.8 (4.03)
198.6 (4.42) 297.7 (5.04) 590.1 (3.99)
214.5 (4.50) 315.8 (5.01) 606.3 (4.02)
a
Asymptotic t-statistics are in parentheses.
0 1 0 0 W 0 0 0 0 0
1 0 0 0 0 0 0 0 0
0 0 0 1 1 1 1 1 1
0 0 1 0 1 1 1 1 1
0 0 1 1 0 1 1 1 1
0 0 1 1 1 0 1 1 1
0 0 1 1 1 1 0 1 1
0 0 1 1 1 1 1 0 1
0 0 1 1 1 1 1 1 0
To insure the invertibility of the P IJ ÿ W matrix, so that n could be replaced with Pÿ1 n for a value of de®ned on the (ÿ1,1) interval, the W matrix is normalized so that each row sum to one. In the postulated structure, the correlation coecient and a maximum of J ÿ 1 standard error terms can be estimated. [For more details on the use of GAR(1) processes to approximate the error covariance structure in discrete choice modelling and on parameter estimability issues, refer to Bolduc, (1992)]. In our application, the second standard deviation term 2 is set to 1. 4.2. Estimation results Estimation results of the dierent MNP versions considered are displayed in Table 2. Column 2 results refers to the SML MNP solution based on 50 draws, assuming cross-correlation between the alternative speci®c errors and homoscedasticity. In other words, all sigmas are ®xed to 1 and only is estimated. The results obtained clearly show that correlation is present between the utilities. The GAR(1) setting makes it possible to summarize the full correlation structure using a single correlation coecient. The ®t is better but the estimated parameters are not so dierent from the MNP i.i.d. solution. Columns 3 and 4 of Table 2 refer to the model speci®cation where seven standard deviation terms are estimated. Recall that for identi®cation, the second standard deviation term is ®xed to 1. Therefore, in this speci®cation referred to as unconstrained in the tables each utility has its own heteroscedasticity eect. According to the estimation results obtained, heteroscedasticity is signi®cantly present. The model ®t is much better than with the homoscedastic structure. Values of time estimates, especially when 250 draws are used, are much
D. Bolduc/Transportation Research Part B 33 (1999) 63±79
75
lower than in the MNP i.i.d. case. One known problem associated with maximum simulated likelihood is the bias introduced by simulating. This is because the GHK is a technique to simulate the choice probability, not its natural log. The bias can be proved to be present using Jensen's inequality. It is also known that it becomes unsigni®cant with large number of simulation draws. This justi®es our use of 250 draws for estimation. Note that with 50 draws, the results are very close to those obtained with large number of draws. The improvement in the ®t using larger number of draws can be attributed to the presence of the bias. All this indicates that the GHK simulator performs very well. The last column of Table 2 refers to a constrained version of the model where the heteroscedastic structure is postulated to be consistent with the de®nition of the alternatives provided in Table 1. In this version, we are postulating homoscedasticity among groups of alternatives. The groups formed using the restrictions: 1 6 ; 2 1; 3 4 8 ; 5 9 with 7 remaining free are car-driver, car passenger, taxi-metro, bus and car-passenger±metro. This is done to demonstrate the ¯exibility of the approach. In this case, the simulated log-likelihood value is ÿ1443.86 which is marginally higher in absolute value than the one corresponding to the unrestricted version of the model, which indicates that the restrictions make sense. This is con®rmed with a log-likelihood ratio test. All parameter estimates closely resemble the values obtained with the unconstrained model. The run time per iteration was approximately 6.4 min. The constrained estimation gains in terms of number of iterations to reach optimum but the computation time per iteration is almost the same, since only mappings are applied in computing the derivatives, so a comparable number of manipulations are required at a given iteration. The implied VOT estimates for in vehicle time, walking time and waiting time are not as good as those obtained with the more general error structure. The unconstrained version performs best in producing smaller VOT estimates. However, there appears to be some room for improvement. An MNP formulation with lognormally distributed VOT coecients is a potentially good alternative for solving the problem of high VOT estimates. This is left for a future research. 5. Conclusion Our application demonstrated the feasibility of MNP estimation when applied to choice situation based on large choice sets. In the most general case that we considered in the application, each utility was characterized by a speci®c standard error and the correlation among two natural blocks of utilities was modelled using a generalized autoregressive error structure. Based on the observed improvements in the log-likelihood values, to take into account these inter-relationships between the model errors was important. The technique was applied to the 9-mode transportation choice model considered in Gaudry et al. (1989). Estimating dierent models on this reliable disaggregate data set from Santiago, Chile, these authors found that the time valuation coecients obtained with linear speci®cation were particularly high. Our MNP estimations gave some more evidence to their ®nding. We suspect that an MNP extension of the MNL formulation with lognormally distributed VOT implemented in BenAkiva et al. (1993) would be a promising technique for solving this problem. Our exploratory analyses using the random VOT based MNL speci®cation clearly point into this direction. This approach will be considered in a later research.
76
D. Bolduc/Transportation Research Part B 33 (1999) 63±79
Acknowledgements The author would like to thank Professor Marc Gaudry for kindly providing the Santiago data bank used for this study. Appendix A. The ®rst-order derivatives A.1. The simulated likelihood function The estimation method considered here maximises the natural logarithm of the simulated likelihood function. Noting that denotes the joint vector of parameters to estimate (i.e. it contains and s), then the simulated log-likelihood function is written as: N X
N X
mn R Y 1X lnfn
i ln lnr L R r1 l1 n1 n1
where fnr
i
mn Q l1
!
A1
lnr is the conditional probability of choosing alternative i given a particular
draw r of the vector wn in Eq. (8) of the text. Note that we use directly the conventions established in Eqs. (11)±(15). We now provide the details regarding the computation of the ®rst-order derivatives. A.2. First-order derivatives Recall that the joint vector of parameters is denoted as . Using Eq. (A1), we get: mn N R X @L X 1 1X @ ln lnr fnr
i ; @ f
i R r1 @ n1 n l1
A2
where: @ ln lnr lnr @aln;r @aln;r llnr ; @ lnr @ @
A3
with llnr lnr =lnr and with lnr denoting a standard normal density function evaluated at aln;r . This particular value is computed from Eq. (15). Because aln;r is a function of all whn;r , h 1; . . . ; l ÿ 1 and since we have whn;r ÿ1
uhn;r
ahn;r for h 1; . . . ; l ÿ 1, then in the computation of @aln;r =@, the following recursion: @whnr
ahn;r @ahn;r uhn;r ; @
whn;r @
A4
D. Bolduc/Transportation Research Part B 33 (1999) 63±79
77
needs to be taken into account. Recall that uhn;r denotes a particular draw from the random uniform distribution de®ned over the unit interval. We now give the explicit relationships for and s. A.3. Derivatives with respect to b mn N R X @L X 1 1X @ ln lnr fnr
i ; @ n1 fn
i R r1 @ l1
A5
where: @ ln lnr @aln;r llnr ; @ @ and where using Eq. (15) and the recursion in Eq. (A4), one has: @aln;r @
ÿ
@aln;r @
lÿ1 P slh;n h1
sll;n
ahn;r @ahn;r uhn;r
w @ hn;r
1n ÿ sX11;n
Xln sll;n
l>1 l 1:
A.4. Derivatives with respect to s As previously mentioned, this derivative is computed using a chain rule linking sn to s. Below we provide @L=@sn . Then, @L=@s is computed as @s0n =@s@L=@sn where the jacobian matrix @s0n =@s is detailed in Appendix B. Now recall that the sn vector is formed by concatenating the elements sij;n , i5j in a column vector. Then the elements of the @L=@sn vector are: mn N R X X @L 1 1X @ ln lnr fnr
i ; i 1; . . . ; mn ; j 1; . . . ; i @sij;n n1 fn
i R r1 @sij;n l1
A6
where using Eq. (15), one can note that: @ ln lnr @sij;n @ ln lnr @sij;n
since aln;r
lnr @aln;r lnr @sij;n
j4i4l
0 i > l; lÿ1 P slh;n @aln;r ÿ sll;n whn;r Xln sll;n , implies that @sij;n 0, whenever i > l. h1
ahn;r hn;r Also, the recursion in Eq. (A4) implies that @w @sij;n uhn;r whn;r lÿ1 P
ahn;r @ahn;r uhn;r sslh;n ; ll;n
whn;r @sij;n
@aln;r @sij;n
ÿ
@aln;r @sij;n
jn;r ÿ sll;n
j
@aln;r @sij;n
ln;r ÿ asll;n
j i l:
h1 w
@ahn;r @sij;n ,
A7
and one therefore obtains that:
j4i < l
A8
78
D. Bolduc/Transportation Research Part B 33 (1999) 63±79
Appendix B. The jacobian matrix In this appendix, we detail the relationship between sn and s. From Eq. (9), one can write:
in Sn S0n Min SS0 M0in :
B1
Let L be a matrix such that vec
S L0 s and let K0 be the matrix that maps s to vec
S0 . Then Eq. (B1) may be written as: vec
in
Min S Min vec
S
Min S Min L0 s
Min Min Svec
S0
Min Min SK0 s: Then this implies that @vec
in
Min S Min L0
Min Min SK0 B0in : @s0
B2
Similarly calling L0n and K0n the matrices mapping sn to vec
Sn and vec
S0n , respectively. Then: vec
in vec
Sn S0n
Sn Imm L0n Sn
Imn Sn K0n sn ; and therefore: @vec
in
Sn Imn L0n
Imn Sn K0n A0n : @s0n
B3
Taking a pseudo-inverse of the matrix in Eq. (B3), then: @s0n 0 A n ; @vec
in which ®nally implies that: @s0n @vec
in 0 @s0n 0 Bin A n ; @s @s @vec
in
B4
0
where Bin and A n are detailed in Eqs. (B2) and (B3), respectively. References Ben-Akiva, M., Bolduc, D., Bradley, M. 1993. Estimation of travel choice models with randomly distributed values of time, Transportation Research Records, 1413, 88±97.
D. Bolduc/Transportation Research Part B 33 (1999) 63±79
79
Bolduc, D., 1992. Generalized autoregressive errors in the multinomial probit model. Transportation Research BÐ Methodological 26B(2), 155±170. BoÈrsch-Supan, A., Hajivassiliou, V., 1993. Smooth unbiased multivariate probability simulators for maximum likelihood estimation of limited dependent variable models. Journal of Econometrics 58, 347±368. Gaudry, M.J.I., Jara-Diaz, S.R., Ortuzar, J., 1989. Value of time sensitivity to model speci®cation. Transportation Research BÐMethodological 23B(2), 151±158. Geweke, J., Keane, M., Runkle, D., 1992. Alternative computational approaches to inference in the multinomial probit model, Research Department, Federal Reserve Bank of Minneapolis. Hajivassiliou, V.A., 1993. Simulation estimation methods for limited dependent variable models. In: (Eds.) Handbook of Statistics, Vol. 11. Maddala, G.S., Rao, C.R., Vinod, H.D., pp. 519±543, North Holland, Amsterdam. Hajivassiliou, V.A., McFadden, D., 1990. The method of simulated scores for the estimation of LVD models with an application to external debt crises, working paper, Cowles Foundation for Research in Economics, Yale University, Connecticut. Hajivassiliou, V.A., McFadden, D., Ruud, P., 1996. Simulation of multivariate normal rectangle probabilities: methods and programs. Journal of Econometrics, 72, 85±134.