Generalized logit model

Generalized logit model

Tmspn. Printed Res.-B, Vol. 25B. Nos. 213, pp. 75-88. in Great Britain. 0191.2615191 %3.00+ .OO b 1991 Pergamon Press plc 1991 GENERALIZED J. Inst...

762KB Sizes 8 Downloads 139 Views

Tmspn. Printed

Res.-B, Vol. 25B. Nos. 213, pp. 75-88. in Great Britain.

0191.2615191 %3.00+ .OO b 1991 Pergamon Press plc

1991

GENERALIZED J. Institute

for Computer

LOGIT MODEL

GERKEN~

Organization, Operations Research and Information University of Bremen, Bremen 33, Germany

Systems,

(Received 29 June 1987; in revisedform 3 May 1989) Abstract-The

logit model is the simplest and best-known probabilistic choice model. Nevertheless according to the deficient flexibility there are problems of making use of the multinomial logit model. In this paper a generalized logit model, which is essentially more flexible than the traditional multinomial logit model, is presented. Furthermore, the usual modal choice models are compared in respect to their flexibility. This is done by calculating the partial derivatives of the choice probability functions at a fixed point. The generalized logit model shows the same flexibility (in a precise sense) as the probit model, but is much more tractable. At the end of this paper an example is set to show the differences in the functional forms between the new model and the traditional models (logit and probit).

1. INTRODUCTION

To determine the modal split, probit and logit models are used nearly exclusively. While probit and logit models are virtually equivalent in the case of binary choice, there are significant differences in the multinomial case. The probit model is difficult to handle, because the integrals which occur have to be numerically analysed. On the other hand the multinomial logit model is of a mathematically significant, simpler form. This advantage is, besides other facts, a result of the restrictive assumption that the distributions of the random components of the stochastic utility functions are independent of each other. This assumption leads to the ‘independence of irrelevant alternatives’ axiom. An example illustrates this point (see e.g. Domencich and McFadden, 1975). Suppose a population faces the alternatives of one car mode and one bus mode, and chooses the car mode with a probability of 2/3. Now suppose a second bus mode is introduced which follows a different route, but has essentially the same attributes as the first bus mode. Intuitively, we believe that individuals will still choose the car mode with probability 2/3 and will choose either of the bus mode with one-half the probability l/3 of choosing some bus mode, or l/6. However, the independence of irrelevant alternatives condition requires that the relative odds of choosing the car mode over either of the bus modes be two to one implying that the probability of choosing the car mode drops to l/2 and the probability of choosing each bus mode is l/4. This example shows that different similarities between the alternatives influence the modal choice. The two bus modes are nearly identical (great similarity), in contrast to the similarity between the auto mode and the bus mode. An example for an experiment on a practical scale is the area licence scheme in Singapore (see e.g. Holland and Watson, 1978). The government had set a licence fee for cars entering the central area. By this scheme, cars with four or more occupants (‘car pools’) were exempt. This induced a large increase in car pools in contrast to a small increase in public transport users. We think this is a result of different similarities among the alternatives. Finally, however, we have to determine the different similarities among the alternatives by measurements. But first we need a model which permits different similarities among the alternatives and which is easier to handle than the probit model. An attempt to overcome these difficulties is provided by the nested logit model, but tPresent

address:

MBB Bremen,

MT 731, Postfach,

D-2800 Bremen 75

TR(R)

25:2/3-A

1.

J. GERKEN

76

this model has also functional limitations (see Section 2). The rest of this paper is divided into four sections. In Section 2 we summarize the theoretical background of modal choice models. In Section 3 a new modal choice model is formulated, which has the form of a universal logit model (see e.g. Amemiya, 1986). This new model is a generalization of the logit model but has the same functional flexibility as the probit model. In Section 4 we provide a functional comparison of the new model with the probit model. Section 5 contains the conclusion. 2. FLEXIBILITY

OF MODAL CHOICE MODELS

In this section we briefly establish the link between the theory of individual behaviour and the resulting models. Suppose an individual i has n alternative choices. We assume that the individual i associates to each alternativej a systematic utility. The systematic utility u,~ is a function of observed attributes of alternative j and of observed socioeconomic characteristics of individual i. Moreover, we assume that there will be unobserved characteristics, such as tastes and unmeasured attributes of alternatives, which vary over the population. These unobserved characteristics are captured in an additive random component x,. Therefore, the utility U,, of mode j to an individual i is the sum of the systematic utility u,~ and the random component x,~, i.e.

u,, = L4Q+ x,,. Variations of x,~ may induce variations in observed choice among individuals facing the same measured attributes. A specification of a distribution of the random components x,, then generates a distribution of choices in the population. Furthermore, it is assumed that each person chooses the transportation mode for which the associated utility is highest. The decision of each individual is consequently not a random process. We assume that the utility function u,) has the following form: u. = c + CY’Z,,+ P’w,, where zij is a vector of the mode characteristics and w, is a vector of the ith person’s socioeconomic characteristics. The vectors cr and 0 are the corresponding parameters and c is a constant. A detailed presentation is given by Amemiya (1986). In the following we see the modal choice models as an aspect of instruments for analysing various traffic policies. That means the following question is emphasized: How does the modal split change with regard to changes of the transport service attributes (as e.g. travel times, transit walk times, transit fares)? For a more precise investigation of this question, we consider a term of the sum cu’z,, separately. Without loss of generality we separate the term bT, (travel time multiplied by appertaining parameter) with c, = c + P’w, + (ol’z,/ - bT,J we obtain uti = bT,J + c,. To simplify

the utility

function

uti further,

we make the following

t,, = T, +

transformation

5

b'

which implies uO = b . t,. In this expression individual i.

we can regard t;, as a generalized

travel time of transport

mode j to an

71

Logit model

Now assuming that extreme-value-distributed,

the random components the distribution function

xii are independently is

and identically

F(x) = exp( - exp( - x)). With this assumption,

the probability

that a person will choose transport p_ =

exP(btj)



C exp(&)’

modej

is

k

This is the multinomial logit model in standard form. (For simplicity we have suppressed the index i.) For an investigation of the different flexibilities of the models we look at the partial derivatives (embraced in the Jacobian matrix) at points with the property C, = t, = . . . = t For ;he logit model we have the following

J=f

Jacobian

[;:

matrix

‘I:;

;;;

:I

.

This Jacobian matrix shows, that a change of t, (i = 1,2, . . . , n) has the same effect for all remaining alternatives. Thus, the deficient flexibility is reflected in the symmetric Jacobian matrix (concerning the symmetry see Section 3). It makes sense to define: A model has maximum flexibility if the entries above (respectively under) the diagonal are ‘adjustable’ by choice of the parameters. In this sense the probit model has maximum flexibility (see Section 4). Next, we examine the nested logit model subject to flexibility. For simplicity and we assume the following hierarchy is given (see Fig. 1). Then the nested logit model has the following form P, = exp(b,t,)

ew(b,t,) + (exp(b,t,) + exp(b,t,))bl’b2 exp(WJ

Pz = (1 - P,)

exp(b,t,)

+ ew(W,)

P, = 1 - P, - Pz.

mode 1

mode 2 Fig. 1. Hierarchy of the modes.

mode 3

let n = 3

J. GERKEN

78

With the abbreviations

b,2 $ 2

a=

2

we obtain

for the Jacobian

J=

matrix at the points

a - a/2 L - a/2

with I, = t, = t,

- a/2 a/4 + b a/4 - b

- a/2 a/4-b . a/4 + b I

If 6, = b, this model becomes the multinomial logit model. It is evident that two of three entries above (respectively below) the diagonal independent of each other ‘adjustable’ by choice of the parameters b, and 6,. It is our desire to create a model in which all entries in the Jacobian matrix ‘adjustable,’ that is a model of maximum flexibility.

are are

Remark The definition of the maximum flexibility seems to be dependent of the chosen point, but this is not true. The points with the property t, = t2 = . . . = t, were chosen only, because the partial derivatives are easier to calculate in these points than in any other. We think it is clear that if the model has maximum flexibility at the chosen point it has also maximum flexibility at any other point. For this reason, the criterion introduced to measure the flexibility of a model is a powerful tool to compare various models. In the next section we will give a graphical interpretation of the criterion.

3. THE GENERALIZED LOGIT MODEL

3.1 Construction We will choose the following method of approach to obtain a generalization of the logit model. At first we will approximate the logit model by a linear model. This linear model has a characteristic structure which is to some extent responsible for the limited functional flexibility. We will then throw away this special structure so that from the new structure we can derive a generalized logit model, which does not have limited functional flexibility. The following diagram (Fig. 2) illustrates this procedure. We assume that the generalized travel times t, (i = 1,2, . . . , n) (definition see Section 2) are so defined, that from equal generalized travel times (ti = t, = . . . = t,) results a uniform distribution on all alternatives, i.e.

P, = P2 = . . . = P”. generalized

logit model

r

linear

logit model

t

model

-b

generalized

Fig. 2. Generalization of the logit model.

linear

model

Logit model

19

This is always possible. In the case where the condition is not satisfied, i.e.

exp(W +

p, =



ci)

C exp(bt,’ + Cj)

with c, # 0 for at least one i E (1,2, . . . , n), the above expression can bring into the standard form: P, =

exp(W .E,exp(bt,)



with the following transformation ti = t,’ +

2.

This has the desired property; therefore we can assume that the logit model is in standard form. There exist special points, namely the points where the generalized travel times are equal. We linearize the logit model in one of these points, e.g. where t, = t, = . . . = t, = t, and get P = B * (t - to. e) + i * e n with -(n - l)b -b

B=f

-b (n - 1)b

I t,. -b

‘1 1

e=

-b

-b -b

:: : . . . . . .

:::

(n i 1)b

PI p2

II 11L. t2

.

,

1

t=

7

t

P=

*I

.

PII

But in contrast to the logit model this linear model depends on the reference time to. To free the model from this drawback we use the simple trick P = B . (t - tmi,, ’ e) + 1 1 e n with t

Ill,”

= min (t,,t,, . . . ,t,).

The above linear model has the following properties (i) The matrix B is the Jacobian matrix in the reference point. (ii) In the reference point holds P, = P2 = . . . = P, = l/n.

For this reason this model is identical with the multinomial logit model in a small neighbourhood of the reference point. But the difficulty lies in the fact that the linear model is applicable only in a small range, because the restriction 0 I P, s 1 (i = 1,2, . . . , n) is fulfilled only in a neighbourhood of the reference point. In the case where Pi I 0 for an i E (1,2, . . . , n) we have to create a new linear model, with appropriately fewer alternatives. The following diagram (Fig. 3) makes the

80

J. /

model

model

P, = 1

Pz=..

GERKEN

n

model

with

with

independent

alternatives

(n-l)

alternatives

of t,

:

=P,=O

P,=,_

=P”

1, Fig. 3. Linear

(t

2=

r’

)

model.

situation clear. It is possible to make the above linear model more flexible the parameter matrix B. The canonical generalization of the matrix B is the following:

B=f

=t”

;TI

;;;

by modifying

.

riJ

with b,J = b,, < 0 (i # J] and b,; = 0 (i,j = 1,2, . . , n). The change of the model performance can be seen in Fig. 4. In the next section we present a generalized logit model which has the following fundamental characteristics: 1. In a small neighbourhood of the reference above generalized linear model.

point

the new model

is identical

with the

model P,= P2=

model wth’

1

.=P,=O

2

alternatives



model



with

alternatives

with (n-l) model independent of t,

alternalw?s.

P

n

=t”

Fig. 4. Generalized

linear model.

)

81

Logit model 2. For f,-+to, i E (1,2, . . . , n) holds (i) Pi-+0 and (ii) the remaining choice probabilities are independent of tj (see Section 3.5). 3. The functions Pi (i = 1,2, . . . , n) are smooth (i.e. they are infinitely often differentiable with respect to the generalized travel times). 3.2 The new model Suppose an individual has n alternative chooses alternative i is then given by

P; =

exP(bi(fi EjeXP(b,(f, -

L)

+

choices.

The probability

Ckgik)

that this individual

;i,j,k = 1,2, . . . n

twin) + Ckgjk)

with c, . g,k

!k,

=

ark

=

(bk - b,)/n; a,, = 0; b; = ---!-

C,b,,;

n-l

tmin = min {t,,tz, . . . ,t,); where 1,: generalized travel time of transport mode i (see Section 2). 4,: model parameter with b,, = b,, < 0 for i # j and b,, = 0. function to describe the asymptotic behaviour (see next pages). &!,I To make the new model more comprehensible, we will first interpret the constants b,, ark, functions g,,. We assume that the generalized travel times t, (i = 1,2, . . . , n) for all modes, as well as the parameter matrix

c,~, and the perturbation

0 b,,

. . . . . .

b,, 0

b In b Zn

.,.

. .

. . . . , . . . . with b,, = b,, 5 0 are known. The parameter b,, describes the similarity two extreme cases have the following meaning:

between

0

I

mode i and mode j. By that,

the

1. b,, = - 03: The mode i and the mode j are identical. 2. b,, = 0: The difference between the generalized travel time of mode i and mode j has no influence on the modal choice. Further the parameter matrix B determines the Jacobian t, = . . . = t”). To simplify the notation we have introduced the term b;=

- ’ n-l

This constant

is simply the mean of the parameters

matrix

C,b,,. for mode i.

at the origin

(i.e. t, =

J. GERKEN

82

The terms g,, are functions of the generalized travel times. Each depends only on one variable. The graph of g, is illustrated in Fig. 5. Notice, that the c,~ could also be negative, but nevertheless the signs of Uikand C, are always the same. From the above graph it is clear that on the one hand the aik define the behaviour of the model at the origin (reference point) and on the other hand the c,, define the asymptotic behaviour of the model. The perturbation function g, describes the influence, which the mode k has on the mode i. Embracing these functions in the matrix

we

see

that we have (n -

1) . n such functions.

h, = bit, the generalized

logit model has the following P, =

With

twin> + C,g,, form:

ew(h,)

c,ew(h,)

Models of this form are commonly called universal logit models (see e.g. Amemiya, 1986). Therefore, the generalized logit model belongs to the class of universal logit models. 3.3 Special cases We first consider the case n = 2. Then the choice probabilities parameter 6: = b,,: = b,,. We have

b, = b2 = b hence

consequently

gik

tfketmin) Fig. 5. Graph

of the perturbation

functions.

depend

only on one

Logit model

83

exp(b(t, - t,J)

P, = exp(b(t, -

tnd)

+

ew(W

-

Cd)

ev(W =

exp(bt,)

+ exp(bt,)

Therefore, the above model is in the case n = 2 identical to the binary logit model. Let us consider a further special case. Suppose n 2 3 and b: = 6, for all i # j. We have b, = (n n-l

l)b

= b , i = 1 ,2 ,-..

n

hence as above g,, = 0, i,k = 1,2, . . . n and therefore

P, =

exp(bt,) Cjexp(btj)

, i,j = 1,2, . . . n.

This is the multinomial logit model. Hence we can regard the proposed model as a real generalization of the traditional logit model. Therefore, we call this model a generalized logit model. The reader may ask at this point: Why b, = bji for all i,j = 1,2, . . . , n? This problem shall be demonstrated by the simplest example (i.e. n = 2). Suppose b,, # bzl, then b, = b,, and b2 = b,,. We obtain g,, = g,, = 0 and this implies

P, =

exp(b,(t, - t,J) exp(b,(t, - t,,,)) + exp(b&

- t,,,))

Therefore, in the case t, = t,,, we have the binary logit model with parameter b, and in the other case t, = &, we have the binary logit model with parameter 6,. This is of course not desirable, because we obtain a nondifferentiable point (t, = tz). 3.4 Flexibility It was the goal to develop a model with maximum flexibility (i.e. the entries in the Jacobian matrix above (respectively below) the diagonal should be independent of each other ‘adjustable’). To verify this property for the generalized logit model we have to calculate the partial derivatives at the reference point for this model. To abbreviate the notation we denote in the following the reference point by t*. For small 6t,: = t, - t,,, we can approximate the function g,, =

(1- exp(

- F))c,k

via & = a,, x fit,. In order to calculate the partial derivatives of the generalized logit model at the reference point t* it is sufficient to calculate the partial derivatives of the following model

exp(b,ht, + Ca,,Gt,) E,exp( b,6tj + C,ajg3tk)

exp(b,t, + Ca,t, - (bi + Ca,Jt,J Cjexp(bjtj + Ca,,t, - (bj + Caj&,,) We have

84

J.GERKEN

= bi + Ck+;-b, - b, n

bi + &a,

n-l

= b, + Ckf; 3 - ~ b” = b, + C,,, 2 n

n

b,

6, + 3 n

= .& bk. n This implies P, =

exp(b,t, + Ca,t,) C,exp( b,?, +

&a,,&)



Now we can calculate

the derivatives easily nb, - b, - Ca,, = (n - l)b, C$ (t*) = -$ (t*) = n2 n2 I I

-

because

C,a,, = 0. Hence, we obtain logit model

the following

Jacobian

matrix

at the reference

-b,, C,b,,

. . . . . . .

point for the generalized

-b,,, - bZn

I

Notice that the choice probabilities P, are differentiable with respect to the generalized travel times if and only if the given parameter matrix is symmetric. In this case P, E C‘= holds. 3.5 Asymptotic behaviour In this section, the first question to be answered is: How does the generalized model behave for t,+ CCfor i E (1,2, . . . , n)? One could expect that lim

i,-cc This is satisfied

logit

P, = 0.

because lim b,(t, - t,,,,) = - 03. i,- m

Therefore we have for t,+m a distribution of the part of the ith alternative to the remaining alternatives. This process of distribution should be investigated now. For that we consider the model at the reference point. From the Jacobian matrix at this point we see that an increase oft, by lit, (for small At,) results in:

Logit model

pi = 1 + % n P, = A - 5 n

85

At,;

6t, j + i.

with

we obtain

P, =

L n

6P, , 6P, > 0;

Pj = L + 2 n

6P,, j + i. rk

The generalized logit model has the property that the above described distribution of the part of the ith alternative to the remaining alternatives for small At, holds also for t,-+a. This means for t, = . . . = t,_, = ti+, = . . . t, we have lim P,=l+i$,j+i. ,i- m n

rk

The above expression holds, because for t, = . . . = t,_, = t,,, = . . . t, and t,--+m we have gJk = 0 , k # i and

g,, = ln

b,! + (n - 1% nb,

,j # i;

>

consequently exp(b,(t, - tmd + g,,)

lim P, = I,-1cc

b,, + (n - 1W, = c kfi

nb, b,, + (n - l)b, nb,

=

=-

b, + (n - l)b, (n - l)b, + (n - 1)2b, 1 b,, + (n -

n =-

l)b,

(n - l)b,

1 +‘b” n Cb,k’

n This is the desired result.

3.6 Utility maximization After describing the functional properties of the model, we will now regard the model with a view to maximization of the utility function. We have shown in Section 3.4 that the function hi = bi(t, - tmin) + &g,

J. GERKEN

86

can be approximated

by the function ui = bit, + Cgiktk

if the generalized

travel times t, (i = 1,2, . . . , n) are approximately

equal.

Let

u, =

u, +

X,]

be the utility function of mode i to an individual j, where x,, is a random component, which is extreme-value-distributed (see Section 2). Under the assumption that each person chooses the alternative with the greatest utility to himself, we get the traditional logit model:

ew(u,)

P, =

,i,j=

1,2 ,...,

n.

&exp(4) (The index j is suppressed) This means that the generalized logit model behaves like the traditional logit model with the above special utility function in the neighbourhood of the reference point. Thus, the utility for alternative i depends on the competition situation with the remaining alternatives. Alternatives which are losing their ability to compete, lose their initial parts to the remaining alternatives as described in Section 3.5. 4. A FUNCTIONAL

We consider individual i

the probit

COMPARISON

model

WITH THE PROBIT MODEL

with the following

utility

function

for mode j to an

U, = bt, + x,,.

But in contrary to the previous chapter the random components normal-distributed. Suppose that the following covariance matrix is given

c = . . .

. 2

uln

U?,,

.

-

with

.

.

are now multivariate-

I

urr

and

x2,

=

UT,-

cd;, + 02

I/

j=3,...,n

2 cd2 A,, =



wf, + 2

cd2 Ik

-

i = 3, , . . , n

k’ + J=2

/I

j = 2, . . . , n

Logit model

87

we obtain the probit model in the following form

s -cc

The remaining probabilities Pz,. . . ,P,,have analogous expressions. A detailed representation is given (e.g. by Domencich and McFadden, 1975). Since the probabilities depend only on the wii (i # 1) the number of free eligible parameters [n(n - 1)/2] is the same as for the generalized logit model. At the reference point the Jacobian matrix has the following form for the probit model - WI? 2

Wk2

-

W2n

!Tf2

-

c

w/i2

Wkn

k#n

with wu =

-.

1

OlJ

Therefore, the Jacobian matrix for the probit model has the same form as the Jacobian matrix for the generalized logit model. Thus, we obtain the following relations for the parameters

This shows that the generalized logit model is very similar to the probit model. But it has the advantage of easier handling and further that the influences of the model parameters to the choice probabilities are more transparent. 5. CONCLUSION

In this paper we have proposed a new model which is a generalization of the well known logit model, but does not contain its functional limitations. This new model, called the generalized logit model, has the great advantage that the dependence of the choice probabilities on the mode characteristics are transparent. The reason for this transparence is that the parameters used are reflected in the Jacobian matrix in a simple form. For comparable models (e.g. the probit model or the class of GEV models) the dependences are harder to recognize. We think this paper gives a new insight into the model’s behavior relative to the inherent dependences. This new insight makes the direct transmission of research work into the model behaviour possible. For example, information about elasticities of the choice probabilities with request to the mode characteristics, or results in the field, ‘the value of time’ are directly related to the parameters of the model. Further informations

88

J. GERKEN

about the different similarities between the alternatives are easy to incorporate into the model behaviour via choice of the model parameters. Nevertheless the new model should be tested to a great extent to receive more information about the model performance and the parameter matrix. For that, it is necessary to have an algorithm to estimate systematically the required parameters. This work remains for the future. REFERENCES Amemiya T. (1986) Advanced econometrics. Harvard University Press, Cambridge, MA. Domencich T. A. and McFadden D. (1975) Urban travel demand. American Elsevier Publishing York. Holland E. P. and Watson P. L. (1978) Measuring the impacts of the Area Licence Scheme. Control, 19, 14-22.

Company,

New

Traffic Engr. &