A Discrete Choice Model When Context Matters

A Discrete Choice Model When Context Matters

Journal of Mathematical Psychology 43, 518538 (1999) Article ID jmps.1998.1237, available online at http:www.idealibrary.com on A Discrete Choice ...

224KB Sizes 2 Downloads 138 Views

Journal of Mathematical Psychology 43, 518538 (1999) Article ID jmps.1998.1237, available online at http:www.idealibrary.com on

A Discrete Choice Model When Context Matters Antoine Billot Universite Pantheon-Assas (Paris 2) 6 CERAS, Ecole Nationale des Ponts et Chaussees

and Jacques-Francois Thisse CORE, Universite Catholique de Louvain, 6 CERAS, Ecole Nationale des Ponts et Chaussees

This paper contributes to the description of the choice behavior of an individual who, as a result of capacity limitations, is unable to deal with a large number of alternatives and who is also unsure about his most preferred alternative in any given set. Hence, there is a conflict in that the individual wants both to reduce the choice set and to keep as many alternatives as possible. A natural way to solve this conflict is to view the individual as choosing sequentially while giving a weight to each corresponding subset of alternatives. We assume that the weights are defined by Choquet capacities which are independent of the path followed in the choice process. These choice capacities are obtained from a utility defined on the power set of alternatives. This implies that the context, identified as the opportunity set, influences the choice made by the individual. However, the choice capacities are not observable. It is shown that they can be converted into probabilities which have intuitive and appealing properties. In particular, using these probabilities allows for a sensible solution to the blue busred bus paradox when the individual has a natural preference for flexibility.  1999 Academic Press

1. INTRODUCTION

Individual choice theory has been very much dominated by the neoclassical model (see, e.g., Arrow, 1959): given any finite set of mutually exclusive alternatives, an individual chooses with certainty the alternative with the highest utility level. In other words, the utility of a choice set is always identical to the utility of its most preferred alternative, regardless of the other alternatives. Consequently, if We are grateful to the action editor A. A. J. Marley and to two referees for very detailed comments and suggestions. We also thank A. Chateauneuf, J. W. Friedman, J. Y. Jaffray, Ph. Mongin, D. Schmeidler, T. E. Smith, A. Tversky, and B. Walliser for useful discussions. The usual disclaimer applies. Address reprint requests to Antoine Billot, Department of Economics, Universite de Paris II, 92, rue d'Assas, 75006 Paris, France. 0022-249699 30.00 Copyright  1999 by Academic Press All rights of reproduction in any form reserved.

518

A DISCRETE CHOICE MODEL

519

the most preferred alternative still is a member of a smaller choice set, the wellbeing of the individual is supposed to be the same if only this smaller choice set is available (this is known as the Chernoff condition in choice theory). The assumption that the individual is a perfect optimizer has been under attack for many years because the implications of perfect rationality fail to be confirmed by laboratory experiments and because the conclusions of rational analysis sometimes seem to be unreasonable (Aumann, 1997). To be sure, the idea that decisions are made by imperfect optimizers is appealing, but there is no unified theory of imperfectly rational decision making. An interesting line of research can be found in discrete choice theory where the individual's behavior is described by a probability distribution that roughly reflects his utility buried in his unconscious mind. In such models, if one alternative is superior to another according to this unconscious utility, it has a higher probability of being chosen. Consequently, the individual's behavior exhibits a tendency toward optimization. In this perspective, Luce's (1959) analysis has proven to be very useful in describing imperfectly rational decision makers (see, e.g., Billot, 1998; Chen, Friedman, 6 Thisse, 1997; Mirrlees, 1986). In what follows, we propose a generalized Lucean framework that accounts for some of the main factors governing the individual's choice process in a world of bounded rationality. Specifically, this paper aims at contributing to the description of an imperfectly rational individual who (i) has limited ability to deal with a large number of alternatives and (ii) is unsure about the most preferred alternative in a given set. First, it has been empirically observed that an individual has a limited ability to structure his preferences when the choice set he faces is large. For example, according to Pessemier (1978, p. 380), ``the evoked set typically contains only a small fraction of the objects found in the associated market.'' In other words, the individual often focuses on a small set of alternatives before actively engaging in a choice decision process. In all likelihood, the main reason for such an attitude lies in the individual's limitations in processing information related to all the alternatives. Indeed, Miller (1956) noticed that an individual often deals with a small number of alternativesseven plus or minus two. Presumably, this is because choice becomes too difficult to organize for a large number of alternatives. In the same vein, Simon forcefully argued in various publications that knowing all possible alternatives as well as comparing them in terms of their consequences turn out to be very hard tasks for the decision maker (Simon, 1957). However, there is another angle to consider. As noted by Tversky (1972a, p. 281), ``when faced with a choice among several alternatives, people often experience uncertainty and inconsistency, that is, people are often not sure which alternative they should select, nor do they always take the same choice under seemingly identical, conditions.'' Therefore, the individual may prefer a large set of alternatives to a small one in order to retain as much flexibility as possible at the very moment of his choice (Kreps, 1979). This is also in accord with standard discrete choice theory where the expected maximum utility increases with the number of alternatives (Mc Fadden, 1981), Hence, we may safely conclude that there is a conflict between the need for focusing and the preference for flexibility: the individual simultaneously wants to reduce the

520

BILLOT AND THISSE

choice set (given his limited ability in processing various objects) and to keep as many alternatives as possible (because he is unsure about his own tastes). For example, the individual restricts his shopping to a particular urban area because of the transportation costs he has to incur. On the other hand. since he does not know with certainty which store he will like best, the individual prefers to shop in a sufficiently large area. This conflict is implicitly embedded in Luce's probabilistic choice model. More precisely, Luce's choice axiom can readily be interpreted as a ``sequential choice process,'' the salience weights of which are given by probabilities independent of the path followed during the choice process. This model has considerable merits. First, a (scale) utility can be constructed from the choice probabilities which are given by a simple relationship of the form (us)(us+them). Second, it incorporates a tendency toward utility maximization which can be expressed by means of a simple measure of departure from the neoclassical model. Finally, it allows the individual to choose any alternative belonging to the choice set, which gives him the opportunity to value flexibility in choice. However, the Luce model suffers from a severe drawback pointed out by Debreu (1960): all choice probabilities are downgraded (upgraded) in the same proportional manner when an alternative is added to (dropped from) the choice set. This implies some form of symmetry among alternatives which appears to be inconsistent with the existence of different degrees of substitutability. Several probabilistic choice models have been proposed to circumvent this difficulty, such as the elimination by aspects model (Tversky, 1972b), the nested logit (Mc Fadden, 1978), and the multivariate dependent probit (Hausman 6 Wise, 1978). We will see below that the Luce model is such that the utility of the choice set is equal to the sum of the utilities of its constitutive elements, In other words, adding a new alternative generates an increment in utility which is independent of the number and characteristics of the other alternatives in the choice set. Clearly, this is a very restrictive property as one would expect the impact of a new alternative to be sensitive to the basic features of the context in which the choice is made. Hence, we consider a more general preference for flexibility which accounts for structural interdependence among alternatives. In fact, Tversky (1972a) has already shown how the general elimination model allows for various types of interdependence among alternatives to be introduced in probabilistic choice models. As will be seen below, using such models amounts to assuming that the individual positively values all possible contexts in which the choice is to be made. In this paper, we go one step further by allowing the individual to negatively value particular contexts. This is a more general way to say that context matters when the individual selects an alternative. Formally, we reach this goal by relaxing the probabilistic assumption and by replacing it by the weaker concept of capacity (Choquet, 1953), also called nonadditive probabilities (Schmeidler, 1982, 1989; Tversky 6 Koehler, 1994). More accurately, our model retains the path independence property embedded in Luce's model but this property is now formulated in terms of capacities. This leads to a formulation of the choice process which includes Tversky's (1972b) elimination by aspects model and Mc Fadden's (1978) generalized extreme value model as special cases.

A DISCRETE CHOICE MODEL

521

The advantages of using capacities in discrete choice theory are many. First, by assuming the existence of an unconscious utility defined over the power set of alternatives, we can bring about choice capacities expressing the propensity to choose the various alternatives in a given set. These capacities are shown to extend the Luce choice probabilities. Second, because the only constraint is monotonicity with respect to set inclusion, the capacity model endows the individual with more freedom in evaluating the weight associated with a subset of alternatives. In particular, the sum of the choice capacities over the whole set of alternatives may be larger or smaller than one and may vary with the size of the choice set (because of the monotonicity, this sum can never decrease when the number of alternatives increases). Of course, it is often useful to add further restrictions on the unconscious utility but much scope is left here for different options. Third, though the choice capacities are not observable, we will show how they can be converted into probabilities (to be interpreted as long run frequencies of choice), thus allowing for the possibility of conducting experiments in the laboratory. These probabilities are obtained through a procedure in which the choice capacities are upgraded or downgraded according to the underlying utility. More precisely, in order to compute these probabilities, we proceed in two steps: the impact of the context is (i) singled out and then (ii) smoothed out in order to generate the converted probabilities. Fourth, and last, using these converted probabilities allows one to get rid of the blue busred bus paradox uncovered by Debreu provided that the unconscious utility exhibits a natural preference for flexibility. Specifically, we show that, if the individual places a higher value on the choice set formed by the car and a bus (whatever its color), than on any singleton, the converted probability of selecting the car is higher than the probability of choosing either bus. The gap between the probabilities varies with the value placed on the context defined by a car and a bus. The above discussion suggests a striking similarity with results obtained recently in decision theory. Indeed, it is well known that the Savage approach does not cover situations described by Ellsberg (1961). Gilboa (1987) shows that the existence of a von NeumanMorgenstern utility plus nonadditive probabilities can describe situations a la Ellsberg. Here, we show that choice configurations a la Debreu are consistent with the Lucean path independence property when probabilities are nonadditive.

2. THE MODEL AND SOME PRELIMINARY RESULTS

In order to capture the essence of the conflict between the desire for flexibility and the difficulty to choose within an opportunity set, we consider a utility defined over all subsets of alternatives. 1 The idea of propensity to choose is then discussed and formally defined as a capacity. Two possible interpretations of a choice capacity, based respectively on intrinsic and extrinsic uncertainty about the individual's tastes, are then suggested. Finally, the standard neoclassical and the Luce probabilistic choice models are characterized within our general framework. 1

The same idea has been put forward in voting theory. See Falmagne 6 Regenwetter (1996) for a detailed discussion of approval voting.

522

BILLOT AND THISSE

2.1. Utility and Choice Capacity In this section, our research strategy is as follows. First, we define a utility function over all subsets of possible alternatives in order to account for the context in which the choice is made. Second, we introduce the concept of choice capacity and relate it to the utility by using a generalized version of Luce's choice axiom. Last, we are able to show that the capacity to choose an alternative in an opportunity set is a function of the utility of this alternative and of the other alternatives in the opportunity set. Though the individual does not always optimize, the so-obtained choice capacity encapsulates a tendency toward optimizing behavior. Consider first a set A containing n mutually exclusive alternatives and call an opportunity set any subset S (or T ) of A to which the individual's choice is actually restricted. An individual is not always aware of his utility but it is as if he has such a utility rather deeply buried inside his unconscious mind. Formally, the utility is a mapping u( } ): 2 A Ä R + such that u(S) expresses the individual's satisfaction for choosing in the opportunity set S. 2 This utility is assumed to satisfy the following axioms: (U1): u(<)=0. (U2): u(A)<. (U3): If A$T$S, then u(A)u(T )u(S). The first two axioms are normalization ones. The third axiom expresses a preference for flexibility 3 in that S _ T cannot be worse than both S and T. A possible interpretation of such a preference is that the individual is unsure about his most preferred alternative (Kreps, 1979). Indeed, as suggested in the psychological literature on choice, the individual's state of mind may change during the choice process between two subsequent stages and affect his ranking of alternatives. Another interpretation, implicitly contained in Machina (1985), is that the individual values the ``potential surprise'' associated with an opportunity set: he prefers a lottery on this opportunity set to any sure alternative belonging to it. Our next step is to focus on the individual's propensity to choose. This propensity is a measure of the incentives to select a particular alternative (or subset of alternatives) within a given opportunity set, It reflects a rough recognition of the individual's genuine utility u( } ) and finds its origin in the various sources of noise that disturb the knowledge the individual has of his own tastes (see also below for more explanations). Formally, consider any two opportunity sets S, T such that STA; the broader set T is defined as the context of the possible choice of S. In this paper, the propensity to choose S in T is defined by a mapping c T (S) 2

Since we consider utilities on the nonnegative reals, it is natural to consider the utility u( } ) as a ``ratio scale'' in contrast to the more general von NeumannMorgenstern ``interval scale'' utility (see Falmagne (1985, ch. 13) for an extensive discussion of the different types of scaling). 3 When A is a set of differentiated products and when u( } ) is strictly increasing, the preference for flexibility corresponds to a variety-seeking behavior in which the individual's most preferred alternative changes from occasion to occasion (see, e.g., Mc Alister 6 Pessemier, 1982; Anderson et al., 1992, ch. 3).

A DISCRETE CHOICE MODEL

523

from 2 A into [0, 1] such that the following axioms are satisfied: for each TA, T{<, (C1):

c T (<)=0.

(C2):

c T (T )=1.

(C3):

If T$S$R, then c T (T )c T (S)c T (R).

Then, c T ( } ) is a capacity (Choquet, 1953) which we will call the choice capacity. It expresses the propensity that a nonoptimizing individual has to choose a particular alternative (or a subset of alternatives) when facing the context T. It is readily seen that the axioms (C1) through (C3) are the counterparts of (U1) through (U3) in terms of choice behavior. Two possible interpretations of a choice capacity can be given. First, since the utility is defined on the power set, the individual is typically influenced by the context given by the opportunity set T (see, e.g., Belk (1975)). The utility of T may be larger or smaller than the sum of the utilities of its constitutive elements. Within our framework, we therefore say that the utility is context-independent when the contribution of an alternative to the utility of an opportunity set is independent of the utilities of the other alternatives in that opportunity set. In this case, the utility of the individual is context-free. However, the present approach allows us to deal with more general situations in which the utility is context-dependent. We will see in Section 2.3 that the choice capacity is a probability if and only if the utility is context-free. Hence, if the individual is influenced by the context T when choosing the alternative a, the propensity of selecting a in T is indeed a capacity. The second interpretation follows the lines of discrete choice theory. From the viewpoint of an observer, the individual is characterized by a single probability distribution from which the individual draws from moment to moment a utility defined on the alternatives, while the decision process consists of the simple rule of selecting the alternative with the largest momentary utility (see, e.g., Anderson et. al. (1992, ch. 2), Edgell 6 Geisler (1980)). However, an observer may not have enough information to infer the probability distribution underlying the individual's choice and may instead consider several probability distributions as being possible. Combining these probability distributions does not generally yield a probability distribution. For example, if the observer is ``risk-averse,'' he might choose the lower envelope of the distributions which is not a probability but a Choquet capacity (Gilboa 6 Schmeidler, 1989). Note that the choice capacity is consistent with these two interpretations. However, the uncertainty is intrinsic in the former and extrinsic in the latter. Given the emphasis that the literature on discrete choices has put on the Luce model for the choice probabilities, we find it reasonable to introduce similar restrictions on the choice capacities by assuming the following generalized choice axiom: (GCA):

The choice capacities are such that for all RST: c T (R)=c T (S) } c S (R).

This is reminiscent of Luce's axiom (see (2.4) below), in that the capacity to choose in R is independent of the path followed in the focusing process. However,

524

BILLOT AND THISSE

it is different from Luce's since c T ( } ) is not necessarily a probability. Stated differently, the individual eliminates alternatives while giving a weight to each resulting subset of alternatives which is here a capacity. Proposition 1. Assume that c T (S){0, 1, for all ST. Then, (GCA) holds if and only if there exists a positive function u( } ) satisfying (U1) through (U3) defined on 2 T such that c S (R)=u(R)u(S)

(2.1)

for all RST. Furthermore, this function is unique up to a linear transformation (i.e., it is a ratio scale). Proof. (Necessary condition) For all ST, set u(S)=c T (S): T where : T is a positive constant. According to (GCA), c T (R)=c T (S) } c S (R), for all RST. Then, c S (R)=

c T (R) : T u(R) u(R) = = . c T (S) : T u(S) u(S)

It is readily verified that u( } ) as given by (2.1) satisfies (U1) through (U3). Suppose now that another utility function u$( } ) exists such that u$(R) =c S (R). u$(S) Then, we have u(S)=

1 u$(S) 1 c T (S)= } . :T : T u$(T )

Setting 1; T =: T u$(T ), we obtain u$(S)=; T u(S). (Sufficient condition) Obvious.

Q.E.D

The expression (2.1) embodies the conflicting wishes, i.e., flexibility and focusing, in that, for any alternative a # T, u(a) and u(T ) pull in opposite directions under the form of a ratio. This ratio has the advantage of simplicity and, for this reason, has attracted much attention among psychologists and other social scientists; see, e.g., Sen 6 Smith (1995, ch. 2) for a historical survey. This specification will allow us to obtain very simple formulae in Section 3, while it contains the Luce model as a special case. It also implies two more things. First, the propensity of choosing any subset increases with its utility, thus expressing the tendency toward optimizing behavior that characterizes the individual. Second, the propensity for the individual to restrict himself to S does not increase when the utility of T rises. This is because the prominence of S with respect to the context T falls. Clearly, these two properties should characterize the behavior of an imperfectly optimizing individual.

525

A DISCRETE CHOICE MODEL

2.2. The Neoclassical Model of Choice In the standard theory of individual choice, u(S) is given by u(S)=Max [u(a)].

(2.2)

a#S

This utility is such that the individual gives to the opportunity set S the utility level obtained by maximizing over all the alternatives in S. It is shown below that such a utility is characterized by the following axiom: (N): If for each S # 2 A, there exists T # 2 A such that u(S)=u(S _ T ), then for all T $ # 2 A, u(S _ T $)=u(S _ T $ _ T ). If the individual does not gain by adding T to S, he does not gain either by adding the same set to S _ T$ whatever T $ (even when u(S _ T$)>u(S)). Proposition 2. If u( } ) is a utility, satisfying (U1) through (U3), then (2.2) is equivalent to (N). Proof. The necessity part of the proof is obvious. The sufficiency part consists in applying Theorem 1 by Kreps (1979) when the set of states contains a single state and when the state-dependent utility is defined on 2 A. Q.E.D Hence, a utility is neoclassical when the individual does not value the opportunity of choosing in a larger set (since his most preferred alternatives remain the same). Assume for simplicity that the most preferred alternative is always unique. Then, the capacity for the most preferred alternative of S, denoted a* S , to be chosen in T is equal to 1 as long as a* S is identical to the most preferred alternative of T, denoted a* T . However, since the choice capacities are not additive, the alternatives in S different from a* S do not necessarily have a zero choice capacity. On the contrary, when a # T and a{a* T , the capacity of choosing a is nonnegative and strictly increasing in its own utility. As will be seen in Section 3.2, when a capacity is converted into a probability, the most preferred alternative has a probability 1 of being chosen whereas the other alternatives have a zero probability. In this case, we fall back on the standard interpretation of the neoclassical model. Specifically, the choice capacities associated with the neoclassical model are given by a possibility distribution (see, e.g., Zadeh, 1978) defined by c T (S _ S$)=Max[c T (S), c T (S$)],

(2.3)

which is not a probability distribution since the possibility of S plus the possibility of T&S is larger than 1. 2.3. The Luce Probabilistic Model of Choice In order to characterize the Luce model, we introduce the following axiom: (L):

For all S, S$, T # 2 A such that S & S$=< and u(T )>0 : c T (S _ S$)=c T (S)+c T (S$).

526

BILLOT AND THISSE

This axiom means that c T ( } ) is a probability (contrast to (2.3)). Note that (GCA) and (L) imply For all RST # 2 A such that u(S), u(T )>0 : c T (R)=c T (S) } c S (R),

(2.4)

which is the axiom proposed by Luce (1959, p. 6) when c T ( } ) is a probability. As shown below, in the Luce model, u(S) is such that u(S)= : [u(a)].

(2.5)

a#S

Thus, adding a new alternative generates an increment which is independent of the number and characteristics of the other alternatives in the opportunity set. Following Lovasz (1983), this means that the utility behind the Luce model is modular (or linear): \S, S$A, u(S _ S$)+u(S & S$)=u(S)+u(S$).

(2.6)

Hence, the choice capacity is a probability if and only if the utility is modular or context-free. This result shows that the utility implicitly assumed by Luce is very restrictive in that the incremental utility generated by a new alternative depends only upon the characteristics of this alternative. If we want to consider more general patterns of structural dependence among alternatives in which the context matters, c T ( } ) is no longer a probability so that using choice capacities becomes natural. 3. THE CONVERSION THEOREM

Despite their intuitive appeal, the choice capacities are unobservable, which makes it difficult to use them empirically. However, it is possible to utilize these capacities to construct probabilities which can be observed and tested. As seen in Section 2.3, the choice capacities are not probabilities when the underlying utility is not modular. We are therefore interested in finding an operator which permits the conversion of choice capacities into probabilities. For that, we use the concept of contextual utility which is defined below. We then show how this concept allows us to convert capacities into probabilities. 3.1. The Concept of Contextual Utility Consider an individual whose utility is not context-free. If S=[a, b], u(a, b) is therefore different from [u(a)+u(b)]. In this case, it seems natural to express the contextual utility of S, denoted +(S), by the difference +(S)=u(a, b)&[u(a)+u(b)].

(3.1)

In other words, +(S) measures the exact contribution of the variety inside of S, once we already have deducted for the utility of its constitutive elements. When S=[a, b, c],

527

A DISCRETE CHOICE MODEL

one might think that +(S) would be given by u(S)&[u(a)+u(b)+u(c)]. However, this expression includes the contribution of each pair [a, b], [b, c], and [a, c] to the contextual utility of S. Since these utilities are given by (3.1), the contextual utility of [a, b, c] is actually defined as follows: +(S)=[u(S)&[u(a)+u(b)+u(c)]]&[+(a, b)++(a, c)++(b, c)].

(3.2)

Assume now that S=[a, b, c, d]. Following the lines of the argument above would lead to the expression +(S)=[u(S)&[u(a)+u(b)+u(c)+u(d )]]& : +(S a ),

(3.3)

a#S

where S a =S&[a] and +(S a ) is defined as in (3.2). However, each of the four triples S a includes three different pairs. Since there are only six possible pairs of alternatives in S, twelve pairs (two times each) have been deleted in (3.3). Consequently, too many pairs have been removed and the six basic pairs must be reintroduced. This can be achieved by forming the expression +(S)=[u(S)&[u(a)+u(b)+u(c)+u(d )]]& : +(S a )+ a#S

:

+(S a, b ),

(3.4)

a, b # S; a{b

where S a, b =S&[a, b] for a{b, which avoids any double-counting. More generally, in view of (3.1), (3.2), and (3.4), we define the contextual utility +( } ) by the mapping +( } ): 2 A Ä R (3.5) S Ä +(S)=

: RS; *R2

\

+

(&1) *(S&R) u(R)& : u(a) , a#R

where *(S&R) denotes the cardinality of S&R. In words, the contextual utility of S can be interpreted as the (dis)satisfaction the individual derives from choosing in S independently of the proper subsets of S he might face during the focusing process. Observe that the contextual utility satisfies the following important property. Proposition 3. The contextual utility +(S)=0 for all S # 2 A such that *S2 if and only if u( } ) is modular. Proof. (Necessity condition) If S contains two alternatives, then +(S)=0 and (3.1) implies that u( } ) is modular on S. If S contains three alternatives, (3.2) and the above result imply that u( } ) is modular on S. By induction, the same argument applies to any subset containing m4 alternatives. (Sufficient condition) Obvious.

Q.E.D

In other words, the contextual utility is zero if and only if the individual's utility is context-free. The contextual utility can therefore be interpreted as a measure of deviation from the context-independence.

528

BILLOT AND THISSE

Interestingly, the contextual utility +( } ) is linked to the utility u( } ) in a very particular way. The Mobius inverse of the utility u( } ), denoted M[u( } )], is defined as follows: 4 for all S # 2 A, M[u(S)]= : (&1) *(S&R) u(R).

(3.6)

RS

Proposition 4. For all S # 2 A such that *S2, we have +(S)=M[u(S)]. Proof.

It is sufficient to show that : (&1) *(S&R) : u(a)=0. RS

a#R a

Indeed,  RS (&1) *(S&R)  a # R u(a)= a # S [ Ra S (&1) *(S&R ) +(&1) *S ] u(a), where R a denotes any subset of S containing the subset [a] (more generally, R X denotes any subset of S containing the subset X ). Thus, the proposition holds if a  Ra S (&1) *(S&R ) +(&1) *S =0. This is so because a

: (&1) *(S&R ) +(&1) *S Ra S

=(&1) 0 + : (&1) *(S&R

a, b )

a, b, c )

+ : (&1) *(S&R

b # Ra

+ } } } +(&1) *S

b, c b{c

=(1&1) *S =0, which ends the proof.

Q.E.D

The concept of contextual utility therefore provides an intuitive interpretation of the Mobius inverse of the utility u( } ). Using Lemma 2.3 of Shafer (1976) then yields: for any S # 2 A, u(S)=

: RS; *R2

+(R)+ : u(a).

(3.7)

a#S

In words, the utility of S is equal to the sum of the contextual utilities overall subsets containing at least two alternatives plus the sum of the utilities of the alternatives. This confirms the fact that +( } ) is a measure of the (positive or negative) contribution of the context only. For notational simplicity, we write +( } ) in place of M[u( } )]. Note that the contextual utility is not a utility in the sense of Section 2.1. Indeed, +(R) is negative when the individual dislikes the context in which he chooses. However, following Shafer (1976, ch. 2), it is readily verified that the contextual 4 Note that the Shapley value (Shapley 1953) is identical to a Mobius inverse defined on the algebra of subsets (Gilboa 6 Schmeidler, 1992). A general approach to the Mobius inverse can be found in Rota (1964).

529

A DISCRETE CHOICE MODEL

utility +( } ) of any opportunity set is always nonnegative when the utility u( } ) is 2 n-monotone, that is, when U3 holds and, for all the subsets S i of A, the following inequality also holds: u

\

2n

+

n

. S i : u(S i )& : u(S i & S j )+ } } } +(&1) 2 +1 u i=1

i

i< j

\

2n

+

, Si . i=1

(3.8)

When u( } ) is 2 n-monotone, it is k-monotone for any k<2 n in the sense that (3.8) holds for any family of k subsets of alternatives. In particular, U3 corresponds to 1-monotonicity (k=1), while 2-monotonicity (k=2) means that u(S _ S$)u(S)+u(S$)&u(S & S$), which is identical to the property of supermodularity discussed below. The intuition behind k-monotonicity for k3 is not clear and raises questions of interpretation comparable to those for the differentiability of order larger than or equal to 3 in standard utility theory. Note that the Luce utility (2.5) is trivially 2 n-monotone since the probability is itself 2 n-monotone (see Jaffray, 1989, p. 102). We now show that the Mobius inverse of the choice capacity takes a very simple form when the latter is given by (2.1). Proposition 5. Assume that (GCA) holds. Then, for all T # 2 A such that u(T )>0, the Mobius inverse M[c T ( } )] of c T ( } ) is: for all S # 2 T M[c T (S)]=+(S)u(T ). Proof.

(3.9)

The Mobius inverse of c T ( } ) is given by (see (3.6)) M[c T (S)]= : (&1) *(S&R) c T (R). RS

Substituting c T (R)=u(R)u(T ) in this expression and again using (3.6) yields (3.9). Q.E.D Hence, under (GCA), the Mobius inverse of the choice capacity c T ( } ) retains the same form as the choice capacity when the utility of S is replaced by its contextual utility. In particular, the Mobius inverse of the capacity of choosing a # T is obtained from the Mobius inverse of the utility used to define this capacity. Note that, unlike the choice capacity, its Mobius inverse can be negative. However, when u( } ) is 2 n-monotone, it is nonnegative and corresponds to a basic probability in the sense of Shafer (1976, p. 38), where a basic probability extends the concept of probability by defining the corresponding distribution over the power set of alternatives instead of over the set of alternatives. 3.2. The Conversion Theorem Since our objective is to determine probabilities from the capacities (2.1) while retaining the influence of the context, we consider an operator ( } ) of the utility u( } ) based on the contextual utility +( } ).

530

BILLOT AND THISSE

Let * be any probability measure defined on 2 A and  *( } ) be the operator proposed by Sundberg 6 Wagner (1992):  *( } ): 2 A Ä R S Ä  *[u(S)]=

*(R & S | R) +(R).

:

(3.10)

RA R & S{<

In words,  *[u(S)] is a weighted combination of the contextual utilities of the opportunity sets R whose intersection with S is nonempty. The weight associated with R corresponds to the conditional probability *( } | R) of choosing in S when facing R. It is readily verified that  *[u(S)]=u(S) for all S # 2 A if and only if u( } ) is modular (see also Proposition 7). In the special case of singletons, when (GCA) holds, Proposition 5 implies that (3.10) can be rewritten as follows: for all T # 2 A such that u(T )>0 and all [a] # 2 T, p *T ( } ): 2 T Ä [0, 1] [a] Ä p *T (a)= *[c T (a)]=

:

*(a | R) } M[c T (R)].

(3.11)

a # RT

The result below gives a necessary and sufficient condition on the utility u( } ) for the mapping (3.11) to be a probability. Theorem 1. Assume that (GCA) holds and consider a utility u( } ) satisfying (U1) and (U2). Then, for all T # 2 A such that u(T )>0, p *T ( } ) as defined in (3.11) is a probability if and only u( } ) satisfies (U3). Proof.

There are three steps.

(i) Clearly, p *T (<)=0 since  *(0)=0 by (3.10) and p *T (T )=1 by (3.7), (3.9), and (3.10). (ii) Using (3.10), it is readily verified that p *T (S _ S$)= p *T (S)+ p *T (S$) for all S, S$ # 2 T such that S & S$=<. (iii)

It remains to show that p *T (a)0 if iff U3 holds.

(Necessity) This is shown by Theorem 1 by Sundberg 6 Wagner (1992). (Sufficiency) We prove this by showing that if U3 does not hold, then p *T (a) can be negative. Since U3 does not hold, there exists a subset T # 2 A, *T2, and a # T such that u(T )&u(T&a)<0. From (3.7), it then follows that : +(S)& ST

:

+(S)= : +(S)<0.

ST&a

ST a#S

Hence, dividing by u(T ) and using (3.9) and (3.10) implies that p *T (a) is negative. It remains to show that such a p *T ( } ) is consistent with the existence of a probability *. Let *(a)=1&=[1+(*A&2) =], *(x)==[1+(*A&2) =] for any x{a, and write p *T ( } ) as p =T ( } ). It is easy to check that lim p =T (a)= : M[c T (S)]=

=Ä0

ST a#S

1 } : +(S)<0. u(T ) ST a#S

A DISCRETE CHOICE MODEL

531

Since the final term is negative, there exists an =>0 such that * is positive (i.e., a Q.E.D probability) while p =T (a) is negative. Accordingly, probabilities under the form (3.11) can be derived from capacities if and only if the individual has a preference for flexibility. This means that the converted probabilities are consistent with context-dependent utilities. In general, these probabilities do not satisfy the Luce axiom given by (2.4). Furthermore, since the probability distribution * entering the operator (3.10) is arbitrary, many converted probabilities can be constructed. When frequencies emerge from the repeated choices made by the individual, this indeterminacy may be resolved by their observation. In this way, the converted probabilities may become observable. We now consider another property of the utility that leads to a neat comparison between capacities and converted probabilities. To this end, we follow Marley (1991, p. 206) who suggested that it might be valuable to explore the possible relevance of submodular and supermodular functions within the framework of discrete choice theory. In fact, this does allow us to gain more insight into the operator (3.10). A utility is said to be submodular (or concave) if \S, S$A, u(S _ S$)+u(S & S$)u(S$). A utility is called supermodular (or convex) if &u( } ) is submodular. It is readily verified that the neoclassical utility given by (2.2) is submodular. Since u( } ) is defined on 2 A, it is possible to extend the concept of marginal utility to the case of a finite set of alternatives: the discrete marginal utility of an alternative is given by the increment in utility that results from adding this alternative to the opportunity set. Formally, the discrete marginal utility of the alternative a when added to SA&[a], denoted dmu(a, S), is dmu(a, S)=u(S _ [a])&u(S). When the individual has a preference for flexibility, the discrete marginal utility of an alternative is always nonnegative. The following result provides an appealing property of a submodular utility. Proposition 6. Assume a utility u( } ) satisfying (U1) through (U3). Then, u( } ) is submodular (resp. supermodular) if and only if, for all STA&[a], we have dmu(a, S)dmu(a, T ) (resp. dmu(a, S)dmu(a, T )). Proof. (Necessity) Since u( } ) is a submodular function u[(S _ [a]) _ T]+ u[(S _ [a]) & T]u(S _ [a])+u(T ), from which it follows that u(T _ [a])+ u(S)u(S _ [a])+u(T ), i.e., dmu(a, T )dmu(a, S). (Sufficiency) Consider any two subsets S, T of A such that [a]#T&S & T. Since S & [a]=< and assuming dmu(a, S)dmu(a, S & T ), we have u(S _ [a])& u(S)u((S & T ) _ [a])&u(S & T ) which is equivalent to u(S _ [a])+u(S & T ) u(S)+u((S & T ) _ [a])=u(S)+u(T ). Then, we obtain: u(S _ T )+u(S & T ) u(S)+u(T ). The same argument applies, mutatis mutandis, to a supermodular utility. Q.E.D

532

BILLOT AND THISSE

Thus, a submodular (resp. supermodular) utility has a decreasing (resp. increasing) discrete marginal utility. In other words, the discrete marginal utility of an alternative is higher (resp. lower) when it is added to a smaller set (in the sense of inclusion). Intuitively, a submodular (resp. supermodular) utility expresses a decreasing (resp. increasing) preference for flexibility. Proposition 7. Assume that (GCA) holds and consider a utility u( } ) satisfying (U1) through (U3). Then, for all T # 2 A such that u(T )>0, p *T ( } ) as defined in (3.11) is a probability such that p *T (S)c T (S) (resp. p *T (S)c T (S)) for all S # 2 T if and only if u( } ) is submodular (resp. supermodular). Proof.

There are three steps.

(i) As shown in Billot (1991), for any capacity c T ( } ), there exists a dual capacity c dT ( } ) such that \ST, c T (S)+c dT (T&S)=1. Given (GCA) and Proposition 1, this implies that u(S)+u d (T&S)=u(T ) where u d ( } ) is also said to be dual. It is easy to check that if a utility u( } ) is submodular (resp. supermodular), its dual u d ( } ) is supermodular (resp. submodular), Because the dual of the dual utility is the utility itself, it is sufficient to establish the result in the case of a submodular utility. (ii) (Necessity) This part of the proof corresponds to Theorem 2 of Sundberg 6 Wagner (1992). (iii) (Sufficiency) Suppose that u( } ) is not submodular so that &u( } ) is not supermodular. Hence, Proposition 4 of Chateauneuf 6 Jaffray (1989) states that there exists a nonempty subset RT and a, b # R such that +(S)>0.

: [a, b]SR

When a=b, we have u(R)&u(R&a)>0. Thus, we can apply to &u( } ) the same argument as that of Theorem 1 of Sundberg 6 Wagner (1992) to obtain the desired result. When a{b, we obtain the desired result by showing the existence of a probability * such that p *T (R&a)>c T (R&a). Set R a =R&a. By (3.6), (3.10), and (3.11), we have p *T (R a )&c T (R a )=

{

=

*(R a & S | S) M[c T (S)] &c T (R a )

: ST; S & Ra {<

=

*(R a & S | S) M[c T (S)]

: ST; S Â 2Ra S & Ra {<

=

*(R a & S | S) M[c T (S)]

: SR; a # S

+

:

*(R a & S | S) M[c T (S)].

ST; S Â 2R

Let *(a)=*(b)=1&=[2(1&=)+(*T&2) =], *(x)==[2(1&=)+(*T&2) =] for each x # T&[a, b], and write p *T ( } ) as p =T ( } ). It is easy to check that lim p =T (R a )&c T (R a )=

=Ä0

: [a, b]SR

M[c T (S)].

A DISCRETE CHOICE MODEL

533

Since the final term is positive by (3.9), there exists =>0 such that * is positive and Q.E.D p *T (R a )>c T (R a ). This result provides a precise characterization of the probabilities with respect to the capacities fro which they are derived and sheds some additional light on the meaning of the operator (3.10). Specifically, the converted probabilities are all larger (resp. smaller) than or equal to the corresponding capacities when the individual has an increasing (resp. decreasing) preference for flexibility. In other words, when the utility is supermodular, the conversion (3.11) raises the value of the choice capacities (2.1) so that the good contexts tend to dominate the bad ones from the individual viewpoint. The converse is true when the utility is submodular. In the borderline case of a modular utility, we know that c T ( } ) is a probability. By Proposition 6, this probability remains exactly the same under the conversion (3.11). Consequently, we have the following. Corollary. axiom.

If u( } ) is modular, the converted probabilities (3.11) satisfy the Luce

The converse property does not generally hold: when T contains two alternatives, the submodular utility yields probabilities equal to Luce's. We make a pause here with the aim of putting our approach into perspectives. The probabilities (3.11) can be rewritten as follows, p *T (a)=

:

Q T (R) } *(a | R),

(3.12)

a # RT

where Q t(R)#M[c T (R)]. Expression (3.12) generalizes the general elimination model (GEM) suggested by Tversky (1972a) and studied by Marley (1989) to the extent that in our setting the Q T (R) may be negative and also may not sum to 1 while they must be nonnegative and sum to 1 in GEM. Actually, in our contextdependent model, when the individual contemplates the possibility of selecting the alternative a # T, he faces a priori all possible contexts defined by the subsets RT containing a. In each context R, called bad if +(R)<0 or good if +(R)>0, he has a probability *(a | R) of choosing a. However, the individual is aware that some contexts (the bad ones) are harmful to him while the others (the good ones) are beneficial, with a variable intensity measured by +(R)u(T ). Hence, p *T (a) is downgraded (resp. upgraded) when the context is bad (resp. good). Moreover, the worse (resp. the better) a context, the more he downgrades (resp. upgrades) the choice capacities. The individual has therefore a symmetrical attitude toward good and bad contexts. Finally, it is fair to say that our context-dependent model is more general than GEM in that it allows for some particular contexts to have a detrimental effect on the individual's satisfaction. A condition for our framework to imply GEM is that u( } ) is 2 n-monotone since the latter implies that all the Q T (R) are nonnegative and sum to 1 (see Shafer, 1976, Theorem 2.1). In fact, the utility is now such that the converted probabilities (3.11) are equivalent to the choice probabilities derived by Tversky (1972a, p. 347, Eq. 4) in the context of GEM. Furthermore, using Theorem 7 in Tversky (1972a), we have the following:

534

BILLOT AND THISSE

Proposition 8. Assume that u(T )>0 for all T # 2 A &< and u( } ) is a 2 -monotone utility. Then, the probabilities (3.11) are consistent with a random utility model. 5 n

In words, the property of 2 n-monotonicity implies a general elimination procedure a la Tversky while the corresponding converted probabilities (3.11) can be rationalized by a random utility defined on the set of alternatives. Another interesting property of the converted probabilities can also be obtained in the neoclassical model. We have seen that the capacities define a possibility measure (see (2.3)). As expected, when for each T # 2 A &< there is a single most preferred alternative, using Theorem 1 with appropriate *, it is easy to check that the converted probability of choosing this alternative is one whereas the converted probability of choosing any other alternative is zero. Furthermore, in the special case where the utility of each alternative is the same, the converted probabilities of such possibilities are uniformly distributed when * is the Luce probability. Proposition 9. Consider all T # 2 A such that u(T )>0 and assume * is given by the Luce probability. If there exists a given u such that for each a # T, u(a)=u, then p *T (a)=1*T. Proof.

Let T=[a 1 , a 2 , ..., a j , ..., a n ]. By (3.9) and (3.11),

p *T (a j )=c T (a j ) }

{

n +(a j ) +(a j , a k ) + : +(a j ) k=1; k{ j +(a j )++(a k )

n

+

: k, l=1; k, l{ j

+(a j a k , a l ) +(T ) + }}} + : +(a j )++(a k )++(a l ) +(a j ) a #T j

=

n

=1+

n +(T ) +(a j , a k ) +(a j , a k , a l ) + : + }}} + , 2u 3u nu k=1; k{ j k, l=1; k{ j

:

where c T (a)=1 since u(a j )=u for all j. Clearly, in the above expression, the RHS is independent of the alternative a j . Q.E.D This result corresponds to the extreme case in which the individual pays attention only to the number of alternatives contained in an opportunity set. Intuitively, it is therefore not surprising that the converted probabilities are equal across alternatives. We conclude by showing that the operator (3.10) is an additivization in the sense of Gilboa (1989). The definition of such an operator must satisfy the following four conditions: linearity, monotonicity, projectivity, and consistency. The definition of linearity and monotonicity is standard. Projectivity means that u( } ) is invariant under the operator  *( } ) when u( } ) is modular. Finally, consistency requires the operator  *( } ) and the Bayesian updating rule to be commutative. Theorem 2. Assume a utility u( } ) satisfying (U1) through (U3). Then, the operator (3.10) is an additivization. 5

See Anderson et al. (1992, ch. 2) for a survey of random utility theory.

A DISCRETE CHOICE MODEL

535

Proof. There are four steps. Let u( } ) and &( } ) be two utilities in the sense of Section 2.1. (i) Linearity:  *[:u( } )+(1&:) &( } )]=: *[u( } )]+(1&:)  *[&( } )]. First, given (3.6), it is clear that the Mobius inverse of :u(S) for all SA is equal to :+(S). Second, summing the Mobius inverse of two different functions is equivalent to taking the Mobius inverse of the sum. By definition of the operator (3.10), it follows immediately that  *( } ) is linear. (ii) Monotonicity: u( } )&( } ) O  *[( } )] *[&( } )]. Then, using (3.6), (3.7), and (3.10) shows that monotonicity holds for  *( } ), regardless of the distributions *. (iii) Projectivity:  *[u( } )]=u( } ) if u( } ) is modular. This follows from Proposition 7. (iv) Consistency:  *[( } )u(T )]= *[u( } )]u(T ). This follows from (3.10) and Proposition 4.

Q.E.D

4. RESOLUTION OF THE BLUE BUSRED BUS PARADOX

Debreu (1960) showed that the path independence property assumed by Luce (1959) may lead to implausible results in the probabilistic context. Consider the classical example of the blue busred bus paradox which is adapted from Debreu. Suppose that the individual must reach a given destination and has a probability 12 of using his cardenoted by cor of taking a bus. Suppose now that either one of two busesdenoted by rb and bbcan be used, identical except for their color, red or blue. The opportunity set is given by A=[c, bb, rb]. Assume that the individual pays no attention to color so that the two buses have the same probability of being chosen. Intuitively, one would expect the respective choice probabilities to be given by p A(c)=12 and p A(bb)= p A(rb)=14. However, it is well known that the Luce axiom (2.4) implies that each probability is equal to 13 since the choice probabilities of the two existing alternatives are affected in the same way by introducing the second bus. Assume a utility consistent with the choice situation above. First, we must have u(c)=u(bb)=u(rb)=:>0. In our model, the utility is also defined on all opportunity sets. We therefore assume u(bb _ rb)=: and u(c _ bb)=u(c _ rb)= u(c _ bb _ rb)=;. If the individual has a genuine preference for flexibility in that he values the opportunity of facing a car and a bus (whatever its color), but not the possibility of facing two buses instead of one, it must be that ;>:. Loosely speaking, the discrete marginal utility of adding the car or the first bus to the opportunity set is strictly positive, while the discrete marginal utility of adding the second bus is zero. By (2.1), it follows that c A(c)=c A(bb)=cA(rb)=c A(bb _ rb)=:;. Clearly, these choice probabilities are never additive. Consider first the opportunity set T=[c, bb]. Then c T (c)=c T (bb)=:;. A simple but not unique way of revisiting the paradox is to assume that * is given by the Luce probabilities. Computing the converted probabilities (3.11) shows that p *T (c)= p *T (bb)=12, whatever :<;. These correspond to the binary choice probabilities assumed by Luce. Let us now come to the more general case where

536

BILLOT AND THISSE

T=[c, bb, rb]. Again, the choice capacity of each alternative is given by :;. Applying (3.11) shows that p *T (c)=(2;&:)3; while p *T (bb)= p *T (rb)=( ;+:)6;. Since ;>:, the probability of choosing the car is always strictly larger than the probability of choosing either bus. In other words, a genuine preference for flexibility, i.e., u(c _ bb)>u(c)=u(bb), is sufficient to destroy the equiprobability of choices in the blue busred bus paradox. In particular, when ;=2:, it is immediate to see that the probability of choosing the car is 12 while the probability of choosing the blue bus (the red bus) is 14. However, when :=; so that u(c _ bb)=u(c)=u(bb), the converted probabilities are all equal to 13. Though we have kept the path independence property as in Luce, describing the individual's behavior by means of a choice capacity and assuming a genuine preference for flexibility leads to converted probabilities which agree with intuition.

5. CONCLUDING REMARKS

We have proposed a descriptive model of choice for an individual whose unconscious utility is defined on the power set of alternatives and can be, therefore, context-dependent. This utility gives rise to choice capacities which incorporate a tendency toward utility maximization. Furthermore, converting these capacities yields choice frequencies which are intuitively appealing. Yet, we would certainly not argue that the individual should behave according to this model in order to maximize his satisfaction in a dynamic choice context (see Kreps (1988, ch. 13)). On the contrary, the individual described in this paper has an imperfectly rational behavior. Much work remains to be done. First, the model above should be tested in experiments by resorting to different specifications of the utility u( } ). It is likely that the converted probabilities are consistent with different utilities, thus making it difficult to discriminate between the underlying utilities. Second, econometric methods should be devised in order to estimate efficiently the converted probabilities as done in discrete choice theory (see, e.g., Mc Fadden, 1984). Third, noting that Marley (1989) has been able to identify a family of random utility models that yields closed form choice probabilities, we suggest that it would be worthwhile to undertake a similar exercise for our converted probabilities. In particular, one should identify explicit distributions consistent with specific utility functions. Last, while probabilistic choice models have proven to be useful in describing consumer behavior (see, e.g., Anderson et al. (1992, chs. 6 and 7)), it is not clear yet how the converted probabilities can be used in market equilibrium analyses and what kind of results can be expected.

REFERENCES Anderson, S. J., De Palma, A., 6 Thisse, J. F. (1992). Discrete choice theory of product differentiation. Cambridge, MA: MIT Press. Arrow, J. (1959). Rational choice functions and orderings. Economica, 26, 121127. Aumann, R. J. (1997). Rationality and bounded rationality. Games and Economic Behavior, 21, 214.

A DISCRETE CHOICE MODEL

537

Belk, R. W. (1975). Situational variables and consumer behavior. Journal of Consumer Research, 2, 157164. Billot, A. (1991). Cognitive rationality and alternative belief measures. Journal of Risk and Uncertainty, 4, 299324. Billot, A. (1998). Autobiased choice theory. Annals of Operations Research, 80, 85103. Chateauneuf, A., 6 Jaffray, J. Y. (1989). Some characterizations of lower probabilities and other monotone capacities through the use of Mobius inversion. Mathematical Social Sciences, 17, 263283. Chen, H.-C., Friedman, J. W., 6 Thisse, J.-F. (1997). Boundedly rational Nash equilibrium: a probabilistic choice approach. Games and Economic Behavior, 18, 3254. Choquet, G. (1953). Theorie des capacites. Annales de l 'Institut Fourier, 5, 131295. Debreu, G. (1960). Review of R. D. Luce, Individual choice behavior: A theoretical analysis. American Economic Review, 50, 186188. Edgell, S. E., 6 Geisler, W. S. (1980). A set-theoretic random utility model of choice behavior. Journal of Mathematical Psychology, 21, 265278. Ellsberg, D. (1961). Risk, ambiguity and the Savage axiom. Quarterly Journal of Economics, 75, 643669. Falmagne, J. C. (1985). Elements of psychophysical theory. Oxford: Clarendon. Falmagne, J. C., 6 Regenwetter, M. (1996). A random utility model of approval voting. Journal of Mathematical Psychology, 40, 152159. Gilboa, I. (1987). Expected utility with purely subjective nonadditive probabilities. Journal of Mathematical Economics, 16, 6588. Gilboa, I. (1989). Additivizations of nonadditive measures. Mathematics of Operations Research, 14, 117. Gilboa, I., 6 Schmeidler, D. (1989). Maxmin expected utility with nonunique prior. Journal of Mathematical Economics, 18, 141153. Gilboa, I., 6 Schmeidler, D. (1992). Additive representations of nonadditive measures and the Choquet integral. Mimeo. Hausman, J., 6 Wise, D. A. (1978). A conditional probit model for qualitative choice: discrete decisions recognizing interdependence and heterogeneous preferences. Econometrica, 46, 403426. Jaffray, J. Y. (1989). Coherent bets under partially resolving uncertainty and belief functions. Theory and Decision, 26, 99105. Kreps, D. M. (1979). A representation theorem for preference for flexibility. Econometrica, 47, 565578. Kreps, D. M. (1988). Notes on the Theory of Choice. Boulder: Westview Press. Lovasz, L. (1983). Submodular functions and convexity. In A. Bachem, M. Grotschel, 6 B. Karte (Eds.), Mathematical programming: The state of the art, pp. 235257. Heidelberg: SpringerVerlag. Luce, R. D. (1959). Individual choice behavior: A theoretical analysis. New York: Wiley. Machina, M. J. (1985). Stochastic choice functions generated for deterministic preferences over lotteries. Economic Journal, 59, 575594. Marley, A. A. J. (1989). A random utility family that includes many of the classical models and has closed form choice probabilities and choice times. British Journal of Mathematical and Statistical Psychology, 42, 1336. Marley, A. A. J. (1991). Context dependent probabilistic choice models based on measure of binary advantages. Mathematical Social Sciences, 21, 201231. Mc Alister, L., 6 Pessemier, E. (1982). Variety seeking behavior: an interdisciplinary approach. Journal of Consumer Research, 9, 311322. Mc Fadden, D. (1978). Modelling the choice of residential location. In A. Karlvist, L. Lundqvist, F. Snickars, 6 J. Weibull (Eds.), Spatial interaction theory and planning models, pp. 7596. Amsterdam: North-Holland.

538

BILLOT AND THISSE

Mc Fadden, D. (1981). Econometric models of probabilistic choice. In C. F. Manski 6 D. Mc Fadden (Eds.), Structural analysis of discrete data with econometric applications, pp. 198272. Cambridge, MA: MIT Press. Mc Fadden, D. (1984). Econometric analysis of qualitative response models. In Z. Griliches 6 M. D. Intriligator (Eds.), Handbook of econometrics (Vol. 2), pp. 13951457. Amsterdam: NorthHolland. Miller, G. A. (1956). The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychological Review, 63, 8197. Mirrlees, J. A. (1986). Economic policy and nonrational behaviour. Mimeo, Nuffield College, Oxford. Pessemier, E. (1978). Stochastic properties of changing preferences. Papers and Proceedings of the American Economic Associations, 68, 380385. Rota, G. C. (1964). Theory of Mobius functions. Zeitschrift fur Wahr. Theori und Verw. Gebiete, 2, 340368. Schmeidler, D. (1982). Subjective probability without additivity. Working paper. Foerder Institute for Economic Research, Tel-Aviv University. Schmeidler, D. (1989). Subjective probability and expected utility without additivity. Econometrica, 57, 571587. Sen, A. 6 Smith, T. E. (1995). Gravity models of spatial interaction behavior. Heidelberg: Springer-Verlag. Shafer, G. (1976). A mathematical theory of evidence. Princeton: Princeton University Press. Shapley, L. S. (1953). A value for n-person game. In H. Kuhn 6 A. W. Tucker (Eds.), Contribution to the theory of games II, pp. 307317. Princeton: Princeton University Press. Simon, H. (1957). Models of man. New York: Wiley. Sundberg, C., 6 Wagner, C. (1992). Characterizations of monotone and 2-monotone capacities. Journal of Theoretical Probability, 5, 159167. Tversky, A. (1972a). Choice by elimination. Journal of Mathematical Psychology, 9, 341367. Tversky, A. (1972b). Elimination by aspects: a theory of choice. Psychological Review, 79, 281299. Tversky, A., 6 Koehler, D. J. (1994). Support theory: a nonextensional representation of subjective utility. Psychological Review, 101, 547567. Zadeh, L. A. (1978). Fuzzy sets as a basis for a theory of possibility. Fuzzy Sets and Systems, 1, 327. Received: May 12, 1997; revised: April 14, 1998