If you choose not to decide, you still have made a choice

Journal of Choice Modelling 22 (2017) 13–23 Contents lists available at ScienceDirect Journal of Choice Modelling journal homepage: www.elsevier.com...

Download PDF

483KB Sizes 0 Downloads 65 Views

Report

PDF Reader
Full Text

Journal of Choice Modelling 22 (2017) 13–23

Contents lists available at ScienceDirect

Journal of Choice Modelling journal homepage: www.elsevier.com/locate/jocm

If you choose not to decide, you still have made a choice Francisco J. Bahamonde-Birke

a,b,c,⁎

d

, Isidora Navarro , Juan de Dios Ortúzar

MARK e

a

Institut für Verkehrsforschung, Deutsches Zentrum für Luft- und Raumfahrt (DLR), Germany b Energy, Transportation and Environment Department, Deutsches Institut für Wirtschaftsforschung, Berlin, Germany c Technische Universität Berlin, Germany d Department of Transport Engineering and Logistics, Pontiﬁcia Universidad Católica de Chile, Chile e Department of Transport Engineering and Logistics, Centre for Sustainable Urban Development (CEDEUS), Pontiﬁcia Universidad Católica de Chile, Chile

A R T I C L E I N F O

ABSTRACT

Keywords: Discrete choice models Indiﬀerence Stated-choice experiments

When designing stated-choice experiments modellers may consider oﬀering respondents an “indiﬀerence” alternative to avoid stochastic choices when utility diﬀerences between alternatives are perceived as too small. By doing this, the modeller avoids adding white noise to the data and may gain additional information. This paper proposes a framework to model discrete choices in the presence of indiﬀerence alternatives. The approach allows depicting the likelihood function, independent of the number of alternatives in the choice-set and in the subset of indiﬀerence alternatives, oﬀering a new approach to existing methods that are only deﬁned for binary choice situations. The method is tested with the help of simulated and real data observing that the proposed framework allows recovering the parameters used in the generation of the synthetic datasets without major diﬃculties in most cases. Alternative approaches, such as considering the indiﬀerence option as an opt-out alternative or ignoring the indiﬀerence choices are clearly outperformed by the proposed framework and appear not capable of recovering parameters in the simulated set.

1. Introduction Discrete choice models rely on the assumption that individuals are rational decision makers that maximize their utility when facing a given choice situation. This way, individuals will opt for a given alternative if and only if it promises them the maximum expected utility among all alternatives in their choice-sets (Thurstone, 1927; McFadden, 1974). Nevertheless, establishing which alternatives should be considered into the individuals’ choice-set is not an easy task. In real situations the modeller will just observe the chosen alternatives and needs to construct the choice-sets of the individuals on the basis of their characteristic and their choices. This involves major diﬃculties, as it is well established that people tend to narrow their decisions to only a subset of the potentially available options (Roberts and Lattin, 1991; Swait and Erdem, 2007). By contrast, when dealing with stated preference (SP) data the choice-set must be established a priori. In this case, it is important that it be carefully deﬁned preserving the realism of the choice situations. Thus, in many cases it might be necessary to consider an opt-out (non-purchase) alternative (Carson et al., 1994; Olsen and Swait, 1998); whether this alternative should be included directly into the choice-set (Louviere et al., 2000) or indirectly via dual response procedures (Dhar and Simonson, 2003) remains a debatable point (see Schlereth and Skiera (2016) for a good discussion), but the necessity of alternatives accounting for a reservation utility level

⁎

Corresponding author at: Institut für Verkehrsforschung, Deutsches Zentrum für Luft- und Raumfahrt (DLR), Germany. E-mail addresses: [email protected], [email protected] (F.J. Bahamonde-Birke), [email protected] (I. Navarro), [email protected] (J.d.D. Ortúzar). http://dx.doi.org/10.1016/j.jocm.2016.11.002 Received 23 February 2016; Received in revised form 23 November 2016; Accepted 26 November 2016 1755-5345/ © 2016 Published by Elsevier Ltd.

Journal of Choice Modelling 22 (2017) 13–23

F.J. Bahamonde-Birke et al.

higher than the expected utility of all alternatives in the choice-set is well-established (Kontoleon and Yabe, 2003). A similar but far less analysed problem is the inclusion of indiﬀerence alternatives in the choice-set. If the modeller does not allow for respondents to state their indiﬀerence among two or more alternatives, they will be forced to opt for one of them in a rather stochastic manner, adding white noise to the experiment. Additionally, doing so would provide less information about the individuals’ preferences leading to loss of eﬃciency. Indeed, Cantillo et al. (2010) used a synthetic dataset to show that assigning the preferences associated with an indiﬀerence alternative randomly, diminished signiﬁcantly the model's capability to recover the input parameters. Furthermore, Cantillo et al. (2010) also considered real databanks, observing than oﬀering the possibility of stating indiﬀerence may indeed aﬀect the outcome of the experiment (the estimated parameter values). Along these lines, empirical evidence (Dhar, 1997; Fenichel et al., 2009) shows that in experiments including non-purchase options, indiﬀerence situations may artiﬁcially increase the probability of selecting the opt-out1 alternative, as a kind of cognitive bias. Therefore, it appears advisable to include indiﬀerence alternatives when also oﬀering non-purchase options as a way to reduce cognitive biases. Nevertheless, including indiﬀerence alternatives should be carefully considered, as it might generate other kind of complications, especially if individuals are overwhelmed by the complexity of the choice experiment. Both situations (opt-out and indiﬀerence alternatives) exhibit, however, substantial diﬀerences; while the former suggest the existence of a reservation utility that is higher than the utility provided by the alternatives in the choice set, the latter indicates that individuals ascribe the same utility to two or more alternatives in it (this utility being higher than the reservation utility). Therefore, in the ﬁrst case an extra alternative accounting for this reservation prize should be considered. Nevertheless, considering an extra alternative to reﬂect indiﬀerence choices does not seem to be appropriate, as it does not reﬂect the causes leading to the statement of indiﬀerence; in fact, by treating indiﬀerence as a new (opt-out) alternative, the modeller implicitly assumes that the utility ascribed to this new option would be greater than that of the competing alternatives, which is clearly not the case. Despite the fact, that according to classical theory indiﬀerence situations will only arise if the expected utility of two or more alternatives is the same (curves of indiﬀerence), the underlying behavioural theory behind the indiﬀerence phenomena suggests the existence of perception thresholds, below which the individuals are not able to perceive diﬀerences between two stimuli (Quandt, 1956; Coombs et al., 1970; Cantillo and Ortúzar, 2006). Krishnan (1977) developed an operational discrete choice model accounting for the existence of indiﬀerence thresholds. This approach (Minimum Perceivable Diﬀerences model, MPD) allows taking into account the fact that observations falling into the indiﬀerence interval would be assigned stochastically to one alternative, in the context of a binary choice situation. Cantillo et al. (2010) expanded the MPD-approach to allow for individuals stating their indiﬀerence in stated-choice (SC) experiments. This way, the indiﬀerence alternative would be selected if the diﬀerence between the utility of both alternatives was smaller than a threshold, to be estimated. The main limitation of the MPD-approach is that it only allows considering binary choice situations. Thus, the method can neither consider situations where two alternatives exhibit a similar utility (which is superior to all other alternatives in the choiceset), nor cases when three or more choices report an apparently identical utility (which may be of particular interest when considering alternatives to ﬁrst-choice SP experiments, such as rankings). The same limitation arises, when considering approaches such as an ordered logit framework (with indiﬀerence being an intermediate choice between two binary alternatives). A method that allows accommodating more than two alternatives, consists in assuming that instead of behaving as utility maximizers, the individuals minimize their regret (RRM framework; Chorus, 2012a). Under this assumption, the regret associated with a certain alternative is given by the direct comparison of its attributes with those of all remaining alternatives in the choice set (whereby only a negative performance would generate regret). Thus, including an extra null-alternative (without attributes) into the model cannot longer be associated with a higher reservation utility, but would rather stand for a level of regret, above which none of the alternatives in the choice-set is favoured (i.e. none of the alternatives signiﬁcantly minimizes the regret in comparison with other options in the choice-set, Chorus, 2012b). Hence, under this assumption an extra alternative would allow to capture indiﬀerence (Hess et al., 2014). Nevertheless, this approach does not appear appropriate to deal with non-binary choice situations, as the extra alternative would be indicative of indiﬀerence among the whole choice-set, not allowing to consider indiﬀerence among a sub-set of alternatives (e.g. the respondent is indiﬀerent among two alternatives, but both alternatives are preferred over the remaining alternatives in the choice-set). Furthermore, it does not seem appropriate to consider experiments allowing for both the option of stating indiﬀerence as well as opting out, because: (i) it would be necessarily a non-binary situation, and (ii) it would imply considering two diﬀerent nullalternatives, which should account for two completely diﬀerent phenomena. Finally, the approach would necessarily require the analyst to assume a regret minimization strategy (ideally, the modeller should aim at an approach that allows considering indiﬀerence under diﬀerent assumptions, e.g. regret minimization or utility maximization, and discern between them on an empirical basis). This paper discusses the implications of indiﬀerence choices concerning the utility ascribed to the diﬀerent alternatives in the choice-set. Along these lines, the paper presents a new approach that allows dealing with indiﬀerence choices in multinomial choice situations. This framework allows not only addressing ﬁrst-choice SP experiments but also rankings, where indiﬀerence choices may be expected to appear more often. The approach is tested with the help of simulated and real datasets, observing that it clearly

1 For the purposes of this paper, it is assumed that opting-out implies that none of the alternatives satisﬁes the individual's requirements (i.e. a non-purchase option). This is highly recommended in non-pivoted SP experiments, as with totally new options it could well be that none is acceptable and not presenting an opt-out may bias results (Olsen and Swait, 1998).

14

Journal of Choice Modelling 22 (2017) 13–23

F.J. Bahamonde-Birke et al.

outperforms other methods allowing for a proper estimation of parameters. 2. Modelling with indiﬀerence alternatives According to random utility theory (Thurstone, 1927; McFadden, 1974) in a given choice situation an individual q would opt for alternative i belonging to choice-set A if and only if Uiq > Ujq ∀ j ≠ i ∈ A. Then, the probability with which the individual would select this alternative is given by:

Piq = P(Uiq > Ujq )

∀j≠i∈A

(2.1)

The expected utility U can be expressed in terms of a representative component V, characterized through concrete and measurable properties of the alternatives and the individuals, and an error term ε representing all unknown (for the analyst) elements of the decision. Assuming an additive speciﬁcation (other structures are also possible, but for didactic purposes we assume the more common speciﬁcation), (2.1) can be rewritten as:

Piq = P(Viq + εiq > Vjq + εjq ) = P(Viq − Vjq > εjq − εiq )

∀j≠i∈A

(2.2)

Finally, the likelihood of observing a set of given choices is given by:

L=

∏ ∏ Pjq yjq, q

(2.3)

j∈A

where yjq takes the value of one if alternative j is selected by individual q and zero otherwise. 2.1. Stating indiﬀerence If the modeller allows for individual q to state his indiﬀerence between n alternatives belonging to an indiﬀerence-set B, which is a subset of the complete choice-set A, (consisting of m alternatives, with m≥n), the choice probability of the alternatives belonging to B, may be written as follows:

Pi = Pj = Pk = …

∀ i, j, k , … ∈ B

(2.4)

As under random utility theory all alternatives belonging to B must maximize the individual's expected utility U, Eq. (2.4) can be rewritten in these terms:

Ui ≈ Uj ≈ Uk ≈ …

∀ i, j, k , … ∈ B,

(2.5)

where the inequality accounts for the existence of utility diﬀerences (smaller than a given threshold) that are not being perceived by the respondents. If we introduce error terms ϕ, then Eq. (2.5) can be expressed in terms of equalities (where ϕ is an error term merely guaranteeing that the utility diﬀerences within a given indiﬀerence-set be equal to zero):

Ui + ϕi = Uj + ϕj = Uk + ϕk = …

∀ i, j, k , … ∈ B

(2.6)

Then, expressing the expected utility in terms of a representative utility V and the aforementioned error terms ε (accounting for all unknown elements of the decision but the indiﬀerence thresholds leading to the statement of indiﬀerence) leads to the following expression:

Vi + εi + ϕi = Vj + εj + ϕj = Vk + εk + ϕk = …

∀ i, j, k , … ∈ B

(2.7)

Finally, the probability associated with an alternative in the indiﬀerence set would be given by:

Pri = P(Vi − Vj > εj + ϕj − εi − ϕi )

∀ j ≠ i ∈ A ∧ ϕj = 0, ∀ j ∉ B

(2.8)

where Pri≠Pi, as it is associated with a larger error component (due to ϕ). When modelling with indiﬀerence alternatives, it cannot be assumed that all alternatives in the indiﬀerence-set are selected at the same time, but rather that all are selected with a frequency 1/n. Hence, when considering indiﬀerence options, the likelihood of observing a given choice can be expressed as:

L=

∏ Prj yj, (2.9)

j∈A

where yj takes the value of 1/n if alternative j∈B and zero otherwise. If we consider a group of individuals M selecting indiﬀerencesets and another group of individuals L selecting unique preferred alternatives (where M and L are mutually exclusive), the general likelihood function for the population will take the following form:

L=

∏ ∏ Pjq yj⋅ ∏ ∏ Prjq yj, q∈M j∈A

(2.10)

q∈L j∈A

It is important to notice that using a value of yj equal to one for all alternatives in an indiﬀerence-set would result in overweighting all observations by respondents selecting indiﬀerence options. Also, note that the likelihood function would be maximized when all alternatives in an indiﬀerence-set are equally likely (AM-GM inequality). Even though, the above framework was derived for utility maximization, it is straightforward to extend it to consider regret 15

Journal of Choice Modelling 22 (2017) 13–23

F.J. Bahamonde-Birke et al.

minimization strategies (Chorus, 2012a). Furthermore, as the meaning of null-alternatives is diﬀerent when pursuing regret minimization and utility maximization strategies, this approach would allow for a fair comparison between both sets of assumptions in the presence of indiﬀerence. This would allow for the modeller to discern which set of assumptions oﬀers a better representation of the underlying decision-making process on the basis of the empirical evidence. Similarly, it is easy to extend it for ranking preference elicitation procedures when two or more alternatives at diﬀerent depths of the ranking order are considered equivalent (i.e. the respondent cannot choose among them). In this case, the modeller would rely on the choices arising by exploiting the ranking2 (Chapman and Staelin, 1982; Bradley and Daly, 1994) and the framework would apply to those cases where there is indiﬀerence between two or more alternatives (i.e. allowing to deal with ties at any depth of the ranking order). 2.2. Identiﬁcation of the error term ϕ Identiﬁcation and the distribution of the error terms ϕ are complicated topics. First, the errors only appear in the utility function if the alternatives are selected as part of an indiﬀerence-set (as part or Pri), which may signiﬁcantly decrease the number of observations available for estimation. Furthermore, this additional error term will only be associated with the alternatives that are indeed part of a given indiﬀerence set; thus, the utility functions of the remaining alternatives (not selected as a part of the indiﬀerence set) will not be aﬀected by this term. For this reason, non-symmetrical assumptions regarding its distribution are unsuitable (the estimator would diverge, as it would only be associated with selected alternatives). Moreover, establishing an adequate functional form for the error term ϕ is not an easy task, as it does not comply with the usual assumptions regarding error terms. In fact, ϕ is an error component that merely guarantees that the diﬀerence among two or more diﬀerent utilities is equal to zero; that is, the error is being added to the alternatives associated with the lesser utilities or subtracted from the alternatives associated with higher utilities, so that the utility of all alternatives in the indiﬀerence-set add up to the same amount. Hence, these errors would not be properly represented by the usual assumptions concerning error terms. That, in conjunction with the aforementioned unsuitability of non-symmetrical distributions, creates major diﬃculties. If the modeller assumes that the error term ϕ follows a distribution equal to the diﬀerence between two Logistic distributions with diﬀerent scale parameters,3 the sum εi − εj + ϕ would also be represented by a Logistic distribution, with a smaller scale parameter (i.e. with a larger standard deviation). As a consequence, the utility functions of alternatives being selected as a part of an indiﬀerence-set would be equivalent to the utility functions of the single-choice framework (Pi), but would be associated with a smaller scale parameter, reﬂecting the increased uncertainty.4 A limitation of this approach is that it does not allow quantifying the thresholds leading to the statement of indiﬀerence; notwithstanding, the model stills oﬀers an adequate depiction of the underlying utility functions associated with the diﬀerent alternatives as well as a functional form for forecasting. Additionally, the approach allows considering diﬀerent scale parameters depending on the number of alternatives in the indiﬀerence-sets, or on which alternatives are being considered as part of the indiﬀerence-sets, as well as on the socio-economic characteristics on the individuals. Because of identiﬁcation issues, the modeller is forced to normalize one of the aforementioned scale parameters (either that related to the equations for single choices – when the respondent selects a unique alternative - or that associated with alternatives selected as part of an indiﬀerence-set). Both normalizations would lead, however, to diﬀerent estimates; if the modeller normalizes the error associated with the single choices, s/he would be scaling the estimates upwards (when comparing them with the outcomes of an alternative model, where the scale parameter was ﬁxed at one for the whole sample), as single choices are related to a lesser degree of uncertainty (i.e. the diﬀerences of representative utilities are larger, in average, when indiﬀerence situations are left out or alternatively, indiﬀerence situations would arise more often when utility diﬀerences are smaller). Thus, when ﬁxing the scale parameter for single choices the estimates are no longer comparable with alternative speciﬁcations.5 In this case, the analyst does not attempt to estimate separate models for individuals choosing single alternatives or indiﬀerence sets, but rather to estimate a parsimonious model that is consistent with both kind of choices (and the estimates of which are comparable with models estimated without introducing the error term ϕ). A possible way to deal with this problem is to ﬁx the scale parameter associated with the weighted average of the sample (normally at one, but other values may be chosen). In this form, the analyst would estimate k-1 parameters (with k representing the number of diﬀerent scale parameters), and express the last one in terms of the others to be estimated as in:

λ⋅N = λ1⋅n1 + λ 2⋅n 2 + λ 3⋅n3 + …,

(2.11)

where N represents the total number of observations in the sample, λ the weighted average of the scale parameters, λj scale parameters to be estimated, and nj the number of observations associated with a given scale parameter. A ﬁnal characteristic of the approach is that when considering only two alternatives, diﬀerent scale parameters cannot be 2 By exploiting the ranking the analyst assumes that the ﬁrst alternative (in the ranking) would be selected in a choice-set considering all alternatives. The second ranked alternative would be the choice in a set including all alternatives but the ﬁrst ranked, and so on. 3 While this assumption does not appear to be wholly appropriate to describe the aforementioned response, it evidences itself as fairly convenient for modelling issues. 4 This framework resembles the structure used to deal with RP and SP data simultaneously. 5 Naturally, as the scale parameters are ﬁxed without loss of generality, the model would still be appropriate, but a correction would be required in order to make diﬀerent speciﬁcations comparable.

16

Journal of Choice Modelling 22 (2017) 13–23

F.J. Bahamonde-Birke et al.

estimated. This relates to the fact that when all alternatives are equally probable, all estimates must necessarily be equal to zero and therefore any scale parameter multiplying a group of indiﬀerence sets containing all alternatives would be unidentiﬁed (and therefore can be excluded without loss of generality). As in binary choice situations every indiﬀerence-set must necessarily include both alternatives, considering diﬀerent scale parameters in this case would be redundant. As a corollary, considering diﬀerent scale parameters would only impact the model when those can be associated with at least one indiﬀerence-set (selected by at least a part of the population) not containing all alternatives in the choice-set. 3. Study cases To test our framework as well as the impact of introducing indiﬀerence alternatives into the model, we conducted four diﬀerent empirical tests. First, we considered a simulation exercise consisting of only two alternatives (binary choice situation). Then, we analysed a real case considering a similar binary structure. Then, a second simulation exercise introduced an opt-out alternative (additional to the possibility of stating indiﬀerence), and ﬁnally, we tested the framework using real data consisting of a binary choice set allowing for indiﬀerence as well as for opting out. 3.1. Simulated binary choices Our ﬁrst simulated scenario considered binary choice situations where the utility functions of the alternatives were given by:

U1 = β1⋅X1 + ε1 U2 = ASC2 + β2⋅X2 + ε2

(3.1)

Here, individuals are supposed to choose the alternative with the highest expected utility or to state their indiﬀerence if the utility diﬀerence between both alternatives is smaller than a given threshold. We took X1 and X2 as random draws from Normal distributions with mean 0 and −1, and standard deviations 1 and 1.5, respectively. The error terms were assumed to be independently EV1 distributed with mean 0 and scale parameter 1; both β parameters as well as the ASC were ﬁxed to 1, while three diﬀerent values were tested for the threshold: 0.15, 0.3 and 0.45 representing small, medium and large indiﬀerence intervals, respectively. In all three cases 5000 pseudo-individuals were generated observing that 254, 495 and 739 stated their indiﬀerence between both alternatives, respectively. With this dataset we estimated four models, considering diﬀerent approaches to address the indiﬀerence alternatives. Model 1 considered all indiﬀerence options as an extra alternative; thus, in our utility maximization framework this extra alternative represents opting-out rather than indiﬀerence.6 Model 2 presents the results for a case where the individuals stating their indiﬀerence are ignored. Model 3 does not account for the fact that all alternatives in a selected indiﬀerence-set are not observed with the same frequency (if a given alternative is selected as a part of a binary indiﬀerence set, in reality, it would only be selected with a frequency of 0.5) as alternatives being selected directly; hence, Model 3 assigns a weight (yj) of 1 to all observations. Finally, Model 4 considered the proposed framework (Eq. (2.10)). All models were estimated using PythonBiogeme (Bierlaire, 2003), and the results are presented in Table 3.1 (the standard deviation of the estimated parameters is shown in parenthesis). As can be observed, considering the indiﬀerence options as opt-out alternatives yields a signiﬁcantly worse goodness-of-ﬁt than the other approaches. Nevertheless, in this particular case the log-likelihood does not oﬀer meaningful insights, as Model 1 considers one alternative more (aﬀecting the choice probabilities), while Models 2 and 3 consider less/more observations (ignoring/ overweighting the observations associated with smaller utility diﬀerences would artiﬁcially diminish/increase the error and goodness-of-ﬁt). An analysis of the parameters provides more useful information. Models 1–3 are not capable of recovering the values used in the generation of the dataset (the t-tests of equality are rejected at a signiﬁcance level of 5%, for at least part of the estimators in every case). This may be related to the fact that the artiﬁcial increase/descent of the error aﬀects the scale parameter, biasing the results. Model 4 (the proposed framework) recovers the parameters used in the generation of the dataset without major diﬃculties. Regarding the indiﬀerence intervals, it can be observed that the diﬀerences between the approaches increase (as well as the diﬃculties associated with the parameter recovery for Models 1–3) as the indiﬀerence threshold gets greater leading to more individuals stating their indiﬀerence. Model 4 performs adequately in every case. 3.2. Real dataset: binary choices The second study case consists of three diﬀerent datasets that were collected as a part of the same survey on passenger rail transportation in Southern Chile (Appendix A presents an example of the way alternatives were presented to the individuals). The diﬀerent datasets are associated with three diﬀerent kinds of trips: short and long interurban trips and urban trips. In this regard, the individuals were asked to choose between two unlabelled hypothetical alternatives, having also the choice of stating their indiﬀerence among both alternatives. All experiments considered the same attributes: fare (P), travel time (TT), access time (AT) and 6 As we are considering a binary choice situation in this speciﬁc case, it would have been possible to test the approach put forward by Hess et al. (2014) by assuming regret minimization. Nevertheless, such comparison would be spurious as we assumed utility maximization when constructing the simulated dataset; the same applies (though to a lesser extent) to alternatives methods, such as the MPD-approach (Cantillo et al., 2010).

17

Journal of Choice Modelling 22 (2017) 13–23

F.J. Bahamonde-Birke et al.

Table 3.1 Estimated models. Case study 1. Threshold

Variable

Model 1

Model 2

Model 3

Model 4

0.15

X1 X2 ASC2 ASCopt-out

0.970 (0.0392) 0.964 (0.0307) 0.905 (0.0454) −2.30 (0.0679)

1.10 (0.0445) 1.07 (0.0345) 1.04 (0.0503) –

0.973 (0.0397) 0.950 (0.0304) 0.909 (0.045) –

1.03 (0.0419) 1.01 (0.0323) 0.968 (0.0474) –

Final log-likelihood No. of observations

−3248.472 5000

−2190.648 4746

−2626.479 5254

- 2410.844 5000

X1 X2 ASC2 ASCopt-out

0.897 (0.0379) 0.969 (0.0304) 0.933 (0.0451) −1.55 (0.0516)

1.14 (0.0430) 1.16 (0.0344) 1.19 (0.0489) –

0.904 (0.0381) 0.933 (0.0294) 0.934 (0.0438) –

1.00 (0.0401) 1.03 (0.0314) 1.04 (0.0459) –

Final log-likelihood No. of observations

−3693.051 5000

−1978.882 4505

−2812.637 5495

- 2403.098 5000

X1 X2 ASC2 ASCopt-out

0.846 (0.0366) 0.906 (0.0294) 0.872 (0.0441) −1.05 (0.0445)

1.15 (0.0484) 1.15 (0.0383) 1.18 (0.0561) –

0.819 (0.0353) 0.811 (0.0267) 0.834 (0.0413) –

0.945 (0.0402) 0.941 (0.0309) 0.965 (0.0468) –

Final log-likelihood No. of observations

−4082.823 5000

−1868.977 4261

−3109.345 5739

- 2505.769 5000

0.3

0.45

transport mode (TM – indicating if the trip was made by train or by bus). We considered the following utility functions:

U1 = ASC1 + βP ⋅P1 + βTT ⋅TT1 + βAT ⋅AT1 + βTM ⋅TM1 + ε1 U2 = ASC2 + βP ⋅P2 + βTT ⋅TT2 + βAT ⋅AT2 + βTM ⋅TM2 + ε2

(3.2)

The ﬁrst dataset (short interurban trips) consists of 2781 observations, of which 289 (10.4%) stated indiﬀerence among both options. The second dataset (long interurban trips) includes 2745 observations, with 202 (7.4%) indicating indiﬀerence. Finally, the data on urban trips consists of 5841 observations, 553 (9.5%) of which were associated with indiﬀerence. For consistency purposes we considered basically the same models as in the previous study case. This way, Model 1 considers indiﬀerence as an additional opt-out alternative, Model 2 ignores all indiﬀerence observations, Model 3 relays on the proposed approach without accounting for frequency considerations (in fact, duplicating the weight of the indiﬀerence observations) and Model 4 follows the proposed approach, as depicted in Eq. (2.10). For illustrative purposes, this time we also included a model considering a null-alternative in the context of a regret minimization framework (Model 5). In this case, the additional alternative, can be framed as indiﬀerence (Hess et al., 2014).7 The results are presented in Table 3.2. In line with the previous study case, the results conﬁrm that treating indiﬀerence as an opt-out alternative (in the utility maximization framework) lead to results that widely diverge from alternative approaches (however, it is not possible to establish positively that the results are deﬁnitely biased, as it is not possible to observe the underlying model). Along these lines, as in the previous case, the estimates of Model 2 and Model 3 are consistently deviated up and down, respectively, from the results associated with Model 4. All three considered cases (interurban short and long trips, as well as urban trips) exhibit exactly the same patterns. Finally, and according with the expectations, the estimates associated with Model 5 are similar, but slightly larger, than those obtained for Model 4 (with the exception of the ASCs, as they are indicative for utility in RUM models and regret in RRM models). However, a direct comparison between both models cannot be performed, as they consider a diﬀerent number of alternatives (although, surprisingly, Model 5 exhibits a worse goodness-of-ﬁt than Model 1 for urban trips). These results conﬁrm our previous ﬁndings, implying that treating indiﬀerence as an opt-out alternative, as well as ignoring indiﬀerence observations or applying the proposed approach without considering frequency issues may lead to biased results. This evidence supports the necessity of an adequate modelling approach when dealing with indiﬀerence observations and real data. 3.3. Simulated binary choices including an opt-out alternative A second simulated scenario was considered. The dataset was generated similarly to the previous one, but in this case we introduced an opt-out alternative,8 as shown in the following equations set. 7 It could have also been possible to consider Models 2–4 on the basis of regret minimization, but given that they are binary cases, regret minimization and utility maximization would lead to the same outcome. 8 Even though we have framed the third alternative as opting-out, it is easy to see that, for the purposes of the modelling, it would be equivalent to considering a choice-set with three alternatives.

18

Journal of Choice Modelling 22 (2017) 13–23

F.J. Bahamonde-Birke et al.

Table 3.2 Estimated models. Case study 2. Survey

Variable

Model 1

Model 2

Model 3

Model 4

Model 5

Short Trips

P TT AT TM ASC2 ASCopt-out

−0.83 (0.204) −0.172 (0.0371) −0.067 (0.0546) 0.148 (0.0788) −0.493 (0.0779) −4.06 (0.45)

−1.99 (0.343) −0.375 (0.0662) −0.0778 (0.0673) 0.272 (0.0878) −0.507 (0.128) –

−1.63 (0.301) −0.299 (0.0596) −0.0429 (0.0592) 0.202 (0.0781) −0.402 (0.114) –

−1.79 (0.32) −0.332 (0.0626) −0.0571 (0.0628) 0.232 (0.0825) −0.447 (0.12) –

−1.97 (0.337) −0.351 (0.0635) −0.0667 (0.0684) 0.258 (0.0869) 0.482 (0.125) 4.5 (0.0879)

log-likelihood No. of obs.

2781.00 2781

−1620.24 2492

−2041.62 3070

−1832.10 2781

− 2549.21 2781

P TT AT TM ASC2 ASCopt-out

−0.855 (0.161) −0.112 (0.0342) −0.112 (0.0534) 0.232 (0.0792) 0.475 (0.0964) −6.17 (0.81)

−1.06 (0.202) −0.344 (0.0508) −0.123 (0.066) 0.268 (0.0876) 0.0866 (0.137) –

−0.876 (0.185) −0.278 (0.0465) −0.102 (0.0601) 0.264 (0.0812) 0.0769 (0.126) –

−0.957 (0.192) −0.308 (0.0485) −0.111 (0.0628) 0.267 (0.0841) 0.0807 (0.131) –

−0.988 (0.194) −0.352 (0.0513) −0.126 (0.0668) 0.28 (0.0866) −0.0428 (0.137) 4.64 (0.108)

log-likelihood No. of obs.

−2403.90 2745

−1666.02 2543

−1960.91 2947

−1814.13 2745

−2387.72 2745

P TT AT TM ASC2 ASCopt-out

−1.1 (0.263) −0.452 (0.0432) −0.169 (0.0659) 0.3 (0.052) −0.318 (0.0733) −3.38 (0.187)

−1.25 (0.273) −0.427 (0.0455) −0.178 (0.0689) 0.277 (0.059) −0.257 (0.0763) –

−1.01 (0.246) −0.353 (0.0406) −0.146 (0.0616) 0.236 (0.0533) −0.217 (0.068) –

−1.12 (0.258) −0.386 (0.0428) −0.16 (0.0649) 0.255 (0.0559) −0.235 (0.0717) –

−1.2 (0.268) −0.417 (0.0443) −0.172 (0.0673) 0.284 (0.059) 0.261 (0.0745) 4.49 (0.0582)

log-likelihood No. of obs.

−5408.35 5841

−3589.52 5288

−4368.88 6394

−3979.80 5841

−5419.95 5841

Long Trips

Urban Trips

U1 = β1⋅X1 + ε1 U2 = ASC2 + β2⋅X2 + ε2 Uopt−out = ASCopt − out + ε3

(3.3)

The data generation process was analogous to the previous section, but this time the pseudo-individuals were allowed to state their indiﬀerence among: (i) both alternatives, (ii) one existing alternative and opting-out, and (iii) both existing alternatives and opting-out. The ASC associated with the opt-out alternative (accounting for the reservation utility) was ﬁxed at zero. Again we considered the same diﬀerent indiﬀerence thresholds. In the ﬁrst case (low threshold), we observed 400 observations indicating indiﬀerence (124 between both existing alternatives, 131 between alternative 1 and opting out, 135 between alternative 2 and opting out, and 10 between both alternatives and opting out). In the second case (intermediate indiﬀerence threshold) 745 pseudo-individuals stated their indiﬀerence (200 between both existing alternatives, 260 between alternative 1 and opting out, 230 between alternative 2 and opting out, and 55 between both alternatives and opting out), while in the third case (large indiﬀerence threshold) we observed 1093 indiﬀerence observations (297 between both existing alternatives, 345 between alternative 1 and opting out, 319 between alternative 2 and opting out, and 132 between both alternatives and opting out). In this case, considering extra alternatives does not make sense, as option 3 already accounts for the opt-out alternative. This way, we estimated four diﬀerent models. Model 1 presents the results of ignoring the indiﬀerence choices. Model 2 considers the proposed framework but ignores the error term ϕ and does not consider a diﬀerent scale parameter for the observations associated with indiﬀerence-sets. Model 3 considers diﬀerent scale parameters but assigns a weight (yj) of 1 to all observations. Finally, Model 4 represents the proposed framework. The scale parameters were ﬁxed in accordance with (2.10) and (2.11). The results are presented in Table 3.3. This time, all speciﬁcations but Model 1 (which ignores the indiﬀerence observations) allow for an adequate estimation of the parameters used in the generation of the dataset. Model 2 performs surprisingly well, but still oﬀers a signiﬁcantly worse goodnessof-ﬁt than Model 4. Nevertheless, the fact that Model 2 performs acceptably while ignoring the diﬀerent variability, suggest that considering diﬀerent scale parameters may be omitted without major complications if the modeller does not count with enough indiﬀerence-set observations (to allow for a correct estimation of the λ parameter). Model 3 performs signiﬁcantly better than its un-weighted counterpart in the previous simulation exercise. The reason is that when considering diﬀerent scale parameters, the weighting only aﬀects the observations multiplied by the same scale parameter. Then, in this case, the weights associated with the single choices are not relevant, while we observed a small amount of observations with a weight of 1/3 (individuals stating their indiﬀerence among all choices) and a large majority associated with a weight of 1/2, which clearly diminishes the bias associated with ignoring the weighting. In fact, if we ignore the scale parameter (analogously to 19

Journal of Choice Modelling 22 (2017) 13–23

F.J. Bahamonde-Birke et al.

Table 3.3 Estimated models. Case study 3. Threshold

Variable

Model 1

Model 2

Model 3

Model 4

0.15

X1 X2 ASC2 ASCopt-out λIndiﬀerence-set

1.09 (0.0430) 1.05 (0.0334) 1.11 (0.0493) 0.0242 (0.0419) –

1.03 (0.0404) 0.991 (0.0310) 1.05 (0.0464) 0.0272 (0.0396) –

1.04 (0.0401) 0.999 (0.0310) 1.05 (0.0461) 0.0283 (0.0391) 0.436 (0.0511)

1.04 (0.0406) 0.999 (0.0314) 1.05 (0.0466) 0.0257 (0.0395) 0.439 (0.0717)

Final log-likelihood No. of observations

3848.233 4600

−4294.822 5000

−4699.123 5410

4268.096 5000

X1 X2 ASC2 ASCopt-out λIndiﬀerence-set

1.08 (0.0441) 1.08 (0.036) 1.08 (0.0514) 0.0394 (0.0431) –

0.976 (0.0391) 0.963 (0.0311) 0.996 (0.0461) 0.0304 (0.0388) –

0.979 (0.0383) 0.968 (0.0309) 0.989 (0.0449) 0.0302 (0.0377) 0.438 (0.0385)

0.98 (0.0393) 0.974 (0.0318) 0.981 (0.046) 0.0355 (0.0383) 0.447 (0.0535)

Final log-likelihood No. of observations

3572.903 4255

−4404.245 5000

−5202.828 5800

−4356.509 5000

X1 X2 ASC2 ASCopt-out λIndiﬀerence-set

1.17 (0.0468) 1.13 (0.0383) 1.19 (0.0559) 0.0242 (0.0419) –

0.979 (0.0386) 0.776 (0.0272) 1.00 (0.0452) −0.0785 (0.0398) –

0.937 (0.0383) 0.858 (0.0291) 0.955 (0.0448) 0.0291 (0.0366) 0.153 (0.0355)

0.933 (0.0390) 0.872 (0.0298) 0.941 (0.0456) 0.0267 (0.0362) 0.142 (0.0479)

Final log-likelihood No. of observations

−3163.748 3907

−4515.938 5000

−5701.262 6225

−4360.298 5000

0.3

0.45

Model 2 but ignoring the weighting), the model is no longer capable of recovering the parameters. Regarding the magnitude of the threshold we observe that all speciﬁcations (besides Model 1) perform adequately as long as the indiﬀerence threshold is small or intermediate. When the indiﬀerence threshold is large (causing that an important part of our pseudo-population is associated with indiﬀerence alternatives), all models exhibit some diﬃculties recovering the parameters. This may be explained by the fact that a larger indiﬀerence threshold may be indeed associated with a larger uncertainty and that our assumptions regarding the distribution of the error term ϕ are not exactly accurate. Even in this case, Model 4 exhibits the best performance, and is capable of recovering three out of four parameters and clearly outperforms Model 1 and Model 2. 3.4. Real dataset including opt-out and indiﬀerence options For the fourth case study we used another real dataset. Information was collected as part of a project aiming to identify and assess the value of urban attributes (Appendix B presents an example of the way alternatives were presented to the individuals). For this purpose, a SP experiment in the context of residential location choice was designed and conducted in one neighbourhood of Santiago de Chile. The area has some special characteristics, such as high income, urban amenities and enhanced possibilities (in terms of transport facilities). The attributes considered in the experiment were the existence of Green Areas on the sidewalk (GA), Bus Corridors (BC), Green Verges segregating the bus corridors (GV), Bike Lanes (BL), and changes in the housing rent (ΔP). The experiment oﬀered respondents four diﬀerent options; two alternatives depicted through visual representations (with utilities U1 and U2), the possibility of stating indiﬀerence between both of them, as well as an opt-out alternative (with utility Uopt-out), as in the following equations:

U1 = ASC1 + βga⋅GA1 + βgv⋅GV1 + βbike⋅BL1 + βbus⋅BC1 + βprice⋅ΔP1 + ε1 U2 = ASC2 + βga⋅GA2 + βgv⋅GV2 + βbike⋅BL 2 + βbus⋅BC2 + βprice⋅ΔP2 + ε2 Uopt−out = ASCopt − out + ε3

(3.4)

The dataset has 588 observations, twelve of which (2.04%) are associated with indiﬀerence statements, and 26 with the opt-out alternative. We estimated four diﬀerent models (Table 3.4) similar to those described in the third case study. Model 1 ignores the individuals stating their indiﬀerence. Model 2 does not consider diﬀerent scale parameters for the observations associated with the indiﬀerence sets, while Model 3 adds diﬀerent scale parameters but considers the same weight for all observations; ﬁnally, Model 4 uses the proposed framework (diﬀerent scale parameters and diﬀerent weights). As can be observed, in this case all estimated models perform similarly; in fact, it is not possible to identify statistically signiﬁcant diﬀerences between the parameters estimated by each model. This may be related to the fact that only twelve individuals (2.04%) stated their indiﬀerence among alternatives. It implies that considering involved speciﬁcations is not crucial when the number of 20

Journal of Choice Modelling 22 (2017) 13–23

F.J. Bahamonde-Birke et al.

Table 3.4 Estimated models. Case study 4. Variable

Model 1

Model 2

Model 3

Model 4

GA GV BL BC ΔP ASC1 ASC2 λindiﬀerence-set

0.204 (0,112) 0,392 (0.138) 0.977 (0.120) −0.133 (0.130) −1.97 (0.407) 1.60 (0.235) 1.54 (0.236) –

0.196 (0.110) 0.392 (0.136) 0.956 (0.118) −0.129 (0.128) −1.93 (0.400) 1.64 (0.233) 1.58 (0.235) –

0.189 (0.109) 0.391 (0.134) 0.937 (0.118) −0.126 (0.127) −1.89 (0.394) 1.68 (0.232) 1.62 (0.234) 0.977 (0.344)

0.196 (0,110) 0,391 (0.136) 0.957 (0.119) −0.129 (0.128) −1.93 (0.400) 1.64 (0.234) 1.58 (0.235) 0.967 (0.474)

Final log-likelihood No. of observations

−438.742 576

−448.481 588

−458.165 600

−448.478 588

indiﬀerence observations is relatively low. In line with the previous case study it is not possible to identify diﬀerences associated with the inclusion of diﬀerent scale parameters for the indiﬀerent sets, which supports the hypothesis that the more involved approach does not perform substantially better than assuming equal utility disturbances (Pri=Pi) in Eq. (2.10). 4. Conclusions Omitting indiﬀerence alternatives in a SC experiment can lead to loss of eﬃciency, as forcing individuals to opt for an alternative in an indiﬀerence situation may add white noise to the data, as has been previously established in the literature. Along the same line, allowing for indiﬀerence alternatives would oﬀer a better depiction of individuals’ preferences and increase the richness of the dataset. Nevertheless, modelling with indiﬀerence alternatives is not trivial and the approaches reported in the literature only allow considering binary choice situations. We propose an alternative method that can be used in any choice situation. This approach exhibits high ﬂexibility and allows specifying the likelihood function independent of the number of alternatives in the choice-set and in the subset of indiﬀerence alternatives. Finally, we test our method with help of four study cases. In the ﬁrst two experiments (based on simulated and real data), we considered binary choices, observing that the proposed framework allows recovering the parameters used in the generation of the dataset without major diﬃculties outperforming other alternatives. In case studies 3 and 4 (also on the basis of simulated and real data) we considered further alternatives (speciﬁcally opting out). We observe that in most cases the proposed approach allows recovering the real underlying parameters (some problems may arise when considering large indiﬀerence thresholds and many indiﬀerence observations, but even in this case the proposed approach outperforms competing alternatives). Nevertheless, ignoring the extra variability associated with selecting an alternative as a part of an indiﬀerence set does not appear to have a signiﬁcant eﬀect on the estimated parameters, as is clearly shown in the third and fourth experiments. This way, it may be advisable for modellers to omit this feature, if the data has a scarce number of indiﬀerence observations. Further research should provide more evidence regarding this hypothesis. Alternative approaches, such as considering the indiﬀerence as an opt-out alternative or just ignoring the indiﬀerence preferences are clearly outperformed by the proposed framework and are not capable of recovering the real parameters. Nevertheless, datasets including a very small number of indiﬀerence observations do not appear to be severely aﬀected by a poor treatment of indiﬀerence. In fact, completely ignoring these observations in case study four does not lead to signiﬁcantly diﬀerent results. Hence, the impact of indiﬀerence observations depends on their number. Finally, it can be concluded that individuals not stating a clear preference are indeed providing relevant information, which should be considered by the modellers: if you choose not to decide, you still have made a choice. Acknowledgments We wish to thank Ricardo Hurtubia for his useful suggestions and for pointing out the convenience of changing the name of the paper, suggesting we should choose a path that's clear, we should choose freewill. We are also grateful to Juan Pablo Sepúlveda for having provided us with the real data used in our ﬁrst experiment. Finally, we are indebted to the Institute on Complex Engineering Systems (ICM: P-05-004-F; CONICYT: FB0816), the Centre for Sustainable Urban Development, CEDEUS (Conicyt/Fondap/ 15110020) and the Bus Rapid Transit Centre of Excellence funded by VREF (www.brt.cl), for their support. This article beneﬁted greatly from the helpful comments of two anonymous referees.

21

Journal of Choice Modelling 22 (2017) 13–23

F.J. Bahamonde-Birke et al.

Appendix A See Fig. A1. In this case, respondents were instructed to choose “Any of them” if no alternative was favoured over the other.

Fig. A1. Presentation of choice situations in Case Study 2.

Appendix B See Fig. B1.

Fig. B1. Presentation of choice situations in Case Study 4.

22

Journal of Choice Modelling 22 (2017) 13–23

F.J. Bahamonde-Birke et al.

References Bradley, M.A., Daly, A.J., 1994. Use of the logit scaling approach to test for rank-order and fatigue eﬀects in stated preference data. Transportation 21, 167–184. Bierlaire, M. 2003. BIOGEME: A free package for the estimation of discrete choice models, Proceedings of the 3rd Swiss Transportation Research Conference, Ascona, Switzerland Cantillo, V., Ortúzar, J. de D., 2006. Implications of thresholds in discrete choice modelling. Transp. Rev. 26, 667–691. Cantillo, V., Amaya, J., Ortúzar, J. de D., 2010. Thresholds and indiﬀerence in stated choice surveys. Transp. Res. Part B: Methodol. 44, 753–763. Carson, R.T., Louviere, J.J., Anderson, D.A., Arabie, P., Bunch, D.S., Hensher, D.A., Johnson, R.M., Kuhfeld, W.F., Steinberg, D., Swait, J., Timmermans, H., Wiley, J.B., 1994. Experimental analysis of choice. Mark. Lett. 5, 351–367. Chapman, R.G., Staelin, R., 1982. Exploiting rank ordered choice set data within the stochastic utility model. J. Mark. Res. 19, 288–301. Chorus, C., 2012a. Random regret minimization: an overview of model properties and empirical evidence. Transp. Rev. 32, 75–92. Chorus, C.G., 2012b. Logsums for utility-maximizers and regret-minimizers, and their relation with desirability and satisfaction. Transp. Res. Part A: Policy Pract. 46, 1003–1012. Coombs, C.H., Dawes, R.M., Tversky, A., 1970. Mathematical Psychology: An Elementary Introduction. Prentice-Hall, Englewood Cliﬀs, New Jersey. Dhar, R., 1997. Consumer preference for a no-choice option. J. Consum. Res. 24, 215–231. Dhar, R., Simonson, I., 2003. The eﬀect of forced choice on choice. J. Mark. Res. 40, 146–160. Fenichel, E.P., Lupi, F., Hoehn, J.P., Kaplowitz, M.D., 2009. Split-sample tests of “no opinion” responses in an attribute-based choice model. Land Econ. 85, 348–362. Hess, S., Beck, M.J., Chorus, C.G., 2014. Contrasts between utility maximisation and regret minimisation in the presence of opt out alternatives. Transp. Res. Part A: Policy Pract. 66, 1–12. Kontoleon, A., Yabe, M., 2003. Assessing the impacts of alternative ‘opt-out’ formats in choice experiment studies: consumer preferences for genetically modiﬁed content and production information in food. J. Agric. Policy Resour. 5, 1–43. Krishnan, K.S., 1977. Incorporating thresholds of indiﬀerence in probabilistic choice models. Manag. Sci. 23, 1224–1233. Louviere, J.J., Hensher, D.A., Swait, J.D., 2000. Stated Choice Methods: Analysis and Applications. Cambridge University Press, Cambridge. McFadden, D., 1974. Conditional logit analysis of qualitative choice behaviour. In: Zarembka, P. (Ed.), Frontiers in Econometrics. Academic Press, New York, 105–142. Olsen, G.D., Swait, J.D., 1998. Nothing is Important. Working Paper, University of Calgary. Quandt, R.E., 1956. A probabilistic theory of consumer behaviour. Q. J. Econ. 70, 507–536. Roberts, J.H., Lattin, J.M., 1991. Development and testing of a model of consideration set composition. J. Mark. Res. 28, 429–440. Schlereth, C., Skiera, B., 2016. Two new features in discrete choice experiments to improve willingness-to-pay estimation that result in SDR and SADR: separated (adaptive) dual response. Manag. Sci.. http://dx.doi.org/10.1287/mnsc.2015.2367. Swait, J.D., Erdem, T., 2007. Brand eﬀects on choice and choice set formation under uncertainty. Mark. Sci. 26, 679–697. Thurstone, L.L., 1927. A law of comparative judgment. Psychol. Rev. 34, 273–286.

23

If you choose not to decide, you still have made a choice

If you choose not to decide, you still have made a choice

Recommend Documents