Transportation Research Part A 46 (2012) 1641–1653
Contents lists available at SciVerse ScienceDirect
Transportation Research Part A journal homepage: www.elsevier.com/locate/tra
An analysis of destination choice for opaque airline products using multidimensional binary logit models Misuk Lee a, Alexandre Khelifa a,b, Laurie A. Garrow a,⇑, Michel Bierlaire b, David Post c a
Georgia Institute of Technology, School of Civil and Environmental Engineering, 790 Atlantic Drive, Atlanta, GA 30332-0355, USA Transport and Mobility Laboratory, School of Architecture, Civil and Environmental Engineering, Ecole Polytechnique Fédérale de Lausanne (EPFL), Switzerland c SigmaZen, Germany b
a r t i c l e
i n f o
Article history: Received 17 November 2010 Received in revised form 8 August 2012 Accepted 14 August 2012
Keywords: Opaque products Discrete choice models Airline passenger behavior
a b s t r a c t We investigate how customers respond to an opaque airline product offered by a European carrier. In this opaque product design, customers are randomly assigned to travel to one of approximately ten destinations; however, for a fee they may exclude one or more destinations from the choice set (or a particular package design) prior to learning which destination they will travel to. We use a multidimensional binary logit model to predict the probability that one or more alternatives will be chosen by a customer. Results show that customers are more likely to pay to exclude destinations located close to the origin airport and destinations that speak the same language as the origin airport. Length of stay, cost of living at the destination, and measures of destination attractiveness are also found to be significant predictors for some package designs. Based on these findings, we offer general recommendations for how to design opaque packages for airline customers. Ó 2012 Elsevier Ltd. All rights reserved.
1. Introduction and motivation Over the past 15 years, the competitive structure of the airline industry has dramatically changed due to the emergence of online travel agencies (such as Expedia, Orbitz and Travelocity) that facilitated the comparison of prices across airline competitors. This emergence also coincided with an increased market penetration of low cost carriers (LCCs). LCCs use different pricing models than those used by legacy carriers. Specifically, the majority of LCCs use one-way pricing, which results in separate price quotes for the departing and returning portions of a trip. One-way pricing effectively eliminates the ability to segment business and leisure travelers based on a Saturday night stay requirement (i.e., business travelers are less likely to have a trip that involves a Saturday night stay). Combine the use of one-way pricing with the fact that the internet has increased the transparency of prices for consumers and the result is that today, almost half of all air leisure travelers state that they purchase the lowest price they find when using online channels (Harteveldt et al., 2004). In this environment, several airlines are beginning to explore the viability of using opaque products to stimulate leisure travelers that exhibit a high degree of travel flexibility without cannibalizing revenue from business travelers. As defined by Post (2010), ‘‘an opaque product is defined as a product in which one or more of the attributes that make up the product are hidden from the purchaser (that is, they’re not fully specified by the supplier) until after payment is made (e.g., see Gallego and Phillips, 2004; Fay, 2008).’’ From a historical perspective, it is important to note that the original applications of opaque airline products originated not by airlines, but by new companies such as Priceline and Hotwire. Many of the first articles in this area focused on: (1) Priceline, the first airline reverse auction site that entered the market in 1998 (e.g., see Kannan and ⇑ Corresponding author. Tel.: +1 404 385 6634. E-mail addresses:
[email protected] (M. Lee),
[email protected] (A. Khelifa),
[email protected] (L.A. Garrow), michel.bierlaire@epfl.ch (M. Bierlaire),
[email protected] (D. Post). 0965-8564/$ - see front matter Ó 2012 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.tra.2012.08.009
1642
M. Lee et al. / Transportation Research Part A 46 (2012) 1641–1653
Kopalle, 2001; Fay 2004; Spann et al., 2004); and, (2) Travelocity which, along with Expedia, entered the market in 1996 as an online travel agency and later started competing with Priceline and Hotwire by providing opaque products (e.g., see Smith et al., 2007; Zhouaoui and Rao, 2009). To date, there have been several articles examining opaque products in the travel industry (e.g., see Gallego and Phillips, 2004; Gallego and Phillips, 2004) in the context of revenue management. Several authors have observed a potential for profit from the opaque sale of a distressed inventory (e.g., see Jiang, 2007). From a theoretical perspective, other authors have examined opaque products to determine the conditions under which offering opaque products may be worthwhile; the majority of these studies describe several favorable sales environments (e.g., see Fay, 2008; Jiang, 2007; Granados et al., 2008; Jerath et al., 2009). However, despite the potential to increase revenues, few airlines have investigated the viability of directly offering opaque products themselves. Conceptually, it should be clear that the ability of airlines to offer an opaque product directly to customers has several benefits, most notably, the ability to tailor products to potential customers and the ability to increase brand awareness. Two airline applications of opaque products offered by airlines reported in the literature are overviewed by Post (2010) and include those by Freedom Air, a former subsidiary of Air New Zealand, and Germanwings, a wholly-owned subsidiary of Lufthansa. In the context of Germanwings, Post (2010) notes that ‘‘the customer can select a group of possible destinations, any one of which she is prepared to fly onto a particular departure and return date. A penalty fee is charged for each destination that she deletes from the group, thereby making the group smaller and reducing the uncertainty. Only after payment is made is the customer informed of her flight itinerary.’’ Post (2010) continues, noting that this type of opaque product has resulted in increased load factors at Germanwings by 1.5%. Further, a detailed examination revealed that these new passengers are almost completely incremental, i.e., they represent new customer demand (Mang et al., 2009). Post (2010) also analyzed a second opaque product variant used by Freedom Air in which ‘‘the destination and the number of nights at the destination were known, but the outbound and return flights were hidden within a customer-specified time window until some time before the actual departure date. In addition, the consumer could vary this advance warning as an additional parameter to influence the offer price.’’ The increase in airline profits from this opaque product was approximately 6% (Mang et al., 2009). The objective of this paper is to understand customer behavior as it relates to product selection for an opaque product (such as that offered by Germanwings). The paper does not focus on the revenue management implications of offering such a product, but rather focuses on understanding what features of the opaque product are attractive to consumers. This objective is consistent with prior studies published in Transportation Research Part A that have examined one or more aspects of air travel behavior (e.g., see Brey and Walker, 2011; Chen, 2008; Lu and Peeta, 2009; Peeta et al., 2008; Tsamboulas and Nikoleris, 2008). The remainder of this paper is organized as follows. First, we review the opaque destination choice product. Next, we introduce the multidimensional binary logit model, which was used to investigate which destinations customers are more likely to pay to exclude from packages. This is followed by an explanation of the data used for estimation, results, and model validation. The paper concludes with recommendations on how to design profitable opaque destination products.
2. Flexible destination choice product Data for this study comes from an opaque destination product offered by a European carrier. The opaque destination product enables customers to receive known prices (at steep discounts) in exchange for their willingness to accept uncertainty in their travel destination (but not their travel dates). Customers traveling from a particular airport are told that for a round trip fare of 39.98 euros, they will be randomly assigned to one of approximately ten destinations. If one or more of these destinations is unappealing, the customer may elect to exclude them for a fee. For each city excluded there is a fee of 5 euros. A minimum of three destinations must remain in the choice set, as the airline company must maintain some opaqueness so as to not dilute revenue from their traditional products. All destinations included in a package are served via non-stop flight. An example of the flexible destination choice product is shown in Fig. 1. A total of four packages were examined for each of two origin airports. Fig. 1 shows these four packages (defined as the Western Europe, Eastern Europe, Culture, and Party
Western Europe from Origin 1
… 1 2 3 4 5 6 7A ... K1
Eastern Europe from Origin 1
… 11 12 13 14… K2
Culture from Origin 1
… 1 2 4 7A 8 11 12 …K3
Fig. 1. Examples of different packages.
Party from Origin 1
… 1 2 3 4 5 11 13 ... K4
M. Lee et al. / Transportation Research Part A 46 (2012) 1641–1653
1643
packages) from Origin 1; similar packages apply for Origin 2. The destinations served by the Western Europe and the Eastern Europe packages are mutually exclusive; however, a particular destination may appear in more than one package (e.g., a city may be in both the Western Europe and the Culture packages). Destination 7A is denoted with a letter to indicate that this destination is served via a less-than-daily frequency. That is, for the Western European and Cultural Packages, Origin 1 serves destination ‘‘7A’’ only on certain days of the week. From a modeling perspective, there are several key points of interest. First, the presence of less-than-daily frequencies may influence the construct of the underlying choice set considered by customers. That is, if customers are aware of an airline’s schedule, they would not pay to exclude a city that had no flights for their desired departure and return dates. We investigate this question in detail in the results section. We model the choice of destination exclusion conditional on a particular package selection using a multidimensional binary logit model that assumes that destination choice exclusion probabilities (conditional on a particular package selection) are independent of other destinations in the choice set, i.e., each destination is independently or ‘‘sequentially’’ examined by the customer who decides whether to chose that destination or not. This model is presented in the next section. 3. Multidimensional binary logit model We use a multidimensional binary logit model to study which destinations customers are more likely to pay to exclude from packages. In the multidimensional binary logit model, a decision maker faces a choice among J alternatives. The utility that the decision maker n obtains from alternative i e J is Uni, which is decomposed as Vni + eni where Vni is defined as the representative utility and eni as an error term. Suppose that the decision maker chooses (possibly multiple) alternatives that exceeds her expectation or threshold, Uni = Vni + eni The decision to choose an alternative is a binary decision. Further, we may assume that Vni = 0 since consumer utilities are nominal. That is, for identification purposes it is necessary to set the utility of an alternative to a constant; here, we choose zero for convenience. Then, the probability of choosing alternative i can be represented as a binary logit model where the difference in e0 s is logistic:
Pni ¼ PrðV ni þ eni > en Þ
¼
eV ni : 1 þ eV ni
Let Sn denote the set of alternatives that decision maker n selected and yni = 1 if individual n chose alternative i and zero otherwise. In other words, Sn = {i|yni = 1}. Then the joint probability of choice is found as:
PrðSn Þ ¼
Y i2Sn
Y eyni V ni eV ni Y 1 ¼ : V V 1 þ e ni jRS 1 þ e nj 1 þ eV ni i¼1 J
n
A sample of N decision makers is obtained for the purpose of estimation. Further, let the representative utility be a linearin-parameters function b : V ni ¼ bT xni where xni is a vector of observed variables for alternative i. The log-likelihood function is:
LLðbÞ ¼
N X
lnðPrðSn ÞÞ ¼
n¼1
" N X X n¼1
i2Sn
T
ln
eb xni T 1 þ eb xni
! þ
X
ln
jRSn
#
1 bT xnj
1þe
¼
J h N X X
i T yni bT xnj ln 1 þ eb xni :
n¼1 i¼1
The maximum likelihood estimates are derived by solving the first-order condition: N X N X X dLLðbÞ X ¼ yni xnj Pni xni ¼ ðyni Pni Þxnj : db n¼1 i¼1 n¼1 i¼1 J
J
Note that the first-order condition for the multidimensional binary logit model is exactly the same as the first-order condition for a binary logit or multinomial logit (MNL) model (McFadden, 1974). The key distinction is that in the MNL model, an individual chooses one alternative whereas in the multidimensional binary logit model, individuals may choose multiple alternatives. The multidimensional binary logit model is distinct from bundled consumption problems in which the utility is derived from consuming all chosen items. That is, here, we assume that each alternative in the choice set is chosen independently of the other alternatives (which results in expanding the package design into a set of binary choice models). This is the key assumption in our model1: that the decision to choose or not choose an alternative is independent across alternatives; in reality, there may be a component of the error term that can be related to the entire set Sn. The multidimensional binary logit model is appropriate to use in situations in which customers select (or exclude) one or more alternatives or product attributes, and where consumption represents the ultimate product assigned by the customer (versus chosen by the customer directly). This type of situation is common in industries that sell opaque products. For example, Hotwire allows customers to select hotels that meet specific criteria, e.g., free internet, free parking, indoor pool, fitness center, airport shuttle, etc. The specific property is not known to the customer in advance of purchase, only characteristics of 1 We examined extensions to the multi-dimensional binary logit model that relaxed this assumption; however the interpretation of b coefficients was similar for the more complex models, thus we report results only for the multi-dimensional binary logit model.
1644
M. Lee et al. / Transportation Research Part A 46 (2012) 1641–1653
the hotel. Hotwire could potentially use our model to determine which hotel characteristics are most important to show to customers. 4. Data Choice models based on the multidimensional binary logit model were estimated for the opaque destination airline product. Multiple data sources were used for this study to investigate how the variables shown in Table 1 influence destination exclusion probabilities. The first eight variables were obtained from the airline’s booking and scheduling databases for bookings that occurred across a 1-year period, i.e., from August 8, 2008 to August 7, 2009. These databases provide information on the choice set shown to the consumer, which destinations in the choice set were actually available for the customer’s specified departure and return dates (e.g., some destinations may have had less-than-daily service and/or have been sold out), which destination(s) were excluded by the customer, and which destination was ultimately assigned to the customer. We employed several methods to identify leisure travelers in our database. Individuals who are travelling with others and/or have children present are typically more likely to be leisure travelers. From these databases, we observed passengers travelling together on the same reservation (which can be represented as the sum of the number of adults or children 13 or older, number of children ages 2–12, and number of children younger than 2). The airline industry also frequently considers weekend departures, Sunday returns, and longer stays as trip segmentation variables. That is, individuals departing later in the week, returning on a Sunday and/or staying more than three nights at the destination may be more likely to be leisure travelers. Given that departure day of the week, return day of the week, and length of stay are typically highly correlated in airline booking data, it is common to include the combination of variables that fit the data the best. An additional seasonality variable controls for destination preferences that may vary by season, e.g., a ski destination may be more attractive at certain times of the year. The next four variables represent characteristics that describe the potential destinations. The first variable, language, is a dummy variable equal to one if the language spoken at the destination is the same as that spoken at the origin city. Distance represents how far the origin and destination cities are from each other. We classified distance into four categories: (1) short distances, defined as trips less than 500 km that are served by flights of approximately 45 min; (2) medium-short distances, defined as trips between 501 and 1000 km that are served by flights of approximately 1 h; (3) medium-long distances, defined as trips between 1001 and 1500 km that are served by flights of approximately one to 2 h; and, (4) long distances, defined as trips greater than 1500 km that are served by flights of more than 2 h. Our cost of living variable was also classified into four categories based on an assessment of the Employment Conditions Abroad (ECA) International report and the ‘‘Big Mac Index’’ published by The Economist. The ECA (ECA) International report calculates a cost of living based on a basket of 128 consumer goods and services commonly purchased by expatriates in over 300 worldwide locations. The ‘‘Big Mac Index’’ defines the consumer goods basket as a McDonald’s Big Mac (Employment Conditions Abroad Limited, 2009; The Economist, 2010). Based on these cost of living measures, the relative cost of living at the destination was classified as low, medium, high and very high. The cost of living at the two origin airports examined in this study is high. The final variable, attractiveness, provides a measure of how attractive a destination city is relative to other destination cities in the package. Attractiveness was defined differently for each package based on which characteristic(s) were used to
Table 1 Variables examined in study. Variable
Description
Number of passengers Number of adults Number of children Number of infants Weekend departure Sunday return Length of stay Season Language Distance
Number of individuals travelling together on the same reservation
Cost of living Attractiveness
Number of individuals 13 years or older travelling together on the same reservation Number of individuals between the ages of 2 and 12 travelling together on the same reservation Number of individuals younger than 2 years old travelling together on the same reservation Dummy variable equal to one if the trip begins on a Thursday, Friday or Saturday Dummy variable equal to one if the trip ends on a Sunday Number of nights spent away from home Set of four dummy variables that indicate whether the trip took place in spring, summer, fall, or winter Dummy variable equal to one if the language spoken in the destination city is the same as the language spoken at the origin city Set of four dummy variables that classify how far a destination is located from the origin (1 = short distance, less than 500 km or a 45 min flight; 2 = medium-short distance between 501 and 1000 km or a 1 h flight; 3 = medium-long distance between 1001 and 1500 km or a 1–2 h flight; 4 = long distance greater than 1500 km or a 2+ hour flight) Set of four dummy variables that classify the cost of living at the destination into one of four ranked categories (1 = low, 2 = medium, 3 = high, 4 = very high). The cost of living at the two origin cities examined in this study is very high Set of four dummy variables that classify the attractiveness of a destination as marketed by a specific package into one of four ranked categories (1 = not attractive, 2 = somewhat unattractive, 3 = somewhat attractive, 4 = attractive)
M. Lee et al. / Transportation Research Part A 46 (2012) 1641–1653
1645
market the package. For example, in assigning a score of 1–4 (where 1 is not attractive and 4 is attractive) for the party package three main websites were used: The World’s Top Party Cities from Forbes, The Best Party Cities in the World by Club Planet, and The 10 Best Party Cities in Europe from RatesToGo (Murphy, 2009; Fazio, 2009; Cho, 2008). In defining attractiveness for the culture package, we used the Travel Inspiration pages of TripAdvisor (2010). These pages rank cities by a ‘‘popularity’’ index; this index reflects the opinions of individuals from multiple countries and parts of the world. Attractiveness measures for the Western and Eastern European packages were based upon a study issued by Globalization and World Cities Study Group and Network at Loughborough University in the UK (Beaverstock et al., 1999). The Loughborough study provides a rating of cities based on their provision of advanced producer services such as accounting, advertising, finance and law. Finally, note that we could not include exclusion price in the model as it did not vary across the choice sets. The cost to exclude each destination was always 5 euros. However, exclusion prices could easily be added into our models for future opaque product designs, as long as these prices varied across consumers.
5. Results This section discusses estimation results and is organized in two key sections. The first section investigates whether customers recognized that there were flights with less-than-daily service in the choice set. The second section summarizes findings from the multidimensional binary logit models.
5.1. Choice sets One of the interesting aspects of the flexible destination product was that most – but not all – of the destinations had at least one daily flight from the origin airport. As a consequence, there can be departure cities in the package that have no flights for the particular departure and return dates selected by the customer. This leads to an interesting behavioral question: Are customers aware that some destinations may not be available for their specified departure and return dates? If so, customers would realize that there is no need to pay 5 euros to exclude these cities since the airline cannot accommodate their particular nonstop departure and return date request. To investigate this question, we conducted extensive Chi-square tests for a 6 month period in which we had both complete schedule and booking data (April 5, 2008–October 7, 2008). The results of our analysis, based on an examination of more than 11 cities, clearly showed that customers were just as likely to exclude a destination irrespective of whether a nonstop flight was actually scheduled. A total of 18 Chi-square tests were conducted (which is greater than 11 as the same city could appear in multiple packages). Only one out of 18 Chi-square tests was rejected at the 0.05 level. From a modeling perspective, this enabled us to define the choice set (for destination exclusions) as the set of all destinations included in the package. These results raise the suspicion that opaque purchasers may not be ‘‘savvy,’’ that is they may not be totally aware of the operational mechanism behind their purchase. From a product design perspective, this result is particularly important, as it suggests that the airline can (cautiously) design packages that include new markets to which it has started providing new service (note that new markets are often served at less-than-daily frequencies). As a benefit to both consumers and the airline, including these new destinations in the package can help build customer awareness that the airline serves these new markets. Further, for those situations in which these new destinations have flights that match the customers’ selected departure and return dates, these destinations can be assigned to customers. In this sense, the opaque product can potentially help new markets that may be not yet be profitable become profitable. By placing even a modest number of highly flexible customers willing to pay at least 39.98 euros per round-trip in these new markets, the airline is able to ‘‘stimulate’’ demand for these new markets without lowering its price or cannibalizing revenues from its traditional airline products. Further, assuming these extra, highly flexible customers create incremental revenue for the airline, they may help make an unprofitable flight become a profitable one.
5.2. Multidimensional binary logit model results The multidimensional binary logit model results presented in this section were determined using an iterative modeling approach. As part of this modeling process, more than a dozen utility functions were investigated. Of the variables shown in Table 1, several are not included in the final model specification due to the fact they were not significant and/or resulted in parameter estimates with the incorrect signs. For example, we were not able to refine the total number of travelers to capture the presence and/or number of children and youths in different age groups due to small sample sizes. In the discussion that follows we focus on all remaining variables with the exception of seasonality as we expect that an individual’s preference for particular destinations will vary depending on the season; however, the particular pattern of preference will depend on the destination cities included in a particular package (which cannot be revealed due to non-disclosure agreements).
1646
M. Lee et al. / Transportation Research Part A 46 (2012) 1641–1653
5.2.1. Western European packages Estimation results for the Western European packages are summarized in Table 2 for two estimation samples: one that contains 80% of the observations (which we used for our validation analysis), and a second that contains 100% of the observations (which we used to verify the stability of parameter estimates). The reference choice used for estimation was destination inclusion, thus positive coefficients correspond to higher likelihoods of destination inclusion or, alternatively, lower likelihoods of destination exclusions. The Western European package is the package that has the most bookings and, among all of the packages, is the one that provided intuitive interpretations across all variables examined in the study. In terms of passenger characteristics, as the number of passengers increases, the probability a customer excludes a destination decreases. Intuitively, this makes sense as we would expect larger families and larger parties travelling together to be more sensitive to cost (or perhaps less likely to reach a consensus on where not to go). Similarly, in terms of trip characteristics, the weekend indicator shows that those individuals departing later in the week are less likely to exclude destinations. Both of these results are consistent with behaviors we expect to observe from leisure travelers in the sense that leisure travelers tend to be more price-sensitive and flexible. However, consider the length of stay variable. According to this variable, the longer the trip duration (which is usually indicative of more price-sensitive leisure travelers), the more likely individuals are to exclude a destination. Or stated another way, the longer the individual will be away from home, the more she cares about where she goes and the more selective she becomes in those cities she is willing to travel to. In terms of the other variables used to describe the destinations, cities in which the cost of living is very low are more likely to be included in the flexible destination product. Cities that are more attractive (as defined by the Loughborough study) are also more likely to be included. Cities located close to the origin city are more likely to be excluded. That is, all else being equal, passengers would rather travel to destinations located further from the origin city. Intuitively, this is because the air fare is the same for all cities, implying that the ‘‘value’’ on a per-kilometer basis offered to the customer is higher for destinations located further from the origin cities. Cities located further from the origin are also, in general, less convenient to reach by non-aviation modes. In addition, destinations in which the same language is spoken at the origin airport are more likely to be excluded. Note that this effect is in addition to the distance effect, i.e., passengers are even more likely to exclude those cities that are close to the origin airport if that city speaks the same language. This result is consistent with other studies of airline passenger behavior in Europe. That is, several researchers have observed a geographic ‘‘boundary effect’’ in which customers evaluate the attractiveness of an origin or destination airport based not only on distance, but rather by whether the airport is located within a particular area (e.g., a specific country). One of the most striking examples of these boundary effects was reported by Jordan Karatzas, Chairman of the Supervisory Board of SkyEurope Airlines, who explains that one of the reasons why, in 2009, SkyEurope had two hubs located in Vienna and Bratislava was because their initial hub in Bratislava was unable to draw passengers from Austria as they had expected, despite the fact that the two cities are located approximately 65 km apart. Karatzas attributed this to a ‘‘border effect’’ they were not anticipating when initially selecting Bratislava, i.e., passengers’ resistance to cross from Austria into Slovakia (Karatzas, 2009). Similarly, in our data, the language variable appears to be picking up a border effect, or interaction with distance. Given two destinations located the same distance apart, the one that is located outside the individual’s home country is preferred as a destination, i.e., individuals prefer to vacation at destinations outside of their country. The results for the Western European packages from the two origin cities are similar in their interpretation, as are the models based on 80% of the data and all of the data. 5.2.2. Eastern European packages The results of the Eastern European package are fairly similar to those obtained with the Western European package and are shown in Table 3. Specifically, results related to the number of passengers, weekend departure indicator, and distance offer similar behavioral interpretations. It was not possible to compare the impact of language spoken because there was at most one (and often no) Eastern European cities which spoke the same language as the origin airports. One of the key differences among the Western European and Eastern European packages is that the length of stay is not as strong of a predictor variable for the Eastern European package (the variability in the length of stay was not as high for the Eastern European packages). A second key difference is that for the Eastern European packages, the only variable describing the destination city that provided an intuitive interpretation was distance. In contrast with the Western Europe model, the attractiveness of a city and its cost of living did not provide a strong model fit with monotonically increasing parameter estimates across the different categories. The cost of living, which is typically very low across the majority of Eastern European cities, was not an important factor in determining Eastern European destination exclusion probabilities. Similarly, the attractiveness measure derived from the Loughborough study did not help describe destination exclusion probabilities for the Eastern European destinations. 5.2.3. Party and cultural packages The results from the party and cultural packages are shown in Tables 4 and 5, respectively. In general, the results are consistent with those observed previously, i.e., exclusion probabilities tend to decrease with the number of travelers (although this parameter estimate exhibits the incorrect sign for the culture package from Origin 2). Exclusion probabilities also decrease for departures that begin on a Thursday, Friday, or Saturday. Similarly, for the party package, customers are more likely to exclude destination cities that share a common language with the origin airport; the language indicator was dropped from the culture package due to high correlation with the distance variable. Both packages are more likely to ex-
Table 2 Results for Western European packages. Origin 1: W. Europe 100% sample
Origin 2: W. Europe 80% sample
Origin 2: W. Europe 100% sample
3.09 (47.6) 0.022 (1.6) 0.227 (8.7) 0.145 (22.7)
3.10 (53.3) 0.036 (2.9) 0.229 (9.8) 0.151 (26.0)
3.30 (46.2) 0.104 (6.8) 0.094 (3.4) 0.172 (20.5)
3.35 (52.2) 0.090 (6.5) 0.110 (4.5) 0.157 (21.3)
Season Spring Summer Fall
0.254 (6.9) 0.124 (3.9) 0.097 (2.4)
0.266 (8.1) 0.133 (4.6) 0.127 (3.6)
0.154 (3.8) 0.132 (3.9) 0.051 (1.1)
0.185 (5.2) 0.150 (4.9) 0.022 (0.5)
Language Same as departure
0.156 (2.0)
0.121 (1.7)
0.763 (20.0)
0.777 (22.8)
Cost of living Low Medium High Very high
1.56 (38.6) 0 0 0
1.55 (42.9) 0 0 0
0.250 (6.5) 0 0 0
0.244 (7.1) 0 0 0
Attractiveness Not attractive Somewhat unattractive Somewhat attractive Very attractive
1.37 (39.0) 1.09 (23.3) 0 0
1.37 (43.5) 1.12 (26.7) 0 0
0.344 (8.4) 0 0 0
0.359 (9.7) 0 0 0
Distance Short Medium-short Medium-long Long
1.44 (30.8) 0 0 0
1.46 (34.8) 0 0 0
1.87 (44.3) 0 0 0
1.90 (49.8) 0 0 0
Measures of model fit Log-likelihood Log-likelihood at constants Rho-square at constants # Customer purchases # MNL comparisons
21,451.26 25,708.84 0.1651 4,770 66,780
26,852.56 32,163.71 0.1656 5,963 83,482
17,413.97 20,394.99 0.1482 4,665 41,985
21,638.49 25,401.96 0.1462 5,832 52,488
M. Lee et al. / Transportation Research Part A 46 (2012) 1641–1653
Origin 1: W. Europe 80% sample ASC # Passengers Weekend Length of stay
Key: Parameter estimate (t-statistic). A parameter of 0 represents the reference level. ASC = alternative-specific constant.
1647
1648
Table 3 Results for Eastern European packages. Origin 1: E. Europe –100% sample
Origin 2: E. Europe – 80% sample
Origin 2: E. Europe – 100% sample
2.50 (24.7) 0.119 (3.8) 0.033 (0.6) 0.041 (7.6)
2.38 (26.6) 0.126 (4.5) 0.079 (1.5) 0.038 (8.0)
2.14 (22.6) 0.117 (4.6) 0.060 (1.0) 0.008 (1.3)
2.16 (25.6) 0.101 (4.5) 0.070 (1.2) 0.010 (1.7)
Season Spring Summer Fall
0.038 (0.4) 0.044 (0.6) 0.145 (1.5)
0.133 (1.7) 0.026 (0.4) 0.192 (2.3)
0.119 (1.3) 0.065 (0.8) 0.093 (0.9)
0.092 (1.1) 0.076 (1.1) 0.051 (0.6)
Language Same as departure
N/A
N/A
N/A
N/A
Distance Short Medium-short Medium-long Long
0.849 (13.9) 0 0 0
0.862 (15.9) 0 0 0
2.31 (36.9) 0 0 0
2.31 (41.4) 0 0 0
Measures of model fit Log-likelihood Log-likelihood at constants Rho-square at constants # Customer purchases # MNL comparisons
4,208.37 4,352.04 0.0340 1,040 12,480
5,300.14 5,486.68 0.0330 1,300 15,600
3,524.82 4,229.36 0.1662 872 9,592
4,430.62 5,313.89 0.1666 1,090 11,990
Key: Parameter estimate (t-statistic). A parameter of 0 represents the reference level. ASC = alternative-specific constant.
M. Lee et al. / Transportation Research Part A 46 (2012) 1641–1653
Origin 1: E. Europe – 80% sample ASC # Passengers Weekend Length of stay
1649
M. Lee et al. / Transportation Research Part A 46 (2012) 1641–1653 Table 4 Results for party packages. Origin 1: Party 80% sample
Origin 1: Party 100% sample
Origin 2: Party 80% sample
Origin 2: Party 100% sample
ASC # Passengers Weekend Length of stay
2.84 (42.6) 0.080 (4.6) 0.185 (5.2) 0.038 (9.9)
2.81 (46.9) 0.103 (6.6) 0.193 (6.0) 0.038 (11.2)
2.64 (23.5) 0.034 (1.3) 0.170 (2.6) 0.074 (6.3)
2.68 (26.8) 0.018 (0.8) 0.120 (2.1) 0.068 (6.5)
Season Spring Summer Fall
0.129 (2.3) 0.008 (0.2) 0.045 (0.7)
0.183 (3.6) 0.039 (0.9) 0.008 (0.1)
0.246 (2.7) 0.242 (3.2) 0.519 (3.6)
0.186 (2.2) 0.272 (4.0) 0.399 (2.9)
Language Same as departure place
1.56 (23.1)
1.58 (26.0)
0.893 (9.1)
0.852 (9.7)
Distance Short Medium-short Medium-long Long
1.45 (22.6) 0.992 (22.5) 0 0
1.42 (24.6) 0.964 (24.6) 0 0
1.24 (15.0) 1.11 (13.1) 0 0
1.24 (16.7) 1.08 (14.2) 0 0
Measures of model fit Log-likelihood Log-likelihood at constants Rho-square at constants # Customer purchases # MNL comparisons
11,247.00 12,927.43 0.1297 2,376 35,640
14,072.26 16,170.35 0.1300 2,970 44,550
3,255.61 3,595.36 0.0898 723 7,953
4,067.07 4,468.56 0.0945 904 9,944
Key: Parameter estimate (t-statistic). A parameter of 0 represents the reference level. ASC = alternative-specific constant.
Table 5 Results for culture packages. Origin 1: Culture 80% sample
Origin 1: Culture 100% sample
Origin 2: Culture 80% sample
Origin 2: Culture 100% sample
ASC # Passengers Weekend Length of stay
2.95 (44.5) 0.016 (0.9) 0.351 (10.4) 0.051 (9.5)
2.88 (48.6) 0.035 (2.1) 0.368 (12.2) 0.053 (10.8)
3.09 (32.9) 0.012 (0.5) 0.404 (7.7) 0.054 (6.9)
3.06 (36.8) 0.016 (0.8) 0.395 (8.5) 0.049 (6.9)
Season Spring Summer Fall
0.246 (4.3) 0.511 (11.3) 0.338 (6.4)
0.223 (4.4) 0.461 (11.4) 0.310 (6.6)
0.307 (4.3) 0.237 (3.9) 0.483 (3.6)
0.261 (4.1) 0.250 (4.6) 0.373 (3.0)
Language Same as departure place
Dropped
Dropped
Dropped
Dropped
Distance Short Medium-short Medium-long Long
3.02 (66.6) 0.868 (20.9) 0 0
3.02 (74.3) 0.874 (23.5) 0 0
3.42 (47.2) 1.33 (19.2) 0 0
3.39 (52.6) 1.33 (21.5) 0 0
12,360.15 15,295.38
15,439.77 19,089.26
4,962.60 6,577.79
6,243.67 8,226.66
0.1912 2,370 33,180
0.1919 2,963 41,482
0.2411 1,195 13,145
0.2456 1,494 16,434
Measures of model fit Log-likelihood Log-likelihood at constants Rho-square at constants # customer purchases # MNL comparisons
Key: Parameter estimate (t-statistic). A parameter of 0 represents the reference level. ASC = alternative-specific constant.
clude destination cities that are located nearer to the origin. The attractiveness and cost of living variables were not significant and/or did not provide an intuitive interpretation across all levels and were excluded from the final specification. 5.2.4. Summary of main findings from multidimensional binary logit models Table 6 summarizes the main findings across all packages and origin cities based on the multidimensional binary logit models. As shown in the table, destination cities in which the language is the same as the origin cities and destination cities that are located closest to the origin city are more likely to be excluded from the opaque destination choice product. This result is consistent across all marketed packages. Further, for the Western European package (which is chosen most often)
1650
M. Lee et al. / Transportation Research Part A 46 (2012) 1641–1653
Table 6 Summary of destination characteristics impacting destination choice exclusions. Package
Same language
Increased distance
Increased attractiveness
Increased cost of living
Western Europe Eastern Europe Party Culture
" N/A " –
; ; ; ;
; – – –
; – – –
Key: An upward (downward) arrow indicates an increase (decrease) in destination exclusion probabilities.
attractiveness and cost of living variables are found to be important in describing destination exclusion probabilities, with those cities that have the lowest cost of living and highest level of attractiveness being the least likely to be excluded. From a package design perspective, these results are important as they provide insights into the mix of destinations that should be included in a package to encourage customers to exclude one or more destinations. Airlines can earn additional revenues by including cities that speak the same language as the origin city, and cities located within 500 km in the package as these are the cities that are least attractive. However, there is clearly a balance that the airline needs to strike between including unattractive destinations in the package (which drives additional revenues through customer paying penalties to exclude these destinations) while simultaneously keeping the package attractive enough to maintain a similar sales volume. The validation of model results, discussed in the next section, provides additional insights into those cities that are more likely to be excluded from packages. 6. Validation The multidimensional binary logit model estimation results reported in Tables 2–5 that were based on the 80% random sample were used to predict inclusion probabilities for the remaining 20%. Table 7 summarizes the average in-sample (based on estimation data) and out-of-sample (based on validation data) inclusion probabilities for each package. For the in-sample data, the percentage of destinations included in a package ranges from 80.0% to 88.9%. These percentages are similar to those for the out-of-sample data (79.9–88.2%). These package-level inclusion probabilities are useful for quickly assessing the overall revenue generated by a package, i.e., in this case the highest overall exclusion probabilities (and also highest volume packages) are those associated with the Western European and culture packages. A comparison of in-sample and out-of-sample inclusion probabilities at the destination exclusion level shows that prediction accuracy is associated with the number of customer purchases. Those packages that have a larger number of customer purchases have higher in-sample and out-of-sample prediction accuracy. Here, prediction accuracy is measured as the weighted absolute percent error (WAPE) at the destination level, which is loosely based on measure commonly used in airline studies (e.g., see Garrow and Koppelman, 2004): D X
WAPE ¼
nd jActual-Predictedj
d¼1 D X nd d¼1
where nd is the number of times a destination was included in a purchased package and D is the total number of destinations. Actual refers to the actual inclusion probability for the in-sample estimation data and predicted refers to the predicted inclusion probability for the out-of-sample validation data. As seen in Table 7, WAPE ranges from 3.08% to 4.15% for Origin 1 insample data and is consistently higher (3.19–5.04%) for Origin 1 out-of-sample data. Origin 2 packages did not have as many purchased observations and in general, WAPE measures for Origin 2 are higher than those for Origin 1.
Table 7 Aggregate validation results. Package
Origin Origin Origin Origin Origin Origin Origin Origin
1: 1: 1: 1: 2: 2: 2: 2:
W. Europe E. Europe Culture Party W. Europe E. Europe Culture Party
# Customer purchases
Rho squared
5963 1300 2963 2970 5832 1090 1494 904
0.1656 0.0330 0.1919 0.1300 0.1462 0.1666 0.2456 0.0945
Inclusion probabilities at package level
Weighted absolute % error at destination level
In sample (%)
Out of sample (%)
In sample (%)
Out of sample (%)
87.07 88.89 82.68 88.21 81.03 83.24 80.00 83.92
87.00 88.17 82.93 88.15 81.65 84.08 79.93 83.24
3.076 4.055 4.148 3.994 5.151 3.342 4.885 6.400
3.189 5.035 3.874 3.881 5.243 4.514 4.811 5.512
1651
M. Lee et al. / Transportation Research Part A 46 (2012) 1641–1653 Table 8 Disaggregate validation results for Western European and culture packages. City - # obs
1–323 2–419 3–1415 4–213 5–377 6–212 7–378 8–397 9–432 10–1447 11–654 12–1104 13–312 14–350 15–1436 16–751 17–458 18–400 19–433 20–207 21–334 22–1074 23–1032 24–1448 25–196 26–450
Origin 1: W. Europe
Origin 1: Culture
Out of sample
In sample
Out of sample
In sample
0.911–0.960
0.896–0.955
0.957–0.960
0.960–0.955
0.971–0.933 0.936–0.933 0.883–0.856 0.871–0.856 0.962–0.933 0.451–0.419
0.968–0.935 0.946–0.935 0.880–0.856 0.862–0.856 0.976–0.935 0.469–0.431
0.926–0.933
0.926–0.935
0.949–0.856 0.388–0.419 0.959–0.933
0.954–0.856 0.393–0.431 0.949–0.935
0.813–0.856 0.846–0.933
0.808–0.856 0.847–0.935
0.950–0.838 0.953–0.948 0.956–0.960
0.838–0.833 0.948–0.949 0.960–0.955
0.954–0.948
0.948–0.949
0.969–0.960
0.960–0.955
0.984–0.948 0.441–0.488 0.891–0.938
0.948–0.949 0.488–0.498 0.938–0.944
0.830–0.785 0.833–0.833 0.773–0.838
0.785–0.789 0.833–0.826 0.838–0.833
0.788–0.785
0.785–0.789
0.726–0.856 0.893–0.856
0.744–0.856 0.889–0.856
Origin 2: W. Europe
Origin 2: Culture
Out of sample
In sample
Out of sample
In sample
0.940–0.947
0.952–0.955
0.977–0.946 0.926–0.946
0.963–0.944 0.926–0.944
0.808–0.826
0.793–0.819
0.408–0.382
0.441–0.396
0.916–0.826
0.896–0.819
0.914–0.826 0.356–0.382 0.974–0.946 0.926–0.946
0.920–0.819 0.351–0.396 0.973–0.944 0.950–0.944
0.930–0.946
0.910–0.944
0.667–0.826
0.669–0.819
0.908–0.738 0.568–0.738 0.950–0.947
0.921–0.741 0.560–0.741 0.956–0.955
0.976–0.947 0.936–0.962
0.979–0.955 0.949–0.970
0.781–0.755 0.571–0.571 0.662–0.688
0.798–0.777 0.557–0.557 0.676–0.697
Key: The two numbers shown in each cell are the actual inclusion probability – the estimated inclusion probability. Bolded entries represent cities with inclusion probabilities less than 0.8.
Table 8 provides an examination of which particular destinations drive revenues for the two most profitable packages. Those destinations that are highlighted represent destinations that have average exclusion probabilities greater than or equal to 20% for the in-sample and/or out-of-sample predictions. The majority of the highlighted destinations, namely destinations 5, 7, 11, 14, 18, and 23 have very low inclusion probabilities (or very high exclusion probabilities). These are cities located close to the origin airport that speak the same language as the origin airport. Cities 22, 24 and 26 are particularly interesting as they represent large international cities located outside the country of the origin airports (and to which most individuals have probably previously traveled). Cities 24 and 26 speak the same language as the origin airport. The multidimensional binary logit model does a fairly good job predicting higher exclusion probabilities for these cities, with the exception of City 24, which has a higher exclusion probability than the model predicts. Although the exact reasons why individuals exclude this city more frequently are unclear, it is interesting to note that the airline quickly recognized this fact and included this destination in the Western European, culture, and party packages. The validation results also provide insights into potentially new package designs. That is, it is interesting to note that for each of the packages shown in Table 8, on average three or four destinations have exclusion probabilities greater than 20%. Also, note that the destinations with the highest exclusion probabilities are not common across an origin. For example, it may be possible to further increase revenues by including destinations 7 and 14 in the Western European packages, destination 18 in the culture package for origin 1, or destinations 22 and 23 in the culture package for origin 2. It may also be possible to further increase revenues by charging a premium to exclude particular destinations (e.g., those within a 500 km radius of the origin airport). Due to the limited variability in package designs available in the estimation data, further experimentation will be required to determine the optimal number of ‘‘undesirable’’ destinations to include in a package and how to price them, i.e., if too many destinations with high exclusion probabilities are included in the package we would expect conversion rates to decrease. Discussions with the airline are currently underway to pursue these and other pricing ideas.
7. Sensitivity of inclusion probabilities to parameter estimates In addition to examining the stability of parameter estimates for a holdout sample, we examined the sensitivity of inclusion probabilities at the package level (such as those reported in Table 7) to the parameter estimates themselves. We used the variance–covariance parameter covariance matrix to randomly generate parameters and calculate the average inclusion probability for the estimation dataset. The process was repeated 100 times and used to produce confidence intervals for the
1652
M. Lee et al. / Transportation Research Part A 46 (2012) 1641–1653
Table 9 Sensitivity analysis of inclusion probabilities to changes in parameter values.
Origin Origin Origin Origin Origin Origin Origin Origin
1: 1: 1: 1: 2: 2: 2: 2:
W. Europe E. Europe Culture Party W. Europe E. Europe Culture Party
Mean
Std. dev.
90% CI
95% CI
0.870 0.887 0.828 0.882 0.812 0.838 0.799 0.834
0.001 0.235 0.167 0.137 0.162 0.295 0.280 0.351
0.869–0.872 0.882–0.891 0.824–0.830 0.879–0.884 0.809–0.814 0.833–0.842 0.795–0.804 0.828–0.839
0.868–0.873 0.882–0.892 0.824–0.831 0.879–0.885 0.809–0.816 0.833–0.845 0.794–0.804 0.827–0.841
Note: Average inclusion probability for the estimation dataset based on 100 runs for each package. Parameters were generated using the parameter variance–covariance matrix.
inclusion probabilities. The 90% and 95% confidence intervals associated with inclusion probabilities are reported in Table 9 for each package. As seen in the table, inclusion probabilities at the package level are not sensitive to the input parameter estimates. 8. Conclusions Several airlines are starting to investigate the potential of directly offering opaque products to consumers. This has several advantages, most notably the ability to interact directly with consumers versus using a third-party intermediary and the ability to stimulate demand in new markets (including those with less-than-daily flight frequencies) without cannibalizing revenue from customers purchasing traditional products. In this paper, we applied a multidimensional binary choice model to predict the probability an individual would exclude one or more destinations from a package. The model performed quite well and provided new insights into why particular destinations were being excluded by customers. Destinations located close to the origin airport and destinations that spoke the same language as the origin airport were more likely to be excluded. In addition, customers who were traveling for longer periods of time were more likely to exclude destinations. Based on these findings, the airline is currently experimenting with new package designs and pricing, i.e., they are investigating the potential to further increase revenues by charging more to exclude particular destinations and/or by placing multiple ‘‘undesirable’’ destinations into a package. Acknowledgement This research was supported in part by NSF CAREER Grant SES-0846758. References Beaverstock, J.V., Smith, R.G., Taylor, P.J., 1999. A roster of world cities. Cities 16(6), 445–58.
(accessed 21.05.10). Brey, R., Walker, J., 2011. Latent temporal preferences: an application to airline travel. Transportation Research Part A 45 (9), 880–895. Chen, C.-F., 2008. Investigating structural relationships between service quality, perceived value, satisfaction, and behavioral intentions for air passengers: evidence from Taiwan. Transportation Research Part A 42 (4), 709–719. Cho, E., 2008. The 10 Best Party Cities in Europe. RatesToGo: Travelblog for Travelers OnTheGo. Published February 21, 2008 (accessed 21.05.10). The Economist, 2010. The Big Mac Index. (accessed 21.05.10). Fay, S., 2004. Partial-repeat-bidding in the name-your-own price channel. Marketing Science 23 (3), 407–418. Fay, S., 2008. Selling an opaque product through an intermediary: the case of disguising one’s product. Journal of Retailing 84 (1), 59–75. Fazio, S., 2009. Best Party Cities in the World. Clubplanet. Published March 23, 2009 (accessed 21.05.10). Gallego, G., Phillips, R., 2004. Revenue management of flexible products. Manufacturing and Service Operations Management 6 (4), 321–337. Garrow, L.A., Koppelman, F.S., 2004. Multinomial and nested logit models of airline passengers’ no-show and standby behavior. Journal of Revenue and Pricing Management 3 (3), 237–253. Granados, N., Gupta, A., Kauffman, R.J., 2008. Designing online selling mechanism: transparency levels and prices. Decision Support Systems 45 (4), 729– 745. Harteveldt, H.H., Wilson, C.P., Johnson, C., 2004. Why leisure travelers book at their favorite sites. In: Forrester Research: Trends. (accessed 09.05.09). Jerath, K., Netessine, S., Veeraraghavan, S.K., 2009. Selling to Strategic Customers: Opaque Selling Strategies. Working Paper. Wharton School of Business. Jiang, Y., 2007. Price discrimination with opaque products. Journal of Revenue and Pricing Management 6 (2), 118–134. Kannan, P.K., Kopalle, P.K., 2001. Dynamic pricing on the internet: importance and implications for consumer behavior. International Journal of Electronic Commerce 5 (3), 63–83. Karatzas, J., 2009. SkyEurope. Presentation at the AGIFORS Strategic Planning and Scheduling Meeting, Athens, Greece. Lu, J.-L., Peeta, S., 2009. Analysis of the factors that influence the relationship between business air travel and videoconferencing. Transportation Research Part A 43 (8), 709–721. Mang, S., Spann, M., Post, D., 2009. Implementierung eines Interaktive price response systems bei einer low cost airline. Business Services: Konzepte, Technologien, Anwendungen, Band 247, 193–202. McFadden, D., 1974. Conditional logit analysis of qualitative choice behavior. In: Zarembka, P. (Ed.), Frontiers in Econometrics. Academic Press, New York, pp. 105–142. Murphy, J., 2009. World’s Top Party City. Forbes (Published December 21, 2009). (accessed 21.05.10).
M. Lee et al. / Transportation Research Part A 46 (2012) 1641–1653
1653
Peeta, S., Paz, A., DeLaurentis, D., 2008. Stated preference analysis of a very new light jet based on-demand air service. Transportation Research Part A 42 (4), 629–645. Post, D., 2010. Variable opaque products in the airline industry: a tool to fill the gaps and increase revenues. Journal of Revenue and Pricing Management 9 (4), 292–299. Smith, B.C., Darrow, R., Elieson, J., Guenther, D., Rao, B.V., Zouaoui, F., 2007. Travelocity becomes a retailer. Interfaces 37 (1), 68–81. Spann, M., Skiera, B., Schaefers, B., 2004. Measuring individual frictional costs and willingess-to-pay via name-your-own-price mechanisms. Journal of Interactive Marketing 18 (4), 22–36. TripAdvisor 2010. Best History & Culture Vacations – Europe. (accessed 21.05.10). Tsamboulas, D.A., Nikoleris, A., 2008. Passengers’ willingness to pay for airport ground access time savings. Transportation Research Part A 42 (10), 1274– 1282. Zhouaoui, F., Rao, B.V., 2009. Dynamic pricing of opaque airline tickets. Journal of Revenue and Pricing Management 8 (2/3), 148–154.