Journal of Retailing and Consumer Services 6 (1999) 99—105
Cross-category sales promotion effects Harald Hruschka *, Martin Lukanowicz, Christian Buchta Department of Marketing, University of Regensburg, Universita¨ tsstrasse 31, D-93053 Regensburg, Germany Telekom Control, Mariahilferstrasse 77, A-1060 Vienna, Austria Department of Tourism, University of Economics, Augasse 2, A-1090 Vienna, Austria Received for publication 14 July 1998
Abstract We introduce a multivariate binomial logit model measuring cross-category dependence and sales promotion effects of a retail assortment. This model requires as data both the market baskets of individual shoppers and the categories currently promoted in a retail outlet. A special section describes the stepwise procedure used to estimate parameters of this model. Its application is demonstrated analyzing 6147 purchases that were acquired in a medium-sized supermarket. We finally discuss the managerial relevance of this model for sales promotion decisions of retail firms. 1999 Elsevier Science Ltd. All rights reserved. Keywords: Assortment; Market basket analysis; Sales promotion
1. Introduction The main focus of our study is the measurement of dependencies and sales promotion effects across the categories of a retail assortment. Both dependencies and sales promotion effects are conceived with regard to shoppers’ purchase probabilities. We use market basket data acquired by scanner technology together with appropriate software where a market basket is the set of items (categories) that a buyer acquires in the same purchase. The other part of our database consists of information on the categories currently featured in a retail outlet. One finds a few contributions dealing with the measurement of cross-category dependence in the literature. Bo¨cker (1975, 1978) uses pairwise association measures to identify relationships between pairs of categories. Similar approaches may be found in Julander (1992), Dickinson et al. (1992) and Bultez et al. (1996). Chintagunta and Haldar (1995) use sophisticated bivariate hazard and probit models to measure cross-category dependence with regard to purchase times or choices, but they still only consider pairs of categories (pasta and
* Corresponding author. Tel.: 0049 941 943 2277; fax: 0049 941 943 2278.
pasta sauce, strawberry and rasberry yoghurt, soup and yogurt). In contrast to these approaches, the following contributions are able to reproduce influences of many (i.e. more than one) different categories on purchases of a category. Hruschka (1991) develops a probabilistic model consisting of logit equations, which measures crosscategory dependence by interaction parameters. He studies market baskets for an assortment of 72 categories. Another probabilistic approach uses data mining algorithms to discover association rules that indicate how frequent pairs of subsets of assortment categories are purchased together (Agrawal and Sikant, 1994). There is a dearth of studies investigating the impact of promotions on non-promoted products, especially with regard to complementary effects. Walters (1991) or Mulhern and Leone (1991) find evidence of asymmetric promotion effects between pairs of categories. Schmalen and Pechtl (1995) study cross-category promotion effects of coffee on other categories. Effects are measured by growth of monetary sales of the other category. Sales growth is somewhat higher only in one of the other categories (cut cheese). Chintagunta and Haldar (1995) extend their models mentioned above by including price and promotion variables. For the pairs of product categories considered they obtain higher interdependence measures compared to
0969-6989/99/$ — see front matter 1999 Elsevier Science Ltd. All rights reserved. PII: S 0 9 6 9 - 6 9 8 9 ( 9 8 ) 0 0 0 2 6 - 5
100
H. Hruschka et al. / Journal of Retailing and Consumer Services 6 (1999) 99—105
model specifications without independent variables. Manchanda et al. (1997) analyse multi-category purchases in four categories (laundry detergents, fabric softeners, cakemix and cake frosting) using a multivariate probit model. They obtain significant complementary price effects between laundry detergents and fabric softeners (cakemix and cake frosting). In the next section we lay out the multivariate logit model. This is followed by a section giving information on the estimation method. Then results of an empirical study are presented. We conclude with a discussion of the managerial relevance of this model for sales promotion decisions of retail firms.
purchases of category i from purchases of the rest of the assortment. b denotes the effect of a sales promotion of category G i on the main effect of the same category, b the effect on GHG the interaction of categories i and j by a sales promotion of category i. Conditional probabilities of purchases of category i given purchases of other related categories (whose a O0) collected in the index set Z and sales promotions GH G X"+X , 2, X , are derived from the loglinear model ' as P "1/(1#exp(!(aG #bG X G8G 6 G # (aG #bG X #bG X ) ½ ))) H HG G HH H H HZ8G with
2. Multivariate logit model We extend the model of Hruschka (1991) by including cross-category sales promotion effects influencing purchase probabilities. Purchases ½ (i"1, I) and sales G promotion X (i"1, I) in I product categories are binary G variables. We assume that promotion of category i may influence purchases of category i via its main effect as well as joint purchases of other categories jOi via interaction parameters. We start from the loglinear model for joint purchase probabilites P(½ , 2, ½ ): ' ' ln P(½ , 2, ½ )"a # (a #b X ) ½ ' G G G G G '\ ' # (a #b X #b X )½ ½ , GH GHG G GHH H G H G HG>
(1)
where a is the main effect of category i (the change of the G log expected joint probabilities by a purchase of category i), and a the first-order interaction between the two GH categories i and j. Interactions measure the deviation of the log observed joint probabilities from the log expected joint probabilities if only main effects are considered. The model includes interactions between pairs of categories (first-order interactions) and neglects higherorder interactions (e.g. between triples of categories). This may be justified by the high number of variables (categories) of retail applications and the better interpretability of such a simplified model. A similar approach is taken in conjoint-analysis models (e.g. Green et al., 1989). Omission of any first-order interaction a gives GH a model, where the purchase of category i is conditionally independent from the purchase of category j given purchases of other categories. By leaving out any a with GI kOj as well, one obtains a model according to which the purchase of both categories i and k is independent from the purchase of category j. If all interactions a with jOi GH are excluded, we arrive at total independence of
aG "a , bG "b , bG "b . (2) H GH HG GHG HH GHH This model consists of one binomial logit equation for each of the categories considered. It is a multivariate binomial logit model using the terminology of Nerlove and Press (1973). The relationship of the multivariate logit formulation to the loglinear model implies cross-equation parameter restrictions of the following type (Maddala, 1987): aG "aH"a . (3) H G GH These restrictions come up to equality conditions for first-order interactions. The coefficient of interaction of category j in the equation of category i, aG equals the H coefficient of category i in the equation of category j, aH . G We call two categories purchase complements (substitutes) if joint purchases are more (less) frequent compared to the case of stochastic independence (Mulhern and Leone, 1991). Our definition is based on product interdependencies in terms of customers’ purchases (Betancourt and Gautschi, 1990). A parameter aG greater (less) than zero indicates that H both categories are complements (substitutes). A parameter equal to zero shows that they are independent. To be more specific, there is conditional independence with those categories jOi, which have no interaction parameter different from zero in the logit equation of category i. If the logit equation of category i has no interaction parameters at all, its purchases are totally independent from purchases of the rest of the assortment. We distinguish the following types of effects of a sales promotion of category i: E more purchases of the same category (bG '0); E less joint purchases of categories i and j (bG (0); HG E more joint purchases of categories i and j (bG '0). HG In the extreme a promotion may make two categories purchase complements that without promotion joint purchases are stochastically independent.
H. Hruschka et al. / Journal of Retailing and Consumer Services 6 (1999) 99—105
3. Estimation of the multivariate logit model Estimation of the multivariate logit model proceeds in the following way: 1. Basic multivariate logit model: E determination of significant cross effects for all pairs of categories; E specification of the multivariate logit model by combining interaction parameters corresponding to all significant cross effects and all main effect parameters; E single equation estimation of the multivariate logit model; E stepwise elimination of interaction parameters; E estimation of the multivariate model for all categories taking cross-equation equality restrictions into account. 2. Introduction of the additional parameters of the extended model. 3. Stepwise elimination of additional parameters. Single logit equations for each category are estimated by generalized least squares. To include parameter restrictions the whole system of binomial logit equations is formulated as one multivariate nonlinear regression model. Parameter estimates (or variance weighted averages) obtained in the single equation step serve as initial values for the multivariate nonlinear least squares estimation (Gallant, 1987). The vast number of possible model specifications because of the high number of categories even when restricting to constant terms and first order interactions forces to use a coarse model search heuristic. In several steps, the parameter with maximal insignificance is selected as candidate for elimination from the model. It is actually eliminated, if the normed fit index of the unrestricted model with this parameter included compared to the restricted model without this parameter is less than 0.02. Stepwise elimination stops, if all parameters are significant at a"0.05. The normed fit index gives the relative improvement of the sum of weighted squared errors of an unrestricted model (SSE ) compared to a restricted model (SSE ): 3 0 SSE !SSE 0 3. (4) SSE 0 In the case of perfect fit the normed fit index assumes the value one.
4. Empirical study The empirical study is based on a data set consisting of 6147 purchases acquired in a medium sized supermarket of the same retail chain. Usual scanner data were read in, transformed to and stored as market basket data by
101
special software installed in the data-processing center of the retail chain. The purchases of the data set occurred on four successive saturdays. As frequencies for individual items as a rule become very low, we analyze data on the category level. In agreement with the classification scheme of the retail chain 150 categories are distinguished. Results demonstrate that dependence exists for 73 categories. Only 4.9% of the pairs formed by these 73 categories have significant interaction parameters. The purchases of certain categories are totally independent of purchases of the remaining categories. Table 1 shows those isolated categories and their main effects, which attain at least 100 purchases in the data set. Using interaction parameters of the multivariate binomial model one can identify clusters consisting of more than one category that are independent from other parts of the assortment (Fig. 1 contains a MDS map computed on the basis of interaction parameters.). These clusters are: E E E E E E E
detergents and related products; household cleansers, other cleansers; tobacco, cigars, cigarette paper; tropical fruit, frozen fruit; baby related products (food, care, hygienic); red wine, white wine; beer, water.
The categories most frequently bought are: bread (1078 purchases); fruit (1050 purchases); vegetables (846 purchases); yoghurt (782 purchases); journals (713 purchases); milk (705 purchases). Table 2 contains main effects and interaction parameters of the logit equations of these categories. Bread has the strongest interactions with cut cheese and fruit or vegetables, fruit with vegetables and yoghurt, vegetables with fruit and milk, yoghurt with milk and fruit, milk with yoghurt and vegetables. Journals only interact with bread. Table 1 Independent categories Category
Purchases
Champagne Cigarettes Frozen potato and flour products Frozen poultry Gifts Office articles Rolls Snacks Soft drinks
116 443 108 157 193 100 265 251 555
aG !1.398 !0.796 !1.398 !1.301 !1.222 !1.523 !1.046 !1.046 !0.770
102
H. Hruschka et al. / Journal of Retailing and Consumer Services 6 (1999) 99—105
Fig. 1. MDS map of interdependent categories.
On the whole, results confirm expectations that most categories of a retail assortment are complements as they allow customers to do one-stop shopping (Betancourt and Gautschi, 1990). Almost all of the interactions discovered comprise categories that are complements. The only substitutes found are cigars and tobacco as well as cigars and cigarette paper. This may be explained by the fact that these categories are restricted to basically the same consumption activity. The logit model is computationally more efficient than the data mining approach of Agrawal and Sikant (1994) mentioned above. The latter leads to computing times for small simulation problems with 20 categories which are much higher than those necessary for the multivariate binomial logit model when applied to a real-world data set. Maximally, 47 categories are promoted per week. After eliminating 26 categories that are promoted in every week and categories with very low purchase frequencies, 28 promoted categories remain. For the following cate-
gories promotion does not change their own main effect (bG "0), i.e. the purchase frequency does not significantly increase (or decrease), if a category is promoted: E E E E E E E E E
deli; spread; sweets; vermouth and dessert wine; dog food; toilet tissue; red wine; dishwashing detergents; electric appliances.
It may be possible that promotion in these categories only lead to switching of customers within a category. Of course, to study this hypothesis one has to use brandspecific data. For salt & garlic as well as cat food the multivariate logit model indicates more purchases if these categories are promoted (i.e. higher main effects because of bG equal
H. Hruschka et al. / Journal of Retailing and Consumer Services 6 (1999) 99—105
103
Table 2 Main and interaction effects
Bread Main effect Baking products Bread Butter Canned milk Canned vegetables Cheese Chocolate Cut cheese Deli Dental care Durable milk Eggs Fat & oil Fruit Fruit juice Hygienic tissue Journals Milk Pasta Rice & legume Salt & garlic Soups & sauces Sour canned food Spices & mustard Toilet tissue Vegetables Whole-meat bread Yoghurt
!0.699
0.170
0.185 0.529
0.223
Fruit !0.824 0.190 0.279 0.155
0.185 0.170 0.111 0.238 0.286 0.173 0.179 0.140
0.279
Milk
Logit equation for Journals
!1.046 0.176 0.161 0.270
Vegetables
Yoghurt
!0.658
!1.222
!0.959
0.265
0.279
0.173
0.170 0.185 0.238
0.114 0.152 0.146
0.238 0.182 0.140
0.260 0.386
0.185
0.140
0.324 0.086 0.401
0.204 0.312 0.272
0.179 0.265 0.161
0.140 0.121 0.061
0.400 0.255
0.487
0.260 0.188
0.127 0.230
0.179
0.199 0.093 0.182
0.279
0.401
0.400
0.173
0.312
0.487
0.290
to 0.176 and 0.230, respectively), but no effects on interactions (i.e. the interaction parameters are the same as without feature, bG "0). Especially, salt & garlic is a HG category only weakly related to consumption activities. Promoting it therefore does not change consumption patterns of other categories. For some of the other categories effects of promotion on interactions can be confirmed. Promotion increases complementary relationships in some categories (parameters bG are shown in parentheses): HG canned vegetables P pasta (0.236); canned vegetables P soups sauces (0.251); dried fruit P baking products (0.338); hair care P hygienic products (0.021); soups & sauces P canned vegetables (0.206). Most of these category pairs are complements with regard to consumption acitivities. Note that promotion of canned vegetables increases the complementary relationship with soups & sauces, while promotion of soups & sauces in turn increases the complementary relationship with canned vegetables. Promotion in some categories lead to complementary relationships with certain other categories, i.e. categories
i and j are only related if i is promoted: rice & legume P milk (0.318); soups & sauces P baking products (0.223). The multivariate logit model demonstrates that for some categories promotion decreases the joint purchase probability compared to the situation without promotion: flour flour bread bread deli exotic fruit fruit hygienic tissue
P baking products P fat & oil P fruit P milk P canned fish P frozen fruit P bread P dental care
(!0.276); (!0.276); (!0.375); (!0.108); (!0.046); (!0.495); (!0.215); (!0.260).
Accelerated purchases (of e.g. flour, tissue), reduction of consumption activities including the categories on the right-hand side (e.g. fruit or bread) or substitution in consumption activities (e.g. of frozen by exotic fruit) caused by promotion may be responsible for these results.
104
H. Hruschka et al. / Journal of Retailing and Consumer Services 6 (1999) 99—105
5. Conclusions Market basket data offer new possibilities to provide information that is relevant from the point of view of retail managers. Hitherto the most popular approaches showing interdependencies in retail assortments rely on association measures for pairs of categories. We presented a multivariate binomial logit model for market basket data analysis which overcomes limitations of these bivariate methods. It separates simple effects of frequency from interactions between categories. It allows to identify single categories as well as clusters of categories that are independent from the rest of the assortment with regard to purchase probabilities. Moreover, the multivariate logit also measures cross-category effects of sales promotions on purchase probabilities. Nevertheless this new approach has certain limitations. It only considers whether or not items are purchased together, but neglects how much of these items are purchased. In the form presented here, individual shoppers are not identified and interdependencies across purchases of an individual or household are neglected due to data restrictions. Moreover, homogeneity across purchases is assumed. Of course, most of these limitations draw attention to possible extensions of the model. The capability to give a more valid picture of purchases of categories and their interdependencies is of managerial relevance. The multivariate logit model calibrated on basket data is suited for assisting retail managers in taking sales promotion and assortment decisions. With regard to sales promotion the relevant literature focuses on the effects of features, displays, coupons and price cuts on purchases of the same category or brand (Blattberg and Neslin, 1990). Despite the obvious importance of increasing purchase probabilities of the same category or item (i.e. strong positive influences of promotions on main effects), in many situations promotions are effective only if they stimulate sales of non-promoted categories or items, e.g. because of higher sales of other categories or items to current customers (Mulhern and Leone, 1991). To take an example from our empirical study, soft drinks have a high purchase frequency, but do not interact with the rest of the assortment. Promoting soft drinks will raise their own sales without influencing other categories. Promotion will only be worthwhile, if soft drinks have high volume and margins. On the other hand, if two categories are strong complements, the retailer may achieve higher profits if he promotes both. If a retail manager is interested in cross-selling, she or he should look for categories with high main effects that are also strong complements with relationships to other frequently bought high contribution categories. If customers buy more of category A if it is promoted and
if they are also more likely to buy category B, then the retailer’s profits may be higher if he promotes only A. According to the empirical results this may be valid for canned vegetables and pasta, dried fruit and baking products as well as hair care and hygienic products. Such categories are also appropriate for loss leaderpricing. This means that they are offered with a lower price, which besides own category sales also increases purchase probabilities of related high-margin categories. This way the retailer earns higher profits. On the other hand, if effects of promoting category A are positive with regard to B and vice versa, it may be of advantage to promote both. The results for soups & sauces and canned vegetables provide an example. Promotion of categories may reduce purchases in certain other categories. This makes promotion or lossleader pricing less attractive. Relevant examples in our study are bread and fruit or milk, deli and fish, exotic fruit and frozen fruit.
Acknowledgements We thank Helmut Schmalen and his co-workers at the Department of Marketing of the University of Passau for their help in providing the data and discussing results. Partial support by the Austrian Research Foundation (Project P 7527) is acknowledged.
References Agrawal, R., Sikant, R., 1994. Fast algorithms for mining asssociation rules. Proceedings of the VLDB Conference. Santiago, Chile. Betancourt, R., Gautschi, D., 1990. Demand complementarities, Hausehold production, and retail assortments. Marketing Science 9, 146—161. Bishop, Y.M.M., Fienberg, S.E., Holland, P.W., 1975. Discrete multivariate analysis. Theory and Practice. MIT Press, Cambridge, MA. Blattberg, R.C., Neslin, S.A., 1990. Sales promotion. Prentice-Hall, Englewood Cliffs, NJ. Bo¨cker, F., 1975. Die Analyse des Kaufverbunds — Ein Ansatz zur bedarfsorientierten Warentypologie. Zeitschrift fu¨r Betriebswirtschaftliche Forschung 27, 290—306. Bo¨cker, F., 1978. Die Bestimmung der Kaufverbundenheit von Produkten. Duncker & Humblot, Berlin. Bultez, A., Julander, C.-R., Nisol, P., 1996. Structuring retail assortments according to in-store shopping. Proceedings of the Annual Conference of the European Marketing Academy (EMAC), Budapest, pp. 1521—1531. Chintagunta, P.K., Haldar, S., 1994. Measuring cross-category dependence in the purchase timing behavior of households: the case of complementary goods. Unpublished manuscript. Dickinson, R., Harris, F., Sircar, S., 1992. Merchandise compatibility: an exploratory study of its measurement and effect on department
H. Hruschka et al. / Journal of Retailing and Consumer Services 6 (1999) 99—105 store performance. International Review of Retail, Distribution and Consumer Research, 351—379. Gallant, A.R., 1987. Nonlinear Statistical Models. Wiley, New York. Green, P.E., Krieger, A.M., Zelnio, R.N., 1989. A componential segmentation model with optimal product design features. Decision Sciences 20, 220—238. Hruschka, H., 1991. Bestimmung der Kaufverbundenheit mit Hilfe eines probabilistischen Me{modells. Zeitschrift fu¨r betriebswirtschaftliche Forschung 43, 418—434. Julander, C.-R., 1992. Basket analysis. A new way of analysing scanner data. International Journal of Retail & Distribution Management, 20, 10—18. Maddala, G.S., 1987. Limited-Dependent and Qualitative Variables in Econometrics, Cambridge University Press, Cambridge.
105
Nerlove, M., Press, J., 1973. Univariate and Multivariate Log Linear and Logistic Models RAND Report. Mulhern, F.J., Leone, R.P., 1991. Implicit price bundling of retail products: a multiproduct approach to maximizing store profitability. Journal of Marketing, 55, 63—76. Manchanda, P., Ansari, A., Gupta, S., 1997. The shopping basket: A model for multi-category purchase incidence decisions. Working Paper, Columbia University, New York. Schmalen, H., Pechtl, H., 1995. Die Absatzwirkung von Sonderangebotsaktionen im Lebensmitteleinzelhandel. Zeitschrift fu¨r Betriebswirtschaft, 65, 587—607. Walters, R.G., 1991. Assessing the impact of retail promotions on product substitution, Complementary purchase, and inter-store sales displacement. Journal of Marketing, 55, 17—28.