Applying behavioral economics in predictive analytics for B2B churn: Findings from service quality data

Applying behavioral economics in predictive analytics for B2B churn: Findings from service quality data

Accepted Manuscript Applying behavioral economics in predictive analytics for B2B churn: Findings from service quality data Arash Barfar, Balaji Padm...

2MB Sizes 0 Downloads 31 Views

Accepted Manuscript Applying behavioral economics in predictive analytics for B2B churn: Findings from service quality data

Arash Barfar, Balaji Padmanabhan, Alan Hevner PII: DOI: Reference:

S0167-9236(17)30118-5 doi: 10.1016/j.dss.2017.06.006 DECSUP 12857

To appear in:

Decision Support Systems

Received date: Revised date: Accepted date:

15 November 2016 17 June 2017 26 June 2017

Please cite this article as: Arash Barfar, Balaji Padmanabhan, Alan Hevner , Applying behavioral economics in predictive analytics for B2B churn: Findings from service quality data, Decision Support Systems (2017), doi: 10.1016/j.dss.2017.06.006

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT APPLYING BEHAVIORAL ECONOMICS IN PREDICTIVE ANALYTICS FOR B2B CHURN: FINDINGS FROM SERVICE QUALITY DATA Arash Barfar Information Systems Department College of Business University of Nevada at Reno

CR

IP

T

Balaji Padmanabhan & Alan Hevner Information Systems & Decision Sciences Department Muma College of Business, University of South Florida 4202 E. Fowler Avenue, Tampa, FL 33620

Abstract: Motivated by the long-standing debate on rationality in behavioral economics

US

and the potential of theory-driven predictive analytics, this paper examines the link

AN

between service quality and B2B churn. Using longitudinal B2B transactional data with service quality indicators provided by a large company, we present evidence that both

M

rationality and bounded-rationality assumptions play significant roles in predicting organizational decisions on churn. Specifically, variables that relate to the assumed

ED

rationality of organizations appear to provide accurate predictions while, at the same time,

PT

variables that capture boundedly rational decision rules appear to play a role through “somatic states” that make organizations more sensitive to the rational variables. In

CE

addition to presenting a novel approach for predicting organizational decisions on churn, this paper offers theoretical and managerial insights as well as opportunities for future

AC

research at the intersection of behavioral economics and predictive analytics for decisionmaking.

Keywords: organizational decision analytics, B2B service operations, churn, service quality, decision-making, rationality, bounded rationality, heuristics, adaptive toolbox, somatic states

1

ACCEPTED MANUSCRIPT APPLYING BEHAVIORAL ECONOMICS IN PREDICTIVE ANALYTICS FOR B2B CHURN: FINDINGS FROM SERVICE QUALITY DATA

Abstract: Motivated by the long-standing debate on rationality in behavioral economics

IP

T

and the potential of theory-driven predictive analytics, this paper examines the link between service quality and B2B churn. Using longitudinal B2B transactional data with

CR

service quality indicators provided by a large company, we present evidence that both

US

rationality and bounded-rationality assumptions play significant roles in predicting organizational decisions on churn. Specifically, variables that relate to the assumed

AN

rationality of organizations appear to provide accurate predictions while, at the same time, variables that capture boundedly rational decision rules appear to play a role through

M

“somatic states” that make organizations more sensitive to the rational variables. In

ED

addition to presenting a novel approach for predicting organizational decisions on churn, this paper offers theoretical and managerial insights as well as opportunities for future

PT

research at the intersection of behavioral economics and predictive analytics for decision-

CE

making.

Keywords: organizational decision analytics, B2B service operations, churn, service

AC

quality, decision-making, rationality, bounded rationality, heuristics, adaptive toolbox, somatic states

2

ACCEPTED MANUSCRIPT APPLYING BEHAVIORAL ECONOMICS IN PREDICTIVE ANALYTICS FOR B2B CHURN: FINDINGS FROM SERVICE QUALITY DATA 1. INTRODUCTION Service organizations are highly invested in maintaining strong relationships with their customer base, both individual customers (B2C) and business entities (B2B). Loyalty in B2B service operations is particularly

T

important since B2B interactions are perhaps fewer but provide greater numbers of transactions and more

IP

revenue per transaction (Rauyruen & Miller 2007). To understand the role that service quality plays in B2B

CR

loyalty, we analyze a rich longitudinal service quality database from a large company. The database covers two years of weekly service transactions for nearly 100,000 Small and Medium Enterprises (SMEs) as

US

service customers. These SMEs may be viewed as active processors of service quality, which constantly

AN

analyze their service records and decide whether to stop or continue their business with the service company (Koufteros et al. 2014). Using this fine-grained and longitudinal B2B service quality data to examine the

M

role that service quality plays in B2B churn provides the applied context of this research. The methodological context of this research is informed by two stimulating research streams. The field

ED

of behavioral economics has the potential to fuel new streams of interdisciplinary Information Systems (IS)

PT

research (Goes 2013). At the same time, the recent explosion of interest in predictive analytics has highlighted the potential of leveraging massive fine-grained behavioral data that are becoming available

CE

(Martens et al. 2016). This paper is one of the first that brings these two areas together in the context of an important decision support context (i.e. managing customer churn).

AC

Our approach involves designing rational and boundedly rational measures of service pain assessment, which relate to the neoclassical and behavioral perspectives on the rationality assumption in organizational decision making. Do levels of service pain affect the SME’s decision on churn? Can different perspectives on the rationality assumption help us design different service pain evaluation variables that capture such effects and, subsequently, predict organizational decisions on churn? This paper takes up these broad questions and examines both explanation and prediction (Shmueli 2010). That is, we investigate (i) if the

3

ACCEPTED MANUSCRIPT service pain evaluation variables are associated with SMEs’ decisions to churn and (ii) whether such variables help generate accurate predictions. Using a finely-granular service database to construct theoretical features also presents important technical challenges in the design of an ETL (Extract-Transform-Load) system. ETL processing can absorb

T

up to 70 percent of data warehousing resources (Kimball & Caserta 2004). Hence, ETL design is vitally

IP

important in feature engineering with large databases. Our ETL focus allows the effective transformation

CR

of thousands of an SME’s service records in multiple tables into one predictive record. While this paper does not discuss the technical details of the ETL system employed in this research, we believe such feature

US

engineering and ETL design deserve greater consideration as a separate step in theory-driven predictive modeling (Shmueli & Koppius 2011).

AN

This paper offers the following contributions. First, we demonstrate how behavioral economics ideas can be integrated into predictive analytics to develop a decision support system for B2B churn prediction.

M

Second, we examine the effect of service quality on B2B churn in a non-contractual setting. The study’s

ED

third contribution concerns the long-standing debate on rationality in economics. Our results corroborate Friedman’s (1953) perspective on the role of simplifying assumptions (e.g. omniscient rationality) in

PT

predicting firm’s behavior. Specifically, the rational measures of service quality assessment help yield

CE

accurate predictions about organizational decisions on churn. On the other hand, there are boundedly rational decision rules that appear to exacerbate the impact of the rational service quality measures on

AC

SMEs’ decisions to churn; a finding that is in line with the somatic marker hypothesis in neuroeconomics. 2. BACKGROUND

2.1. Behavioral Economics and Organizational Decision Making Current thinking on the drivers of organizational decisions reflects two economic viewpoints. In the more traditional view, neoclassical economics assumes that organizations are rational, omniscient, and with no limitations on computational capacities and time (Rieskamp et al. 2006). In contrast, behavioral economics views organizational decision making through the lens of bounded rationality. That is, satisficing administrators might employ heuristics in information processing, which could further make their decisions

4

ACCEPTED MANUSCRIPT susceptible to different biases and short on substantive rationality (Simon 1997). In fact, previous decision support systems research (George et al. 2000) has shown that such biases are indeed strong. Despite their fundamentally different perspectives on rationality, both schools agree that the ultimate test of a theory is its prediction accuracy against observed behavior (Friedman 1953; Simon 1987).

T

Neoclassical economists insist on the need for simplifying assumptions (e.g. rationality) in achieving

IP

accurate predictions (Friedman 1953), whereas behavioral economists ponder if more realistic assumptions (e.g. bounded rationality) can also contribute to predictive power (Simon 1987; Camerer & Loewenstein

CR

2004). Regarding organizational decision making, behavioral economists ask if “there are important,

US

empirically verified, aggregate predictions that follow from the theory of perfect rationality but that do not follow from behavioral theories of rationality” Simon (1979, p. 496). Yet, predictive accuracy of utility and

AN

heuristic models is rarely investigated in behavioral operations (Katsikopoulos & Gigerenzer 2013). Our analyses are based on the premise that administrative decision-making involves a combination of

M

both intuitive and analytical skills (Simon 1997), hence our decision models ideally include both rational

ED

and boundedly rational predictors. This is also in accordance with Kuhn’s (1961) perspective on scientific practice; i.e. we involve the two competing theories in building predictive models to see if the theoretical

PT

service pain evaluation variables help predict B2B churn.

CE

We close this brief review on the behavioral-neoclassical tension with a note on the rationality assumption. Neoclassical economics does not insist that an individual firm behaves rationally (Friedman

AC

1978). Rather, it views the (not so realistic) rationality assumption as a means of achieving accurate predictions about a group of firms in uncontrolled settings (Friedman 1953). This perspective has recently received attention in business research; i.e. research should be judged based on objective criteria such as predictive accuracy rather than relatively subjective criteria such as realism of assumptions (Shugan 2009). 2.2. Predictive Analytics and Churn Churn is a major problem in predictive analytics and has attracted considerable attention in several fields, including telecommunications (e.g. Tsai & Lu 2009; Verbeke et al. 2012), energy (Moeyersoms & Martens 2015), financial services (e.g. Van den Poel & Lariviere 2004; Glady et al. 2009; Nie et al. 2011), electronic

5

ACCEPTED MANUSCRIPT commerce (e.g. Yu et al. 2011), retail markets (e.g. Buckinx & Van den Poel 2005), subscription services (Burez & Van den Poel 2007), donations (Fader et al. 2010), and human resources (i.e. employees) (Saradhi & Palshikar 2011). Almana et al. (2014) and Vafeiadis et al. (2015) provide recent surveys and comparisons of machine learning techniques used in churn prediction.

T

This paper contributes to the churn literature from three perspectives. First, the majority of churn studies

IP

are conducted in contractual settings; i.e. the timing of defection is clear. Yet, a significant segment of the

CR

service industry operates in non-contractual settings, where customers can silently respond to competitors’ loud overtures. Previous churn studies in non-contractual settings mostly predict customer behavior in a

US

predetermined prediction period (e.g. Fader et al. 2010; Rust et al. 2011; Jahromi et al. 2014). Nonetheless, the timing of defection is of the utmost importance, particularly if the service company plans to exercise a

AN

retention program following specific series of service failures that precede churn. Second, the majority of churn studies incorporate RFM (Recency, Frequency, and Monetary Value)

M

based factors, demographics, and surveys as the main predictors (Buckinx & Van den Poel 2005, for

ED

limitations of RFM based churn models see Lee et al. 2011). The few studies that use service quality attributes as predictors (e.g. Padmanabhan et al. 2011) have taken a purely inductive approach. In contrast,

PT

this paper employs theoretical service quality variables as the main predictors for churn.

CE

Third, the majority of churn studies concern B2C settings. The few studies that focus on B2B churn are not similar to the present study from the above two perspectives. Bolton et al. (2006) study B2B churn in a

AC

contractual setting with the data of 143 firms, where “average engineer work-minutes per contract” is the quality metric. In this paper, however, we investigate two years of several weekly service quality indices for nearly 100,000 SMEs. Jahromi et al. (2014) and Chen et al. (2014) study B2B churn in non-contractual settings. In contrast to these studies, in this paper we apply theory-driven predictive analytics and demonstrate that the service quality variables are relevant and can predict SMEs’ decisions on churn. 3. THE SERVICE DATABASE: SERVICE PAIN AND EPISODES The B2B service database in this study is comprised of nine large data tables. The service transactions table includes the number of service units that each SME has received in each week within a two-year period;

6

ACCEPTED MANUSCRIPT i.e. . In the two-year service period, several hundred million units of service were delivered to the customers. With nearly 100,000 SMEs, each with 105 weeks of provided service, the service transactions table has nearly ten million records. The service company has defined a set of eight Service Quality Indexes (SQIs). Each SQI corresponds

T

to a specific type of service failure (e.g. a one-day delay in fulfilment) that an SME might experience with

IP

one unit of service. Table 1 shows the total relative frequency of service failures related to each SQI. Note

CR

that except for 𝑆𝑄𝐼1, less than 1% of the service units delivered in the two-year window were subject to different types of service failures. Nevertheless, we investigate if such infrequent service failures can

Table 1: SQIs Service Failure Percentage SQI1 1.21%

SQI2 0.33%

SQI3 0.08%

SQI4 0.02%

AN

SQI Incident per Service Unit

US

explain/predict churn.

SQI5 0.03%

SQI6 0.01%

SQI7 0.00%

SQI8 0.06%

M

For every SME in every week we observe how many units of provided service were subject to a specific type of failure. In addition to the eight SQIs, a domain expert in the company provided a holistic SQI which

ED

is a weighted linear combination of the individual SQIs which we also use in the study. The SQIs conceptually correspond to the absence of different service “hygiene” attributes (Naumann

PT

& Jackson 1999), which are expected as inherent parts of service (e.g. being on-time). Thus, the SQIs

CE

pertain to different measures of momentary pain stimulus (denoted by 𝑝𝑡 ) in behavioral economics (e.g. Ariely 1998). Besides 𝑝𝑡 , since an SME receives instant utility from the fulfilled service units, we normalize

AC

the weekly 𝑝𝑡 related to each SQI with the number of provided service units in that week; i.e. proportional momentary pain, denoted by 𝑝̅𝑡 . Specifically, 𝑝̅𝑡 is the number of service units suffered from a specific SQI failure, divided by the total number of provided service units in a week. As SMEs make transactions with the company their service pain/utility profiles become continually updated, hypothetically by 𝑝𝑡 or 𝑝̅𝑡 corresponding to different SQIs. We hypothesize that SMEs are actively evaluating their service pain/utility profiles, and based on those evaluations they decide to churn or stay loyal to the service company (Bentham 1789; Jevons 1888). To predict their decisions on churn,

7

ACCEPTED MANUSCRIPT should we assume that SMEs practice a rational model of service evaluation, where they make decisions based on the actual experienced service pain/utility (e.g. temporal average), or should we realistically assume that they abide by boundedly rational administrators who rely on judgment heuristics? To answer this question, we first need to define a ‘service episode’; i.e. a bounded time interval defined

T

by its instant service utilities and pains (Kahneman 2000). Unless the SME’s loyalty age is less than two

IP

years (i.e. the database time span), we assume that its service episode starts with the beginning of our service database. In the case of churners, the end of the SME’s service episode naturally coincides with the timing

CR

of its defection. Due to the non-contractual setting of this study, ‘defection’ corresponds to significant

US

inactivity that lasts until the end of the database two-year window. Considering the large number of SMEs (i.e. nearly 100,000) and since non-contractual data are not labeled, we employed the following time-

AN

intensive two-step process to identify churners and their churn dates. This process took over two months to complete, but resulted in customer churn identifications accepted by the company as accurate.

M

1. We first form a pool of potential churners including thousands of SMEs whose service unit time-series

ED

satisfy the following two conditions: (i) There is a point in time where the moving average of the number of service units drops by at least 80%, and (ii) The slope of the first order regression line on

PT

the service unit time-series is less than -0.05. The two cutoffs (i.e. 80% and -0.05) were selected after

CE

a sensitivity analysis; the combination carries a low rate of false negatives after validating random samples of candidates with an expert analyst in the service company.

AC

2. For the several thousands of potential churners in the pool we individually plot their service unit timeseries; i.e. number of weekly service units against time (e.g. Figure 1). We then manually examine the time-series plots to select the churners and register their churn dates. Further, every one of the plots identified as churn, and a sample that were not, were verified by the service company expert. In the above process, a few thousand SMEs were selected as churners and the timings of their defections were registered. Of the churners, we focus on those with at least six months of service transactions to ensure enough data for the analyses. Each identified churner has its specific service episode with respect to the episode’s timings and content. The episode’s content in this study concerns (i) weekly service units as

8

ACCEPTED MANUSCRIPT instant service utilities and (2) eight different types of weekly service pains. This paper is one the first churn studies in non-contractual settings where each churner has a specific churn date, compared to a predefined prediction period (Fader et al. 2010; Rust et al. 2011; Jahromi et al. 2014). This is important when service

CR

IP

T

companies plan to exercise a retention program following specific series of failures that precede churn.

US

Figure 1: An identified churner

4. BOUNDEDLY RATIONAL CHURN DECISION RULES

AN

In accordance with the Principle of Utility (Bentham 1789), we suspect that the level of service pain an

M

SME experiences can be used to predict its future decision on churn. In this section we draw on behavioral economics to suggest decision rules that an SME might apply to decide whether to churn. We then consider

ED

these as features which can be examined using descriptive and predictive analytics for churn. Rational models of information processing suggest that a service episode is evaluated based on its

PT

actual experienced service pain and utility, which is the temporal average of the episode’s instant pains.

CE

Behavioral economics, however, posits that individuals are guided by their memories of pain, and not the actual experienced pain (Kahneman et al. 1997). Such remembered pain is liable to biases, and hence is a

AC

fallible estimate of the actual experienced pain; i.e. a memory-experience gap (Miron-Shatz et al. 2009). The snapshot model in behavioral economics (Fredrickson & Kahneman 1993) explains the retrospective evaluation of temporally extended experiences (e.g. service episodes). Individuals assess their experience episodes by constructing a snapshot of the episode’s representative moments and evaluating that snapshot’s utility (Kahneman et al. 2003). In a similar vein, we posit that SMEs continuously update their service snapshots based on their streaming experience with the company (Fredrickson & Kahneman 1993).

9

ACCEPTED MANUSCRIPT We propose boundedly rational churn decision rules based on the representativeness and availability heuristics which are the focus of the of the heuristics and biases research program (Tversky & Kahneman 1974; Kahneman & Frederick 2002). Every churn decision rule can be exercised with each of the eight SQIs. For exposition, we illustrate a real example from the service database that is consistent with each

T

decision rule. That is, the data behave as if an SME employed the decision rule and churned. While not

IP

suggesting causality, we seek such examples in the database for two reasons. First, such examples help the

CR

reader see the potential impacts of such decision rules on B2B churn. Second, any decision rule should have concrete examples in the database to be considered in the study. Not seeing any example may suggest that

4.1. Representativeness Decision Rules for Churn

US

the decision rule is never employed and should not be part of the study.

AN

A prominent heuristic for judging pain episodes is the peak-end rule (Fredrickson & Kahneman 1993); i.e. the past episode is evaluated based on its maximum instant pain along with the instant pain close to the

M

end— as the two representatives for all the episode’s instant pains. For example, among a group of patients

ED

that have undergone a painful operation, those with less total pain and more end pain evaluated the whole procedure more painful than those with more total pain and less end pain (Redelmeier & Kahneman 1996).

PT

The original peak-end rule is merely an average of the peak pain and the end pain, hence the timing of

CE

the peak pain is not considered in the subsequent evaluation. This may not be important in less-than-onehour episodes (e.g. Redelmeier & Kahneman 1996); however, we suspect the timing of the peak pain might

AC

play a significant role in evaluating long service episodes. In a similar vein, some studies highlight the role of instant pains trends (Ariely & Carmon 2003), where a sequence of increasing pains is retrospectively judged worse than a sequence of decreasing ones, although both sequences carry the same total pain. That is, pushing the peak pain to the end can change the slope significantly, whereas the peak-end average stays as before. Accordingly, we propose separate decision rules based on the peak pain and the end pain. The first and simplest decision rule (DR1) is solely based on the existence of service pain. That is, the SME will churn if the recent pain is greater than zero, regardless of its magnitude. Figure 2 depicts a churn in the database that can be attributed to the application of this rule; i.e. the SME decides to churn following

10

ACCEPTED MANUSCRIPT the first incident of a specific SQI failure. The time-series show that on 5/31/2010, one of the eight units of

CR

IP

T

provided service was subject to a specific type of failure, and the company subsequently churned.

US

Figure 2: Application of DR1; (a) a specific SQI (b) service units

To act upon a churn decision rule, an SME needs some time to complete a switch to another service

AN

provider, hence we consider an arbitrary action window (e.g. six weeks) to formulate the decision rules. In addition to the six-week action window, we examine the application of these decision rules with four-week

M

and eight-week action windows. With a six-week action window DR1 can be formulated as: ∃𝑡 ∈ [𝑇 − 5, 𝑇]: 𝑝𝑡 ≻ 0;

ED

DR1: churn in week 𝑇 if

DR1 also has a more rational manifestation, where the SME’s tolerance for service failure grows with

PT

the number of service units (i.e. extensional target evaluation). Behavioral economics posits that the logical rule of judgment is extensional (Kahneman & Frederick 2002); however, no such strict statement can be

CE

made in B2B services. It is not clear whether an SME’s potential insensitivity to service volume is an

AC

unconscious effect or a deliberate strategy. One reason is that SMEs, compared to individuals, are more likely to have logged information about the service volume, hence the extensional target attribute is not low in accessibility. Moreover, here both sensitivity and insensitivity to service scope are backed by apt explanations: An SME may not take its broad service scope into account, expecting no service pain at all since it is paying for each unit of service. Some may even push this further, expecting that broader service scopes deserve special care from the service company and subsequently less service pain. We refer to this hypothetical phenomenon as righteous neglect of scope since it can be endorsed by analytic reasoning. At the other extreme, an SME may appreciate the probability and admits that as the service scope expands the

11

ACCEPTED MANUSCRIPT probability of service failures grows—leading to the sensitivity to service scope. For the same reasons, in addition to the SQIs’ instant proportional pain (𝑝̅𝑡 ) we will test all of the suggested decision rules with instant pain (𝑝𝑡 ); i.e. without considering the instant service utility that the SME receives. The decision rule for the extensional target evaluation of the end service pain (DR2) is proposed in a

T

way that satisfies monotonicity (Ariely & Lewenstein 2000). Monotonicity holds if each service unit adds

IP

to the failure tolerance threshold an amount that depends on the previous service utility and failure that the

CR

SME has already experienced. Thus, DR2 states that the SME will churn if the updated average of service pains (after the recent failure) is greater than the average prior to the recent service failure. This condition

US

holds iff the average of the recent instant pains is greater than the average prior to the recent failures. To illustrate, suppose that throughout the past course of service transactions where 1000 units of service

AN

were provided, the SME has experienced ten units of failure. In the present month, the number of service units is 101 and the SME has experienced one unit of service failure. In the case of extensional service pain

M

evaluation, one unit of pain is commensurate with 101 service units, compared to the past proportional pain

ED

(i.e. 10/1000). The SME may even appreciate this as a sign of improvement in service quality. ∑𝑇 𝑡=𝑇−5 𝑝𝑡 ∑𝑇 𝑡=𝑇−5 𝑢𝑡

∑𝑇−6 𝑝𝑡

𝑡=1 ≻ ∑𝑇−6 𝑡=1

𝑢𝑡

; where 𝑢𝑡 is the service volume in week 𝑡.

AC

CE

PT

DR2: churn in week 𝑇 if

Figure 3: Application of DR2; (a) a specific SQI (b) service units In Figure 3, for example, the service pain in the red area could push an SME with a sense of probability to churn, although it is not worse than the pain the SME experienced before. The reason is that the average pain in the red area is worse than the previous average pain (gray area); i.e. a regression in service quality.

12

ACCEPTED MANUSCRIPT The last decision rule in this section concerns the peak aspect of the peak-end rule, where an SME might take the maximum instant (proportional) service pain as a representative for the whole service episode. DR3 addresses an SME that will churn if the recent pain is greater than any pain it has experienced before. Figure 4 depicts a churn in the database that can be attributed to the application of DR3 with 𝑝𝑡 . 𝑇 𝑡=𝑇−5𝑀𝑎𝑥(𝑝𝑡 )



𝑇−6 𝑡=1𝑀𝑎𝑥(𝑝𝑡 )

AN

US

CR

IP

T

DR3: churn in week 𝑇 if

M

Figure 4: Application of DR3; (a) a specific SQI (b) service units 4.2. Availability Decision Rules for Churn

ED

In accordance with attribution theory, we suggest that the SME’s judgment about the frequency of service failures plays an important role in its decision on churn. Frequent service failures eventually turn into a

PT

stable attribution of the service company, which pertains to the application of a judgment heuristic known

CE

as the availability heuristic. An incident is estimated as frequent if it is available; i.e. it can be easily brought to mind (Tversky & Kahneman 1973). In this vein, clinical research has shown that the recalled pain

AC

frequency is often overestimated if the pain is recent (Van Den Brink et al. 2001; Shiffman et al. 2008). In service operations, the broad decision rule that stems from the availability heuristic is equivalent to DR1. If the most recent service pain is greater than zero, the service failure that caused pain will be also conceived as frequent; an impression that can lead the SME to churn in accordance with attribution theory. As noted earlier, DR1 covers both prototypical and extensional target evaluation. Here we present a version of this decision rule for the extensional target evaluation of failures frequency with two different measures.

13

ACCEPTED MANUSCRIPT The first measure (𝑓) for service failures frequency is temporal; i.e. the number of weeks that include at least one incident of related service failure divided by the number of weeks that include at least one service unit (f =

|{𝑡|∀𝑡

|{𝑡|∀𝑡

𝑝𝑡 ≻ 0}| 𝑢𝑡 ≻ 0}| ). Following the same logic presented for DR2, the decision rule for

extensional target evaluation of temporal frequency states that an SME will churn if the recent temporal

T

frequency of service failures is greater than what it was before. Figure 5 depicts a churn after two

𝑇−6 𝑡=1𝑓

M

AN

US



CR

𝑇 𝑡=𝑇−5𝑓

DR4: churn in week 𝑇 if

IP

consecutive failures.

ED

Figure 5: Application of DR4; (a) a specific SQI (b) service units The second measure (𝐹) for service failures frequency is incidental, which is the number of related

𝑇 𝑡=𝑇−5𝐹



∑ 𝑝𝑡

|{𝑡|∀𝑡

𝑢𝑡 ≻ 0}| .

𝑇−6 𝑡=1𝐹

AC

CE

DR5: churn in week 𝑇 if

PT

service failures divided by the number of weeks that include at least one service unit; 𝐹 =

Figure 6: Application of DR5; (a) specific SQI (b) service units Figure 6 depicts a churn incident that can be attributed to the application of the availability heuristic with the incidental measure.

14

ACCEPTED MANUSCRIPT Table 2 summarizes the five decision rules, and their possible applications with different SQIs. Table 2: Summary of Decision Rules and their Applications Can be applied with…* Eight SQIs’ 𝑝𝑡 and the holistic SQIs’ 𝑝𝑡 Eight SQIs’ 𝑝𝑡 and 𝑝̅𝑡 , along with the holistic SQIs’ 𝑝𝑡 , 𝑝̅𝑡 , and 𝑝𝑖𝑡 Eight SQIs’ 𝑝𝑡 and 𝑝̅𝑡 , along with the holistic SQIs’ 𝑝𝑡 , 𝑝̅𝑡 , and 𝑝𝑖𝑡 Eight SQIs’ 𝑝𝑡 and the holistic SQIs’ 𝑝𝑡

T

Eight SQIs’ 𝑝𝑡 and the holistic SQIs’ 𝑝𝑡

IP

Description Churn if the recent pain is greater than zero, regardless of its magnitude. Churn if the updated average of service pains (updated after the most recent DR2 failure) is greater than the average prior to the recent service failure. Churn if the recent pain is greater than any pain that has been experienced DR3 before. Churn if the recent temporal frequency of service failures is greater than what it DR4 was before. Churn if the recent incidental frequency of service failures is greater than what DR5 it was before. *𝑝𝑡 : Instant pain 𝑝 𝑝̅𝑡 : Instant proportional pain ( 𝑡) 𝑢 𝑝𝑖𝑡 : overall number of service failures in a specific week regardless of the SQI types.

CR

Rule DR1

4.3. Somatic Markers and Boundedly Rational Decision Rules

US

Organizational decision making involves a combination of both intuitive and analytical skills, where administrative emotions steer analytical actions to particular goals in the organization (Simon 1997). This

AN

is in line with the dual-system of cognitive processes in psychology (Kahneman & Frederick 2002). That

M

is, System 2 (reasoning) concurrently monitors the quality of the quick proposals made by System 1 (intuition) and subsequently endorses, corrects, or overrides them. Accordingly, the suggested boundedly

ED

rational churn decision rules may also draw the SME’s attention to the rational measures of service quality.

PT

The somatic marker hypothesis in neuroeconomics (Damasio 2005; Bechara & Damasio 2005) can partly explain the hypothesized synergy between rational and boundedly rational assessments of service

CE

quality. A heuristic (e.g. peak service pain) can cause a somatic state that “functions as an alarm bell” and “operates not only as a marker for the value of what it represented, but also as a booster for continued

AC

working memory and attention” (Damasio 2005, p. 174). Likewise, the decision rule’s biasing nature might cause a somatic state that draws the SME’s attention to service quality, and subsequently calls for judgment, which might be carried out using the rational measures of quality assessment. 5. ANALYSES AND RESULTS We examine the large service database to see if rational and boundedly rational decision rules help the service company explain and predict an SME’s churn. We use one-third of the churners to build the descriptive dataset and the remaining two-thirds for the explanatory/predictive analyses.

15

ACCEPTED MANUSCRIPT Each identified churner has a specific service episode; i.e. a churner’s service episode ends with a specific churn date. For the non-churners however, we select the service episodes in both descriptive and predictive datasets based on the service episodes of the dataset’s churners. For every churner in the descriptive dataset (base rate: 1/9), for example, we randomly select nine non-churners (that have not been

T

selected by the ETL yet) whose initial service episodes are longer than the churner’s. For these nine non-

IP

churners, the ETL selects the service episodes’ ending and beginning in a way that (i) The ending coincides

CR

with the corresponding churner’s and (ii) The length is equal to the corresponding churner’s, based on which the ETL adjusts the beginning. Such matched sampling is used to control in part for common events

US

in the environment that might impact all SMEs. 5.1. Descriptive Analysis

AN

The aims of the study’s descriptive analysis are twofold. First, it intrinsically covers the two essential steps in building predictive models (Shmueli & Koppius 2011); i.e. exploratory analysis and choice of variables.

M

We have defined five churn decision rules in Section 4 each of which can be calculated with 𝑝𝑡 or 𝑝̅𝑡 of

ED

eight SQIs. The descriptive analysis helps decide which application measures, of which decision rules, with regard to which SQIs can be potential predictors for organizational churn.

PT

Second, the study’s descriptive analysis aims to address the view in behavioral economics that rejects

CE

positivism and the elements of statistical significance such as p-value and lack of fit (Hosseini 2003; Ziliak & McCloskey 2008). Thus, in the descriptive analysis we refrain from rushing into the tests of statistical

AC

significance; we first investigate any descriptive evidence of the application of churn decision rules. The descriptive ETL is designed to extract the three measures from the descriptive dataset. These measures in part address the application of the suggested decision rules in churn: Measure 1- Percentage of churners that immediately follow the firing of a decision rule compared to nonchurners (for whom the condition of the relevant decision rule holds in the last six weeks of their matched service episodes), Measure 2- Percentage of churners for whom the decision rule fired at least once compared to non-churners (for whom the condition of the relevant decision rule holds at least once in their matched episodes),

16

ACCEPTED MANUSCRIPT Measure 3- Alarm frequency of a decision rule for churners compared to non-churners. Alarm frequency is the number of times that the necessary condition of a decision rule holds in a service episode. Specifically, it concerns the number of times that a specific decision rule has fired, or set off a service quality alarm in a service episode, on which an SME could have acted and churned.

T

These measures help examine the importance of the representativeness and availability decision rules

IP

in B2B churn as defined in Section 4, through benchmarking the evidence of their application by churners

CR

against non-churners. Measure 1 is computed based on the last six weeks of the service episodes; hence the churn dates play a significant role in the first measure. To address the concern about any inherent inaccuracy

US

of the churn dates, the ETL computes Measures 2 and 3 that capture the relative importance of the decision rules with respect to the entire service episode, and not just its end.

AN

We end the descriptive analysis with a comparison between the actual service pain that the churners and non-churners experienced in their service episodes; i.e. a rational measure of service pain assessment.

M

5.1.1. Boundedly Rational Churn Decision Rules

ED

To investigate any evidence of the application of the churn decision rules, the ETL analyzes the information extracted based on the temporal locus of a six-week sliding window. For each SME, starting from the 25th

PT

week of its specific service episode, the ETL first extracts the necessary information for all decision rules

CE

with respect to all SQIs, and then examines if the conditions for a specific decision rule hold. Having registered the analysis results for the current temporal locus of the sliding window, the ETL moves the

AC

sliding window ahead for one week, updates the relevant information, and repeats the process until it reaches the end of the SME’s specific episode. The first 24 weeks of each service episode are left as the initial benchmark for the extensional decision rules (e.g. DS2). In addition to the six-week sliding window, the ETL in this section has been executed with four-week and eight-week sliding windows. The results are not significantly different than the ones reported with the six-week sliding window. For each SQI, in addition to its relevant instant pains (𝑝𝑡 ), we conduct the same analysis with instant proportional pains (𝑝̅𝑡 ) to examine their application with extensional target evaluation. Furthermore, the same analysis is conducted for 𝑝𝑖𝑡 , which is inherent to the holistic SQI, and addresses the overall number

17

ACCEPTED MANUSCRIPT of service failures in a specific week regardless of the SQI types. To illustrate, suppose that in a specific week, there is one service failure of 𝑆𝑄𝐼1 , one failure of 𝑆𝑄𝐼2, and no failures of the rest of SQIs. Here, 𝑝𝑖𝑡 is equal to 2, whereas 𝑝𝑡 corresponding to the holistic SQI is sum of the weights of 𝑆𝑄𝐼1 and 𝑆𝑄𝐼2 that has been suggested by a domain expert.

T

The pseudo-code in Figure 7 illustrates the significant ETL programming involved in this process.

IP

Note that there are four loops in the pseudo code; 100,000 SMEs, each with nearly 100 weeks, five different

CR

decision rules, and nine different SQIs (including the holistic one).

AN

US

For each SME, Fetch the SME’s service episode; i.e. [𝑤𝑠𝑡𝑎𝑟𝑡 , 𝑤𝑒𝑛𝑑 ], Set [𝑤𝑠𝑡𝑎𝑟𝑡 , 𝑤𝑠𝑡𝑎𝑟𝑡+23 ] as the base, For each 𝑤𝑖 in [𝑤𝑠𝑡𝑎𝑟𝑡+24 , 𝑤𝑒𝑛𝑑−5 ], For each decision rule 𝐷𝑅𝑗 , For each 𝑆𝑄𝐼𝑘 , Check to see if the conditions of 𝐷𝑅𝑗 * with 𝑆𝑄𝐼𝑘 (𝑝𝑡 ) holds within [𝑤𝑖 , 𝑤𝑖+5 ], Check to see if the conditions of 𝐷𝑅𝑗 with 𝑆𝑄𝐼𝑘 (𝑝̅𝑡 ) holds within [𝑤𝑖 , 𝑤𝑖+5 ], EndFor, EndFor, EndFor, EndFor,

M

*For conditions of different decision rules see section 4.1

Figure 7: Psuedo code for extracting pain assessment variables

ED

Table 3 presents the data for Measure 1, showing a comparison between the percentage of churners

PT

who immediately followed the decision rule and churned, and the same percentage for non-churners. To compute this measure, the ETL extracts the percentage of non-churners for whom the condition of the

CE

relevant decision rule holds in the last six weeks of their matched service episodes. Note that DR2 and DR3 can be calculated with 𝑝𝑡 and 𝑝̅𝑡 , hence each of their cells has two entries.

AC

Table 3: Relative importance of decision rules for churners compared to non-churners (Measure 1) 𝑺𝑸𝑰𝟏 𝒑𝒕 DR1 DR2 DR3 DR4 DR5

𝑺𝑸𝑰𝟐

̅𝒕 𝒑

0.93 1.09 1.15 0.93 1.32 0.84 0.87

𝒑𝒕

̅𝒕 𝒑

0.97 1.08 1.14 1.12 1.37 0.95 0.96

𝑺𝑸𝑰𝟑 𝒑𝒕

𝑺𝑸𝑰𝟒 ̅𝒕 𝒑

0.82 0.85 0.9 0.92 1.05 0.82 0.80

𝒑𝒕

𝑺𝑸𝑰𝟓 ̅𝒕 𝒑

0.83 0.86 0.86 0.68 0.73 0.79 0.79

𝑺𝑸𝑰𝟔 ̅𝒕 𝒑

𝒑𝒕

1.00 1.06 1.08 1.05 1.22 1.02 1.04

𝒑𝒕

𝑺𝑸𝑰𝟕 ̅𝒕 𝒑

1.16 1.17 1.17 1.08 1.16 1.17 1.17

𝒑𝒕

𝑺𝑸𝑰𝟖 ̅𝒕 𝒑

1.00 1.00 1.00 1.00 1.00 1.00 1.00

𝒑𝒕

Holistic SQI ̅𝒕 𝒑

1.08 1.13 1.13 1.18 1.13 1.11 1.07

𝒑𝒕

̅𝒕 𝒑

𝒑𝒊𝒕

1.06 0.96

0.98 1.19 1.16 0.91 0.88

1.14 1.00

The three bold statistics in Table 3 indicate the existence of proportional peak pain in the last six weeks prior to churn. These statistics address a relatively rational version of the peak pain decision rules since they are practiced with the SQIs’ 𝑝̅𝑡 , which also considers the received utility. For example, while the

18

ACCEPTED MANUSCRIPT condition for [DR3, 𝑆𝑄𝐼1 (𝑝̅𝑡 )] holds 32% more in the last six weeks of the churners’ episodes than for the non-churners’, the same decision rule (practiced with 𝑆𝑄𝐼1(𝑝𝑡 )) has an opposite trajectory (i.e. 0.93). Nonetheless, the bold statistics in Table 3 do not necessarily imply a causal relationship between the decision rules and churn. Consider for example the bold statistic in [DR3, 𝑆𝑄𝐼2 (𝑝̅𝑡 )]. In this case, 10.4% of

10.4% 7.6%

= 1.37). However, nearly half of the churners in the numerator did

IP

7.6% for the non-churners (i.e.

T

churners experienced the peak proportional pain in their last six weeks prior to churn, while this number is

CR

not follow the same decision rule more than four times within their service episodes. That is, the same decision rule had fired but they did not churn subsequently.

1.00 1.01 1.02 0.93 1.00 0.98 0.99

𝒑𝒕

𝑺𝑸𝑰𝟑

̅𝒕 𝒑

1.02 1.20 1.29 1.04 1.09 1.02 1.02

𝑺𝑸𝑰𝟒

̅𝒕 𝒑

𝒑𝒕

0.96 0.96 0.96 0.96 1.01 0.96 0.96

𝒑𝒕

𝑺𝑸𝑰𝟓

̅𝒕 𝒑

0.98 0.98 0.98 0.97 1.00 0.98 0.99

𝑺𝑸𝑰𝟔

̅𝒕 𝒑

𝒑𝒕

𝒑𝒕

̅𝒕 𝒑

1.07 1.08 1.07 1.04 1.08 1.06 1.07

1.20 1.21 1.21 1.23 1.25 1.21 1.21

𝑺𝑸𝑰𝟕 𝒑𝒕

𝑺𝑸𝑰𝟖

̅𝒕 𝒑

1.22 1.23 1.24 1.12 1.22 1.23 1.23

Holistic SQI ̅𝒕 𝒑

𝒑𝒕

1.06 1.06 1.04 1.48 1.30 1.05 1.06

𝒑𝒕 1.03 0.97

̅𝒕 𝒑

𝒑𝒊𝒕

1.01 1.04 1.02 0.99 1.00

1.03 0.94

M

DR1 DR2 DR3 DR4 DR5

𝑺𝑸𝑰𝟐

̅𝒕 𝒑

AN

𝑺𝑸𝑰𝟏 𝒑𝒕

US

Table 4: Relative importance of decision rules for churners compared to non-churners (Measure 2)

ED

Table 4 indicates a comparison between the percentage of churners for whom the condition of the decision rule holds at least once in their entire service episodes, and the same percentage for non-churners

PT

(i.e. Measure 2). Measure 2 does not depend on the churn dates; it counts the cases where an SME followed a decision rule with some delay. Table 4 shows that the percentage of churners and non-churners that could

CE

follow a decision rule is not practically different. Only for [DR3, 𝑆𝑄𝐼8 (𝑝𝑡 )] does the difference of 48%

AC

stand out.

Table 5: Relative importance of decision rules for churners compared to non-churners (Measure 3) 𝑺𝑸𝑰𝟏 𝒑𝒕 DR1 DR2 DR3 DR4 DR5

𝑺𝑸𝑰𝟐

̅𝒕 𝒑

0.97 1.00 1.04 0.94 1.06 0.94 0.94

𝒑𝒕

̅𝒕 𝒑

1.02 1.00 1.00 1.10 1.10 1.00 1.00

𝑺𝑸𝑰𝟑 𝒑𝒕

𝑺𝑸𝑰𝟒 ̅𝒕 𝒑

0.96 0.95 0.95 0.95 0.99 0.92 0.94

𝒑𝒕

𝑺𝑸𝑰𝟓 ̅𝒕 𝒑

1.03 0.99 1.00 0.92 0.93 0.99 1.00

𝑺𝑸𝑰𝟔 ̅𝒕 𝒑

𝒑𝒕

1.10 1.10 1.10 1.00 1.10 1.10 1.11

𝒑𝒕

𝑺𝑸𝑰𝟕 ̅𝒕 𝒑

1.20 1.20 1.20 1.10 1.10 1.19 1.19

𝒑𝒕

𝑺𝑸𝑰𝟖 ̅𝒕 𝒑

1.30 1.30 1.30 1.00 1.20 1.31 1.31

𝒑𝒕

Holistic SQI ̅𝒕 𝒑

1.05 1.00 1.00 1.00 0.97 1.03 1.03

𝒑𝒕 1.00 0.95

̅𝒕 𝒑 0.99 1.00 1.00 0.96 0.96

𝒑𝒊𝒕 1.00 0.95

Measure 3 indicates if a specific alarm was set off more for churners than for non-churners. Specifically, it provides a comparison between the average number of times that a decision rule’s condition holds in the entire service episode of a typical churner and that average for a typical non-churner.

19

ACCEPTED MANUSCRIPT 5.1.2. Rational Assessment of Service Pain Given that the service episodes in the analysis are of different lengths, we pick the temporal average of proportional service pain (𝜎) as a normative measure of the actual experienced service pain/utility; i.e. for each SME with a T-week service episode, 𝜎 =

∑𝑇 1 𝑝̅𝑡 𝑇

.

T

Table 6 shows that for all SQIs the mean of 𝜎 for the churners is greater than for the non-churners;

IP

suggesting that the churners in the descriptive dataset have experienced more total service pain than the

CR

non-churners. This indicates lower actual service quality for the churners throughout their service episodes. Consider the 𝑆𝑄𝐼1column: for a typical non-churner in the descriptive dataset, in every week on average,

US

1.49% of the service units suffered from the 𝑆𝑄𝐼1 failure, whereas this ratio is 1.65% for the churners. Note that the holistic percentages are large due to the weights the service company expert assigned to each SQI.

𝑺𝑸𝑰𝟏

𝑺𝑸𝑰𝟐

𝑺𝑸𝑰𝟑

𝑺𝑸𝑰𝟒

𝑺𝑸𝑰𝟓

𝑺𝑸𝑰𝟔

𝑺𝑸𝑰𝟕

𝑺𝑸𝑰𝟖

Holistic SQI

1.49% 1.65%

0.50% 0.61%

0.16% 0.19%

0.83% 0.84%

0.06% 0.07%

0.50% 0.59%

0.013% 0.017%

0.71% 0.77%

83.45% 94.48%

M

Non-churners Churners

AN

Table 6: Temporal average of proportional service pain

ED

Table 6 pertains to the rational models of information processing in neoclassical economics; hence we can examine its statistical significance. To investigate whether the mean of 𝜎 is significantly greater for the

PT

churners than for the non-churners, we conduct Analyses of Variance on 𝜎 with regard to each SQI. We

CE

first conduct an omnibus MANOVA. It is determined that significant differences exist between the two groups at any level of α since (p-value≺10-4). Although the number of service units and service episodes’

AC

lengths are both embedded in 𝜎, we still include them as the covariances in our analyses. Regarding the relevant assumptions, ANCOVA is robust with respect to the normality assumption for large samples. We conduct the Levene’s test to verify the homogeneity of variances of 𝜎 for the churners and the non-churners. Except for the first two SQIs, the Levene’s null hypothesis is not rejected at α equals to 0.01—satisfying the corresponding assumption for seven SQIs. Among these SQIs, only the service failures related to 𝑆𝑄𝐼4 and 𝑆𝑄𝐼7 do not cause more significant pain for the churners; i.e. the actual service pain differences are highly significant for the rest of SQIs.

20

ACCEPTED MANUSCRIPT 5.2. Explaining and Predicting B2B Churn with Service Quality This section describes both explanatory and predictive modeling (for differences see Shmueli 2010). We investigate (i) if the service pain evaluation variables can explain SMEs’ decisions to churn, and (ii) if such theoretical predictors help predict churn. It is important to note that we approach the above questions (i.e.

T

explanation and prediction) separately. We apply logistic regression models, which are widely used in

IP

churn (Neslin et al. 2006; Lemmens & Gupta 2013). We supplement our predictive modeling with

CR

classification trees as the second most common technique for churn detection (Neslin et al. 2006). To ensure robustness, we examine both explanation and prediction through ten iterations of random stratified

US

subsampling. 5.2.1. Construction of Variables

AN

The ETL system uses the remaining two-thirds of the SMEs and their specific episodes to generate the dataset for explanatory and predictive modeling (Table 7). SME_ID is the primary key in the dataset. The

M

target variable is a churn label, addressing the SME’s observed behavior on defection. The service pain

ED

assessment variables are computed based on each SME’s specific episode. The controls include Age (age of SME’s relationship with the service company), and a set of firmographics.

PT

Control Variables Age, 𝑭𝟏 , 𝑭𝟐 , 𝑭𝟑 •Relationship Age, 11 years on average. •Firm demographics.

Service Pain Assessment Variables Rational Service Assessment Variables Heuristic Service Assessment Variables 𝑹𝟏 … 𝑹𝟗 𝑯𝟏 … 𝑯𝟐𝟔 𝑯𝟐𝟕 , 𝑯𝟐𝟖 , 𝑯𝟐𝟗 •Based on the SME’s whole episode. •Based on the SME’s • Based on the last six weeks • 𝑅𝑖 : Temporal average (𝜎) of service whole service episode. of the SME’s service episode. •See Appendix A. •See Appendix A. pain/utility for 𝑆𝑄𝐼𝑖

CE

Churn (0/1)

SME_ID

Table 7: An observation in the dataset for explanatory and predictive modeling

AC

We draw on the descriptive analysis in Section 5.1 to select the theoretical variables. The churners in the descriptive dataset have experienced more actual service pain with regard to all SQIs (Table 6), hence the ETL computes nine R(ational) service pain assessment variables (𝑅𝑖:1→9 ). In Table 7, 𝑅𝒊 is the temporal ∑𝑇 1 𝑝̅𝑡

average of the proportional pain (

𝑇

) related to 𝑆𝑄𝐼𝒊 , which the SME has experienced in its specific

episode. 𝑅𝟗 concerns the holistic SQI. The ETL also computes 29 H(euristic) variables (𝐻𝑗:1→29) which concern the application of different decision rules with different SQIs. Appendix A explains these variables in detail.

21

ACCEPTED MANUSCRIPT 5.2.2. Explaining Organization Churn with Service Pain Assessment Variables We have drawn on the theories in behavioral economics to design variables that capture different forms of B2B service quality evaluation. The ultimate goal of these variables is to help generate accurate predictions about organizational churn. Nonetheless, since these variables are deeply rooted in theory, we first test the

T

related hypotheses through explanatory modeling (Shmueli 2010). Specifically, explanatory modeling

IP

allows us to test if the related hypotheses for the effects of service pain or somatic states on churn hold. In this section, accordingly, we investigate if the suggested service pain assessment variables can

CR

explain SMEs’ decisions to churn. To ensure consistency and robustness, we examine the effects of

US

theoretical variables on churn through ten iterations of random stratified subsampling. In each iteration, we examine the statistical significance of the model using the randomly selected two-thirds of the dataset. To

AN

assess the predictive power of the explanatory models (Shmueli 2010), in each round we score the remaining one-third of the dataset (i.e. test stratum) with the explanatory logit models.

M

To account for multicollinearity, which could interfere with inference (Shmueli 2010), in every iteration

ED

we consider a ‘0.1’ cutoff as the tolerance threshold (Hair et al. 2010). All R(ational) variables (𝑅𝑖:1→8 ) are highly tolerant in all iterations; their tolerance values are mostly in the ranges of 0.9 and 0.7. 𝑅9 is naturally

PT

intolerant since it concerns a linear combination of the eight SQIs. Among the H(euristic) variables,

CE

𝐻27 , 𝐻28 and 𝐻29, which are computed based on the last six weeks of the SME’s episode, are highly tolerant throughout all iterations; their tolerance values are greater than 0.9. Among the first 26 H(euristic) variables,

AC

which are computed based on the SME’s whole episode, only 𝐻13 and 𝐻14 are moderately tolerant. The rational essence of the tolerant H(euristic) variables deserves attention. 𝐻13 and 𝐻14 address the extensional target evaluation of the last service pain related to SQI2, which is a logical rule (Kahneman & Frederick 2002). Similarly, 𝐻27 , 𝐻28 , and 𝐻29 concern the existence of proportional peak pain in the last six weeks of the episode, thereby carrying a sense of utility and probability in the peak pain. Besides the variables that pass the tolerance filter, we include three interaction terms between 𝐻27 , 𝐻28 , 𝐻29 and their corresponding R(ational) pain assessment variables. This addresses the

22

ACCEPTED MANUSCRIPT hypothesized role of heuristics as attentional mechanisms (i.e. the somatic marker hypothesis). To illustrate, since 𝐻27 concerns the existence of the proportional peak 𝑆𝑄𝐼𝟏 pain in the last six weeks of the SME’s episode, we include ‘𝐻27 ∗ 𝑅𝟏 ’ in the model. We hypothesize that the peak pain (𝐻27=1) could realize/exacerbate the effect of the rational assessment of the relevant pain (𝑅1 ) on the odds of churn.

T

Table 8 summarizes the statistically significant variables, their significance level, and their effects on

IP

the odds of churn throughout the ten iterations of random stratified subsampling. The results show that the

CR

SME’s age of business relationship with the service company stays highly significant, with a negative effect on the odds of churn. Holding other variables fixed, the odds of churn decreases by 3.7% for every one-

US

year increase in the business relationship age. In a similar vein, an Analysis of Variance shows that the average relationship age is significantly less for churners. This highlights the importance of a well-

AN

established inter-organizational relationship in B2B operations. Table 8: Explanatory models

M

ED

PT

CE

1 2 3 4 5 6 7 8 9 10

Control Rational Assessment Heuristic Variables Variables Variables ∗∗∗ −𝐴𝑔𝑒 ∗∗∗ , 𝐹1∗∗∗ , 𝐹3∗∗∗ +𝑅1∗∗∗ , +𝑅2∗∗∗ , +𝑅6∗ , +𝑅8∗∗∗ +𝐻13 ∗∗∗ −𝐴𝑔𝑒 ∗∗∗ , 𝐹1∗∗∗ , 𝐹3∗∗∗ +𝑅1∗∗∗ , +𝑅2∗∗∗ , +𝑅6∗∗ , +𝑅8∗∗∗ +𝐻13 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗ ∗∗∗ ∗∗∗ ∗ ∗∗∗ −𝐴𝑔𝑒 , 𝐹1 , 𝐹3 +𝐻13 , +𝐻28 +𝑅1 , +𝑅2 , +𝑅6 , +𝑅8 ∗∗∗ −𝐴𝑔𝑒 ∗∗∗ , 𝐹1∗∗∗ , 𝐹3∗∗∗ +𝑅1∗∗ , +𝑅2∗∗∗ , +𝑅6∗∗ , +𝑅8∗∗ +𝐻13 ∗∗∗ −𝐴𝑔𝑒 ∗∗∗ , 𝐹1∗∗∗ , 𝐹3∗∗∗ +𝑅1∗∗∗ , +𝑅2∗ , +𝑅6∗ , +𝑅8∗∗∗ +𝐻13 ∗∗∗ ∗∗∗ ∗∗ ∗∗ ∗∗∗ −𝐴𝑔𝑒 ∗∗∗ , 𝐹1∗∗∗ , 𝐹2∗, 𝐹3∗∗∗ +𝑅1 , +𝑅2 , +𝑅6 , +𝑅8 +𝐻13 ∗∗∗ −𝐴𝑔𝑒 ∗∗∗ , 𝐹1∗∗∗ , 𝐹3∗∗∗ +𝑅1∗∗∗ , +𝑅2∗∗∗ , +𝑅6∗∗ , +𝑅8∗∗∗ +𝐻13 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗ ∗∗ ∗∗∗ −𝐴𝑔𝑒 , 𝐹1 , 𝐹3 +𝑅1 , +𝑅2 , +𝑅8 +𝐻13 ∗∗∗ −𝐴𝑔𝑒 ∗∗∗ , 𝐹1∗∗∗ , 𝐹3∗∗∗ +𝑅1∗∗∗ , +𝑅2∗∗∗ , +𝑅8∗∗∗ +𝐻13 ∗∗∗ −𝐴𝑔𝑒 ∗∗∗ , 𝐹1∗∗∗ , 𝐹3∗∗∗ +𝑅1∗∗∗ , +𝑅2∗ , +𝑅6∗ , +𝑅8∗∗∗ +𝐻13 ***Significant at α=0.001, **Significant at α=0. 01, *Significant at α=0.05.

AUC on Test Stratum 0.6563 +𝐻27 ∗ 𝑅1∗∗ , +𝐻29 ∗ 𝑅5∗∗ 0.6542 +𝐻27 ∗ 𝑅1∗∗∗ , +𝐻28 ∗ 𝑅2∗∗ , +𝐻29 ∗ 𝑅5∗∗ 0.6524 +𝐻27 ∗ 𝑅1∗∗∗ 0.6504 +𝐻27 ∗ 𝑅1∗∗∗ , +𝐻28 ∗ 𝑅2∗∗ , +𝐻29 ∗ 𝑅5∗ 0.6431 +𝐻27 ∗ 𝑅1∗∗∗ , +𝐻28 ∗ 𝑅2∗∗∗ 0.6574 +𝐻27 ∗ 𝑅1∗∗∗ , +𝐻28 ∗ 𝑅2∗ 0.6568 +𝐻27 ∗ 𝑅1∗∗∗ 0.6639 +𝐻27 ∗ 𝑅1∗∗∗ , +𝐻28 ∗ 𝑅2∗ , +𝐻29 ∗ 𝑅5∗ 0.6586 +𝐻27 ∗ 𝑅1∗∗∗ , +𝐻28 ∗ 𝑅2∗∗∗ , +𝐻29 ∗ 𝑅5∗∗∗ 0.6591 +𝐻27 ∗ 𝑅1∗∗ , +𝐻28 ∗ 𝑅2∗∗ Significance of the coefficients is based on χ2 tests. Somatic Marker Hypothesis

AC

The theoretical variables in the explanatory models are found to increase the odds of churn, corroborating the hypothesized effects of the perceived service pain on SMEs’ decisions to churn. To illustrate, holding other variables fixed, one percent increase in the temporal average of 𝑆𝑄𝐼1 proportional pain (i.e. 𝑅1 ) will increase the odds of churn by 7.25%. Furthermore, proportional peak pain is found to either realize or exacerbate the effect of rational pain assessment on churn (e.g. ‘𝐻27 ∗ 𝑅1 ’), providing evidence for the hypothesized somatic states in organizational decision making. Proportional peak pain (𝐻27 ) could cause a somatic state in SMEs and calls for a decision on loyalty, which can be made using the related R(ational) measures of service quality (𝑅1 ).

23

ACCEPTED MANUSCRIPT The use of expert opinion as part of the churn detection process calls for verification of the results’ robustness with respect to the labeled churners. We conduct a sensitivity analysis by continuously removing a random 5% of churners from a training stratum and investigating any change in the variables significance after each removal. Table 9 shows that while 𝑅1 loses its significance after removing 35% of the labeled

T

churners, all other variables carry their significance until the removal of 50% of the labeled churners—

IP

showing the robustness of the explanatory models against an important human element in the study. Table 9: Sensitivity analysis

M

AN

US

CR

Control Rational Service Pain Boundedly Rational Service Somatic Marker Hypothesis Variables Assessment Variables Pain Assessment Variables ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ -5% −𝐴𝑔𝑒 , 𝐹1 , 𝐹3 +𝐻27 ∗ 𝑅1∗∗∗ , +𝐻28 ∗ 𝑅2∗∗ , +𝐻29 ∗ 𝑅5∗∗ +𝑅1 , +𝑅2 , +𝑅6 , +𝑅8 +𝐻13 ∗∗∗ -10% −𝐴𝑔𝑒 ∗∗∗ , 𝐹1∗∗∗ , 𝐹3∗∗∗ +𝐻27 ∗ 𝑅1∗∗∗ , +𝐻28 ∗ 𝑅2∗∗ , +𝐻29 ∗ 𝑅5∗∗∗ +𝑅1∗∗ , +𝑅2∗∗∗ , +𝑅6∗∗∗ , +𝑅8∗∗∗ +𝐻13 ∗∗∗ -15% −𝐴𝑔𝑒 ∗∗∗ , 𝐹1∗∗∗ , 𝐹3∗∗∗ +𝐻27 ∗ 𝑅1∗∗∗ , +𝐻28 ∗ 𝑅2∗∗ , +𝐻29 ∗ 𝑅5∗∗∗ +𝑅1∗∗ , +𝑅2∗∗∗ , +𝑅6∗∗∗ , +𝑅8∗∗∗ +𝐻13 ∗∗∗ -20% −𝐴𝑔𝑒 ∗∗∗ , 𝐹1∗∗∗ , 𝐹3∗∗∗ +𝐻27 ∗ 𝑅1∗∗∗ , +𝐻28 ∗ 𝑅2∗∗ , +𝐻29 ∗ 𝑅5∗∗∗ +𝑅1∗∗ , +𝑅2∗∗∗ , +𝑅6∗∗∗ , +𝑅8∗∗∗ +𝐻13 ∗∗∗ -25% −𝐴𝑔𝑒 ∗∗∗ , 𝐹1∗∗∗ , 𝐹3∗∗∗ +𝐻27 ∗ 𝑅1∗∗∗ , +𝐻28 ∗ 𝑅2∗ , +𝐻29 ∗ 𝑅5∗∗∗ +𝑅1∗ , +𝑅2∗∗∗ , +𝑅6∗∗∗ , +𝑅8∗∗∗ +𝐻13 ∗∗∗ ∗∗∗ ∗∗∗ ∗ ∗∗∗ ∗∗∗ ∗∗ ∗∗∗ -30% −𝐴𝑔𝑒 , 𝐹1 , 𝐹3 +𝐻27 ∗ 𝑅1∗∗∗ , +𝐻28 ∗ 𝑅2∗∗ , +𝐻29 ∗ 𝑅5∗∗∗ +𝑅1 , +𝑅2 , +𝑅6 , +𝑅8 +𝐻13 ∗∗∗ -35% −𝐴𝑔𝑒 ∗∗∗ , 𝐹1∗∗∗ , 𝐹3∗∗∗ +𝐻27 ∗ 𝑅1∗∗∗ , +𝐻28 ∗ 𝑅2∗∗ , +𝐻29 ∗ 𝑅5∗∗ 𝑅1 , +𝑅2∗∗∗ , +𝑅6∗∗∗ , +𝑅8∗∗ +𝐻13 ∗∗∗ -40% −𝐴𝑔𝑒 ∗∗∗ , 𝐹1∗∗∗ , 𝐹3∗∗∗ +𝐻27 ∗ 𝑅1∗∗∗ , +𝐻28 ∗ 𝑅2∗∗ , +𝐻29 ∗ 𝑅5∗∗ 𝑅1 , +𝑅2∗∗∗ , +𝑅6∗∗∗ , +𝑅8∗∗ +𝐻13 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗ ∗∗∗ -45% −𝐴𝑔𝑒 , 𝐹1 , 𝐹3 +𝐻27 ∗ 𝑅1∗∗∗ , +𝐻28 ∗ 𝑅2∗∗ , +𝐻29 ∗ 𝑅5∗∗ 𝑅1 , +𝑅2 , +𝑅6 , +𝑅8 +𝐻13 ∗∗∗ -50% −𝐴𝑔𝑒 ∗∗∗ , 𝐹1∗∗∗ , 𝐹3∗∗∗ +𝐻27 ∗ 𝑅1∗∗∗ , +𝐻28 ∗ 𝑅2∗∗ , +𝐻29 ∗ 𝑅5∗ 𝑅1 , +𝑅2∗∗∗ , +𝑅6∗∗∗ , +𝑅8∗∗ +𝐻13 ***Significant at α=0.001, **Significant at α=0. 01, *Significant at α=0.05. Significance of the coefficients is based on Wald χ2 tests.

5.2.3. Predicting Organizational Churn with Service Pain Assessment Variables

ED

To optimally predict B2B churn, we develop a stepwise selection method based on the variables’

PT

contribution to the predictive accuracy. Note that we apply the stepwise selection to the whole pool of control, rational, and boundedly rational variables; checking for multicollinearity is not necessary for

CE

predictive purposes (Shmueli 2010). The sequence of the predictors entering the logit model is determined by their contributions to the Area Under the ROC Curve (AUC), which is computed against a validation

AC

set. In each iteration, specifically, we randomly select two thirds of data set for training and validation, and the remaining one third for test. Table 10 shows the sequence of the predictors entering the model in each iteration and the AUC of the final model on the test stratum. ROC curves address the tradeoff between the model’s true positive and false positive rates at different thresholds (Provost & Fawcett 2013), and Area under ROC curve has been shown to stand out from the rest of accuracy measures (Culver et al. 2006), especially in cases of unbalanced datasets. We supplement our predictive modeling with classification trees as the second most common technique for churn detection (Neslin et al. 2006).

24

ACCEPTED MANUSCRIPT

Table 10: Stepwise selection of predictors

8th

𝐻25 𝐻27

AUC on Test Stratum Logit Tree 0.630 0.6513 0.639 0.6484 0.630 0.6479 0.625 0.6425 0.623 0.6429 0.633 0.6500 0.635 0.6530 0.633 0.6505 0.637 0.6494 0.622 0.6504

T

Predictors Added in Each Step 3rd 4th 5th 6th 7th 𝐻13 𝐹1 𝑅2 𝑅6 𝐹1 𝐻13 𝑅1 𝑅2 𝐻25 𝐹1 𝑅1 𝐻13 𝑅8 𝑅2 𝐹1 𝐻13 𝑅2 𝑅8 𝑅6 𝐻13 𝑅1 𝑅8 𝐹1 𝑅2 𝐹1 𝐻13 𝑅2 𝐻25 𝑅1 𝐹1 𝐻13 𝑅1 𝑅8 𝑅6 𝐹1 𝐻13 𝑅6 𝐹1 𝑅2 𝑅8 𝐹1 𝐻13 𝑅2 𝑅8

IP

2nd 𝐹3 𝐹3 𝐹3 𝐹3 𝐹3 𝐹3 𝐹3 𝐹3 𝐹3 𝐹3

CR

1 2 3 4 5 6 7 8 9 10

1st Age Age Age Age Age Age Age Age Age Age

Table 10 shows that the first two variables entering the logit models throughout the iterations are

US

controls. Similarly, the contributions to the AUCs indicate that Age is the principal variable in terms of accuracy, highlighting the importance of having a long-term relationship with the service provider in B2B

AN

operations. The marginal contribution of the service pain variables beyond Age might be explained by a data limitation; i.e. the pain assessment variables are extracted from a two-year service database. Note that

M

the business relationship age of an average SME with the service company is nearly 11 years. Perhaps other

ED

variables could help predict the churn of SMEs with a long-term relationship with the service company. To examine further the predictive value of the service pain assessment variables, we repeat the above

PT

stepwise selection process with solely new customers; i.e. the SMEs with the relationship age of two years

CE

or less (for which our database has complete data). These new customers are the ones with which firms do not yet have deep relationships, making them particularly important for analytics-driven operational

AC

methods. Table 11 shows that for new customers, the principal predictors in all iterations are the theoretical service pain assessment variables. Modest AUC scores (greater than 0.6) have worthwhile implications in the context of B2C churn (Provost & Fawcett 2013). Modest AUC scores, especially those delivered by parsimonious service quality models, deserve even more attention in the B2B operations (fewer customers, more transactions). Figure 8 depicts a tree trained with the new customers. The numbers in the leaves represent percentage of churners.

25

ACCEPTED MANUSCRIPT Table 11: Churn models for new customers AUC on Test Stratum 0.6221 0.6004 0.5900 0.6074 0.5853 0.6390 0.6243 0.5949 0.6197 0.6795

R2 > .3

≤ .3

67%

R2

> .2

≤ .2

R8

56%

29%

> .1

≤ .1

73%

R1

T

> .3

R2 ≤ .3

> .8

≤ .8

61%

38%

CR

IP

1 2 3 4 5 6 7 8 9 10

Predictors Added in Each Step 1st 2nd 3rd 4th 𝐻13 𝑅6 𝑅8 𝐻13 𝑅8 𝑅6 𝐻13 𝑅8 𝑅1 𝐻13 𝑅8 𝐻26 𝐻25 𝑅2 𝑅6 𝐻13 𝐻25 𝑅1 𝐻13 𝐻29 𝑅1 𝐹1 𝐻26 𝑅1 𝑅8 𝐻13 𝑅2 𝑅8 𝑅9 𝑅1 𝐹1 𝐻13

Figure 8: Churn tree for new customers

6. CONCLUSIONS AND FUTURE WORK

US

In this paper we have shown how behavioral economics can inspire feature engineering and help build parsimonious models for an important business problem using theory-driven predictive analytics. Our

AN

analyses corroborate the role of the rationality assumption in explaining/ predicting the future behavior of

M

organizations (Friedman 1953; Simon 1979). Besides shedding new light on the rationality debate, the results have important implications for B2B service operations. Our findings suggest that the service

ED

company should keep the total service pain low. This is an important finding in B2B operations where the

PT

prior conception is that firms do not follow a rational model of service quality assessment in making decisions about service renewals (Bolton et al. 2006). Concomitantly, the service provider should monitor

CE

every service failure and be alert to the extensional service evaluation heuristic, which can cause a somatic state for the SME—making it more sensitive to the past total service pain.

AC

6.1. Implications for Behavioral Organizational Economics Behavioral economics views administrative decision-making as a combination of both intuitive and analytical skills (Simon 1997). Accordingly, our feature engineering designed both rational and boundedly rational measures of service pain assessment. The rational measure of pain assessment is simply the temporal average of service pain. The boundedly rational service pain assessment measures are different variants of the two most comprehensive heuristics; i.e. representativeness and availability heuristics

26

ACCEPTED MANUSCRIPT (Tversky & Kahneman 1974). These measures also include more rational variants of the heuristic decision rules, which appreciate utility and probability (i.e. extensional target evaluation). Regarding the analytical service pain evaluation techniques, both descriptive and predictive analyses suggest that the average behavior of SMEs as a group is as if they are rational (Friedman 1953). In

T

particular, the rational measures of service pain evaluation help explain and predict the SMEs’ decisions

IP

on churn. Individual SMEs might practice different boundedly rational decision rules; however, it is their

CR

assumed rationality that helps explain and predict their future behavior (Friedman 1953, 1978). This finding is also endorsed by Friedman (1953) and Shugan (2009); i.e. the theory’s fruitfulness is evaluated by its

US

predictive accuracy, not by its assumptions realism. Nonetheless, our findings should not characterize the SMEs as wholly rational. What we demonstrate with our analyses is that, overall, the SMEs’ assumed

AN

rationality appears to contribute to our models’ statistical and practical significance. This is also consistent with the findings in experimental economics that concur with neoclassical predictions (e.g. List 2004).

M

Regarding the heuristic service pain evaluation techniques, our findings are consistent with the somatic

ED

marker hypothesis in neuroeconomics (Bechara & Damasio 2005). The biasing nature of intuitive decision rules might cause a somatic state which draws the SME’s attention to the service quality issues and calls

PT

for judgment, which might be carried out in a relatively rational way. In this vein, while the proportional

CE

peak pain draws the SME’s attention to the corresponding service issue, the SME’s information systems may play the role of a working memory that is necessary for coherent analytical reasoning after the somatic

AC

marker operates. Being in a somatic state caused by the peak pain, the organization carries out the subsequent evaluation more sensitively. It should be noted, however, that all the intuitive service evaluation measures in the models pertain to extensional target evaluation. That is, they appreciate the service utility and probability, and hence are viewed as logical rules of judgment (Kahneman & Frederick 2002). The highlighted role of the rationality assumption in our findings can be explained from the behavioral organizational economics standpoint. First, it can be argued that the neoclassical organizational decisionmaking is a special case of bounded rationality (Sontheimer 2006); individuals’ decisions are satisficing and not optimizing due to the limitations on information, analytical processing capacity, and time. Thus,

27

ACCEPTED MANUSCRIPT individuals may make optimizing decisions in a situation void of these limitations. For example, the service quality information might be logged by the SME’s information systems. Accordingly, the necessary information is not low in accessibility, and hence waives the need to employ any intuitive heuristic. Moreover, the analytical processing capacity of a potential group of decision makers in an SME

T

outperforms an individual’s (Oliva & Watson 2009). Lastly, it can be assumed that the SME spends enough

IP

time to contemplate ending a B2B relationship with a major service company in an oligopoly; a decision

CR

that could ultimately affect the SME’s end customers. Each of these speculations deserves subtler investigation that could be undertaken by qualitative studies, which is beyond the scope of this paper.

US

6.2. Implications for Predictive Analytics and Churn

The hybrid deductive-inductive approach to building predictive models for churn makes this research one

AN

of the first B2B churn studies where service quality is shown to be an important determinant for customer attrition—especially for those with whom the service provider has not yet established a deep business

M

relationship. This is an important finding for B2B operations, where the number of potential service

ED

providers is less than in B2C—making the ‘quality-churn’ connection more difficult to build on. Interestingly, the few B2B churn studies that consider service quality features (e.g. Chen et al. 2014) do not

PT

find them among the top ranked predictors. However, the high quality and longitudinal nature of the data

CE

used in this study present a compelling case for the importance of service quality in B2B operations. Furthermore, this paper takes a unique approach to discovering churners and pinpointing their churn

AC

dates in a non-contractual environment. This makes the present paper one of the first non-contractual churn studies where subjects have different prediction periods, compared to predetermined prediction periods. 6.3. Future work

Future research directions for this research direction concern both behavioral and technical research. A behavioral research direction is on boundedly rational decision rules and their implementation. Behavioral researchers can suggest testing new heuristic decision rules in organizational decision-making. This is especially important considering the highlighted role of somatic states caused by boundedly rational decision rules.

28

ACCEPTED MANUSCRIPT As consumer behavior data are becoming available, technical researchers can employ new algorithms to investigate and apply different ideas in behavioral economics. A future research direction for the present study is if we can strengthen the predictive power by applying the notion of heuristics orchestration from the Fast and Frugal heuristics research (Gigerenzer & Selten 2002). Specifically, is it possible that an SME

T

follows one (or several) boundedly rational decision rules orchestrated on a combination of SQIs? Here,

IP

state-of-the-art Pareto optimal points query processing algorithms can help us to gain insights on

CR

orchestration mechanisms in behavioral economics, and introduce their application in theory-driven predictive analytics.

US

In conclusion, we have identified the potential in bringing ideas from behavioral economics into predictive analytics in problems where extensive behavioral data is available, such as in IS and Marketing

AN

studies that exploit big data and algorithms to derive insights into user behavior. Our work here, using theory-driven predictive analytics and feature engineering, is one approach that we demonstrate to be

M

potentially promising.

ED

REFERENCES

Almana, A. M., Aksoy, M. S., & Alzahrani, R. (2014). A survey on data mining techniques in customer

PT

churn analysis for telecom industry. Journal of Engineering Research and Applications, 4(5), 165-171.

CE

Ariely, D. (1998). Combining experiences over time: The effects of duration, intensity changes and online measurements on retrospective pain evaluations. J. of Behavioral Decision Making, 11(1), 19-45.

AC

Ariely, D., & Carmon, Z. (2003). Summary assessment of experiences: The whole is different from the sum of its parts. Russell Sage Foundation. Ariely, D., & Loewenstein, G. (2000). When does duration matter in judgment and decision making? Journal of Experimental Psychology: General, 129(4), 508. Bechara, A., & Damasio, A. R. (2005). The somatic marker hypothesis: A neural theory of economic decision. Games and economic behavior, 52(2), 336-372. Bentham, Jeremy (1789). The principles of morals and legislation. Dover Publications.

29

ACCEPTED MANUSCRIPT Bolton, R. N., Lemon, K. N., & Bramlett, M. D. (2006). The effect of service experiences over time on a supplier's retention of business customers. Management Science, 52(12), 1811-1823. Buckinx, W., & Van den Poel, D. (2005). Customer base analysis: partial defection of behaviorally loyal clients in a noncontractual FMCG retail setting. European J. of Operational Research, 164(1), 252-268.

T

Burez, J., & Van den Poel, D. (2007). CRM at a pay-TV company: Using analytical models to reduce

IP

customer attrition by targeted marketing for subscription services. Expert Systems with Applications,

CR

32(2), 277-288.

Camerer, C. F., & Loewenstein, G. (2004). Behavioral economics: Past, present, future. Advances in

US

behavioral economics, 3.

Chen, K., Hu, Y. H., & Hsieh, Y. C. (2014). Predicting customer churn from valuable B2B customers in

AN

the logistics industry: a case study. Information Systems and e-Business Management, 1-20. Culver, M., Kun, D., & Scott, S. (2006, December). Active learning to maximize area under the ROC

M

curve. In Data Mining, 2006. ICDM'06. Sixth International Conference on (pp. 149-158). IEEE.

ED

Damasio, A. (1994, 2005). Descartes' error: Emotion, reason, and the human brain. Putnam Publishing, Penguin Books.

PT

Fader, P., Hardie, B. & Shang, J. (2010). Customer-base analysis in a discrete-time non-contractual

CE

setting. Marketing Science, 29(6), 1086-1108. Fredrickson, B. L., & Kahneman, D. (1993). Duration neglect in retrospective evaluations of affective

AC

episodes. Journal of personality and social psychology,65(1), 45. Friedman, M. (1953). The methodology of positive economics. Essays in positive economics. University of Chicago Press, 3-43. Friedman, M. (1978). Whitman’s Interview with Milton Friedman. Economically speaking. PBS stations. George, J. F., Duffy, K., & Ahuja, M. (2000). Countering the anchoring and adjustment bias with decision support systems. Decision Support Systems, 29(2), 195-206. Gigerenzer, G, & Selten, R (2002). Rethinking rationality. Bounded rationality: The adaptive toolbox, 1-12.

30

ACCEPTED MANUSCRIPT Glady, N., Baesens, B., & Croux, C. (2009). Modeling churn using customer lifetime value. European Journal of Operational Research, 197(1), 402-411. Goes, P. B. (2013). Editor's comments: information systems research and behavioral economics. MIS quarterly, 37(3), iii-viii.

T

Hair Jr, J. F., Black, W. C., Babin, B. J., & Anderson, R. E., 2010. Multivariate data analysis. 7th Edition.

IP

Hosseini, H. (2003). The arrival of behavioral economics: from Michigan, or the Carnegie School in the

CR

1950s and the early 1960s? Journal of Socio-Economics, 32(4), 391-409.

Jahromi, A. T., Stakhovych, S., & Ewing, M. (2014). Managing B2B customer churn, retention and

US

profitability. Industrial Marketing Management, 43(7), 1258-1268.

Jevons, W. S. (1871). The theory of political economy. Reprinted by Palgrave Macmillan 2013.

AN

Kahneman, D. (2000). Evaluation by moments: Past and future. Choices, values, and frames, 693-708.

Russell Sage Foundation Publications.

M

Kahneman, D., Diener, E., & Schwarz, N. (2003). Well-being: The foundations of hedonic psychology.

ED

Kahneman, D., & Frederick, S. (2002). Representativeness revisited: Attribute substitution in intuitive judgment. Heuristics and biases: The psychology of intuitive judgment, 49-81.

PT

Kahneman, D., Fredrickson, B. L., Schreiber, C. A., & Redelmeier, D. A. (1993). When more pain is

CE

preferred to less: Adding a better end. Psychological Science, 4(6), 401-405. Kahneman, D., & Tversky, A. (Eds.). (2000). Choices, values, and frames. Cambridge University Press.

AC

Kahneman, D., Wakker, P. P., & Sarin, R. (1997). Back to Bentham? Explorations of experienced utility. The Quarterly Journal of Economics, 112(2), 375-406. Katsikopoulos, K. & Gigerenzer, G. (2013). Behavioral operations management: A blind spot and a research program. Journal of Supply Chain Management, 49(1), 3-7. Kimball, R., & Caserta, J. (2004). The data warehouse ETL toolkit. John Wiley & Sons. Koufteros, X., Droge, C., Heim, G., Massad, N., & Vickery, S. K. (2014). Encounter Satisfaction in E‐ tailing: Are the Relationships of Order Fulfillment Service Quality with its Antecedents and Consequences Moderated by Historical Satisfaction? Decision Sciences, 45(1), 5-48.

31

ACCEPTED MANUSCRIPT Kuhn, T. S. (1961). The function of measurement in modern physical science. Isis, 161-193. Lee, H., Lee, Y., Cho, H., Im, K., & Kim, Y. S. (2011). Mining churning behaviors and developing retention strategies based on a partial least squares model. Decision Support Systems, 52(1), 207-216. Lemmens, A., & Gupta, S. (2013). Managing churn to maximize profits. Harvard Business School.

T

List, J. A. (2004). Neoclassical theory versus prospect theory: Evidence from the marketplace.

IP

Econometrica, 72(2), 615-625.

CR

Martens, D., Provost, F., Clark, J., & Junqué de Fortuny, E. (2016). Mining massive fine-grained behavior data to improve predictive analytics. MIS Quarterly, 40(4), 1-20.

US

Martens, D., Vanthienen, J., Verbeke, W., & Baesens, B. (2011). Performance of classification models from a user perspective. Decision Support Systems, 51(4), 782-793.

AN

Miron-Shatz, T., Stone, A., & Kahneman, D. (2009). Memories of yesterday’s emotions: Does the valence of experience affect the memory-experience gap? Emotion, 9(6), 885.

M

Moeyersoms, J., & Martens, D. (2015). Including high-cardinality attributes in predictive models: A case

ED

study in churn prediction in the energy sector. Decision Support Systems, 72, 72-81.

Horizons, 42(3), 71-76.

PT

Naumann, E., & Jackson Jr, D. W. (1999). One more time: how do you satisfy customers? Business

Neslin, S. A., Gupta, S., Kamakura, W., Lu, J., & Mason, C. H. (2006). Defection detection: Measuring and

CE

understanding the predictive accuracy of customer churn models. Journal of marketing research, 43(2).

AC

Nie, G., Rowe, W., Zhang, L., Tian, Y., & Shi, Y. (2011). Credit card churn forecasting by logistic regression and decision tree. Expert Systems with Applications, 38(12), 15273-15285. Oliva, R., & Watson, N. (2009). Managing functional biases in organizational forecasts: A case study of consensus forecasting in supply chain planning. Production and Operations Management, 18(2), 138151. Padmanabhan, B., Hevner, A., Cuenco, M. & Shi, C. (2011). From information to operations: Service quality and customer retention. ACM Transactions on Management Information Systems, 2(4). Provost, F. & Fawcett, T. (2013). Data Science for Business. O’Reilly.

32

ACCEPTED MANUSCRIPT Rauyruen, P., & Miller, K. E. (2007). Relationship quality as a predictor of B2B customer loyalty. Journal of business research, 60(1), 21-31. Redelmeier, D. A., & Kahneman, D. (1996). Patients' memories of painful medical treatments: real-time and retrospective evaluations of two minimally invasive procedures. Pain, 66(1), 3-8.

T

Rieskamp, J., Hertwig, R., & Todd, P. M. (2006). Bounded rationality. Handbook of Contemporary

IP

Behavioral Economics, Armonk, NY: ME Sharpe, 218-236.

CR

Rust, R., Kumar, V. & Venkatesan, R. (2011). Will the frog change into a prince? Predicting future customer profitability. International Journal of Research in Marketing, 28(4), 281-294.

US

Saradhi, V. V., & Palshikar, G. K. (2011). Employee churn prediction. Expert Systems with Applications, 38(3), 1999-2006.

AN

Shmueli, G. (2010). To explain or to predict? Statistical science, 25(3), 289-310. Shmueli, G., & Koppius, O. R. (2011). Predictive analytics in information systems research. MIS

M

Quarterly, 35(3), 553-572.

ED

Shiffman, S., Stone, A. A., & Hufford, M. R. (2008). Ecological momentary assessment. Annu. Rev. Clin. Psychol., 4, 1-32.

CE

Science, 28(5), 991-998.

PT

Shugan, S. M. (2009). Commentary-Relevancy Is Robust Prediction, Not Alleged Realism. Marketing

Simon, H. A. (1979). Rational decision making in business organizations. The American economic

AC

review, 69(4), 493-513.

Simon, H. A. (1987). Behavioural economics. The new Palgrave: A dictionary of economics, 1, 221-24. Simon, H. (1997a). Administrative Behavior. 4th Edition, Free Press. Sontheimer, K. (2006). Behavioral versus neoclassical economics. Handbook of Contemporary Behavioral Economics: Foundations and Developments, 237. Tsai, C. F., & Lu, Y. H. (2009). Customer churn prediction by hybrid neural networks. Expert Systems with Applications, 36(10), 12547-12553.

33

ACCEPTED MANUSCRIPT Tversky, A., & Kahneman, D. (1973). Availability: A heuristic for judging frequency and probability. Cognitive psychology, 5(2), 207-232. Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185(4157), 1124-1131. Vafeiadis, T., Diamantaras, K. I., Sarigiannidis, G., & Chatzisavvas, K. C. (2015). A comparison of machine

IP

T

learning techniques for customer churn prediction. Simulation Modelling Practice & Theory, 55, 1-9.

Van den Brink, M., Bandell‐ Hoekstra, E. N. G., & Abu‐ Saad, H. H. (2001). The occurrence of recall

CR

bias in pediatric headache: a comparison of questionnaire and diary data. Headache: The Journal of

US

Head and Face Pain, 41(1), 11-20.

Van den Poel, D., & Lariviere, B. (2004). Customer attrition analysis for financial services using

AN

proportional hazard models. European Journal of Operational Research, 157(1), 196-217. Verbeke, W., Dejaeger, K., Martens, D., Hur, J., & Baesens, B. (2012). New insights into churn

ED

Operational Research, 218(1), 211-229.

M

prediction in the telecommunication sector: A profit driven data mining approach. European Journal of

Wiersema, F. (2013). The B2B Agenda: The current state of B2B marketing and a look ahead. Industrial

PT

Marketing Management 42, 470–488.

Yu, X., Guo, S., Guo, J., & Huang, X. (2011). An extended support vector machine forecasting

CE

framework for customer churn in e-commerce. Expert Systems with Applications, 38(3), 1425-1430.

AC

Ziliak, S. T., & McCloskey, D. N. (2008). The cult of statistical significance: How the standard error costs us jobs, justice and lives. University of Michigan Press. APPENDIX A

The first 26 H(euristic) variables are constructed based on Table 4, since Measure 2 has resulted in more potentially significant statistics than Measure 3. To illustrate, the decision rules whose Measure 2 is equal or greater than 1.2 also cover all such important heuristics per Measure 3. If a specific [DR, SQI] has two statistics (one for 𝑝𝑡 and one for 𝑝̅𝑡 ) both exceeding 1.2, we pick the greater as the representative; except for [DR3, SQI8], for which we include two variables as they are the only measures greater than 1.3.

34

ACCEPTED MANUSCRIPT The predictive ETL computes two predictors for each of these cells in Table 4; one for Measure 2 and one for Measure 3 (Section 4.1). The first predictor represents the number of times where the conditions for the relevant decision rule held in the SME’s service episode (denoted by an odd index, e.g. 𝐻1 ), and the second predictor represents a binary flag showing whether that condition held at least once in the SME’s

T

service episode (denoted by an even index, e.g. 𝐻2 ). The last three decision rules (i.e. 𝐻27 , 𝐻28, and 𝐻29)

IP

are binary flags that address the highlighted cells in Table 3 (Measure 1) whose statistics are greater than

CR

1.2. To illustrate, 𝐻27 is equal to one if the last proportional pain related to 𝑆𝑄𝐼1 has not been experienced by the SME before; i.e. the 𝑆𝑄𝐼1 peak pain decision rule holds at the end of the SME’s service episode.

US

Table A1. Heuristic Decision Rule Variables

𝑯𝟐𝟕 𝑯𝟐𝟖 𝑯𝟐𝟗

Has the condition for the peak pain heuristic (with 𝑝̅𝑡 ) held with respect to SQI1 at the end of the episode? Has the condition for the peak pain heuristic (with 𝑝̅𝑡 ) held with respect to SQI2 at the end of the episode? Has the condition for the peak pain heuristic (with 𝑝̅𝑡 ) held with respect to SQI5 at the end of the episode?

PT

ED

M

AN

Number of times that the availability heuristic (incidental frequency) holds with respect to SQI6. Has the condition for the availability heuristic (incidental frequency) held with respect to SQI6 at least once? Number of times that the availability heuristic (temporal frequency) holds with respect to SQI6. Has the condition for the availability heuristic (temporal frequency) held with respect to SQI6 at least once? Number of times that the availability heuristic (incidental frequency) holds with respect to SQI7. Has the condition for the availability heuristic (incidental frequency) held with respect to SQI7 at least once? Number of times that the availability heuristic (temporal frequency) holds with respect to SQI7. Has the condition for the availability heuristic (temporal frequency) held with respect to SQI7 at least once? Number of times that the end pain heuristic holds with respect to SQI6. Has the condition for the end pain heuristic held with respect to SQI6 at least once? Number of times that the end pain heuristic holds with respect to SQI7. Has the condition for the end pain heuristic held with respect to SQI7 at least once? Number of times that the extensional end pain heuristic (with 𝑝̅𝑡 ) holds with respect to SQI2. Has the condition for the extensional end pain heuristic (with 𝑝̅𝑡 ) held with respect to SQI2 at least once? Number of times that the extensional end pain heuristic (with 𝑝̅𝑡 ) holds with respect to SQI6. Has the condition for the extensional end pain heuristic (with 𝑝̅𝑡 ) held with respect to SQI6 at least once? Number of times that the extensional end pain heuristic (with 𝑝̅𝑡 ) holds with respect to SQI7. Has the condition for the extensional end pain heuristic (with 𝑝̅𝑡 ) held with respect to SQI7 at least once? Number of times that the peak pain heuristic (with 𝑝̅𝑡 ) holds with respect to SQI6. Has the condition for the peak pain heuristic (with 𝑝̅𝑡 ) held with respect to SQI6 at least once? Number of times that the peak pain heuristic (with 𝑝̅𝑡 ) holds with respect to SQI7. Has the condition for the peak pain heuristic (with 𝑝̅𝑡 ) held with respect to SQI7 at least once? Number of times that the peak pain heuristic (with 𝑝𝑡 ) holds with respect to SQI8. Has the condition for the peak pain heuristic (with 𝑝𝑡 ) held with respect to SQI8 at least once? Number of times that the peak pain heuristic (with 𝑝̅𝑡 ) holds with respect to SQI8. Has the condition for the peak pain heuristic (with 𝑝̅𝑡 ) held with respect to SQI8 at least once?

CE

Availability Heuristics Representativeness Heuristics

*

Heuristic Decision Rule

𝑯𝟏 𝑯𝟐 𝑯𝟑 𝑯𝟒 𝑯𝟓 𝑯𝟔 𝑯𝟕 𝑯𝟖 𝑯𝟗 𝑯𝟏𝟎 𝑯𝟏𝟏 𝑯𝟏𝟐 𝑯𝟏𝟑 𝑯𝟏𝟒 𝑯𝟏𝟓 𝑯𝟏𝟔 𝑯𝟏𝟕 𝑯𝟏𝟖 𝑯𝟏𝟗 𝑯𝟐𝟎 𝑯𝟐𝟏 𝑯𝟐𝟐 𝑯𝟐𝟑 𝑯𝟐𝟒 𝑯𝟐𝟓 𝑯𝟐𝟔

AC

Whole Episode Last 6 weeks

For each SME, the variables are extracted based on that specific SME’s:

Variable

* Representativeness Heuristics

35

ACCEPTED MANUSCRIPT APPLYING BEHAVIORAL ECONOMICS IN PREDICTIVE ANALYTICS FOR B2B CHURN: FINDINGS FROM SERVICE QUALITY DATA

Highlights 

The paper presents an approach that integrates behavioral economics and

We present evidence that both rationality and bounded-rationality assumptions

IP



T

predictive analytics in a B2B churn modeling context.

Unlike many studies at the individual level we do not find strong evidence for

US



CR

play significant roles in predicting organizational decisions on churn.

AC

CE

PT

ED

M

AN

human decision making biases or heuristics at play here at the organization level.

36