INCREMENTAL VALUE MODELING Behram Hansotia Brad Rukstales f INTRODUCTION In many one-to-one marketing situations the marketer is interested in identifying prospects/customers (hereinafter referred to simply as customers) she can profitably influence through a direct marketing contact. Since direct marketing often is just one of several ongoing marketing/advertising programs, many customers make purchases from the company whether they receive direct communications or not. The problem at hand then is which individuals should the company contact, so that the return on the direct marketing investment exceeds the company’s hurdle rate for such investments. Rather than have a separate hurdle rate for justifying marketing investments, many companies use the hurdle rate used to make all capital investment decisions. This typically is the firm’s cost of capital. The situation described above is endemic to many CRM (Customer Relationship Management) modeling applications: ●
A bank is interested in selling home equity lines of credit (HELOC) to its mortgage customers. It advertises this product in the local newspapers as well as on radio, and also markets it direct, via mail, to its mortgage customers. How should it select customers for the direct marketing contact? ● A major retailer is planning a back-to-school mailing. It will also be running newspaper inserts and TV and radio advertisements. It can identify 8 million customer households with © 2002 Wiley Periodicals, Inc. and Direct Marketing Educational Foundation, Inc. f JOURNAL OF INTERACTIVE MARKETING VOLUME 16 / NUMBER 3 / SUMMER 2002 Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/dir.10035
35
BEHRAM HANSOTIA is cofounder, president, and CEO of InfoWorks (a Member of the Rapp Collins Marketing Services Network). He has a Ph.D. in Management Science from the University of Illinois at UrbanaChampaign. BRAD RUKSTALES is president of the Customer Asset Consulting Group, Inc. and has an MBA from the University of Michigan. This article was co-authored while he was Sr. VP and COO of InfoWorks.
JOURNAL OF INTERACTIVE MARKETING
larly meaningful under conditions of uncertainty such as those encountered in direct marketing. The break-even decision rule states: Select all investments whose net expected profits are greater than zero. In the context of a direct marketing program, this rule becomes: Select all customers whose net expected profit contribution from the marketing program is positive. Assume:
kids under 18 on its database. Which households should it contact? ● A retailer believes it can proactively reduce the churn rate of its high value customers through targeted offers. Using Survival Analysis it has modeled the median time to next purchase. For its most profitable customers, it plans to send out attractive offers to visit the store again, if the customer’s time since last purchase exceeds the predicted median time to next purchase, by two weeks (customer is predicted to be late.) Which late customers should it select for the offer?
pi is the likelihood of customer i responding c is the cost per contact vi is the resulting profit if customer i responds
In these three examples, customers of their own volition, or as a result of general media advertising, would have a certain probability of making a purchase. The objective of the direct communication is to enhance the contacted party’s likelihood of purchase, or the amount of purchase, or get them to buy products with higher margins, so that the net incremental expected profits (or ROI) after marketing costs is greater than zero (or greater than the hurdle rate). This article describes tree and regression based approaches to developing decision rules that can be applied, customer by customer, in making marketing investment decisions. We provide formulas for profit-based breakeven, incremental break-even, and ROI- and LTV- based incremental break-even rules. We show how these rules may be used to optimally match marketing treatments to customer profiles and then discuss how the parameters, namely response likelihoods, of the decision rules may be estimated via logistic regression and tree-based approaches such as CHAID and CART. A numeric example using real data illustrates the use of logistic regression and a modified CHAID to estimate incremental response rates.
Then, customer i should be selected if p iv i ⫺ c ⬎ 0 or pi ⬎
when pi ⫽ p*i ⫽ c/vi , we refer to p*i as the break-even response likelihood.
INCREMENTAL BREAK-EVEN DECISION RULE As mentioned above, many customers would make purchases on their own, with or without a marketing stimulation. The investment problem then is identifying customers for whom the direct marketing treatment has a sufficiently large impact, either on the response likelihood or the resulting profit, or both, for the marketing investment to be justified. Assume: piT ⫽ response likelihood of customer i under the marketing treatment; piC ⫽ response likelihood of customer i under control (no marketing treatment); T vi ⫽ profit from customer i, under treatment, given the customer responds; viC ⫽ profit from customer i, under control, given the customer responds; and
BREAK-EVEN DECISION RULES The break-even decision rule is a well-known economic rule for selecting investments, when there are no resource constraints. It is particuJOURNAL OF INTERACTIVE MARKETING
c vi
●
36
VOLUME 16 / NUMBER 3 / SUMMER 2002
INCREMENTAL VALUE MODELING
c ⫽ cost per contact for the marketing treatment.
If the response results in a contractual agreement, such as an insurance policy or a credit card or a phone service, the profit metric needs to incorporate a longer-term perspective. The recommended profit metric in this instance is the Net Present Value (NPV) of free cashflows attributed to the responding customer. This is often referred to as the customer’s lifetime or Long Term Value (LTV). Let:
Then, under the incremental break-even decision rule, select customer i, if 共 p Ti vTi ⫺ c兲 ⫺ pCi vCi ⬎ 0 or p Ti vTi ⫺ pCi vCi ⫺ c ⬎ 0
LiT be the i-th customer’s LTV under treatment LiC be the i-th customer’s LTV under control
That is, select customer i if his net expected incremental profit is greater than zero. If response results in the same profit whether a customer is treated or not, then viT ⫽ viC ⫽ vi and the decision rule becomes
共 p Ti ⫺ pCi 兲 ⫽ ⌬pi ⬎
Then the LTV based incremental break-even rule states: Select customer i if p Ti LTi ⫺ pCi LCi ⫺ c ⬎ 0
c vi
That is, the expected LTV under treatment should exceed the expected LTV under control. If the marketing program is targeted to existing customers, the LTV actually refers to the remaining value of the customer, from the current point in time. This is particularly important since attrition likelihoods, which are a key input to the LTV calculations, generally vary with customer tenure. Note, at break-even, when the incremental net expected LTV is zero, the last customer selected for marketing is expected to generate an internal rate of return (IRR) equal to the discount rate used in the LTV calculations. (The IRR is calculated as the discount rate at which the NPV equals zero). All customers selected ahead of this customer are expected to generate internal rates of return in excess of the discount rate. If the firm’s cost of capital is used as the discount rate, then this rule implies that the rate of return on the marketing investment is expected to exceed the firm’s cost of capital. Since a firm’s value tends to increase when its investments generate rates of return that exceed its cost of capital, this decision rule ensures that marketing investments will do their part in enhancing firm value.
Here, ⌬pi is the incremental response likelihood and the incremental break-even decision rule becomes identical to the basic break-even rule.
ROI BASED INCREMENTAL BREAK-EVEN DECISION RULE The above break-even decision rule can easily be extended to select investments based on a minimum hurdle ROI. Let R* be the minimum hurdle rate, then the customer selection rule becomes: Select customer i if p Ti vTi ⫺ c ⫺ pCi vCi ⱖ R* c If the marketing investment is tied up for n months before the profit is realized and R* is expressed as a percent per year, then the selection rule becomes: p Ti vTi ⫺ pCi vCi R ⴱ 12 ⱖ1⫹ c n JOURNAL OF INTERACTIVE MARKETING
●
37
VOLUME 16 / NUMBER 3 / SUMMER 2002
JOURNAL OF INTERACTIVE MARKETING
冋 册
OPTIMAL TREATMENT MATCHING The marketing treatment problem can be easily extended so that instead of selecting customers for a treatment, the problem includes identifying the optimal treatment (if any) for each customer. Assume there are N competing treatments. Let: pij pio vij vio cj
BT is the vector of partial logistic regression coefficients for the treatment model.
冋 册
be the likelihood of customer i responding to treatment j, with j ⫽1, . . . , n; be the likelihood of customer i responding under no treatment, i.e., under the control treatment; be the value of customer i, given she responds to treatment j; be the value of customer i, given she responds under no treatment, i.e., under the control treatment; be the cost of marketing treatment j.
BC is the vector of partial logistic regression coefficients for the control model. If exploratory data analysis indicates that the variables in xT and xC are the same, the two equations may be estimated jointly through a single equation, by combining the treatment and the control samples. Let xi be the vector of independent variables for customer i. zi ⫽ 1 if observation i is in the treatment sample zi ⫽ 0 if observation i is in the control sample
pik vik ⫺ pio vio ⫺ ck ⬎ 0
冋 册
and pik vik ⫺ pio vio ⫺ ck ⫽ Max 关 pij vij ⫺ pio vio ⫺ cj 兴
ln
j
If there is no treatment for which the incremental net expected value is positive, then customer i is not selected for any treatment.
pi ⫽ BTzi xi ⫹ BC共1 ⫺ zi 兲xi 1 ⫺ pi
In the above equations, the intercept term B0 can be easily incorporated, by writing x as the column vector (1, x1, , x2, . . . , xn) and B as the row vector (B0, B1, B2, . . . , Bn). Similarly, value may be estimated by a conditional ordinary least squares (OLS) regression model built on the responders. If we define value as cash flow, then other customer level operating costs would have to be subtracted from the predicted revenue to obtain the customer level value estimate. Let:
APPLYING THE DECISION RULES In applying the decision rules, we need customer level estimates for the response likelihoods and value/profit under the marketing treatment and under the control. These can be obtained through modeling. Either regression analysis or recursive partitioning algorithms like CART/CHAID may be used to estimate these parameters. Let:
viT be the estimated (expected) value of customer i under treatment; viC be the estimated expected value under control; xiT and xiC be the set of predictor variables for the treatment and control regression equations respectively; z i僆{0, 1} be defined as earlier.
xiT be the vector of predictor variables for customer i in the treatment model C xi be the vector of predictor variables for customer i in the control model; then, piT and piC may be estimated by the following logistic regression models: JOURNAL OF INTERACTIVE MARKETING
C
pCi eB Cx i C C ln ⫽ B x or, p ⫽ C , C i i 1 ⫺ pCi 1 ⫹ eB Cx i
Then select customer i for treatment k if
T
pTi eB Tx i T T ln ⫽ B x or, p ⫽ T , T i i 1 ⫺ pTi 1 ⫹ eB Tx i
●
38
VOLUME 16 / NUMBER 3 / SUMMER 2002
INCREMENTAL VALUE MODELING
vˆ Ti ⫽ BTxTi
PARAMETER ESTIMATES VIA CART/CHAID Separate trees can be developed off the treatment and control samples to estimate the response likelihoods and values under treatment and control. Four trees in total need to be constructed. Two response trees from the treatment and control samples and two value (revenue) trees, from the respondents, in the two samples. A customer, based on his or her profile, will then end up in one of the mutually exclusive terminal nodes of each tree. Since each node has a mean value for the target variable, each customer’s response likelihood and conditional value can be estimated under the marketing treatment and under the control. Tree-based estimates, however, are sensitive to sampling variance. Since the tree algorithm uses recursive partitioning, the sample sizes get progressively smaller as one moves to lower partitions of the tree. One approach that has proved to be valuable in these situations is KFold Validation. The K-fold validation technique is related to the bootstrap approach (Efron & Tibshirani, 1993) for accurately estimating prediction error. Here the entire sample is divided into K equal-sized samples and K trees are built, using K ⫺ 1 samples for each tree. Every observation is now placed in a terminal node of each of the K trees and an average expected value (response likelihood/value) or score computed over the K estimates. These scores can now be used to build a Gains Table over all observations. K-Fold Validation helps ensure that the data is maximally leveraged in estimating target variables. Since the predicted response (score) of each observation is estimated as the average response from K separate trees, this estimate takes into account all the optimal partitions that can be developed from K subsamples. This approach is therefore not unlike what forecasters use to develop a forecast as an average of the predictions from different models. Since CART and CHAID require large samples, to accurately predict variables of interest, K-Fold Validation is particularly useful for moderate sized samples. If one is an ardent fan of tree technology, our recommendation is to
v Ci ⫽ BCxCi and when xT ⫽ xC ⫽ x v i ⫽ B Tzi xi ⫹ BC共1 ⫺ zi 兲xi
CORRECTING FOR MODEL BIASES THROUGH SMOOTHING The estimated (expected) value of a model’s criterion, or dependent, variable (for a given set of values for the predictors) is often referred to as the customer score. Generally with direct marketing models, for a group of customers there is often a difference between the mean score and the mean observed value. The relationship between scores and observed values is captured in a Gains Table. This table documents model performance on a holdout sample. Here customers are grouped into deciles, or demi-deciles, based on their scores, and for each group observed behaviors such as response rate, order size, etc., are noted. Typically, higher the score higher is the observed behavior. This relationship between scores and observed values is used to estimate the expected behavior rather than the score itself. By fitting a polynomial model, the relationship between the model scores and observed response rates can be smoothed and better estimates of the response rates can be obtained (see Figure 1).
FIGURE 1
Smoothed Relationship between Scores and Observed Response Rates
JOURNAL OF INTERACTIVE MARKETING
●
39
VOLUME 16 / NUMBER 3 / SUMMER 2002
JOURNAL OF INTERACTIVE MARKETING
FIGURE 2
Tree Methodology Extended to Incremental Response Rates
the difference in response rates, at each node we identify the split that maximizes the difference in the incremental response rate. This is achieved by including both the treatment and the control samples in the analysis. At each split an incremental response rate (⌬p) is computed for each branch and the split that results in the
use trees to estimate response likelihoods and regression analysis to estimate value since the sample for estimating value will be considerably smaller, as it consists only of responders. The CHAID algorithm can be modified to model incremental response rates directly. Instead of determining the split that maximizes JOURNAL OF INTERACTIVE MARKETING
●
40
VOLUME 16 / NUMBER 3 / SUMMER 2002
INCREMENTAL VALUE MODELING
largest difference in ⌬p, ⌬(⌬p) is selected. This is shown in Figure 2. Since split A results in a greater difference in the incremental response rate, it would be preferred over split B. The tree algorithm in this case is slightly modified, so that instead of selecting the split with the greatest difference (or most significant chi-square) in response rates, the split with the greatest difference in incremental response rates is selected.
$100 or more. There was no customization, other than name and address on the mail piece. At the end of the promotional period, transactions were posted to the database. For customers in the treatment and control groups, total sales during the promotional period were calculated and added to the sample data. Customers with a purchase were coded with a 1, and those with no purchase were coded with a 0 for this analysis. Table 1 shows some basic response information for the treatment and control groups.
EMPIRICAL STUDY An empirical examination of the TREE and logistic regression-based approaches was conducted. The focus of this study was on incremental response only, not taking into account the expected revenue. The data used for this study comes from a holiday promotion from a major national retailer. In addition to the roll-out, a random cross-section of the customer base was mailed the promotion (treatment group). An additional random sample was selected and not mailed (control group). These random samples were used to conduct the analysis. In many cases, companies may not agree to mail random samples, just to collect representative data for building models. In such cases it may still be possible to develop models by stratifying customers into good and bad. We would then oversample the good customers (to keep the client happy) and undersample the bad ones and then weight the two groups inversely during model development. In this instance, the client did not object to random samples, since she felt that the best customers would be more likely to shop on their own and it was the average, or slightly above average, customer who had to be further prodded to shop at the store. Prior purchase behavior information was also extracted as of the time of the mail file selection. This included typical RFM variables, such as number of trips (over different time periods), date of last purchase, and total dollars spent (over different time periods). Also included were variables relating to a customer’s holiday purchasing in prior holiday periods. The promotion was $10 off a purchase of JOURNAL OF INTERACTIVE MARKETING
SAMPLING METHODOLOGY To conduct this analysis, the entire sample was divided into a model-building and a validation dataset. To reduce sample bias, the file was sorted by past-12-months revenue, and alternate observations were then randomly sent to the model-building or validation datasets, as recommended in Malthouse (2001). In addition, outliers were removed.
MODEL RESULTS The incremental response modeling took place in three steps. Step 1 was to build a logistic regression response model for the control group. Step 2 was to build a similar model for the treatment group. It should be noted that the resulting model scores should be an accurate forecast of future behavior, since they are not being used merely to rank-order customers for selection. This is because one score will be subtracted from the other score to calculate the final ranking score. As such, Steps 1 and 2 also include a process of smoothing the resulting expected probabilities (scores) to more accurately represent the observed probabilities in the model-building sample. Step 3 is this calculation of incremental response. Once exploratory data analysis was performed on the model-building sample, a logistic regression model was developed for the control group. All variables are significant at the 90% level or above. The resulting model was scored on the same sample. The sample was then rankordered by the predicted score, and divided ●
41
VOLUME 16 / NUMBER 3 / SUMMER 2002
JOURNAL OF INTERACTIVE MARKETING
TABLE 1
Basic Sample and Response Information
Control Group Treatment Group Difference
Number of Customers
Number of Responses
Response Rate
Average Sales
Total Sales (000’s)
141,138 141,139 1
37,250 42,470 5,220
26.4% 30.1% 3.7%
$108.72 $103.18 $ (5.54)
$4,050 $4,382 $ 332
the difference between the predicted and observed response rates. The same process was conducted for the treatment group. The resulting deciles for this model, including smoothed and unsmoothed estimates, are included in Table 3. Again, the difference between predicted and observed response rates is reduced through the use of the smoothing model. Step 3, calculating expected incremental response, consisted of a few steps. The first was to combine the control group and treatment group samples from the model-building sample into one sample. The entire model-building sample was scored with both models. For each cus-
into 50 equal-size groups. For each group, the mean expected score and observed response probabilities were calculated. This data set of 50 observations was then used to build a second regression model. The dependent variable was the observed response rate. The independent variables were re-expressions of the score, or the predicted probabilities (squared and cubed terms). Once this smoothing model was built, the modelbuilding sample was scored to smooth the estimates, to more accurately predict observed response. The results of this exercise are displayed in Table 2. As can be seen, the smoothing model does significantly reduce
TABLE 2
Model-Building Sample, Control Model Diagnostics Control Model
Decile
Observed Response Rate
Unsmoothed Response Estimate*
Absolute Value of Difference
Smoothed Response Estimate*
Absolute Value of Difference
1 2 3 4 5 6 7 8 9 10 Average
68.9% 51.3% 38.9% 29.9% 23.2% 16.9% 13.0% 9.1% 6.5% 4.9% 26.3%
72.8% 50.7% 37.2% 27.8% 21.3% 16.6% 12.9% 10.3% 8.2% 5.4% 26.3%
3.9% 0.6% 1.7% 2.1% 1.8% 0.3% 0.0% 1.2% 1.7% 0.5% 1.4%
68.8% 51.2% 39.3% 29.8% 22.5% 17.0% 12.8% 9.6% 7.3% 4.2% 26.3%
0.1% 0.1% 0.4% 0.1% 0.7% 0.1% 0.2% 0.6% 0.8% 0.7% 0.4%
* Average Decile Scores
JOURNAL OF INTERACTIVE MARKETING
●
42
VOLUME 16 / NUMBER 3 / SUMMER 2002
INCREMENTAL VALUE MODELING
TABLE 3
Model-Building Sample, Treatment Model Diagnostics Treatment Model
Decile
Observed Response Rate
Unsmoothed Response Estimate*
Absolute Value of Difference
Smoothed Response Estimate*
Absolute Value of Difference
1 2 3 4 5 6 7 8 9 10 Average
73.3% 54.3% 43.1% 35.8% 26.9% 21.0% 16.3% 12.8% 9.3% 6.9% 30.0%
77.1% 54.5% 41.5% 32.5% 25.8% 20.8% 16.7% 13.6% 11.0% 7.6% 30.1%
3.8% 0.2% 1.6% 3.3% 1.1% 0.2% 0.4% 0.8% 1.7% 0.7% 1.4%
72.80% 55.00% 43.50% 34.30% 27.00% 21.20% 16.50% 12.90% 9.99% 6.25% 29.9%
0.5% 0.7% 0.4% 1.5% 0.1% 0.2% 0.2% 0.1% 0.7% 0.6% 0.5%
* Average Decile Scores
control group were calculated, and the observed incremental response rate was calculated, as well. Table 4 shows the results of this exercise. As can be seen, the approach does rank-order customers from high to low in terms of observed response. The only exceptions are deciles 4 and 8, which actually increase from the prior decile.
tomer, the expected control model score was subtracted from the expected treatment model score to calculate an incremental response score. The sample was then sorted in descending order by the incremental response score, and divided into deciles. For each decile, the observed responses for the treatment group and the
TABLE 4
Model Building Sample Final Incremental Diagnostics Predicted
Observed
Decile
Treatment
Control
Incremental
Treatment
Control
Incremental
Cumulative Incremental
1 2 3 4 5 6 7 8 9 10
44.9% 38.1% 34.7% 32.1% 29.4% 26.3% 22.6% 19.3% 18.0% 34.2%
37.7% 32.5% 29.8% 27.8% 25.5% 23.0% 19.7% 16.8% 15.9% 33.6%
7.2% 5.6% 4.9% 4.3% 3.8% 3.4% 2.9% 2.6% 2.1% 0.6%
46.7% 38.1% 33.4% 31.4% 28.6% 26.3% 23.0% 19.6% 18.6% 33.8%
38.8% 31.6% 29.7% 27.1% 25.4% 23.3% 20.2% 16.1% 17.5% 32.9%
7.9% 6.4% 3.6% 4.3% 3.2% 3.0% 2.8% 3.6% 1.2% 0.9%
7.9% 7.2% 6.0% 5.6% 5.1% 4.8% 4.5% 4.4% 4.0% 3.7%
JOURNAL OF INTERACTIVE MARKETING
●
43
VOLUME 16 / NUMBER 3 / SUMMER 2002
JOURNAL OF INTERACTIVE MARKETING
TABLE 5
Validation Sample Final Incremental Diagnostics Predicted
Observed
Decile
Treatment
Control
Incremental
Treatment
Control
Incremental
Cumulative Incremental
1 2 3 4 5 6 7 8 9 10
45.0% 38.3% 34.7% 31.8% 29.4% 26.3% 22.6% 19.5% 18.2% 34.1%
37.7% 32.7% 29.8% 27.5% 25.5% 22.9% 19.6% 17.0% 16.2% 33.5%
7.3% 5.6% 4.9% 4.3% 3.9% 3.4% 3.0% 2.5% 2.0% 0.6%
45.5% 38.9% 33.8% 31.3% 28.8% 26.4% 22.7% 19.1% 18.7% 34.6%
39.3% 32.8% 30.4% 27.8% 25.0% 22.9% 20.2% 16.8% 15.8% 32.3%
6.2% 6.1% 3.4% 3.6% 3.8% 3.5% 2.6% 2.4% 2.9% 2.3%
6.2% 6.1% 5.2% 4.8% 4.6% 4.4% 4.2% 3.9% 3.8% 3.7%
model sample). The predicted value for the terminal node is assigned to all observations falling in the node. This is repeated for each of the trees, yielding three predicted values for each observation. The final predicted value, or score, is the average of the three predicted values. Figure 3 displays the results of one of the trees with seven terminal nodes. These are shown numbered from 1 to 7 in decreasing order of the incremental response rates. Table 6 displays a gains table for the combined score of all three trees. With the exception of a slight increase in the bottom score range, this approach does produce declining observed incremental response rates across segments.
The validation sample was then scored with the models and rank-ordered in the same way as the model-building sample. These results are shown in Table 5. As can be seen, the model generally performs well, with the top deciles having higher incremental response rates, and the bottom deciles having lower incremental response rates. However, there are some cases of deciles that have higher incremental response rates than the previous decile: 4, 5, and 9. In addition, the top decile’s performance misses the estimate by 1.1%, from 7.3% estimated to 6.2% actual. The bottom decile, too, misses the estimate by 1.8%, from 0.6% estimated to 2.4% observed. The net effect is that the curve is “flattened” from the model-building sample to the validation sample, a not uncommon occurrence.
COMPARISON OF RESULTS Both methodologies discriminate between customers with a larger predicted incremental response and those with a much lower incremental response. In the validation of the two approaches, the modified CHAID methodology provides a more favorable rank-ordering of customers, based on observed incremental response. Figure 4 shows a comparison based on cumulative depth of file. In developing Figure 4 we interpolated between the segments to arrive at the incremental response rates at the decile level, so we could directly compare the tree
TREE RESULTS The model-building sample was used to build a modified CHAID model. The process included the building of three CHAID trees. Each tree was built on 90% of the data in the model sample—a different, but not mutually exclusive, random sample was selected to build each of the trees. The predicted value corresponding to each terminal node is calculated as the difference of the average response rates for the treatment and holdout groups within the node (using the JOURNAL OF INTERACTIVE MARKETING
●
44
VOLUME 16 / NUMBER 3 / SUMMER 2002
INCREMENTAL VALUE MODELING
FIGURE 3
Tree Based on Incremental Response Rate Splits
approaches are very comparable, with the tree approach slightly out-performing the logistic modeling approach. We believe the tree approach provides somewhat superior performance because the tree is constructed based directly on incremental response rates, the variable of inter-
approach to the logistic regression approach. The most extreme difference is at the lower depth of selection. If a firm is looking to identify the very top end of their file (say, the top 10%), the modified CHAID approach clearly seems superior. However, at depths below that, the two
TABLE 6
TREE Validation Sample Final Incremental Diagnostics
Segment 1 2 3 4 5 6 7 8
Pct of File
Predicted Incremental
Treatment
Observed Control
Incremental
Cumulative Incremental
9.9% 8.8% 11.4% 8.9% 12.4% 11.9% 26.2% 10.5%
7.7% 5.5% 4.3% 3.7% 3.4% 3.2% 2.7% 1.8%
48.7% 33.7% 30.5% 19.1% 49.6% 24.7% 9.2% 53.0%
41.1% 28.3% 26.1% 15.2% 46.1% 21.4% 7.3% 50.9%
7.5% 5.3% 4.4% 3.9% 3.5% 3.3% 1.9% 2.1%
7.5% 6.5% 5.7% 5.3% 4.8% 4.6% 3.8% 3.6%
JOURNAL OF INTERACTIVE MARKETING
●
45
VOLUME 16 / NUMBER 3 / SUMMER 2002
JOURNAL OF INTERACTIVE MARKETING
FIGURE 4
Comparison of Methodologies
promotion. It could be that these customers are likely to spend more, even when they were likely to visit anyway. Given the two different inferred behaviors that lead to positive impact (incremental visit, or incremental spending on same visit), it may be appropriate to consider different models for each of these behaviors. This could be as simple as a model for high-value customers (incremental spending) versus mass market (incremental visit), or utilizing more advanced methods such as latent class regression methodologies that tease apart different customer groups and derive separate models for each. Marketers should be solely focused on incremental performance for promotional communications, both from a program analysis perspective, and for program enhancement activities. Either methodology described here will enhance performance on one key driver of incremental value, that being incremental response.
est, whereas in the logit approach, the incremental response rates are derived after separately modeling treatment and control response rates.
DISCUSSION This study examines two methodologies for predicting incremental response. However, other issues quickly rise to the fore in identifying the incremental impact of a direct marketing communication. From a marketing standpoint, a promotional marketing device can have two outcomes. The first is to generate an incremental visit from a customer who would not be as likely to visit in the absence of a promotion. The second is incremental spending from a customer who would have visited the store with or without the promotional communication. As can be seen in the gains tables for both the logistic and CHAID approaches, the last decile/ segment has an extremely high visit/response rate. In fact, in the CHAID methodology, the last segment has the highest visit rate of any segment (50.9%) in the control sample. This indicates that there are some customers with regular visit habits who may not be persuaded to increase their visits due to a mailing. However, the extension of this methodology to include revenue would allow for understanding to what extent purchase amounts are impacted by a JOURNAL OF INTERACTIVE MARKETING
REFERENCES Efron, B., & Tibshirani, R.J. (1993). An Introduction to the Bootstrap (Monographs on Statistics and Applied Probability, No. 57). London: Chapman and Hall. Malthouse, E.C. (2001). Assessing the Performance of Direct Marketing Scoring Models. Journal of Interactive Marketing, 15 (1), 49.
●
46
VOLUME 16 / NUMBER 3 / SUMMER 2002