J BUSN RES 1992:24:115-133
115
Screening for Interactions Between Design Factors and Demographics in Choice-Based Conjoint Goutam George
Chakraborty Woodworth
Gary J. Gaeth University
of Iowa
Richard
Ettenson
University of Maryland
We propose an iterative stepwise forward selection procedure to screen a subset of important variables from a large pool of candidate variables in choice-based conjoint analysis. The method involves computing weighted correlations between candidate variables and residuals from the prior best fitting multinomial logit (MNL) model. Candidate variables that pass the screening step are then introduced and the MNL model is refitted using standard MNL software. A series of diagnostic tests is carried out before each iteration cycle. The procedure can be implemented easily using commonly available statistical software. We illustrate the application of the proposed method on a large data set. Introduction Choice modeling, whether based on field or experimental data, has been an important area of study for researchers in economics, marketing, as well as other fields (Sen, 1986; McFadden, 1986). Recently, discrete choice experiments have gained considerable popularity among academicians and practitioners who wish to understand consumer behavior (Louviere and Woodworth, 1983; Louviere, 1988). In a discrete choice experiment, a respondent is asked to evaluate a series of choice sets. Each choice set typically contains several alternatives described by specific combinations of attribute levels. The respondent’s task is to choose the most preferred alternative from each choice set. Analogous to rating-based conjoint analysis (Green and Srinivasan, 1978; Louviere, 1988), the data generated from choice
Address correspondence Administration, Department
to Goutam Chakraborty. of Marketing. Stillwater,
Journal of Business Research 24, 115-133 (1992) 0 1992 Elsevier Science Publishing Co., Inc. 655 Avenue of the Americas, New York, NY IO010
currently at Oklahoma OK 74074.
State University,
College
of Business
014%2963/92/$5.W
116
J BUN RES 1992:24:115-133
G. Chakraborty
et al.
experiments can be modeled to estimate relative importance of the attributes, as well as part-worths associated with different levels of the attributes. This information provides considerable insight into consumers’ decision making processes and is often used to predict marketplace behavior through choice simulators. Although different types of choice modeling procedures have been reported in the literature (Currim, 1982; Buckley, 1988), the more popular ones appear to be variants of the multinomial logit model (MNL). Estimating MNL models from choice experiment data becomes increasingly complex as the number of attributes describing the alternatives increases. For example, in a choice experiment that uses 10 attributes to describe alternatives, the second order model includes 10 main effects, 10 quadratic effects (i.e., the linear-by-linear component of main effects), and 45 2-way interaction effects. Thus, even without higher order interactions, there are 6.5 effects that might be included in the model. The situation is further complicated by the fact that researchers in marketing are often interested in finding differences in decision processes across managerially relevant market segments. These differences typically manifest themselves in the form of varying attribute coefficients across demographic variables. Using the example above, if one considers only 5 such demographic variables (e.g., age, gender, income, education, and marital status), an additional 50 interactions between main effects and demographics need to be modeled. Additionally, if the researcher wants to include interactions between demographics and the quadratic effects or the 2-way attribute-by-attribute interactions, the number of total effects to be included in the model grows to 325. A MNL model of this magnitude is unwieldy, if not intractable for 2 reasons. First, the associated X-matrix is often ill-conditioned (collinear) making simultaneous estimation of all parameters difficult, and second, many software packages cannot estimate such a large number of parameters. To deal with these problems, we outline an iterative stepwise forward selection procedure that can be implemented with existing statistical software packages. The remainder of this article is organized as follows. First, we briefly describe the estimation of parameters in a MNL model. Next, we propose a screening method to identify significant effects from a large number of potential effects to include in the choice model. We then apply our screening method to an existing data set collected as part of a large study of consumer decision making for health insurance. Finally, we conclude the article by discussing some limitations of our screening method and suggesting directions for future research. Maximum
Likelihood
Estimation
of Parameters
in MNL
Models
The MNL form of Lute’s model (1959, 1977) can be derived from random utility theory (Amemiya, 1981; Ben-Akiva and Lerman, 1985; McFadden, 1981), which assumes that the utility of an alternative has 2 components, one that is fixed and the other random. The fixed or systematic component of an alternative’s utility is typically modeled as a function of the attribute levels and individual consumer characteristics. The random component reflects unobserved attributes and taste variation, measurement errors, and imperfect information (Ben-Akiva and Lerman, 1985). If the random components are independently and identically Gumbell distributed, the probability that a randomly selected individual in population segment “g” chooses alternative “a” from choice set “A” is given by the MNL model:
Screening for Interactions
J BUSN RES 1992:24:115-133
in Conjoint
117
p,(aIA) = exp(Va,YC eXP(V,d7 jcA
where V,, is the systematic component of the utility of alternative “j” to individuals in segment “g.” This systematic component can be further specified as a function of the vector of the observed levels of each attribute (x) of the alternatives as
‘,, =
ck Pkg xk, ’
where Pkg is the weight of the attribute “k” in the valuation of the alternative “j” for respondents in segment “g.” This specification can be easily extended to accommodate quadratic or interaction effects. There are several ways to mode1 the variation of attribute weights across segments, but the simplest is to regard them as functions of demographic (or similar) variables: Pkg
=
Pk
+
c
ake
.
we,
where ake represents the interaction between the attribute “k” and the demographic variable “e” and w_ is the value of the demographic variable “e” in the segment “g.” Thus, the utility of alternative “j” for individuals in segment “g” is:
‘1, =
2 k
Pk
.
xk,
+
c
2 k
Sk,
For the main effects model, hence, dropping the segment V,, = V, = C
Pk
’ xkj
’ xkj
.
W,g
(1)
e
attribute-by-demographic interactions (g) subscript, we are left with:
are all zero;
(2)
k
Choice Experiments Data from choice experiments are often analyzed using the MNL mode1 (e.g., Louviere, 1988). Suppose that the design in a choice experiment involves presenting The probability that an individual I choice sets, A,, A,. . . . .A,, to each respondent. “a” from choice set Ai, according to the from segment “g” chooses alternative MNL mode1 is:
jtAi
where Vi, is the systematic component of alternative “j” for respondents in segment “g” given by equation (1). In equation (3), the denominator is constant for alternatives within the same choice set Ai. Therefore, one can rewrite this equation as: pg (alAi)
= exp(V,,)l&
where ki, is an unknown constant corresponding to the choice set ‘5” and the segment “g” (see Technical Appendix for details). For the purpose of parameter estimation, we treat the choices made by a single respondent as conditionally independent, given the demographic characteristics of that respondent. Failure of this assumption does not bias parameter estimates but inflates estimates of their
118
J BUSN RES 1992:24:115-133
G. Chakraborty
et al.
standard errors. Louviere and Woodworth (1983) discuss this and propose a conservative correction of the standard errors. The parameters of the utility model in equation (1) are estimated by maximizing the formal likelihood
where f.,,, is the number
of individuals in segment “R” choosing from choice set A,, and G and I are the total number of segments respectively.
alternative “a” and choice sets,
Practical Problems in Screening from a Large Number of Interactions The practical barrier to fitting such models is the need to estimate coefficients for an extremely large number of potential interaction terms-both attribute-byattribute and attribute-by-demographic interactions. In what follows, we describe an iterative stepwise forward selection method to rapidly screen a large number of variables for improving the model fit (see Cooper and Nakanishi, 1988, for single step screening procedure). Our method involves computing weighted correlations between candidate variables and residuals from the currently best fitting MNL model. Candidate variables that pass the screening step are then introduced and the MNL model is refitted using standard MNL software. Our procedure is analogous to the well-known forward stepwise regression method. As such, it shares 3 generic shortcomings associated with any stepwise procedure. First, the decision as to which variables to include and when to stop is ad hoc and should depend on researchers’ knowledge and goals. Second, stepwise procedures are specifically designed to maximize model fit and are therefore generally unsuitable for inference and theory testing. Finally, all stepwise procedures suffer from the lack of availability of a suitable comparison or base model because the correct model is unknown and the full model with all independent variables is often saturated and of little value, especially in the presence of collinearity. To provide some safeguards against the generic weaknesses of stepwise procedures, we have incorporated a series of diagnostic checks in various steps of our proposed method. Ideally, a screening procedure for choosing candidate variables should maximize pre-specified research criteria, such as improvement in model fit or improvement in the predictive ability of the model. In the context of maximum likelihood estimation, the usual criterion is maximizing the improvement in log-likelihood. However, existing MNL programs do not allow stepwise addition of variables by this criterion. We are not aware of any existing software for stepwise introduction of variables in a multinomial logistic model, although such procedures are available for binary logit models (in SAS and BMDP). Our solution to this problem is to screen a small set of candidate variables, fit these variables with MNL software, and then test for improvement in log-likelihood by backwards elimination of the comparatively small set of candidate variables passing the screening step. The residuals from the refitted model are then used to screen for additional candidate variables. The procedure is repeated until no useful improvement in the model is made. Iteration helps to avoid missing an important variable in the screening stage.
Screening
for Interactions
J BUSN RES 1992:24:115-133
in Conjoint
119
STEP ONE Fir mam effects .M.NLmodel
Compute
Weighted
residuals.
correlaaon
working
of residuals
with
ldennfy subset of can&date vanables
STEP THREE Center all vanables and run welghred 5repwse reeresslon
r
s~gnnliicanr vanables
STEP FOUR Refit MNl_ model. adding candidate vanables whxh sunwed prewous srs;,
+ STEP
FIVE
Check hkehhocd raoo resr for Jomt slgmficance of all new vanables T-test for mdiwdual significance for each new vanable Check If vansbles from Srep one are xl11 \lgnlflcanl? Check lmprovemenr
in J’
Figure 1.
Proposed
Stepwise Screening Procedure
Our screening procedure is outlined in Figure 1. We start with the assumption that the attributes in a controlled choice experiment have been carefully selected by the researcher. Although there are certainly applications for which this is not appropriate, we decided to use the main effects as a core model; that is, all main effects are automatically included in all subsequent models. We then select important higher-order variables from a list of potential effects including quadratics, attribute-by-attribute and attribute-by-demographic interactions. The goal is to find effects that contribute to improvement in log-likelihood, and ultimately to provide a better model fit.
120
J BUSN
RES 1992:24:115-133
G. Chakraborty
et al.
As shown in Figure 1, we fit the main effects model using standard MNL software and obtain estimates of coefficients in step one. From these estimated coefficients, we compute expected frequencies, residuals and weights: Expected frequency:
ja,, = n,, . 0, ialA
Residual: Weight:
e “‘P = (5% - L&f&g w,,, = fa,,
where nig is the number of respondents in segment “g” evaluating choice set Ai. Typically, data are completely disaggregated so that II,, equals 1 and fai, is a O/l variable corresponding to each individual and alternative. In this case, the expected frequency equals the predicted probability pr (alAi). In step 2, we identify a set of candidate variables for further consideration by selecting a subset of variables from all quadratic and interaction effects. This is accomplished by including those effects that have large weighted simple correlations with the residuals from the fitted model in step 1. This can be carried out easily using standard statistical software packages, which allow calculation of weighted correlations. This is similar in principle to identification of additional variables in large regression problems (Draper and Smith, 1981), except that we use the ordinary correlations between the residuals and candidate variables instead of the partial correlations controlling for variables already in the model. The purpose of this second step is to reduce the number of variables for step 3, where we work with the partial correlations. Alternative approaches can be taken to this step. In particular, a reviewer suggested that a split-half or a jackknife analysis could replace our use of the single set of correlations. Such an approach helps to protect against instability in the simple correlations. We did not include these approaches in this example, but have found this latter suggestion to be quite helpful when high leverage data points are present, particularly in interval scaled data. In step 3, we approximate a multinomial logistic regression by stepwise weighted linear regression (Jennrich and Moore, 1975) using the working logit as the dependent variable and the set of variables currently in the model, plus the candidate variables selected in the previous step, as independent variables. The working logit, the dependent variable in this step, is:
Y& = 1% <.Li,> +
GIR
In this step, all the variables from step 1 are forced into the model to ensure that the stepwise regression selects those candidate variables having large partial correlations with the dependent variable, after adjusting for the variables already in the model. We also center all independent and dependent variable means to zero at the subject level to reduce collinearity (Draper and Smith, 1981). The weighted least squares stepwise procedure approximates a stepwise MNL solution but computes inflated “F to enter” values (F-ratio for each new variable), due to an overstatement of the residual degrees of freedom. We therefore recommend that researchers set a stringent “F to enter” threshold to ensure that only the most important variables are retained. In step 4, we add the surviving variables to the MNL model. In step 5, the new model is tested against the original for improvement in log-likelihood using the nested model likelihood ratio test (Ben-Akiva and Lerman, 1985). This determines whether all the new variables’ coefficients are jointly equal to zero. It is possible
Screening for Interactions
in Conjoint
J BUSN RES 1992:24:115-133
121
to reject this hypothesis and simultaneously have some nonsignificant t statistics for the new variables. In this case, we recommend additional nested model tests by deleting those new variables that individually are nonsignificant in the model. The ideal approach is to delete new variables one at a time starting with the least significant one and testing the significance of each deletion. However, this may be impractical due to the computer time required to fit large models. Another important issue in this stage is whether the addition of new variables renders some of the original variables nonsignificant. This may arise due to high collinearity between the original variables and new variables. Although it is possible to maintain uncorrelated linear and linear-by-linear effects using a particular experimental design, it is not possible to achieve the same for attribute-bydemographic interactions, as these would depend on the distribution of the demographic variables in the sample. As mentioned previously, for controlled choice experiments we are guided by the principle that main effects are as equally important as interactions, and therefore recommend that the original variables be retained rather than the new variables that create this condition. We recognize that in some instances, an interaction between a demographic variable and a design variable may make more theoretical sense than a prespecified main effect for an attribute. In that case, the interaction can be retained and the main effect dropped. We leave this decision to the judgment of the researcher. An additional diagnostic we recommend at this stage is improvement in the adjusted likelihood ratio index (rho-squared bar) which is related to the Akaike information criterion (Ben-Akiva and Lerman, 1985). This is a measure of improvement in goodness-of-fit and is adjusted for degrees of freedom lost due to the addition of new variables. Once each criterion is satisfied, the result is a model with all the original main effects and a small number of additional effects. We then return to step 1 and repeat the procedure to select additional interactions from those not yet considered. We continue to iterate until the likelihood cannot be improved further by a model that also satisfies the diagnostic criteria of step 5. When the iteration stops, we recommend RESET (regression specification error test) to check for adequacy of functional form of the selected model (Ramsey, 1969; Thursby, 1989). This test is illustrated in the application section. It is possible that the selected model may fail one of these proposed tests. In such a case, the users may go back to step 1 and reiterate by starting with a smaller number of variables from the output of the first stepwise run. Another problem that may occur is that the iteration process may identify non-nested models. In such a case, we propose that the researchers conduct the modified likelihood ratio test for non-nested alternatives (Horowitz, 1982) to select one of the alternative models. Our method of selecting candidate variables, based on the stepwise weighted regression of the current working logits on centered independent variables, shares the same theoretical foundations of the “score test” procedure used in econometrics to carry out the Lagrange multiplier test for omitted variables (McFadden, 1987; Engle, 1984; Crowder, 1976). In the score test, normalized residuals from the current model are regressed on centered and weighted independent variables to make inferences about whether any new variable should be added in the next step. Thus, in both the score test and our procedure, the inferences are guided by the partial correlations of normalized residuals or working logits (a function of
122
J BUSN RES lYY2:24:115-133
G. Chakraborty
et al.
the residuals), respectively, with the omitted independent variables. However, the score test is an omnibus test for the presence of significant omitted variables, while our method seeks to maximize model fit. Inferences based on the omnibus score test are likely to be overly conservative when there are only a few significant variables among a large number of omitted variables. This is similar to the problem encountered when conducting the Scheffe omnibus test for multiple comparisons in the regression context.
Application
of Proposed
Screening
Method
Method In this section, we illustrate the use of our screening method by applying the procedure to data collected as a part of a large scale study of consumers’ decision processes when evaluating and selecting health insurance in a multi-plan environment. Through extensive preliminary work (including 6 focus group interviews with consumers and input based on expert opinion), 24 attributes were selected to describe health insurance plans. For each attribute (except brand name), 3 levels were developed after additional pretesting and expert consultation; brand name was varied at 6 levels. The population of interest was Maryland state employees. The sampling frame was based on a set of workplace telephone directories of state agencies. A systematic sample was created by using every 26th name from the directories. After obtaining consent through telephone solicitation, choice conjoint questionnaires were mailed to 662 subjects. Each subject was paid $15 after they returned a completed questionnaire. The total number of returned surveys was 562, yielding a response rate of 8.5%.
Design The conjoint portion of the questionnaire was composed of 11 choice sets with 4 alternatives per set. Of these, the first 2 choice sets were for practice and not used in the analysis. Each alternative in the choice set was described by a brand name and 23 other attributes. The attribute levels of the first 3 alternatives in each choice set was determined by a 6 x 3’3 fractional factorial design. The attribute levels of the fourth alternative in each choice set were held constant at the middle level. The constant alternative is used to serve as a base for estimating the contrast of the utilities of the other 3 alternatives. Following the conjoint section, the respondents completed, among other things, a section on demographics. The attribute brand name was varied at 6 levels and coded by 5 O/l dummy variables. The 3 levels of the other attributes were coded such that the linear effects had values of - 1, 0, and 1, and the corresponding quadratic effects had values of 1, -2, and 1. The design permitted estimation of all linear and quadratic effects free from other higher-level interactions (attribute-by-attribute). Previous research suggested that the evaluation and selection of health insurance might be influenced by a respondent’s age, gender, job-class, education, current health insurance plan, and whether or not the respondent had dependents. These 6 variables were therefore used in this application. The 5 demographic variables were dichotomized and coded as - 1 and 1. “Current health insurance plan” w&s:also dichotomized based
Screening
for Interactions
in Conjoint
Table 1. Coding
and Sample
J BUSN RES 1992:24:115-133
Distribution
Variable Name
of Demographic
Levels
Variables Coding
40 yrs. or above Less than 40 yrs.
-1
Male Female
-1
White collar Blue collar
-I
Current health insurance plan
Enrolled in Blue Cross/Blue Shield Enrolled in other insurance
-1
Education
More than one year of college High school or comm. college
-1
Yes No
-I
Age Gender Job-class
Dependents
Used 1
% of Respondents 50.3 49.7 48.3 51.7
1
58.5 41.5
1
62.3 37.7
1
62.9 27.1
I
51.5 48.5
on whether the respondent was enrolled in a Blue Cross/Blue Shield plan (Blue Cross/Blue Shield held approximately 80% of market share). The codes and the sample distribution for the levels of these 6 demographic variables are shown in Table 1. Thus, the model building task involved a total of 189 potential effects; 5 dummy variables for brand name, 23 linear effects, 23 quadratic effects, and 138 (6 x 23) attribute-by-demographic interactions contained in a data set of over 20.000 choices.
Result of First Iteration As the first step, we estimated the parameters of the main effects model by using the SAS MLOGIT procedure. In the main effects model, most effects are significant and the adjusted likelihood ratio index is 13.9%. Although this statistic is similar to adjusted R2 in regression, the values of rho-squared bar are typically lower than regression R2 and there are no general guidelines for what would be considered an acceptable value for this statistic (Ben-Akiva and Lerman, 1985). All variables, except the brand name dummies and office hours, are coded such that - 1 represents the “best” level and + 1 represents the “worst” level of the attribute. Therefore, a negative sign is expected for all coefficients except those for office hours and brand dummies; the values under the main effects model in Table 2 confirm this. The order of the 5 most important attributes (hospitalization, choice of doctors, premium, dental coverage, and choice of hospitals) generally confirms managerial expectations. Thus, the overall performance of the main effects model appears to be satisfactory. As an initial screening, we next identified 50 candidate variables from the pool of 161 variables left after fitting the main effects (28 variables) model. We retained all the main effects (even those few that are not significant) based on previously discussed reason. Fifty new variables were chosen on the basis of highest weighted correlations with the residuals of the main effects model. In this case, the number 50 was a result of an ad hoc decision to include only correlations significant at p < 0.10.Although it is theoretically possible to use all 161 variables in the stepwise procedure, the computer time required is prohibitively expensive. Moreover, since
Table 2. Details of Iterations of Screening Process Parameter
Variable
Main effects model
Name
Dummy for brand 1 Dummy for brand 2 Dummy for brand 3 Dummy for brand 4 Dummy for brand 5 Waiting time in office Office hours” Premium Emergency care Choice of doctors Prescription coverage Convenience/paperwork Office visit coverage Out of town emerg. coverage Dental coverage Quality of hospitals Choice of hospitals Travel time to physician Travel time to hospital Time to make appoint. Substance abuse coverage Psychologist/psychiatrist coverage Wellness/education program Vision & hearing care Communication with participants Preventive care Hospitalization 24-hour consultation by phone Vision & hearing care-q (quadratic) Out of town emerg. cov_q (quadratic) Travel time to physician-q (quadratic) Emerg. coverage-q (quadratic) Time to make appointment-q (quadratic) Convenience/paperwork_q (quadratic) Hospitalization-q (quadratic) Current plan-4 by-Premium Pretentive-q (quadratic) Consultation by phone-q (quadratic) Prescription-q (quadratic) Gender-by-premium Log-likelihood of model Change in log-likelihood
.107* ,576 ,149 ,390 ,071s - .08S .091 - .32X -.I56 - .424 - .060* - .212 p.10.5 ~ ,244 - ,319 -.196 - .314 -.159 - ,063 -.15x - ,107 - ,253 - ,028’ ~ ,226 .032* - ,244 - ,665 -.211
- 5995 13.9%
F “Except
for office
hours,
all other
variables are
1st iteration .138 .357 .13x ,412 .117* - .OY3 .O79 - .353 - .144 - ,397 - .029* - .209 ~ ,087 ~ ,251 - ,307 -.176 - .323 - .173 - .05Y - .I53 - .llO ~ ,231 ~ ,046’ ~ ,238 .01g* - .237 - ,647 ~ .210
In
3rd iteration
4th iteration
- ,091 .076 - .342 -.I52 - .380 - .041* - .217 -.lll - ,256 - ,306 -.171 - ,314 -.158 - ,063 -.150 -.138 ~ ,213 ~ ,056 ~ ,249 .011* ~ ,247 ~ ,665 ~ ,207
,131 .3Y5 .111* -.112 ,097 - ,501 -.I70 - ,382 ~ .036* ~. 190 -.123 - ,257 p.314 -.1X8 ~ ,328 p.153 ~ ,052 p.146 -.I46 -.187 - .097 - .266 - .010* - .252 - .662 - ,207
.12g* -.112 ,095 - ,504 -.171 ~ ,387 - .036* -.18Y -.124 ~ ,257 - ,315 -.I87 - .329 - .I51 - .04Y -.146 -.147 -.187 - .096 - ,264 ~ ,009’ - ,255 - ,666 - ,207
- .056
- ,045
~ ,045
- .039
- ,039
- ,037
~ ,038
- .03Y
- ,033
~ ,033
- ,047
- ,048 .055
- .043 .058
~ ,039 ,064
- .037 .066
- .O43
- .041
- ,039
- ,039
- ,049
- .05 1 .041 .183 - .056
- ,056 ,042 ,167 - ,063
~ ,057 ,042 ,168 - .064
- .065 .045
- ,066 ,046 - ,068 ~ 5900 2@ 15.1%
- 5968 27h 14.2%
coded such that
-
.141*
5953 16h 14.4%
test for nested
1corresponds models.
- 5940 13h 14.6% to best levels
.054*
5th iteration
.084* ,297 .126* ,418 .137* -.102 .0X6 - .501 -.162 - ,388 - .036* - ,204 ~ ,109 - .261 - ,300 -.174 - .317 -.I67 - ,069 -.165 -.128 ~ ,200 - ,074 - ,267 - .009* ~ ,256 ~ ,650 - .200
.117* ,308 ,147 ,418
levels. %gnificant at a = 0.01 or better by likelihood ratio *Not significant at a = 0.05 for one-tailed test.
Estimates
2nd iteration
,264
- 5928 12h 14.7% and +
1corresponds
.057* ,277 ,143
,400
to worst
Screening for Interactions
in Conjoint
J BUSN RES 1992:24:115-133
we iterated through each stage (including initial screening and stepwise), the chance of missing an important variable due to this part of the screening procedure was very slim. After centering each variable, we ran a weighted stepwise regression with working logits calculated from the current (main effects) model as the dependent variable and all main effects plus the 50 new candidate variables as independent variables. We first forced the 28 variables (all main effects including brand dummies) into the model and then looked for remaining variables that had the highest partial correlations with the dependent variable. The statistic “F to enter” in the stepwise regression is a measure of these partial correlations. In this illustration, we decided toconsider4variables,“visionandhearingcare-q,”“ out-of-townemergencycoverage of doctors,” and “travel time to physician 9, ” “current health care plan-by-choice q,” which had the highest “F to enter” in the stepwise run. In our example, any variable ending with “ _q” denotes the quadratic effect of that variable. Next, we re-estimated the MNL model and included these 4 new variables along with the 28 variables of the main effects model. Using the nested model test, we retained only the variables “vision and hearing care-q,” “out-of-town emergency and “travel time to physician-q. coverage-q,” ” The coefficient for “current health care plan-by-choice of doctors” was not significant and there was no improvement in p’ when this variable was added to the model containing all other variables. Thus, the model after the first iteration contained the core (main effects) model and 3 additional quadratic effects, yielding a total of 31 variables.
Result of Second Iteration Starting with the final model from the previous iteration, we recomputed the weighted correlations among the residuals of this model and the 158 variables remaining from the original pool of 189 variables, Again, based on the highest weighted correlations, we chose 50 new candidate variables. From the results of the stepwise regression of the working logits (calculated from the final model after the first iteration) on the 31 variables plus the 50 new variables, as before we selected 4 variables based on highest “F to enter.” These 4 variables were “emerof doctors,” “time to make apgency care-q, ” “current health care plan-by-choice pointment-q,” and “convenience/paperwork-q.” Similar to the first iteration, we dropped the interaction “current health care plan-by-choice of doctors” based on significance tests, and lack of improvement in model fit. Therefore, the model after the second iteration contained all main effects and 6 quadratic effects for a total of 34 variables.
Result of Third Iteration Proceeding as in the previous iterations, we chose the first 4 variables from the stepwise regression for final consideration. These were “current health care planby-choice of doctors,” “hospitalization-q,” “current health care plan-byand “preventive care-q.” Again based on the tests as in the previous premium,” iterations, we dropped the variable “current health care plan-by-choice of doctors” and retained the remaining variables in the model. Thus, at this stage the model contained all 28 main effects, eight quadratic effects, and one attribute-bydemographic interaction effect. We also dropped the variable “current health care
126
J BUSN RES 1992:24:1 IS-133
G. Chakraborty
et al.
plan-by-choice of doctors” from the list of potential candidate variables for all subsequent iterations. We could have dropped this variable from the list of potential variables after the first iteration. However, as a check we decided to carry it through 3 iterations to show that it had no effect on subsequent models. In general, once a variable has been tested in the MNL model and shown to be nonsignificant, it may be dropped from the list of candidate variables.
Result of Fourth Iteration In this step, the first 4 additional variables from stcpwise regression were “24-hour consultation by phone-q,” “prescription coverage-q,” “current health care plan-by-preventive care,” and “current health care plan-by-choice of hospitals,” respectively. In the re-estimated MNL model, the coefficients for the 2 demographic-by-attribute interaction variables were nonsignificant. A nested model test also failed to reject the hypothesis that these 2 coefficients were jointly different from zero. Similarly the goodness-of-fit measure, p2 suggested that adding these 2 variables did not improve the model fit. Therefore, we retained “24-hour consultation by phone-q” and “prescription coverage-q,” but dropped both interaction variables from further consideration. The model at this stage had all 28 main effects, 10 quadratic effects, and 1 interaction effect.
Result of Fifth and Final Iteration The first 4 variables from the stepwise regression of this stage were “job-class-byoffice visit coverage,” “gender-by-premium.” “premium-q,” and “current health care plan-by-quality of hospitals.” When added to the model, these new variables improved the log-likelihood and p’. Significance tests indicated that the coefficients of the 4 new variables were different from zero, both individually and jointly. However, adding these 4 new variables resulted in nonsignificant t statistics for a number of variables already in the model. As argued earlier, we believe that in a controlled discrete choice experiment, main effects are at least as important (if not more) as higher order effects. Hence, further investigation is needed before any new variable is added to the model. By deleting new variables one at a time and jointly, we concluded that only the “gender-by-premium” variable could be added to the model in this stage without degrading main effects. Thus, the model after the fifth iteration contained all main effects, 10 quadratic effects, and 2 attributeby-demographic interactions. After the fifth iteration, no variable could be added to the model that satisfied the criteria outlined in step 5 of Figure 1. Thus, the final model was obtained in 5 iterations. The adjusted likelihood ratio index, p”, was lS.l%, which was a modest but significant improvement over the main effects only model (p’ = 13.9%). In this model, most effects (except 2 brand specific dummies and 2 main effects) were significant, and all main effect coefficients were in the expected directions. The iteration results indicate that the coefficient estimates remained fairly stable (with no sign reversals) through each iteration. The final model also passed the RESET test successfully. This test checked the adequacy of the model’s functional form by adding higher powers (square, cube) of the predicted values based on variables already in the model. Had these additional terms been significant, it would have
Screening for Interactions
in Conjoint
J BUSN RES 1992:24:115-133
127
indicated an incorrect functional form of the variables in the model. In sum, all diagnostic checks indicated a satisfactory terminal model. Although the primary purpose of this article is not to describe the substance of the original health care project, the final model resulting from the application of our technique contains several quadratic and attribute-by-demographic interaction effects that provide insights into consumers’ health insurance decisions. For instance, the sign and significance of the “current health care plan-by-premium” interaction indicates that the respondents enrolled in Blue Cross and Blue Shield plans are less price sensitive than respondents enrolled in other plans. This may reflect a brand loyalty effect, since a majority of Blue Cross and Blue Shield enrollees have been with this plan for many years. Similarly, the “gender-bypremium” interaction indicates that male respondents were more sensitive to price than female respondents. This indicates that females may be a more lucrative target market for health insurance providers.
Conclusion We proposed a way to help researchers identify a subset of important variables from a large pool of candidate variables in a discrete choice experiment. Our iterative procedure involved an initial screening based on the correlations of the residuals from a current model with the candidate variables. This was followed by a weighted linear stepwise regression that approximates stepwise multinomial logistic regression. Our weighted regression screening procedure based on the partial correlations shares the same theoretical underpinnings of the Lagrange multiplier test for omitted variables reported in the econometric literature (McFadden, 1987). Cooper and Nakanishi (1988) also recommend a similar screening procedure to identify potential cross-effects in modeling multinomial panel data. However, this latter procedure uses only a 1 step analysis of the residual correlations. Thus, their suggestion is identical to our step 2 in Figure 1. The advantages of our procedure are as follows. First, we use both ordinary correlations to quickly screen a large number of candidate variables followed by partial correlations for a more precise approximation to stepwise multinomial logistic regression. Second, our procedure is iterative, and has diagnostic checks built into each iteration stage so that variables missed at one stage can enter in subsequent stages. Thus, our procedure is economical and more flexible than currently available screening methods for modeling choice. When applied to a large data base from a choice experiment, our method proved to be effective both in identifying appropriate variables to include in the model and extending our understanding of consumer decision behavior for health insurance. Our procedure is not without limitations because it shares several generic limitations possessed by any stepwise variable selection technique. Most notable is the arbitrary nature of selecting one model out of a large number of models of roughly equal explanatory power. For instance, the stepwise procedure commonly used for large regression problems suffer from this limitation. This issue is philosophical to the extent that model building requires both analytic skills and a theoretical foundation on the part of the researchers. The diagnostic checks proposed in our method will help researchers avoid statistical mistakes, but there would undoubtedly be situations in which a researcher may need to rely on theory in
128
J BUSN RES 1‘992:24:115-133
candidate
models.
et al.
researcher might propose a good base technique “on the margin” to fine tune that model by identifying useful omitted variables. Finally, one inherent limitation unique to our method is that we are approximating the exact stepwise solution of MNL model by using a weighted stepwise regression (step 3) to retain the flexibility of including demographic-by-attribute interactions in the pool of candidate variables (see Technical Appendix). However, this issue is less relevant because the final model estimation (step 4) is based on maximum likelihood without any approximation. Another weakness is the use of simple rather than partial correlations in the initial screening phase (step 2). This may cause a useful omitted variable to be dropped in the screening phase if that variable is somewhat collinear with the variables already in the model. The iterative nature of our procedure allows important omitted variables to be picked up in the subsequent iteration cycles, but this should be investigated in future research with a Monte Carlo simulation by varying the degree of intercorrelations among the set of candidate variables. judging
among
G. Chakraborty
A skilled
model on a priori grounds and use our screening
We thank Joel Horowitz for his comments on this manuscript and specifically for the suggestion of conducting RESET test on MNL models. Thanks are also extended two anonymous reviewers and especially to Jordan Louviere for their helpful comments. Thanks to Don Anderson for developing the experimental design used in the application.
Technical
Appendix
Fitting Multinomial Logistic Models via Iteratively Reweighted Least Squares (IRLS) Two well known ideas in mathematical statistics combined with a variable centering technique produce an IRLS algorithm for computing maximum likelihood estimators of the parameters of multinomial logistic choice models. This algorithm is easy to implement and variants of it are used in software for fitting MNL models. Our explanation of the method involves imbedding the multinomial likelihood function in a more general, but simpler, Poisson likelihood function. A complex maximization involving relatively few parameters is thereby replaced by a simpler one involving a large number of “slack” parameters. Having simplified the hkelihood function, we then use the IRLS algorithm to find its maximum (Jennrich and Moore, 1975). For example, suppose that subjects are presented with 4 choice tasks, each requiring them to choose 1 of 3 alternatives: a,,, a,, and a,. To fit a main effects model, we use completely aggregated data. We use the symbols (fo,, f,,, fii) to denote the choice frequencies for the ith choice set. These are absolute, not relative frequencies and may be disaggregated completely or into segments using sociodemographic variables. For the purpose of parameter estimation, the vectors of choice frequencies are modeled as independent; however, this is unlikely to be true for completely aggregated data. Under this assumption, (foi, fii, fZi) is multinomial with probabilities P,,,, P,,, and Pzi, where P,, = exp (V.$t exp(V,,), V,, = x’&, xai is a vector of attribute variables associated with alternative “a” in choice set “i,” and p is the vector of attribute weights (see equation (1)).
Screening for Interactions
J BUSN RES lYY2:24:115-133
in Conjoint
Let us say that there are n subjects presented with the task of choosing product A, product B, or neither in 4 choice sets. The price of each product is either high or low and each product either has or does not have a particular feature. The data layout might look like this: Attributes Price Choice
Feature
A
Set
1 2 3 4
B
Lo Lo Hi Hi
Choice
A
B
No Yes No Yes
Hi Hi Lo Lo
Frequencies
None
A
B
Yes No Yes No
We want to model choice probabiities with brand specific intercepts, feature effects, and a generic price effect. That is, if we code price as - l/l (Lo/Hi) and feature as l/O (Present/Absent), the model (dropping the choice set subscript) is: p(A) = exp(VXl + exp(V.J + exp(Vdl; VA = cxA + rr*PRICE, + +,*FEATURE,
p(B) = exp(Vn)41 + expVA) + exp(V,)l; V, = 0~~ + rr*PRICE,
+ +,*FEATURE,
Where 01, and aB are brand-specific intercepts, -K is the generic price effect, +A and &, are brand-specific feature effects. To fit this model by iteratively weighted least squares, we create a data matrix of independent variables.
Independent Brand Choice
Set
1 1 1
ALT
FRQ
xl
x2
x3
0 I 2
fO1 fll f21
0 1 0
0 0
-1
1
1 0
2 2 2
0 1 2
fO2 f12 f22
0
0
1 0
0 1
3 3 3
0 1 2
fO3 f13 t23
0 1 0
0 0 1
4 4 4
0 1 2
fO4 f14 f24
0 1 0
0 0 1
Choice Set Dummy Variables (within segment)
Variables
Price
Features x4
X5
sl
s2
s3
s4
0 0 0
0 0 1
1 1 1
0 0 0
0 0 0
0 0 0
0 1 0
0 0 0
0 0 0
1 1 1
0 0 0
0 0 0
0 1
0 0 0
0 0 1
0 0 0
0 0 0
1 1 1
0 0 0
0 1
0 1 0
0 0 0
0 0 0
0 0 0
0 0 0
1 1 1
0
-1 1
-1
-1
and re-
Parameters:
The estimation algorithm then consists of repeatedly running sions of a constructed dependent variable, y* (the working
weighted linear regreslogit), on the indepen-
130
BUSN KES lYY2:24:115-133
J
G. Chakraborty
dent variables.
et al.
Let us collect
p2. p% @4, p5) = (%%, aH~ and is repeatedly updated
n,
the parameters to be estimated in vector p’ = (p,, QH). The p vector is initially set equal to (O,O,O,O,O), as follows: $+,,
Reweight Using
the current
values
of fi compute:
1. The probat$lities, i,%,(0): PII, = exp(V,J[exp(V,:,,) + where
exp(P,,)
+
exp(Vdl,
$‘(,, = 0.
+ exp(P,,) + exp(Qz,)I. PII = exp(~J[exp(~,:,,) where P,, = fiI*x,, + &*x1, + fi,*2x,,. A
= exp(PJ[exp(Q,,,) + exp(P,,) + exp(P1,)l, where v’,, = fi?*x,, + fiI*xl, + &*x5,, II. The expected frequencies i,, = n,*j,$,. III. The residuals e,, = (f,, - f,,)lf,,, and IV. The working logits Yz, = lo&,) + e,,,
Estimate p Center all the variables, x,, . , x5 and y* and their fa,,-weighted choice sets in each segment and regress centered y* on centered resulting regression coefficients are updated values of 0.
averages within x,, , , x5. The
Why Does It Work? The multinomial L(P)
log likelihood
= xZ&fal
function
is:
lOg(Pai(P))
setting to zero the partial derivative lihood estimating equation:
with respect
to p, we get the maximum
like-
where n, = fo, + f,, + fi,. Now let us treat the frequencies, fA, as independent Poisson variates with means hai, where log (A;,,) = .Q’ p + y,, and y, = -log(c exp (~,~‘p)), is treated as a free parameter. Note that the conditional distribution of fo,, f,,, f2i given the sampling constraint, L,, + f,, + fii = ni is multinomial with probabilities pa, given above. The role of the “slack” parameters yi is to ensure that the sampling constraint is satisfied by the fitted model. The Poisson log likelihood is: L(P,Y)
= XZJf,,
so that the joint C, [n, and:
log&J
estimating
exp(+)
-
&il
equations
Zaexp(xa,‘fi)]
= 0
for p and y are:
Screening for Interactions
J BUSN RES 1’#2:24:115-133
in Conjoint
131
CiCaX,i[& - exp(-j, + X,i’ fi)] = 0. Solving the first equation for y and substituting the result into the second equation yields an equation identical to the multinomial estimating equation for 6, consequently multinomial and Poisson likelihoods yield the same estimate of p. It can also be shown that the p submatrix of the inverse of the Poisson information matrix for p and y equals the inverse of the multinomial information matrix for p, so that multinomial and Poisson models produce the same asymptotic standard errors for p. Jennrich and Moore (1975, section 3.3) failed to get this result because they did not introduce a slack parameter to handle the fixed N sampling constraint. The use of iteratively reweighted least squares to compute maximum likelihood estimates in the log-linear Poisson model is explained in Jennrich and Moore (1975). Briefly, the method is as follows. Write log (hai) = z,,‘O, where z,, isxai augmented with choice set dummy variables and 8 is p augmented with the slack parameters, y. The estimating equation for 6 is:
Expanding
Letting
this around
an arbitrary
y.zi = Z,i’ 8,, + uai -
e ^I [ I;C~,,h,,(eo)Z:i]
point,
O,,, we get:
A,,(&,))/A,,(0,,),
we have:
‘[ ~CZ,iA,, (O~,)y~i] ~ [ Z’A,,Z]
‘[ Z'A,,Y* ]
That is, 6 is approximately the vector of regression coefficients in the A-weighted regression of y* on the z variables. A practical impediment to the use of this method is the large number of dummy variables corresponding to the slack parameters, one per choice set per segment. If the data are disaggregated into segments, there must be one dummy variable for each choice set within each segment. This is likely to create intractability problem. To overcome this problem, we center y* and the x-variables at their Aweighted means within each choice set in each segment, then these variables are orthogonal to the dummy variables corresponding to the y parameters. Consequently, the updating equation for fi becomes: @ = [Xc’ A,,Xc]-‘[Xc’ A,, Y,*]
(Al)
where X, and Y,* have been centered; however, the residual degrees of freedom will be incorrect. This is the inspiration for our screening procedure. Suppose that Z is a vector of candidate variables (e.g., brand-by-gender interaction) not included in the model. We would disaggregate the data matrix by gender, so that the first 3 rows of the data matrix become:
132
J BUSN RES lYY2:24: 1 IS-133
Gender (1 = M, -1
G. Chakraborty
Choice = F) Set
I
1
Indep.
Vars.
Choice Set Dummy
Brand by Gender Interaction 21
Alt
Frey
xl
x2
x3
x4
x5
sll
1
0
f;lll
0
0
1 1
1 2
0 - 1
0 0
0 0
1 1
1
0 I
1 1
2
0
s12 0 0
fi,?
1
0 ~1 1
0
0
0
1
0
0 1
0 0
1 1
0
22
0
0 0
-1 0
:,
et al.
0 1
0
-1 0 0 1
The screening step consists of fitting the main effects model excluding variables Zl and 22. That is, we iterate equation (Al) to convergence. It is immaterial whether aggregated or gender-disaggregated data are used; the estimate will be the same. To add the gender-by-brand interaction to the equation we must disaggregate the data and append Zl and 22, appropriately centered, to the X, matrix and reiterate question (Al) to convergence. Our screening method, in effect, refits equation (Al) but, in the interest of speed, does not reiterate. The significance levels obtained when weighted linear regression software is used to fit (Al) are incorrect, since the residual degrees of freedom include the degrees of freedom that ought to be deducted for the omitted choice set dummy variables. The proposed screening method described in this article is an approximation to what would be an exact stepwise method for choosing variables in a MNL model. It is approximate for 2 reasons: First, the exact procedure requires iteratively reweighting the least squares fit produced by the weighted linear regression underlying each step in the stepwise regression. Second, the exact procedure requires the introduction of dummy variables (whether explicitly estimated or “absorbed”) for each choice set within each segment. While it is feasible to introduce these dummy variables when working with highly aggregated data (say 8 segments and 16 choice sets), it is not feasible when data are highly disaggregated by the inclusion of demographic interactions into the model. Thus, we trade off accuracy in our ability to identify potential effects for the flexibility of including all types of effects in the list of candidate variables. References Amemiya, Takeshi, Qualitative 19 (1981): 1483-1536.
Response
Models:
A Survey. Journal of Economic Literature
Ben-Akiva, Moshe, and Lerman, Steven R. Discrete Choice Analysis: Theory and Application to Travel Demand, The MIT Press, Cambridge, MA. 1985. Buckley, Patrick G. Nested Multinomial Logit Analysis of Scanner Data for a Hierarchical Choice Model. Journal of Business Research 17 (September 1988): 133-154. Cooper, G. Lee., and Masao, Nakanishi, Market-Share Analysis: Evaluating Competitive Marketing Effectiveness, Kluwer Academic Publishers, MA. 1988. Crowder, M. Maximum Likelihood Estimation Royal Statistical Society B 38 (1976). Currim,
Imran
S. Predictive
Testing
for Dependent
of Consumer
Choice
Observations. Models
Journal of the
Not Subject
to Inde-
Screening
for Interactions pendence 208-222.
J BUSN RES 1992:24:115-133
in Conjoint of Irrelevant
Alternatives.
Draper, Norman R., and Smith, Inc., New York. 1981.
Harry.
Engle, Robert. Wald, Likelihood Handbook of Econometrics. Amsterdam. 1984, Vol. 2.
133
Journal of Marketing Research 19 (May
1982):
Applied Regression Analysis, John Wiley & Sons,
Ratio and Lagrange Multiplier Test in Econometrics, in Z. Griliches and M. Intriligator, eds., North-Holland,
Green, Paul E., and Srinivasan, V. Conjoint Analysis in Consumer Research: Outlook. Journal of Consumer Research 5 (September 1978): 103-123. Horowitz, Joel. Statistical Comparison of Non-Nested Transportation Science 17 (1982): 319-350.
Probabilistic
Discrete
Jennrich, R. I., and Moore, R. H. Maximum Likelihood Estimation Least Squares, ASA Proceedings of the Statistical Computing
Issues
and
Choice Models.
by Means of Nonlinear Section, 1975: 57-65.
Louviere, Jordon J., and Woodworth, George. Design and Analysis of Simulated Consumer Choice or Allocation Experiments: An Approach Based on Aggregate Data. Journal of Marketing Research 20 (November 1983): 350-367. Louviere, Jordan J., Analyzing Decision Making-Metric Conjoint Analysis, Sage Publications, Inc., CA, 1988. Lute, Lute,
R. Duncan.
Individual Choice Behavior, Wiley, New York.
R. Duncan. The Choice chology 3 (1977): 215-33.
Axiom
After
Twenty
Years.
1959.
Journal of Mathematical Psy-
McFadden, Daniel. Econometric Models of Probabilistic Choice, in Structural Discrete Data with Econometric Application. C. F. Manski and D. McFadden, Eds. MIT Press, Cambridge, MA. 1981. Pp. 198-272. McFadden, Daniel The Choice 5 (Fall 1986): 275-297.
Theory
Approach
McFadden, Daniel. Regression-Based Specification Journal Of Econometrics 34 (1987): 63-82.
to Market
Research.
Marketing Science
Tests For The Multinomial
Logit Model.
Ramsey, J. B. Tests For Specification Errors in Classical Linear Least-Squares Regression Analysis. Journal of the Royal Statistical Society Series B, 31 (1969): 350-371 Sen, Subrata K. Introduction Science 5 (Fall 1986).
to the Special Issue on Consumer
Choice
Models.
Marketing
Thursby, Jerry G. A Comparison of Several Specification Error Tests For A General ternative. International Economic Review 30 (February 1989): 217-230.
Al-