Journal of Environmental Economics and Management 40, 21-36 (2000) doi:l0.1006/jeem.1999.1104, available online at http://www.idealibrary.com on IDE
kL3
Random Coefficient Models for Stated Preference Surveys’ David F. Layton Department of Eni’ironmental Science and Policy, Unii’ersity of California, One Shields Arenue, Dar.is, California 95616-S576 E-mail:
[email protected] Received July 11, 1997; revised July 15, 1999 This paper combines the use of random coefficient discrete choice models that allow for correlated errors in the form of unobserved preference heterogeneity with an investigation of the reliability and usefulness of rankings data. For the data we examine, we find significant unobserved preference heterogeneity in the rankings, which violates the underlying assumptions of the rank-ordered logit model. Further, directly modeling unobserved preference heterogeneity results in significant improvement in the precision of the parameter estimates and in estimated Willingness to Pay (WTP). The reliability and usefulness of rankings data should be re-examined with econometric models not subject to the Independence of Irrelevant Alternatives (IIA). o 2000 Academic I’resa
I. INTRODUCTION
This paper considers the specification and testing of econometric models for stated preference surveys designed to value goods and their underlying attributes. The focus is on surveys which ask respondents to rank-order a set of alternatives. An important aspect of eliciting rankings is that, in addition to their usefulness in valuing attributes, they hold the potential for substantially reducing sampling costs as much more information is recovered from each respondent than for a single most preferred choice (or in a referendum). Previous contingent ranking/stated preference surveys include Beggs et al. [4] (electric cars), Rae [26] (visibility), Smith and Desvouges [30] (water recreation), Lareau and Rae [21] (diesel odor fumes), Hartman et al. [l6] (electricity reliability), Brown et al. [9] (northern spotted owl), and Schulze et al. [29] (Superfund site cleanup). A repeated choice game approach was used by Bunch et al. [ l l ] (alternative fueled vehicles), Adamowicz et al. [ll (combined revealed and stated preference data), and Adamowicz et al. [2] (moose hunting).2 This paper is in part based on the author’s Ph.D. dissertation at the University of Washington (1995). The author thanks the members of the supervisory committee, especially Robert Halvorsen and Gardner Brown, and also Michael Ward, Todd Lee, participants of the 1996 AERE workshop, and three anonymous reviewers for helpful comments. William Schulze lundly provided the data used in this study and many helpful suggestions. The analysis of complete and partial ranlungs and the econometric issues examined in this paper are not limited to survey data. Rankings have been analyzed with a rank-ordered logit model in Bloom and Cavanagh [7] (partial ranlungs of arbitrators by unions and employers as part of the arbitrator selection process), Odeck [23] (partial rankings of road investment projects by agencies in Norway), Porter and Zona [25] (ranking of highway construction bids to detect collusion), Thomas [32] (new product introductions), and Thomas [33] (Consumer Product Safety Commission partial rankings for executing projects). 21 0095-0696/00 $35.00 Copyright 0 2000 by A c a d e m i c I’reas All righls oS rcproduclioii in any Sorm rcscrvcd.
22
DAVID F. LAYTON
Except for Bunch et al. [ll],who used a nested logit model, all of the above choice and ranking studies employ conditional logit or rank-ordered logit models. It is well known that the conditional logit model exhibits the undesirable Independence of Irrelevant Alternatives (IIA) property, a property that is extremely likely to be violated in survey data. We use a random coefficient discrete choice model that allows for correlated error terms and is not subject to the Independence of Irrelevant Alternatives. Until recently, discrete choice models with correlated error terms were not computationally tractable for data that contained more than a few alternatives, leaving researchers with little option but to use logit models. Recent innovations in Monte Carlo Integration techniques now make estimation of such models feasible. This approach allows us to relax the assumption that all respondents have the same preferences for the attributes being valued. The ability to model unobservable heterogeneous preferences in the population is a clear advantage of this technique. An important empirical question is to what extent are rankings (complete or partial) consistent with eliciting a single most preferred choice. It has been previously observed that rankings data do not appear to satisfy the assumptions of the rank-ordered logit model, the only model that has been used in applied work. Ben-Akiva et al. [6] used a survey of Boston commuters to study the consistency of using rankings versus a most preferred choice based on conditional logit models. They found that even with the various corrections that have been proposed to allow pooling of the choices from different ranks, the rankings were not consistent with a single most preferred choice. There are a number of different stated preference elicitation methods, and researchers are continually evaluating their performance. As all assessments of elicitation methods are conditional upon the underlying econometric models, it is important to reconsider the performance of these methods when significantly improved econometric approaches are developed. This paper combines the use of random coefficient discrete choice models that allow for correlated errors in the form of unobserved preference heterogeneity with an investigation of whether rankings data can be usefully employed within this richer econometric framework. For the data we examine, we find significant unobserved preference heterogeneity in the rankings which violates the underlying assumptions of the rank-ordered logit model. Further, directly modeling unobserved preference heterogeneity results in a significant improvement in the precision of the parameter estimates and in estimated Willingness to Pay (WTP). This suggests that the reliability and usefulness of rankings data should be reexamined with econometric models not subject to IIA. The next section reviews the standard econometric models used to analyze stated preference data. Section I11 describes the random coefficients discrete choice model and its estimation through simulation. In Section IV we analyze in detail a stated preference data set from Schulze et al. [29], with a focus on violations of the rank-ordered logit model, the presence of unobserved preference heterogeneity, and the additional explanatory power of additional ranks. Section V concludes the paper.
RANDOM COEFFICIENT MODELS
23
II. STANDARD ECONOMETRIC MODELS FOR STATED PREFERENCE SURVEYS
Assume that person i faces m alternatives that are exhaustive and mutually exclusive. We can represent the utility that person i receives from alternative j as
where K i is the observable part of total utility and cii is the unobservable part. The K j may depend on the characteristics of the alternative and of the decision maker and is typically specified as linear in parameters with K j = X i jp. The zij may represent errors in the perception or optimization of the consumer, or it could be the result of the researcher not observing all of the attributes of the alternative or of the utility function. McFadden [22] considered the discrete choice problem where a most preferred choice is observed. He showed that if and only if the error terms in Eq. (1) were distributed as independent identical type I extreme value random variables would the conditional logit model result. The probability of choice is
where w is the scale parameter which is inversely proportional to the standard deviation of the type I extreme value distribution (Ben-Akiva and Lerman [51). The scale parameter is generally normalized to 1,but we have included it in (2) because it will be helpful in formulating an extension of the rank-ordered logit model. Suppose person i is instead asked to provide the first p ranks out of m alternatives, where ( p < m). That is, he or she is asked to rank the alternatives from most to least preferred. As shown in Beggs et al. [4], and Chapman and Staelin [121, under the assumption that the errors in (1) are iid type I extreme value random variables, the conditional logit model for a most preferred choice can easily be extended to a complete or partial ranking. Under these assumptions, the ranking problem is essentially a sequence of p - 1 independent most preferred choice problems. That is, person i chooses his or her preferred alternative out of a set of all rn alternatives; then i picks the next preferred alternative out of the remaining rn - 1 remaining alternatives, and so on. The ranking probability for this rank-ordered logit model is
where j indexes the position in the ranking, r. When inserted into the log likelihood function, the rank-ordered logit for p ranks is seen to be the sum of p separate conditional logit functions, and hence is easily estimated. While a distinct advantage of the conditional logit model is its ease of estimation due to its closed form solution for the Pij, it is also subject to the undesirable Independence of Irrelevant Alternatives property (IIA). As is well known, this property implies that the joint probability of choosing close substitutes will be
24
DAVID F. LAYTON
overestimated by conditional logit models. A related limitation of the IIA property is revealed by considering the fact that the sij are assumed to be independently and identically distributed. As Hausman and Wise [18] noted, it appears likely that whatever contributed to the error observed for person i’s evaluation of option j is likely to have affected the evaluation of option k as well. As we discuss in Section 111, unobserved preference heterogeneity is a likely source of correlation which will violate the assumptions of the conditional logit model. Because the rank-ordered logit is an extended version of the conditional logit model, it embodies the same potential problems. One might even expect it to be less robust to misspecification than the conditional logit model because it exploits the IIA assumption to a much greater degree. Modeling a p dimensional ranking as a sequence of p sequential choices makes much stronger assumptions than does simply modeling the first choice. It has been found with rankings data that the magnitude of the parameter estimates often fall as ranks are added (Hausman and Ruud [17]). A common sense explanation for this pattern is that people rank lower-valued alternatives with less care than they rank higher-valued alternatives, or that they are simply more “sure” of their first few choices than they are about their last few choices. Hausman and Ruud [17] developed the rank-ordered heteroscedastic logit model which allows each rank to have a different scale parameter to reflect the potential for increasing noise in the choice data:
Up to p - 1 scale parameters are identifiable in this model. If we normalize wj of the lowest rank to equal 1, then we may expect the other scale parameters to be greater than 1 and to increase as we move up through the ranking. The heteroscedasticity correction in (4) may allow the inclusion of more ranks if the rank-ordered logit model is essentially correct but people simply exert less effort in choosing lower-valued alternatives. For our data, we will test and reject this hypothesis. A more attractive approach is to remove the IIA assumption altogether. Next we discuss computationally tractable random coefficient discrete choice models that achieve this. Before doing so, it is important to emphasize how the model in (4) differs from the random coefficient models developed in the next section. The rank-ordered heteroscedastic logit model in (4) allows for scale parameters that are common to all respondents but that vary across ranks. The random coefficient model developed next will assume a common scale parameter across the ranks but model preferences varying across respondents. Ill. RANDOM COEFFICIENT DISCRETE CHOICE MODELS
Hausman and Wise [18] motivated a discrete choice model with correlated errors due to the presence of unobserved preference heterogeneity. This model, shown in (51, begins by decomposing the error term in (1) into an unobserved preference heterogeneity component and an alternative specific component.
u‘ J . = x .‘ J. p ;+ &..‘ J = x‘ .J . F + x j j +b i&;j
(5)
RANDOM COEFFICIENT MODELS
25
Each person now has his or her own vector of parameters, p,, which deviates from the population mean, by the vector b,. At this point, a parametric analysis requires the specification of a distribution for the alternative specific error terms, q J ,and the parameters of interest, p. If we assume that they are both normally distributed, the Multinomial Probit model (MNP) of Hausman and Wise [18] results. Other distributional assumptions can be made about p, and E , ~ .In any case, the probabilities associated with the model in ( 5 ) will generally not have a closed-form solution. This will generally require the multivariate integration of a complicated integral, which is not practical for more than a few alternatives. Simulation estimators for the probabilities have been developed that allow high-dimensional choice probabilities to be computed accurately and quickly enough for use in iterative maximum likelihood estimation. Chen and Cosslett [13] employed the random coefficient MNP model and estimated it by the method of simulated maximum likelihood using the GHK simulator (Borsch-Supan and Hajivassiliou [8], Geweke [ 1.51, and Keane [ 191). Recently, Train has developed a simulator explicitly for the random coefficient specification and applied it as a “random parameters logit” model in Brownstone and Train [lo], Revelt and Train [27], and Train [3S].3 The choice of a simulator depends upon how one wishes to introduce and model correlation, and on the structure of the particular problem at hand. For instance, the computational effort required when using the GHK simulator does not depend upon the number of random coefficients but upon the number of alternatives. By contrast, in the simulator developed by Train, the additional computational effort depends mostly upon the number of random coefficients. Train’s simulator also allows for non-normally distributed random coefficients, which is attractive for valuation as the price parameter can have a distribution that does not allow any portion of the population to like higher prices. A brief discussion of the random coefficient (or parameters) logit model and how it can be estimated via simulation naturally draws from the papers by Train. First assume all correlation is due to the random coefficients, p,. Denote the density of p, as f ( p ) and assume that it is independent of the E , which are iid random variates with distribution function A. The probability that person i will choose choice j from a set of alternatives can be written as
p,
Next we rewrite (6) in integral form, conditional first on the alternative specific error of the chosen alternative, E ; ~ and , then on the random coefficients, pi.This yields
The portion in the square brackets is the probability conditional on the p,. Both sets of integrals are taken over the range of the respective random variables. Note that in general f ( p ) is multivariate and the dimension of the outer integral will be In the literature, the terms “random parameters,” “random coefficients,” and “varying parameters” are all used and mean the same thing.
26
DAVID F. LAYTON
equal to the number of random parameters. The E could have any distribution, say normal, but if we assume that the E are distributed as type I extreme value, then conditional on the Pi the inner integral in square brackets yields the standard multinomial logit probability term for m alternatives from (2).
For ease of reading we have normalized the scale parameter, w , to one in (814 Note that the beta vector is still indexed for person i. The analytic integration of (8) is not possible, but the assumption of type I extreme value random variables for the E is convenient because the inner portion at least has a closed-form solution, which makes integrating it by simulation straightforward. To implement the estimation procedure, simply draw a set of Pi from f ( P ) and calculate the interior portion (which is simply (2)) repeatedly, and then average over the number of replications. As discussed in Brownstone and Train [lo], the simulator is smooth, strictly positive, and unbiased with only one draw for Pi.Accuracy is improved as more replications are used. For R replications ( R draws of Pi from f ( P ) ) the probability of choice is simulated as
The index ir on the beta indicates that the probability is calculated for each respondent using R different sets of beta vectors. This model is easily extended for rank-ordered data after noting that, conditional on the random coefficients, the inner portion of (9) becomes the rank-ordered logit model in (3). So the rankordered counterpart of (9) is
where again w , the scale parameter, is normalized to one. Considering (lo), we can see one consideration that would affect the choice of a simulator. In (10) it is necessary to calculate the conditional logit probability p times (once for each rank), and this must be done for each of the R replications, for each respondent. By its construction, the computational difficulty of the GHK simulator is of the same dimension for both rank-ordered and first-rank (or most preferred) data. Therefore, for large p , researchers may find the GHK simulator preferable. The empirical application in Section IV utilizes data with two ranks, so the extra computational effort is more than offset by the other positive features of Train's simulator. Whether one chooses the GHK or Train's simulator, decisions have to be made about how to model correlation. We will discuss our particular specification in the next section. For now we point out that if a random coefficient is assumed for an explanatory variable that is present in all alternatives for a given respondent, this The scale parameter is only used in the model with rank-ordered heteroscedasticity.
RANDOM COEFFICIENT MODELS
27
will make the errors for all alternatives correlated, thus removing the restrictive IIA property. IV. EMPIRICAL ANALYSIS
Sumey Data The data used in the empirical analysis comes from a stated preference survey of public preferences for Superfund hazardous waste site cleanup. This survey by Schulze et al. [28, 291 was administered in the Denver, Colorado, area to 180 people. The printed, self-administered survey, extensively described a hypothetical hazardous waste site with ground-water and surface-water contamination, the risks associated with the site, and five options for cleaning up the site. The five options were based on typical cleanup actions performed at Superfund sites. The options, their titles, and the risk of death as described in Schulze et al. [29] are described below. The actual description of each program was approximately half a page in length. Option A: “no action”-no efforts are taken at the site. Odor problems remain at the site. Risk of death equals 2.1 deaths per million people per year. Option B: “institutional controls”-action is taken to remove immediate health and environmental risks by fencing the site and finding alternative sources to replace contaminated wells. Odor problems remain at the site. Risk of death equals 0.1 deaths per million people per year. is taken to Option C: “landfill cap and ground-water filtration”-action eliminate off-site exposure risks from surface water runoff and ground water is treated for use. Odor problems are eliminated at the site. The risk of death is eliminated. Option D: “landfill cap and ground-water barrier”-action is taken to eliminate off-site exposures as in option C and to prevent further ground-water contamination. Odor problems are eliminated at the site. The risk of death is eliminated. Option El: “complete cleanup”-actions are taken to physically remove all contaminants to licensed off-site disposal facilities and the contaminated ground water is pumped, treated, and re-injected into the ground. Odor problems are eliminated at the site. The risk of death is eliminated. Option E2: Same as El, except that there is now a risk of 0.1 deaths per million people due to the removal and transportation of contaminated substances (e.g., wind-blown soil). Approximately one-half of the respondents received a survey with the fifth option being El, and the other half received a survey with the fifth option being E2. Each option had an associated price which varied over the survey respondents. Each respondent was asked to indicate his or her most preferred and second most preferred options. The respondents’ first and second choices over the five options presented to them, along with costs of each option, make up the basic data set. Of the 180 people who took the survey, 171 indicated a unique most and second most preferred option, which constitutes the sample for the analysis. For a
28
DAVID F. LAYTON
more complete description of the survey design and administration, see Schulze et al. [28, 291. The goal of the application will be to estimate the Willingness to Pay (WTP) for each of the five different cleanup options. The specification of the indirect utility function reflects this by using the Price of each option and alternative specific constants for options B through E2 (denoted B , C, D ,E l , E2)? This specification is very favorable to the assumptions of the conditional logit and rank-ordered logit models. Alternative specific constants are one way to allow for heteroscedastic errors and mitigate to a considerable degree the restrictive IIA property (Train [34]). In this application we have only one non-alternative specific variable. Generally, researchers using stated preference methods will wish to estimate valuations for a number of non-alternative specific variables, whereas in our case the valuation of the alternative specific variables is our goal. This is essentially a “best-case” application for the conditional logit model. Further, since there are only two ranks, it is the “best case” for the rank-ordered logit model. With only two ranks and one non-alternative specific variable, one would anticipate the fewest problems for the rank-ordered logit model. The analysis begins by examining whether the rank-ordered logit model is appropriate for this data. Briefly, we find that it is not. The heteroscedastic extension of Hausman and Ruud [17]does not salvage it. Further, with either one or both ranks, we can not estimate a statistically significant WTP. We then estimate a random coefficient extension of the rank-ordered logit model which demonstrates a source of mis-specification in the standard rank-ordered logit model and yields statistically significant estimates of WTP.
Logit Models Table I shows the results of the models estimated using Price, and the alternative specific constants for B , C , D, E l , and E 2 as explanatory variables. The first two models are the ranked and unranked conditional logit models, shown as C-Logit. The third model allows for rank-ordered heteroscedasticity and is shown as H-Logit. These three models were estimated by the method of maximum likelihood in Gauss. Beginning with the first two models (C-Logit with 1 and 2 ranks), the parameters have the expected signs, with Price being negative and the five alternative specific variables being positive. Each alternative as we move from B through E l represents a more inclusive good, which provides more environmental and/or health benefits, and the increasing estimates confirm this. Option E2 is objectively worse than E l as it is the same option but with a slightly higher health risk. The estimate of E2 is slightly lower than E l . Unfortunately, the price coefficient is insignificant. All three models exhibit reasonably good fit on the basis of the Pseudo R2. The Pseudo R 2 is defined to be (1 - LLJLL,), where LL, is the log likelihood of the estimated model and LL, is the log likelihood of the model when all of the coefficients are restricted to be zero (in the tables, LL, is shown as the log likelihood at O).6 The standard likelihood ratio test for restricting
’ For identification, the coefficient for the first alternative constant, A , is normalized to zero. The Pseudo R2 reported is based on all of the coefficients being equal to zero, including the alternative specific constants since they represent all of the non-price attributes of the alternatives and are critical to estimation of the welfare measures. The likelihood at all betas equal zero is computed as the probability of a random choice, which equals 1/5 for the choice models and 1/20 for the ranking models, so the respective log likelihoods are 171 * ln(0.2) = - 275.214 and 171 * ln(0.05) = -512.270.
29
RANDOM COEFFICIENT MODELS TABLE I Model Estimates ___
C-Logit Model: Estimate (t-statistic) B C
D El
E2 Price (fixed coefficient)
H-Logit
1 Rank
2 Ranks
2 Ranks
2.158 (2.905) 2.898 (4.004) 3.557 (4.944) 4.549 (5.345) 4.356 (5.016) 0.026 ( - 1.707)
2.304 (4.683) 3.240 (6.114) 3.834 (6.990) 3.977 (5.214) 3.819 (4.881) -0.015 ( - 1.047)
2.454 (4.112) 3.455 (4.869) 4.092 (5.411) 4.157 (5.048) 3.991 (4.692) 0.015 (- 0.959)
-
-
Price-p
-
-
Price--rr
-
-
Scale
-
-
-
RC-Logit 1 Rank
2 Ranks
13.297 (1.550) 17.081 (1.654) 19.135 (1.732) 22.366 (1.841) 21.802 (1.785)
5.857 (3.183) 8.121 (3.792) 9.697 (4.174) 12.925 (4.584) 12.558 (4.442)
-
-
-
1.904'
-
1.973'
( - 2.208)
(- 6.743)
-
1.600 (3.888)
1.179 (6.146)
0.870"
-
-
(- 0.560)
LL at 0 LL Pseudo-R2 '
-
275.214
-
512.270
- 222.924
- 406.005
0.190
0.207
512.270 -405.839 0.208
-
-275.214 -219.632 0.202
-
512.270
- 373.479
0.271
Note. All models were run in Gauss. The RC-Logit models were estimated by Simulated Maximum Likelihood using 1000 replications of the simulator. The psuedo-R2 is computed as l-LL/(LL at 0). In all models except for the H-Logit, the scale factor is normalized to one. The t-statistic for the scale factor is computed relative to one and not zero. The price variable is entered as the negative of Price so that the distribution of the Price coefficient is everywhere positive.
all of the coefficients to equal zero is rejected at less than a 1% significance level for all of the model^.^
Consistency of Ranking As noted earlier, a common observation in rankings data is that the parameter estimates decrease in magnitude as additional ranks are added (see Chapman and Staelin [12], Hausman and Ruud [17], Ben-Akiva et al. [6]). This is commonly interpreted as being due to less precision in the choices within the lower ranks. The survey data considered here consists of two ranks, and it would be informative to see how robust the model is with respect to the addition of the second rank.8 In Table I, it is clear that the parameter estimates from the first rank model (C-Logit 'The Likelihood Ratio statistic for the conditional logit model with one rank is 104.58; for the rank-ordered logit model with two ranks it is 212.53. In fact, the reason the survey asked for the first two ranks instead of all four is that there is some evidence that while the lower ranks are less reliable, the negative effects of adding ranks may not set in until after the first few ranks (William Schulze, personal communication).
30
DAVID F. LAYTON
1 Rank) are not equivalent to the estimates from the two-rank model (C-Logit 2 Ranks), but the coefficients of three of the six variables increased while those of the other three decreased. To test the hypothesis of the equality of the parameter estimates between the two models, a Likelihood Ratio test can be formed as in Hausman and Ruud [17] and Ben-Akiva et al. [6]. The test statistic is computed as
where L( PI,) is the log likelihood of the ranked model, Ll( P1) is the log likelihood from the model estimated for the first rank only, and L,( P,) is the log likelihood from the model estimated for the second rank only. The test statistic is distributed chi-squared with degrees of freedom equal to the difference in the number of parameters in the ranked model and the two unranked models, under the null hypothesis of correct specification. The test statistic is 33.20, and the 1% critical value for the x2 (6) distribution is 16.81. The null hypothesis that the first two ranks can be consistently pooled is strongly rejected. Given the rejection of the addition of the second rank in the standard rankordered logit model, we consider whether allowing for the possibility of rankordered heteroscedasticity will allow the consistent pooling of the two ranks. In the model of Eq. (41, under the hypothesis of rank-ordered heteroscedasticity, we expect that the choice of the first rank should be more precise than the choice of the second, indicating that the scale parameter should be greater than 1, after normalizing on the lowest rank. The results of the model with two ranks with a correction for rank-ordered heteroscedasticity are shown in Table I as the H-Logit model. Although the scale factor is less than, instead of greater than, 1 it is not significantly different from 1, indicating that the scale parameter is essentially the same in both ranks. So we reject the hypothesis that increasingly noisy rankings are the cause of our inability to pool the two ranks. Next we consider whether there is evidence of unobserved preference heterogeneity which would violate the assumption of iid type I extreme value errors and cause the observed problems in pooling the ranks.
Modeling Heterogeneous Preferences Here we use a random coefficient Logit (RC-Logit) model to estimate the parameters of the indirect utility function while allowing for unobserved preference heterogeneity. The RC-Logit does not exhibit IIA, and we expect that modeling unobserved preference heterogeneity will allow us to consistently pool the two ranks. Discrete choice models with correlated errors are extremely flexible, and we must make some assumptions about how to model correlation. With only one non-alternative specific variable, Price, the easiest and most parsimonious way to introduce correlation in the error structure across all of the alternatives is to assume that the price coefficient is a random coefficient. With only one random
31
RANDOM COEFFICIENT MODELS
coefficient, the Train simulator discussed in Section I11 is by far the easiest and fastest simulator to use because the computational effort of this simulator depends upon the number of random coefficients. Once the simulator decision has been made, then a distribution must be specified. Brownstone and Train [lo] and Train [3S] used a lognormal distribution for the coefficient of price. This distribution has nice properties in that it is easy to draw from, it restricts the price coefficient for all respondents to having the same sign, and the mean and median of its inverse have a closed form which is convenient in welfare analysis. So we assume that the price coefficient has a lognormal distribution. The draws for the lognormally distributed price coefficient are made as exp(Z), where Z is drawn as an N( p, (T 2 ) . The price coefficient will be distributed lognormally with mean equal to exp( p ( (T 2/2)), median equal to exp( p), and standard deviation equal to exp( p ( (T 2 / 2 ) )* [exp(a2) - l]".s (Brownstone and Train [lo], Train [35], Revelt and Train [27]). The model is estimated, and results are reported in terms of p and (T.We use 1000 draws for the Price coefficient. Because the price of the option appears in all alternatives, this induces correlation in the error structure for all of the alternatives with the addition of only one parameter. The last two columns of Table I contain the results for the RC-Logit model with one and two ranks. These models were estimated by the method of Simulated Maximum Likelihood in Gauss, using 1000 replications of the simulator described in (9) and (10). The first rank RC-Logit model with a lognormally distributed price coefficient yields a number of interesting results. First, it maintains the result that each option is worth more as more cleanup is undertaken, but these estimates are now insignificant. Most interestingly, (T is significantly greater than 0. This is essentially a test of the IIA assumption. The fact that (T is significantly greater than 0 indicates that the errors in the most preferred choice model, even with just a price variable and alternative specific constants, still had residual correlation and heteroscedasticity. This violates the assumptions of the conditional logit model and would lead to the results discussed earlier, that the two ranks could not be pooled via the rank-ordered logit model. Under the assumption of a lognormally distributed price coefficient, the two ranks are pooled using (10). Now we find that all of the parameter estimates are significant at the 1% level. The fit of the model as measured by the Pseudo R 2 is now greatly improved compared to either the C-Logit model with two ranks or the RC-Logit with one rank, from about 0.2 to 0.271. The two-rank RC-Logit model maintains the qualitative results that we saw in the other models, but the inclusion of the lognormally distributed price coefficient combined with two ranks allows us to precisely estimate all of the parameters of the model.9 The reason the RC-Logit models have coefficients on B through E2 that are greater in magnitude than those in the C-Logit models is that the models are scaled relative to the alternative specific error term. The error decomposition in ( 5 ) necessarily attributes less to the alternative specific error term, which results in a rise in the coefficient estimates.
+ +
'Given that the alternative specific variables are one way to try to deal with IIA, we would not anticipate much in the way of remaining residual correlation after incorporating the random price coefficient. To check this, a model with the lognormally distributed price coefficient and all alternative specific constants normally distributed was estimated and did not result in a statistically significant increase in the log likelihood.
32
DAVID F. LAYTON
The RC-Logit model with two ranks is clearly the superior model for this data. In other cases, the costs associated with using the more restrictive but easier to estimate models may not be so stark. A number of questions may emerge about how to design an effective modeling strategy when there are many ranks. First, violations of IIA should always be a concern, including in just the first rank. One should test for violations of IIA. If they are detected then a more flexible model should be estimated. The situation becomes more complex if, for instance, additional ranks cannot be pooled. As discussed earlier, the heteroscedastic rankordered logit model in (4) can estimate up to p - 1 different scale parameters for p ranks. If evidence is found supporting the idea of rank-ordered heteroscedasticity, a mixed model that allows for both rank-ordered heteroscedasticity and random coefficients can be estimated. It may be difficult in practice to estimate models with a large number of parameters designed to capture both rank-ordered heteroscedasticity and unobserved preference heterogeneity. In this situation, nonnested model selection techniques such as those developed in Pollak and Wales [24] may be applied to discriminate between the two separate kinds of models. The focus of the research for non-market valuation is the welfare measures derived from the estimated model, but the researcher may still be interested in the predictive performance of the model. It is also possible to base model selection on predictive performance, as opposed to the more common likelihood-based approach. A rigorous predictive-based approach is cross-validation, to which we turn before we discuss the estimation of welfare measures from the rank-ordered RC-Logit model.
Cross- Validation The number of correctly predicted choices can be generated from the discrete choice models. This within-sample prediction may be informative, but to the extent that the researcher is interested in prediction (as opposed to welfare measures) the more useful measure is the number of correct out-of-sample predictions. This can be accomplished by cross-validation (Stone [31], or see Efron and Tibshirani [14]). The cross-validation technique we implement is known as “leave-one-out’’ crossvalidation. In this approach, one observation is removed from the data set of size n and the model is estimated on the n - 1 remaining observations. Next, the model results are used to predict the dependent variable for the one removed observation. This process is repeated for all n observations.”’ In the case of discrete choice models, we use the estimated models to predict the most preferred alternative. This requires simulation for the random coefficient models. One thousand replications which are independent of those used in estimation are employed. Table I1 shows the results of the cross-validation for the five models estimated. First note that with five alternatives, random prediction would yield a 20% correct prediction rate. The differences between the cross-validation results for the five models are instructive and mirror what we see in the model results previously discussed. First, there is a noticeable degradation in the number of correct
’”
A K-fold cross-validation can be used as well, where K observations are omitted. Given the size of the data set and the fact that only the RC-Logit model with two ranks has all parameters significant, it is best to use the common “leave-one-out’’ approach and set K = 1. For further discussion of these points, see Efron and Tibshirani [14].
33
RANDOM COEFFICIENT MODELS TABLE I1 Cross-Validation Results
C-Logit
H-Logit
RC-Logit
Model:
1 Rank
2Ranks
2Ranks
lRank
2Ranks
No. correct Predictions out of 171 (% correct)
59 (34.5%)
54 (31.6%)
54 (31.6%)
74 (43.3%)
74 (43.3%)
predictions when moving from the first choice conditional logit to the rank-ordered logit model, which is consistent with the fact that we could not pool the two ranks via the standard rank-ordered logit model. Second, the heteroscedastic rank-ordered logit model yields the same number of correct predictions as the non-heteroscedastic model, which is consistent with our findings that the rank-ordered heteroscedasticity correction did not represent an improvement. Third, the RC-Logit model has a much improved prediction rate over the standard conditional logit model. And finally, the rank-ordered RC-Logit model has the same prediction rate as the one-rank RC-Logit model, which is consistent with the observation that the R C model consistently pools the two ranks and that the main gain from the addition of the second rank is in more precise estimates which result in tighter confidence intervals for the parameter estimates. The results of the cross-validation are consistent with the results of the specification testing, so we now derive welfare estimates from the RC-Logit model with two ranks.
Welfare Estimates Calculating the WTP for each option requires the inverse of the price coefficient, which in the RC-Logit model has a lognormal distribution. The inverse of a lognormally distributed random variable is lognormally distributed with mean and median as shown in (12) (Aitchison and Brown [3]).
Before incorporating the fact that the welfare measures are based on econometric estimates which are asymptotically normally distributed, mean WTP from the RC-Logit model is Mean WTP
=
E ( P p l*Alt. Cons.)
=
ep-1+(1/2)u2*Alt. Cons.,
(13)
where P p l is the inverse of the price coefficient, p and v refer to the parameters of the lognormal distribution for the price coefficient, and A t . Cons. refers to the estimated parameter for any of the alternative specific constants B through E2. Equation (13) uses the fact that the price coefficient (and its inverse) is distributed independently of the coefficient for the alternative specific constants by assumption. Because the lognormal distribution is skewed, the median will differ and will be less than the mean. The median WTP can be computed similarly to (13). Median WTP
=
e
* Alt. Cons.
(14)
34
DAVID F. LAYTON TABLE I11 Estimated Willingness to Pay from the Random Coefficient Rank-Ordered Model Program B C
D El
E2 E1-E2
Mean WTP
Median WTP
$93.37 ($26.59-$228.65) $128.76 ($44.39-$303.43) $153.06 ($57.83-$348.86) $200.07 ($88.30-$422.73) $194.25 ($85.54-$410.75) $5.83 ( $16.69-$35.28)
$42.13 ($17.59-$74.40) $58.53 ($30.67-$96.46) $69.89 ($39.86-$110.98) $92.49 ($60.78-$135.17) $89.75 ($58.71-$130.91) $2.74 ( $7.77-$14.50)
~
~
Note. The 95% confidence intervals are obtained by the Krinsky-Robb method [20] and are based on 10,000 random draws.
Table I11 shows the mean and median WTP for each of the options for the rank-ordered RC-Logit model. Also included is the WTP for the difference between El and E2. The sole difference between these two options is a 0.1 in one million risk of death, which, not surprisingly, is insignificant. As the estimated WTP functions are non-linear in the parameter estimates, we employ the KrinsQ-Robb [20]procedure to calculate 95% confidence intervals for (13) and (14). The mean and median WTPs are calculated with respect to $/month for 10 years.” In Table 111, one point worth noting is that the confidence intervals for median WTP are much tighter than the confidence intervals for mean WTP. This is because median WTP for any of the programs as calculated in (14) is based on only two estimated parameters while the mean WTP in (13) is based on three estimated parameters, which means that the measure in (13) will be inherently more uncertain than the measure in (14).
V. CONCLUSION This research has considered econometric methods for stated preference surveys which elicit a ranking from a set of hypothetical alternatives. For the data analyzed here, we found that even with a model that was very favorable to the assumptions of the conditional and rank-ordered logit models, the two ranks could not be pooled. The rank-ordered heteroscedasticity model of Hausman and Ruud [17] did not salvage the rank-ordered model. This suggests that increasingly noisy ranks were the not the underlying problem. Explicit modeling of heterogeneous preferences through the use of a lognormally distributed price coefficient indicated that IIA was violated in the first rank data. The combination of heterogeneous preferences and both ranks allowed the estimation of a model with statistically significant All of the other models result in insignificant WTP estimates
RANDOM COEFFICIENT MODELS
35
parameter estimates and WTP. This more flexible random coefficient discrete choice model required only one additional parameter. The results of the crossvalidation were consistent with and reinforced the results of the specification testing. These results suggest that the reliability of rankings data should be reexamined with econometric models not subject to IIA since all previous evidence suggesting that rankings are inconsistent are based on standard logit models. For the data examined here, once preference heterogeneity was included in the model, the addition of the second rank resulted in what researchers have always desired from rankings but have had difficulty in obtaining with restrictive rank-ordered logit models-more precise parameter estimates. REFERENCES 1. W. Adamowicz, J. Louviere, and M. Williams, Combining revealed and stated preference methods for valuing environmental amenities, J . Enciron. Econom. Management 26, 271-292 (1994). 2. W. Adamowicz, J. Swait, P. Boxall, J. Louviere, and M. Williams, Perception versus objective measures of environmental quality in combined revealed and stated preference models of environmental valuation, J . Enriron. Econom. Management 32, 65-84 (1997). 3. J. Aitchison and J. A. C. Brown, “The Lognormal Distribution,” Cambridge University Press, Cambridge, UK (1957). 4. S. Beggs, S. Cardell, and J. Hausman, Assessing the potential demand for electric cars, J . Econometrics 16, 1-19 (1981). 5. M. Ben-Akiva and S. R. Lerman, “Discrete Choice Analysis,” MIT Press, Cambridge, MA (1985). 6. M. Ben-Akiva, T. Morikawa, and F. Shiroishi, Analysis of the reliability of preference ranlung data, J . Bus. Res. 24, 149-164 (1992). 7. D. E. Bloom and C. L. Cavanagh, An analysis of the selection of arbitrators, Amer. Econom. ReL’. 76, 408-422 (1986). 8. A. Borsch-Supan and V. A. Hajivassiliou, Smooth unbiased multivariate probability simulators for maximum likelihood estimation of limited dependent variable models, J . Econometrics 58, 347-368 (1993). 9. G. Brown, D. Layton, and J. Lazo, Valuing habitat and endangered species, Discussion Paper 94-1, University of Washington (1994). 10. D. Brownstone and K. Train, Forecasting new product penetration with flexible substitution patterns, J . Econometrics 89, 109-129 (1999). 11. D. S. Bunch, M. Bradley, T. F. Golob, R. Kitamura, and G. P. Occhiuzzo, Demand for clean-fuel vehicles in California: A discrete choice stated preference pilot project, TransportationRes. A 27, 237-253 (1993). 12. R. G. Chapman and R. Staelin, Exploiting rank ordered choice set data within the stochastic utility model, J . Marketing Res. 19, 288-301 (1982). 13. H. Z. Chen and S. R. Cosslett, Environmental quality preference and benefit estimation in multinomial probit models: A simulation approach, Amer. J . Agn’. Econom. 80, 512-520 (1998). 14. B. Efron and R. J. Tibshirani, “An Introduction to the Bootstrap,” Chapman & Hall, New York (1993). 15. J. Geweke, Efficient simulation from the multivariate normal and Student-t distributions subject to linear constraints, in “Computing Science and Statistics: Proceedings of the Twenty-Third Symposium on the Interface,” (E. M. Keramidas, Ed.), 571-578 (1991). 16. R. S. Hartman, M. J. Doane, and C. K. Woo, Consumer rationality and the status quo, Quart. J . Econom. 106, 141-162 (1991). 17. J. A. Hausman and P. A. Ruud, Specifying and testing econometric models for rank-ordered data, J . Econometrics 34, 83-104 (1987). 18. J. A. Hausman and D. A. Wise, A conditional probit model for qualitative choice: Discrete decisions recognizing interdependence and heterogeneous preferences, Econometrica 46,403-426 (1978). 19. M. P. Keane, A computationally practical simulation estimator for panel data, Econometrica 62, 95-116 (1994).
36
DAVID F. LAYTON
20. I. Krinsky and A. L. Robb, On approximating the statistical properties of elasticities, Rec. Econom. Statist. 68, 715-719 (1986). 21. T. J. Lareau and D. A. Rae, Valuing WTP for diesel odor reductions: An application of contingent ranking technique, So. Econom. J . 55, 728-742 (1989). 22. D. McFadden, Conditional logit analysis of qualitative choice behavior, in “Frontiers in Econometrics” (P. Zarembka, Ed.), Academic Press, New York (1973). 23. J. Odeck, Ranking of regional road investment in Norway: Does socioeconomic analysis matter?, Transportation 23, 123-140 (1996). 24. R. A. Pollak and T. J. Wales, The likelihood dominance criterion: A new approach to model selection, J . Econometrics 41, 227-242 (1991). 25. R. H. Porter and J. D. Zona, Detection of bid rigging in procurement auctions, J . Polit. Econom. 101, 518-538 (1993). 26. D. E. Rae, The value to visitors of improving visibility at Mesa Verde and Great Smoky National Parks, in “Managing Air Quality and Scenic Resources at National Parks” (R. D. Rowe and L. G. Chestnut, Eds.), Westview Press, Boulder, CO (1983). 27. D. Revelt and K. Train, Mixed logit with repeated choices: Households’ choices of appliance efficiency level, ReL’. Econom. Statist. 80, 647-657 (1998). 28. W. Schulze, G. McClelland, E. Balistreri, R. Boyce, M. Doane, B. Hurd, and R. Simenauer, An evaluation of public preferences for Superfund site cleanup, volume 2: Pilot study, report prepared for US EPA (1994). 29. W. Schulze, G. McClelland, M. Doane, E. Balistreri, R. Boyce, B. Hurd, and R. Simenauer, An analysis of stated preferences for superfund site cleanup, unpublished manuscript (1995). 30. V. K. Smith and W. H. Desvouges, “Measuring Water Quality Benefits,” Kluwer-Nijhoff, Boston (1986). 31. M. Stone, Cross-validation choice and assessment of statistical predictions, J . Royal Statist. SOC.B 36, 111-147 (1974). 32. L. A. Thomas, Brand capital and incumbent firms’ positions in evolving markets, Re(.. Econom. Statist. 11,522-534 (1995). 33. L. G. Thomas, Revealed bureaucratic preference: Priorities of the Consumer Product Safety Commission, RAND J . Econom. 19, 102-113 (1988). 34. K. E. Train, “Qualitative Choice Analysis: Theory, Econometrics, and an Application to Automobile Demand,” MIT Press, Cambridge, MA (1986). 35. K. E. Train, Recreation demand models with taste variation over people, Land Econom. 14, 230-239 (1998).