ORGANIZATIONAL BEHAVIOR AND HUMAN DECISION PROCESSES
Vol. 65, No. 3, March, pp. 194–200, 1996 ARTICLE NO. 0019
COMMENTARY Maximizing Your Chance of Winning: The Long and Short of It Revisited PAUL J. H. SCHOEMAKER*,†
AND
JOHN C. HERSHEY†
*Decision Strategies International, Inc., West Conshohocken, Pennsylvania; and †Department of Operations and Information Management, The Wharton School, University of Pennsylvania
Lopes (1996) offers an interesting view of individual risk-taking. She develops four themes, concerning (1) the role of weighted averaging in unique vs repeated gambles, (2) the usefulness of probability-based choice rules, (3) multiple criteria used in judging gambles, and (4) conventional definitions of rationality. Her paper serves to remind us of the impressive collection of experiments she has conducted to study people’s preferences for multioutcome gambles. A difficulty in commenting on this paper is the back and forth shifting between descriptive, prescriptive, and normative perspectives. For example, it is unclear whether and how single vs multiple plays of a gamble affects either the descriptive or normative adequacy of a weighted averaging approach. As Tversky and BarHillel (1983) justly note, the normative appeal of expectation rules in modern choice theory has nothing to do with long run arguments or averaging over plays. It is also unclear how much Lopes’ article attacks the expected utility (EU) model as a normative theory. Some may prefer to violate some of its axioms knowingly, whereas others may find the normative ideal appealing, but hard to achieve in every circumstance. For the latter group, prescriptive heuristics that are less than perfect may be quite acceptable because they are ‘‘close enough’’ to the normative ideal. The conceptual meaning of averaging rules, discussed in the first section, is not entirely clear. If it applies to any form of weighted averaging, without restrictions on the weighting functions applied to either probabilities or outcomes (other than monotonicity), then its normative validity is questionable, whereas
Address correspondence, and reprint requests to either author: Department of Operations and Information Management, The Wharton School, University of Pennsylvania, Philadelphia, PA 19104.
194
0749-5978/96 $18.00 Copyright q 1996 by Academic Press, Inc. All rights of reproduction in any form reserved.
/ a701$$2585
04-02-96 14:20:00
its descriptive validity is purely an empirical matter. Similarly, Lopes’ claim that decumulative weighting functions are superior to weighted utility ones may be correct normatively (since the former circumvent violations of dominance), but tenuous descriptively given the greater cognitive effort involved in decumulating probabilities. Again, without clarity about the stance adopted, it is hard to assess these and other claims. The next section and theme, dealing with probability-based rules and stochastic control, raise important and interesting issues regarding the conditions that favor probability-based approximations to an expected value or expected-utility rule. Lopes revisits here the 1963 paper by Samuelson, and the criticism against her by Tversky and Bar-Hillel (1983) concerning her ‘‘undue extrapolation’’ of Samuelson’s theorem. Although this is a minor matter, in her original publication of 1981, Lopes failed to mention or consider the key qualifier about the gamble not being acceptable at any point within the range defined by the repeated plays. As such, she was overly critical of Samuelson while nonetheless raising intriguing issues about the long vs short run views. The perspective of stochastic control is especially interesting concerning its normative basis. Is it strictly rational for an expected utility (EU) believer to optimize a probability of coming out ahead? We think not. But does this heuristic offer a reasonable approximation to EU for repeated plays? Quite possibly, under the right circumstances, as we show later. In this light, maximizing the probability of coming out ahead can be sound prescriptive advice. The third theme, about the need for dual rationality criteria, seems very reasonable descriptively but murky normatively. Descriptively, whether these criteria should focus on expectation measures vs probability of success metrics is an empirical issue. Of course, there
obha
AP: OBHDP
195
COMMENTARY
may be more than two criteria that matter to subjects. On the normative side, Allais’ view that it is in no way irrational to accept a heavy reduction of possible gain as the price of achieving certainty, although this same reduction would not be felt acceptable for the same increase in the probability of gain if that increase is far removed from certainty (1952/1979, p. 102), does contradict EU theory as the example below illustrates. Suppose you face an 80% chance of gaining $100 (and a 20% chance of $0). If you accept a 25% increase in the probability of winning at the cost of a prize reduction of $20 (so that you now have a 100% chance of getting $80), then you should also accept the same deal if the original probability had been say 10%. As Allais points out, however, you may nonetheless prefer a 10% chance at $100 over a 12.5% chance at $80. It is an old debate as to whether this is rational; to Allais it is, to EU believers it is not. Rationality is a difficult concept, and is even more difficult to assess without a clear definition. At least EU proponents are very explicit about their particular definition of rationality. The weak link in their axiom system is the substitution axiom, and perhaps the compound probability axiom (which most people would prefer to relax before say transitivity). However, you can only beat a theory with a better theory, and from a normative view nothing exists to rival EU’s appeal to us. The final theme in Lopes’ paper concerns the importance of coherence and consistency in human judgment. We agree that coherence is often violated descriptively and as such coherence should not feature prominently in behavioral models. Normatively, however, we are in the classicist camp and deem it an important condition of rational choice and judgment. Among the most testable claims in Lopes’ paper are those concerning probability-based rules. We explore them further in the remainder of this note. 1. PROBABILITY-BASED RULES
Lopes (1981) and others have argued that ‘‘sensible people often base their choices on the probability of coming out ahead’’ (Lopes, 1996), especially when facing repeated plays of the same gamble. Samuelson (1963) proved, however, that an expected utility (EU) maximizer should never accept multiple plays of a gamble if a single play of that gamble would be unattractive at any wealth level within the range of outcomes defined by the multiple plays. Tversky and Bar-Hillel (1983) concurred, and so would most normative theorists. The more interesting question is how sensible choices are when based on maximizing the chance of coming out ahead, as compared to an expected value
/ a701$$2585
04-02-96 14:20:00
obha
or expected utility benchmark. No one would expect this heuristic to be optimal, but is it reasonable? We shall examine how well the MPW (Maximize the Probability of Winning) heuristic performs when tested with pairs of mixed gambles drawn randomly from a population of gambles whose overall expected value (EV) is zero. We are interested in the performance of the MPW heuristic (relative to the EV rule) as a function of the number of plays. In the next sections we explain (1) the design of our simulation, (2) the numerical results, and (3) our interpretation. The exercise was motivated by our intuition that the MPW heuristic should perform better when the distribution of outcomes is more fine grained (i.e., when there are more branches in the lottery) and when it exhibits more central tendency. Playing a discrete gamble multiple times produces both effects. Also, it offers a test of Lopes’ implied conjecture that multiple plays may favor the MPW heuristic under certain conditions.1 2. DESIGN OF SIMULATION
Two-outcome mixed gambles are drawn independently and at random as follows: —Amount of win (W) is uniformly random on $0 to $100 —Amount of loss (L) is uniformly random on 0$100 to $0 —Probability of winning (p) is uniform random on 0 to 1; the probability of loss is (1 0 p). The notation [Wa , pa ; La , (1 0 pa)] refers to gamble A, offering a p chance of winning W and a (1 0 p) chance of getting L. The expected value of gamble A, denoted EVa , is paWa / (1 0 pa)La . If gamble A is played twice, it is denoted A2, and three times denoted A3. Table 1 shows the outcomes associated with double and triple plays of the gamble, assuming independence of each repeat. Imagine we now draw a second gamble B, which we likewise play two and three times, denoted B2 and B3 respectively. Since the probability of winning (PW) may change under multiple plays for either or both gambles, the MPW heuristic may result in a different choice (unlike the EV rule). The question of interest is how the MPW heuristic fares on average, relative to 1 In the several papers cited, Lopes has argued that it might be very reasonable for people to reject a single play of a gamble, while accepting multiple plays of that same gamble, because people often focus on the probability of coming out ahead. She argues that this strategy is especially reasonable when the initial gamble offers a low probability of high gain. The present paper tests this strategy with gambles offering uniformly distributed probabilities and payoffs.
AP: OBHDP
196
COMMENTARY
TABLE 1 Outcomes under Two and Three Plays (a) Outcomes under two plays Outcomes from the second play Win (W) Outcomes from the first play Win Lose Prob.
Lose (L)
2W p2 L/W p(1 0 p)
W/L p(1 0 p) 2L (1 0 p)2
p
(1 0 p)
Prob.
p (1 0 p)
(b) Outcomes under three plays Outcomes from the third play
Outcomes from first two plays Win twice (2W) Win once, lose once (W / L) Lose twice (2L) Prob.
Win (W)
Lose (L)
3W p3 2W / L 2(1 0 p)p 2 2L / W p(1 0 p)2
2W / L (1 0 p)p 2 W / 2L 2p(1 0 p)2 3L (1 0 p)3
p
(1 0 p)
Prob.
p2 2p(1 0 p) (1 0 p)2
the EV rule, under one, two, and three plays of each gamble in the pair. To assess this, we conducted the following simulation: 1. Draw two gambles at random within the uniform ranges specified above. 2. Repeat step 1 for a total of 1,000 pairs and ignore a pair with domination. 3. Calculate the EV of each gamble in the pair and select the higher EV one. 4. For that pair, does the MPW heuristic pick the same gamble as the EV rule? If yes, MATCH Å 1; if no, MATCH Å 0. 5. SCORE Å [mean EV under MPW rule]/[mean EV of highest EV gamble] 6. Repeat steps 3–5 assuming each gamble is played twice. 7. Repeat steps 3–5 assuming each gamble is played three times. Note that MATCH is an ordinal measure of performance, whereas SCORE is a parametric yardstick for the performance of the MPW heuristic. Although there
/ a701$$2585
04-02-96 14:20:00
obha
is a risk of dividing by zero on any single gamble pair, with a simulation based on a thousand draws, the mean EV of the MPW heuristic (denoted EVheur) and the mean EV of the better gamble under the EV rule (denoted EVbest) are almostly certainly both positive. Since the MPW heuristic can at best match the EV criterion on every pair, its upper benchmark is 100%. The lower benchmark we shall use is random choice. Since each randomly drawn gamble has an EV of zero, the strategy of choosing among them by flipping a coin will also have an EV equal to zero. Hence, our score measure is bounded from 0% to 100%. Lastly, note that repeated plays do not by themselves change our score measure. Under say n repeated plays, the EV of each gamble is multipled by n. Hence, if there is no change in the pattern of ordinal matches, both the EV of the MPW heuristic and that of the higher EV gamble will be multiplied by n, which in turn cancels out when dividing these two EVs. Thus, our parametric measure of performance is invariant under repeated plays, provided no ordinal inversions occur. Indeed, the only way the MPW heuristic can do better under repeated plays is if new matches are created. Since old matches may be lost, however, it is an empirical question whether the net effect on SCORE will be positive or negative. Even if the ordinal score (MATCH) improves, it could in theory happen that the parametric measure SCORE declines. This would require that the newly created mismatches carry a large opportunity cost (relative to the old mismatches that are now eliminated). Since the analytics of this question become quickly intractible, especially for three or more plays, we assessed this research question via Monte Carlo simulation. RESULTS OF SIMULATION
Table 2 offers some numerical examples of the above simulation, for one, two, and three plays of several gamble pairs. Each row shows the two gambles A and B, whether domination occurs, their respective EVs, the probabilities of winning for gamble A and B (denoted PWa and PWb respectively), EVbest , EVheur , and whether an ordinal match occurs. This is then repeated for two and three plays. The bottom of each section of the table shows the means of EVbest , EVheur , and MATCH. Also, we show the value of SCORE for each section. Table 3 tabulates the result of our simulation for 1000 cases. We encountered 75.3% undominated pairs in this set of 1000 randomly selected pairs, with ordinal matches ranging from 69% to 86% (for the subset of undominated pairs). The measure SCORE rose from
AP: OBHDP
/ a701$$2585
04-02-96 14:20:00
obha
AP: OBHDP
0.543 0.893 0.559 0.493
1 2 3 4 Mean
$50.63 $4.58 $74.85 $4.50
$50.63 $4.58 $74.85 $4.50
$50.63 $4.58 $74.85 $4.50
Wa
($71.60) ($42.38) ($77.49) ($30.14)
($71.60) ($42.38) ($77.49) ($30.14)
($71.60) ($42.38) ($77.49) ($30.14)
La
0.354 0.574 0.682 0.277
0.354 0.574 0.682 0.277
0.354 0.574 0.682 0.277
Pb
$0.13 $4.56 $52.83 $49.18
$0.13 $4.56 $52.83 $49.18
$0.13 $4.56 $52.83 $49.18
Wb
Gamble B
Note. Dollar figures in parentheses refer to losses.
0.543 0.893 0.559 0.493
0.543 0.893 0.559 0.493
1 2 3 4 Mean
1 2 3 4 Mean
Pa
Pair
Gamble A
($22.45) ($96.64) ($6.82) ($17.91)
($22.45) ($96.64) ($6.82) ($17.91)
($22.45) ($96.64) ($6.82) ($17.91)
Lb
No Yes No No
No Yes No No
No Yes No No
EVa
$33.84 $0.66
$7.74 ($13.05)
$67.68 $1.31
$15.48 ($26.11)
($43.34) $101.52 $1.97
($15.82) $23.23 ($39.16)
Three plays
($28.90)
($10.55)
Two plays
($14.45)
EVb
($5.27)
One play
Does either gamble dominate?
$101.52 $1.97 $29.22
($15.82)
$67.68 $1.31 $19.48
($10.55)
$33.84 $0.66 $9.74
($5.27)
Best EV (EVbest)
0.589 0.120
0.564
0.313 0.243
0.294
0.559 0.493
0.543
PWa
Numerical Examples of the Simulation for One, Two, and Three Plays
TABLE 2
0.968 0.622
0.044
0.899 0.477
0.126
0.682 0.277
0.354
PWb
$101.52 $1.97 $29.22
($15.82)
$67.68 $1.31 $19.48
($10.55)
$33.84 ($13.05) $5.17
($5.27)
EV of best gamble under MPW (EVheur)
1 1 1.000
1
1 1 1.000
1
1 0 0.667
1
MATCH
1.000
1.000
0.531
SCORE
COMMENTARY
197
198
COMMENTARY
TABLE 3 Summary Statistics of Simulation One play
Two plays
Three plays
1 1000
2 1000
3 1000
Number of plays/gamble Sample size of simulation No. of undominated pairs No. of ordinal matches
753
753
753
522
603
644
Percentage of matches Average EV of heuristic Standard deviation Average EV of best EV Standard deviation Dollar difference in EVs SCORE (ratio of EVs)
69.32% $8.40 $34.56 $15.17 $33.77 $6.77 55.37%
80.08% $24.01 $68.85 $30.34 $67.54 $6.33 79.14%
85.52% $39.52 $102.83 $45.52 $101.31 $6.00 86.82%
EV of heuristic per play EV of best EV per play Dollar difference per play
$8.40 $15.17 $6.77
$12.01 $15.17 $3.17
$13.17 $15.17 $2.00
INTERPRETATION
55% to 87%, with the sharpest increase occurring in going from one to two plays. Figure 1 shows the rise in the mean values of SCORE—as well as the percentage of matches—visually. These data strongly suggest that the MPW heuristic will generally perform better as the number of plays increases, albeit at a decreasing rate. Two factors can explain the monotone increase. First, the percentage of ordinal matches improves with more plays. Second, the mean opportunity loss associated with mismatches appears to become less per occurrence, as the gambles become more fine grained with repeated plays. The risk of major mismatches, which would result in low SCORE values, seems to decline. Interestingly, most of the improvement in SCORE is realized in going from one to two plays. It is unclear, without a factorial design, to what extent this reflects ordinal improvements in matches versus a reduction in the average opportunity cost per mismatch. We examine this issue further below, but first will comment on the simulation itself. Our simulation was conducted in Microsoft EXCEL. To assess the quality of the simulation, we tested whether EVa and EVb were close to zero, whether pa and pb were close to .5, and whether the incidence of domination approximates the theoretical expectation of 1/4 (across the sample of 1000). Using z-tests, we concluded that each of these five hypotheses is accepted at the .05 level. Also, we tested whether the probability of winning under one play approximates 2/3 for the MPW rule. It is well known from order-statistics (see ˜ Å Max[x˜1 , x˜2 , David, 1981) that the random variable Y
/ a701$$2585
04-02-96 14:20:00
. . . , x˜n] has an expected value of n/(n / 1) if each x˜i is independently uniform on [0, 1]. Since the MPW heuristic targets the higher of two independent uniform probabilities, the expected value of the higher probability is 2/3. Note that this test cannot be conducted for multiple plays, as PWa and PWb will no longer be uniform random variables.
obha
What intuition might explain the sharp improvement in the score of the MPW heuristic in going from one to two plays? Consider an example in which EVa ú EVb , but B is preferred to A under the MPW rule. Say A Å [$100, .4; 0$50, .6] and B Å [$40, .5; 0$50, .5]. The only way the MPW rule can improve is if under repeated play some of the intermediate branches in the new lottery offer positive payoffs. Playing the above gambles twice yields the following: A2 Å [$200, .16; $50, .48; 0$100, .36] and B2 Å [$80, .25; 0$10, .5; 0$100, .25]. Note that PWa is now higher (.16 / .48) than PWb Å .25. Hence, by playing the gambles twice, ordinal consistency is restored yielding a perfect value for SCORE. Of course, this does not always happen and it is not simple to calculate the probability of this occurring for all random pairs. Nonetheless, when there is an ordinal mismatch under one play (between EV and MPW), there is some chance that under multiple plays ordinal consistency gets restored. We must also consider, however, the converse scenario. Imagine we start with ordinal consistency, how likely will it be that repeated plays introduce ordinal inconsistency? For example, let A Å [$80, .6; 0$100, .4] and B Å [$60, .5; 0$50, .5]. In this case, EVa Å 8 and EVb Å 5 while PWa Å .6 and PWb Å .5. Hence, both the EV and MPW rule favor gamble A. Now consider two plays of each: A2 Å [$160, .36; 0$20, .48; 0$200, .16] and B2 Å [$120, .25; $10, .5; 0$100, .25]. Again, A
FIG. 1. Performance of heuristic.
AP: OBHDP
199
COMMENTARY
ú B under EV but this time PW2b (Å.75) ú PW2a (Å.36). Hence, playing the gambles twice destroys the ordinal consistency observed under single play. The empirical results suggest that this negative scenario happens less frequently, or to a lesser degree, than the positive one mentioned first. Why? An intuition for these findings is that playing gambles more than once introduces (1) more outcomes and (2) central tendency. The effect of more outcomes reduces the probability that the MPW heuristic favors a highly skewed gamble offering a high chance of a minor gain (at the price of a significant loss). Central tendency redistributes the probability mass toward the middle of the outcome range, again reducing the weight of very high probabilities. The combined effect of finegrainedness and central tendency makes it less probable that the MPW heuristic favors a gamble that happens to have a high PW but is poor in all other regards. As more branches come into play in the MPW heuristic, extremely bad choices (relative to the EV criterion) should occur with less frequency. These same arguments can also be used to explain why an otherwise attractive gamble, that has a low initial p, will be more likely to be chosen under the MPW heuristic as the number of plays increases. To develop the intuition further, consider a very large number of plays so that we are dealing with distributions that are approximately Normal. Suppose we draw pairs of Normals whose means randomly deviate from zero. When the mean of A exceeds that of B, the positive probability mass under A (i.e., the mass associated with positive outcomes) will usually exceed the positive probability mass of B. There are exceptions, of course, if the variances are quite different or if the distributions are highly skewed. But in the limit, the MPW heuristic and the EV rule will correlate highly. Indeed, for distributions that are identical other than in their mean, the two rules will be equivalent for all possible cases.
We would expect this heuristic to do even better compared to an expected utility rule (von Neumann and Morgenstern, 1947) under risk aversion, for the following reasons. A concave utility function gives decreasing weight to higher outcomes. Since the heuristic is insensitive to the spacing of outcomes, any compression of the outcome scale reduces the subjective cost of this insensitivity. A formal proof of this intuition is offered in Schoemaker (1989), where improvements in probability are contrasted with improvements in payoffs. The MPW heuristic only cares about improvements in probability. An EV rules gives equal weight to increments in probability or payoff. An EU rule for risk averters emphasizes the probability dimension over the payoff one, and as such is closer to the MPW heuristic than the EV rule would be. As emphasized by Payne, Bettman and Johnson (1993), people are adaptive decision makers who try to balance the effort and accuracy of their choice strategies. This study examined only the accuracy of our heuristic, not its effort. In cognitive terms, however, the MPW heuristic is not especially effortful (Johnson and Payne, 1985). It entails an ordinal binary comparison with one play, and some additions and multiplications under multiple plays. Nonetheless, we need more systematic evaluations of both the accuracy and the cost of common choice heuristics, especially in the area of risky choice. In conclusion, our simulation suggests that maximizing your chance of winning is a reasonable heuristic when facing risks that are repeated at least twice and not too skewed in distribution. In randomly selected pairs of gambles where neither one dominates the other, the MPW heuristic yielded 55% of the benefits obtained with the EV rule under single play, 79% when each was played twice, and 87% when played three times. REFERENCES
CONCLUSIONS
We have shown that for pairs of mixed gambles, randomly drawn from symmetric uniform distributions, the simple rule of choosing the gamble with the highest probability of winning offers a good approximation to the EV rule. When the gambles are played more than once, the heuristic does even better with most of the improvement obtained in going from one to two plays.2 2
For more formal analyses see Camacho (1979), Chew and Epstein (1988) or McGuire, Pratt, and Zeckhauser (1991). Keren and Wagenaar (1987) offer a descriptive analysis of behavior in single vs repeated plays; see also Schneider and Lopes (1986).
/ a701$$2585
04-02-96 14:20:00
obha
Allais, M. (1952/1979). The foundations of a positive theory of choice involving risk and a criticism of the postulates and axioms of the American School. In M. Allais and O. Hagen (Eds.), Expected utility hypotheses and the Allais Paradox (pp. 27–145). Dordrecht: Reidel. Camacho, A. (1979). Maximizing expected utility and the rule of long run success. In M. Allais and O. Hagen (Eds.), Expected utility hypotheses and the Allais paradox, pp. 203–222. Dordrecht: Reidel. Chew, S. H., and Epstein, L. G. (1988). The law of large numbers and the attractiveness of compound gambles. Journal of Risk and Uncertainty, 1(1, March), 25–132. David, H. A. (1981). Order Statistics, New York: Wiley, 1981. Johnson, E. J., and Payne, J. W. (1985). Effort and accuracy in choice. Management Science 31(4), 395–413. Keren, G., and Wagenaar, W. A. (1987). Violation of utility theory
AP: OBHDP
200
COMMENTARY
in unique and repeated gambles. Journal of Experimental Psychology: Learning, Memory, and Cognition, 13, 387–391. Lopes, L. L. (1981). Decision making in the short run. Journal of Experimental Psychology: Human Learning and Memory, 7(5), 377–385. Lopes, L. L. (1996). When time is of the essence: Averaging, aspiration, and the short run. Organizational Behavior and Human Decision Processes, 65, 179–189. McGuire, M., Pratt, J., and Zeckhauser, R. (1991). Paying to improve your chances: Gambling or insurance? Journal of Risk and Uncertainty, 4(4), 329–338. Payne, J., Bettman, R., and Johnson, E. (1993). The adaptive decision maker. New York: Cambridge Univ. Press.
Samuelson, P. A. (1963). Risk and uncertainty: A fallacy of large numbers. Scientia, 98, 108–113. Schneider, S. L., and Lopes, L. L. (1986). Reflection in preferences under risk: Who and when may suggest why. Journal of Experimental Psychology: Human Perception and Performance, 12(4) 535–548. Schoemaker, P. J. H. (1989). Preferences for information on probabilities versus prizes: The role of risk-taking attitudes. Journal of Risk and Uncertainty, 2, 37–60. Tversky, A. and Bar-Hillel, M. (1983). Risk: The long and the short, Journal of Experimental Psychology: Learning, Memory, and Cognition 9, 713–717. von Neumann J. and Morgenstern, O. (1947). Theory of games and economic behavior, 2nd ed. Princeton, NJ: Princeton Univ. Press.
Received: October 4, 1995
/ a701$$2585
04-02-96 14:20:00
obha
AP: OBHDP