JOURNAL
OF MATHEMATICAL
Gambling
PSYCHOLOGY:
Behavior
(1970)
in Two-Outcome
AMNON University
7, 163-187
of North
RAPOPORT Carolina, Received
AND Chapel
Multistage LYLE
V,
Hill,
North
November
Betting
Games’
JONES Carolina
27514
21. 1968
In a favorable multistage gambling process, a biased coin with the probability of “heads” p > =& is tossed repeatedly. Before each toss, a gambler is required to bet a fraction of his capital on “heads.” His problem is to choose a betting policy that is optimal with respect to some criterion. A model yielding a proportional stationary Markov betting policy is presented and generalized. The sensitivity of the model to departures from optimality is investigated. Fifty-one subjects participated individually in a multistage gambling experiment. While most subjects performed efficiently in terms of the model, the model was not strictly supported, since both the current amount of capital and the pattern of previous outcomes affected subjects’ betting decisions.
Decision making in gambling has been studied experimentally mostly under static betting conditions (Becker, DeGroot, and Marschak, 1963; Coombs, Bezembinder, and Goode, 1967; Coombs and Pruitt, 1960; Davidson, Suppes, and Siegel, 1957; Edwards, 1954, 1955; Mosteller and Nogee, 1951). A typical gambling experiment consists of presenting a subject with a series of gambles. On each gamble, he is permitted either to bet a fixed amount of money, usually very small, or not to bet at all. Various amounts of money may be offered at various odds by the experimenter. The subject’s accumulated gain in typical gambling experiments cannot determine the order of presentation of the bets, nor can his decision on a given trial determine the amount of money he may bet on the next trial. In this paper, gambling behavior is studied under conditions where the quantity to be wagered is a variable determined by the subject rather than by the experimenter. Multistage gambling games that resemble certain economic decision processes are employed; each of the investor’s decisions regarding the outcome of an uncertain event affects the size of his capital on subsequent occasions and hence determines the amount of money subsequently available for investment. r This research was supported by a PHS Research Institute of Mental Health, Public Health Service. The for assistance in data collection and Dr. Amos Tversky
163 480/7/1-r
I*
Grant No. M-10006 authors wish to thank for a critical reading
from the National Mr. Dean Hartley of the manuscript.
164
RAPOPORT
AND
JONES
Consider the following multistage betting game MBG: A gambler is shown a biased coin with the probability p > 4 for heads. The gambler is required to place a bet on the event of heads. The betting process continues for N plays (stages), and the tossings of the coin are independent at these stages. At each stage the gambler is allowed to bet a quantity bj , subject to the restriction 0 < bj < xjel , where xj , j = 1, 2 ,..., A’, denotes the amount of his capital after the jth play, N denotes the number of plays (stages), and x0 denotes the amount of his initial capital. If heads occurs, he retains the amount bet and wins an equal amount. Otherwise, he loses the amount bet. Knowing the value of p, how should he choose bj ? For the gambler, the problem is to choose a criterion and a betting policy that is optimal with respect to this criterion and that tells him exactly how much to bet at each stage. We consider only betting policies in which bj is a function of X, , X, ,..., Xj-, and p, where (the random variable) Xj denotes the gambler’s capital after play j. This restriction is hardly severe as it is difficult to see what other reasonable betting policies could be considered. When bj depends only on Xjpl and p, the betting policy is called a stationary Markov betting policy (Ferguson, 1965). For these policies the notation bj(x, p) is used to represent the amount of money bet by the gambler at stage J’ if Xjml = x and the value of p is known. The MBG has been proposed as a model for certain multistage forecasting processes (Bellman and Kalaba, 1957; Kelly, 1956; Murphy, 1965), in particular for problems that arise in connection with economic forecasting. What is required for a meaningful application of the model is a system subjected to some simple underlying stochastic environmental process with a known parameter, and a decision maker who can reinvest part of his resources at every stage of the process. It is easy to show that if a gambler participating in MBG wishes to maximize the expected value of his capital at the end of N stages, E(X, 1X0), he should employ the stationary Markov policy b&P)
= 1;
if if
p>& p<$.
(1)
forj = 1,2,..., N. Ifp = 4 it does not matter what betting policy he uses. (The gambler must always bet on heads. Were he allowed freedom in placing his bets, then in the case of p < & he should always bet his entire capital on tails.) The betting policy (1) maximizes the expected value of the gambler’s capital but, whenp > 4, it also increases the fluctuations of his capital and hence the probability of ruin. Whenp > 4, a gambler 1 - p’, which subscribing to the betting policy (1) will be wiped out with probability will approach I very rapidly asj increases. As survival is highly desirable, the gambler’s problem, then, is to invest his capital in such a way as to increase his expected winnings while keeping low the probability of bankruptcy. Related to MBG is the famous St. Petersburg game discussed in 1738 by Bernoulli (1954). As Bernoulli pointed out, the paradoxical nature of the game lies in the fact
BEHAVIOR
IN
BETTING
GAMES
165
that while the expected gain for the game is infinite, no reasonable player would prefer the privilege of playing this game to the receipt of an appreciable finite amount of money. Most solutions to this paradox have been based on the notion of utility and the law of diminishing marginal utility. Bernoulli, with whom the notion of utility was first associated, suggested that the same proportionate addition to an amount of wealth carries to an individual the same absolute addition of utility, at all values of absolute level of wealth to which this addition is accruing. This assumption is equivalent to postulating that utility is equal to the logarithm (to any base) of the amount of wealth. This assumption is also found in Fechner’s Law relating subjective sensation to objective stimulation. Maximization of expected gain in both the St. Petersburg paradox and MBG is clearly incompatible with everyday observations of multistage betting behavior. Following Bernoulli, and economists who have proclaimed utility nonlinear in money, the gambler’s problem in MBG may be solved in the same way as the St. Petersburg paradox: When p > 4, the gambler should place his bets in such a way as to maximize the expected utility of his capital at the end of N stages, E( U(X,) 1 X,,). If his utility function is assumed to be logarithmic, as suggested by Bernoulli (1954) and Savage (1954, p. 94) he should maximize the expected logarithm of his capital at the end of N stages, E(log X, 1 X0). The betting policy for MBG which maximizes E(log X, 1 X,,) has been investigated by Bellman and Kalaba (1957, 1958), Breiman (1961), Kelly (1956), and Murphy (1965). They have shown that it is a proportional stationary Markov betting policy, bj(x, p) = ox, where 0 < n(p) < 1. It is the purpose of this paper to summarize their results and generalize them into a model for multistage gambling behavior. A second pupose is to test the model with subjects betting real money in computercontrolled Multistage Betting Games.
A MODEL
FOR
MBG
Restricting himself to betting policies for MBG which require the amount bet to be a fixed proportion of the gambler’s capital at each stage, Kelly (1956) showed that the proportional stationary Markov betting policy
i(P - 4
b(% P) = /j-j
if if
p>Q p < 4,
wherep + q = 1 and N - co, maximizes E(log Xfi / X0). Bellman and Kalaba (1957), using functional equation techniques, considerably extended this result and showed that Kelly’s betting policy (2) is unique and optimal within the class of all betting policies.
166
RAPOPORT
AND
JONES
Define fN(x) as the expected value of the logarithm of the capital after N stages, starting with an initial capital x and using the optimal betting policy. Bellman and Kalaba (I 957) proved that for N 2 1, and p > 4, f&)
= log X + Nk’,
(3)
where K = log2 +phgp
+q1ogq,
(4)
and the optimal betting policy, which is unique and independent of N, is given by (2). The gambler’s capital in MBG at the end of a one-stage process is given by x + ry if he wins the bet, and x - sy if he loses it, where Y = s = 1, and y = +p)x is the quantity wagered. Under the more general case, 7 f S; i.e., the factor multiplying the gambler’s amount of money bet differs, depending on the outcome of the chance event. For example, when Y = 2, a gambler will double the amount wagered if heads comes up. His capital after a successful bet will be x + 2y. The more general MBG with r # s has been considered by Murphy (1965). Using dynamic programming he has proved that for N > 1, p > 4, and r and s known positive constants, f&x)
= log X + Ark”,
(5)
where K’ = P log P + 4 log 4 + P log((r + 4 lo& and the optimal
betting
policy
rs 0
Power
Utility
+ 4/r),
(6)
is given by
(Pr - d x Y=
+ s)/s)
if
0 < (pr - 4s) < YS
(7)
otherwise.
Functions
“That a betting system is optimal with respect to some arbitrarily chosen utility function is not a very convincing argument in favor of its use. It is preferable to consider criteria which are more intrinsic” (Ferguson, 1965, p. 799). Indeed, Kelly (1956) was not concerned with the notion of utility. His model is designed for maximizing the expected rate of growth of the gambler’s capital, subject to the restriction that the gambler had some finite amount of money at the beginning of each stage. He was able to show that maximizing the expected value of the logarithm of capital is the same as maximizing the expected rate of growth of the capital when N -+ co. Breiman (1961) considered two reasonable criteria for MBG, which expanded the notion of “expected rate of capital growth.” One was the minimal expected number of stages to reach or to exceed a fixed amount of capital. The other was the maximal rate
BEHAVIOR
IN
BETTING
167
GAMES
at which the capital grows for a fixed number of stages. For both criteria Breiman showed that Kelly’s betting policy, maximizing E(log X, ) X0), is asymptotically optimal. Despite the compelling reasonableness of a logarithmic utility function, it might nevertheless be asked whether there are other utility functions beside the logarithmic and linear functions that yield the attractive proportional stationary Markov betting policies for MBG. Bellman and Kalaba (1957), taking p > 4, have shown that utility functions yielding proportional stationary Markov betting policies are either logarithmic or linear or power functions given by
+c=cx, M+l)
U(X,) = g+
where C = l/(n/l + l), -1 < M < 0, and c = 0 (without loss of generality). Stevens (1959) argued that such power functions constitute an appropriate class of utility functions for relating “utiles” to dollars. Below we derive the optimal betting policy and the expected utility of capital after N stages for power utility functions which satisfy the law of diminishing marginal utility. For the one-stage process, the gambler is faced with the problem of maximizing his expected utility
.fl(X, y) = [PC@ + yyj”+l + qC(x ~ SYF+ll, subject to the constraint
0 < y < X. Setting
PGf + VY and the optimal
amount
df/dy
(9)
of capital to be bet on the first stage is
x _ ffx
(10)
0 < Y/X < 1 means that
0 < (qs)l’M - (pry and we get the optimal WX,
y = (0, Substitution
= 0, we obtain
= qs(x - sy)?
G7WM - (PrF y = s(qs)l’M+ r(pr)“” The constraint
W
betting
policy
< S(qSy + r(pr)““,
for the one-stage
process:
if 0 < (qs)llM - (pr)liM < s(qs)ljM + r(pr)llM otherwise.
of (11) into (8) yieldsfr(x) f&x!) = pC(x +
YHX)M+l
(11)
= max,,~Y”-CZfi(~, y) as follows:
+ qC(x - sEhy+1 = CX”+lK
(12)
168 where
RAPOPORT
AND
JONES
we define K=
hb+1
1
r+s [
s(qs)l’M
+
Y( pr)l’M
{p(qs)W+1”W
+
g(p)y~‘+f+1M}.
(13)
It can be proved now that, for N 3 1, fN(x)
= CU+‘+~K~,
where K is given in (13), and the optimal betting policy is given by (11) and is independent of N. The proof is by induction. The Principle of Optimality of dynamic programming (Bellman and Kalaba, 1965, p. 35) yields the recurrence relation
fNG4 = ,~y,LPh+l(x Begin with the known Then for N 3 1
N > 2.
result (12) for N = 1, and assume that the result holds for N.
.fNfl(X) = ,lg$PfN(X = ,$y&~PCCX .\ Substitute
+ YY) + QfN-1(x - SY)l,
+ YY) + 9.fidx - SY)l, + rY) M+l KN + qC(x - ~y)~+l KN].
(14)
(11) for y in (14). Then fN+l(x)
= pC(x +
= CY”+~K~[~(~ =
+ qC(x - sHx)~+~K~
YHx)~+~K~
+
YH)~+~
+ q(l - sH)~+~]
(15)
&M+lKN+l
For example, suppose the utility function is proportionate to the square root of the gambler’s capital-a suggestion made by Cramer (Bernoulli, 1954) for solving the St. Petersburg paradox. Suppose further that the gambler either wins the amount of money he bets on each stage or loses it, i.e., Y = s = 1, and that p = 2. Under these conditions the gambler’s optimal betting policy from (11) is to bet 8094 of his capital on heads at each stage. For a logarithmic utility function the gambler’s optimal betting policy (2) is to bet exactly one half of his capital. The Adaptive
Gambler
Proportional stationary Markov betting policies have been derived for MBG assuming that the value of p is communicated to the gambler. In this section p is assumed to be a fixed but unknown parameter. The gambler is not told the true value of p at any stage in the MBG, but knows that it is fixed over time. His problem, then, is to learn the value of the parameter p of a binomial process and to make decisions simultaneously, trying to maximize the expected value of the utility of his capital after N stages.
BEHAVIOR
IN
BETTING
169
GAMES
To account for the stage-to-stage learning of the unknown value of p, a Bayesian learning model has been proposed in a second paper by Bellman and Kalaba (1958): Suppose the gambler’s convictions concerning p are summarized by a personal probability distribution, a beta distribution with probability density function pee,-l,(l
gj( p) =
Parameters of the representing the further that in j u + v = j. Then (16), the posterior pj = ,& + n. The
2 p’.j”( i >0 ' i
-
p)W,-1)
1 - p)(s,-l)
dp ’
for
0 fp
(16) elsewhere.
prior distribution, with density function g,,(p), are OL,,> 0, ,8,, > 0, gambler’s knowledge and beliefs concerning p at stage 0. Suppose stages, 1 < j < N, the gambler has won u bets and lost ZI bets, it can be shown (Raiffa and Schlaifer, 1961) that the parameters of personal probability distribution afterj stages, are q = CX,,+ u and mean ofg,(p) is given by
where pj represents the gambler’s revised estimate of p at stage j, following u wins and v losses. For the logarithmic and power utility functions, the optimal betting policies afterj stages are given by (7) and (1 l), respectively, wherepi calculated from (17) is substituted for p. For each utility function the optimal betting policy for the adaptive gambler is no longer a stationary policy. The fraction of capital that he should wager on stage j, m(pj), depends on pj and will change from one stage to another. However, as pj can be shown to be a consistent estimator of p, i.e., lim pj = p. N-ic.2 I
will approach
Sensitivity
x(p) when N --f CO.
of the Optimal
Betting
Policy
The sensitivity of the optimal betting policy is studied by assessing the effect of departures from this policy on the expected utility in the MBG. Normatively, sensitivity studies have certain obvious merits, especially if the costs of searching for the optimal decision policy and executing it are high. In the MBG, for example, the adaptive gambler may be well advised not to expend too much effort in calculating the exact value ofp, but to be satisfied with a rough estimate of it, if it can be shown that he may suffer only a negligible decrease in the expected utility of his winnings. The study of the sensitivity of the optimal betting policy also may be of value for explaining the gambler’s betting behavior. A gambler playing the MBG may depart
170
RAPOPORT
AND
JONES
from the optimal betting policy for various reasons. He may utilize only part of the information regarding the outcomes of his previous decisions and hence his estimate of p may be substantially in error. The cost of searching for the decision with the highest expected utility, sometimes called the “cost of thinking,” may be high. Or he may simply prefer to introduce variability into the betting instead of always wagering the same fraction of his capital even when the value of p is known. Theoretically, one could obtain the utilities for “cost of thinking,” “variability preference,” and all other factors that could cause deviation from optimal betting behavior. These utilities, together with the utility of wealth, would be incorporated into a generalized utility function and an attempt would be made to derive the optimal betting policy. This approach, however, if feasible at all, would require tremendous effort to discover all factors that may cause discrepancies from optimal decision behavior and to measure their utilities. Furthermore, very little is presently known as to how these utilities should be combined into one general utility function. Presumably, a gambler may depart from the optimal policy if he can safely assume that his decision will result in only a slight decrease in the expected utility of his winnings. A second possible approach, then, to account for betting behavior, would be to derive the optimal betting policy for a given utility function-as we did aboveand then to study its sensitivity to departures from optimality in terms of the expected utility for the MBG. The one-stage expected utility E(log Xj+r / xj = 100) for a known value of p and r=s=l, is charted against values of r(p) ranging between zero and one in Fig. 1 (for p = .6), Fig. 2 (for p = .7), and Fig. 3 (for p = .8). The highest point on each of the three curves corresponds to the optimal betting proportion yielded by the betting policy (2), i.e., .2, .4, and .6, for p = .6, p = .7, p = .8, respectively. The one-stage expected utilities E(2/Xj+i 1xj = 100) are charted in Figs. 4, 5, and 6 for the same values of r, s, p, and .x. The optimal proportions for a square-root utility function obtained from (10) are .385, .690, and .882, for p = .6, p = .7, and p = .8, respectively. Figures 1 to 6 indicate that the optimal betting policies for both the logarithmic and the square-root utility functions are relatively insensitive to discrepancies from optimality. The optimal betting proportion is surrounded in each of these figures by a fairly wide band of almost equally advantageous proportions. To investigate the sensitivity to discrepancies from the optimal betting policy and the width of the tolerance band, the following efficiency measure (Rapoport, 1968) is introduced: z
= N
UN UN+
-
UN-
UN-
.
(18)
In expression (18) UN+ denotes the expected utility for an N-stage MBG resulting from employing an optimal betting policy, assuming a linear, logarithmic, or a power utility function. UN- denotes the expected utility for an N-stage MBG resulting from
BEHAVIOR
IN
BETTING
171
GAMES
2.10 2.lo4
2.00 l.903 ; j
lBo170-
1.60 l.50l.40-
0
I .l
I .2
I 3
I .a .4
I .5
I .6
I .7
I B
I P
I 1.0
lll.61
FIG.
1.
One-stage
expected
utility
for a logarithmic
function
and p = .6.
2052.00 1.95l.903 ;
1.85 -
i z
MO-
4 =
1.751.701.65-
0
1 1
I .2
1 3
I .4
, .5
I .6
, .7
, .8
, 9
1.0
lH.7)
FIG.
2.
One-stage
expected
utility
for a logarithmic
function
and p = .7.
172
RAPOPORT
0
I
II,,
.l
.2
.3
AND
.4
JONES
.5
1
I,,,
.6
.7
.a
P
l.0
ll(.l31
FIG.
3.
One-stage
0
expected
utility
for
a logarithmic
function
and
I
I
I
I
I
I
I
I
I
I
.I
.2
.3
.4
.5
.6
.7
.a
9
1.0
p =
.8.
p =
.6.
TTt.6) FIG.
4.
One-stage
expected
utility
for
a square-root
function
and
BEHAVIOR
IN
BETTING
173
GAMES
-
10.6-
9.9 0
I .I
I .2
I .3
I .4
1 .5
I .6
I .7
I .a
1 .9
1.0
lrt.71 FIG.
5.
One-stage
expected
utility
for a square-root
function
and p = .7.
1l.G 11.6 -
0
FIG.
6.
One-stage
I .I
I 2
expected
I .3
I .4
utility
I .5
I .6
I .7
for a square-root
I 8
I .9
function
I 1D
and p = .8.
174
RAPOPORT
AND
JONES
using the worst policy (in terms of the expected utility) for the corresponding utility function. UN indicates the gambler’s expected utility for an N-stage MBG and the given utility function, which results from betting a fixed proportion of his capital all the time. Clearly, UN- < UN < U,+, and hence 0 < Z, < 1. Fixing a value for Z, , one may solve for the value of UN in (18). The calculated value of UN determines a tolerance band surrounding the optimal betting proportion with lower and upper limits, r&p) and n,+(p), respectively. Table 1 presents the values of U, and the corresponding band limits, nrp(p) and nrf(p), for Z, = .99 and for all combinations of the two utility functions-the logarithmic and the squarerootand the three selected values ofp. Table 2 presents similar results for Z, = .95. TABLE
1
Value of U, and Lower and Upper Tolerance Limits of the Corresponding Proportion Band for 2, = .99 Utility
function -----~~.~.~ Logarithmic
p = .6 ~~~ 2.0025 (.04-.36)
Square
10.1811 (.28-.49)
root
~~~
TABLE
~~~
p = .7 ~~~ ~~~~~. 2.0315 (.27-.52) 10.7616 (.63-.74)
function
p = .6
2.0813 (.51-.68) 11.6464 (.83-.93)
II
Values of U, and Lower and Upper Tolerance of the Corresponding Proportion Band for Z, Utility
p = .8 ~-
Limits = .95
p = .I ~~.~ ~-.~
p = .8 .-___
Logarithmic
1.9773 (.OO-.54)
2.0145 (.lO-.65)
2.0716 (.39-.76)
Square
10.1126 (.14-.61)
10.7268 (.55-.81)
11.5799 (.75-.97)
root
Tables 1 and 2 show that when Z, is large the tolerance bands are approximately symmetric about the optimal proportions, and that they grow narrower as the value of p increases. More importantly, the bands are seen to be wide even for very high values of Z, such as Z, = .99. Thus, for example, whenp = .7, Y = s = 1, Z, = .99, and the gambler’s utility function is logarithmic, the band covers one quarter of the range of z-(.7). To realize the implications of this result, consider a gambler who does not know the true value of p = .7, but wishes to estimate it. His estimate may be
BEHAVIOR
IN
BETTING
175
GAMES
erroneous, but if it falls between the proportions .635 and .760, the gambler employing the optimal betting policy (2) will perform efficiently without appreciable decrement in the one-stage expected utility of his winnings. A slightly lower measure of efficiency2, = .95-allows for an even wider tolerance band for the gambler’s estimate-.550 to .825. Alternatively, the gambler’s estimate of p may be accurate but he may decide to bet any fraction of his capital between .10 and .65 without seriously decreasing the one-stage expected utility of his capital. Figures l-6 together with Tables 1 and 2 show the low sensitivity of the optimal betting policy to discrepancies from optimality. The results are restricted, however, to the one-stage MBG only, and one may inquire how they are affected by increasing N. It can be shown (Breiman, 1961) that if a gambler playing the N-stage MBG selects a policy which, in the long run (when N -+ co), does not become close to the optimal betting policy, then asymptotically he will be infinitely worse off. For the analysis of actual betting behavior in the MBG, where N is finite and small, the asymptotic results are not of great value. In particular, they are of no help at all in analyzing the betting decisions of the subjects who participated in our experiments, where the mean value of N for a problem was about 10. When the insentitivity of the optimal betting policy is measured by 2, , it can be shown that the insensitivity is great even when N increases. In particular, we show below that, in the case of a logarithmic utility function, 2, is independent of N. To show this we write 2, for the logarithmic utility function:
zl =
[p log(x + y) + p log@ - Y)l - [P l%(X + Y’> + 9 Mx - Y’N [P log@ + Y'> + 9 log(x - Y’>l
[p log@ + y”) + q log@ - y”)] -
p log[x( 1 + rr)] + q log[x( 1 - rr)] - p lo&+(1 + 41 - 9 hi+4 = p log[x( 1 + a*)] + q log[x( 1 - 7r”)] - p log[x(l + 41 - q log[41
p log[(l + VI/(1 + 4
+ q Ml
- n)/U + 41
= p log[( 1 + nl)/( 1 + ?r’)] + q log[( 1 - r”)/(l
- n’)] ’
- 4 - 41
(19)
where 0 S< n < 1 is a fixed proportion of capital bet by the gambler, 0 ,< z” < 1 is the betting proportion obtained from (2), and 0 < n’ < 1 is the betting proportion resulting in the worst one-stage expected utility. The expected utility for the N-stage MBG resulting from betting repeatedly the same proportion n is given by l,TN = pN log[x( 1 + n)NJ + C!&p‘+lq
-+ ... + C,NpRqN-Rlog[x(l
log[.x( 1 + x)N--l (I - Tr)]
+ 7r)R (1 - ?r)N--RI + ... + qNlog[x(l
- ?T)Nl, (20)
176
RAPOPORT
where
CRN = {N!/[R!(N
- R)!]}.
AND
JONES
E x p an d’m g and collecting
terms
in (20) we have
u, = log x + NpN-l[p log(l + r)] + (N - 1) C~~lpN-2q:Plog(l + 741 + C~~,pN-l[q
log(1 ~ n)] + e.. + RC&R-lqN-R[p
+ (Iv - R) C,NpqN-yq = log X + N[p
log(1 + 7r)]
(21)
log(1 - 7f)] + ... + Nq”-l[q log(1 - ?7)]
log(1 + 7r) + q log(1 - n)].
Z, is given by
z = log x + N[p log(l + n) + q log( 1 - n)l - log x - N[P log(l + 4 + q log(l - 41 N log X + N[jJ log( 1 + Tr”) + q log( 1 -XV)] - log X - N[p log( 1 + Tr”) + q log( 1 - 7T’)]
_ P l%[(l + n)/U + 41 + 4 k[(l - d/U - 41 = z, = z. p log[(l + n”)U + 41 + 4 log[(l - q/u - q1
(22)
Thus, the lower and upper tolerance limits given in Tables 1 and 2 for the logarithmic utility function hold for every value of N. For the power utility function it can be shown in a similar way that Z, is given by
zN =
Ml [p(l
+ n )~+l + q(] - n)M+l]N - [p( 1 + n’)“+l + q(1 - ~‘)~+ll~ + 7rr”)M,1 + q(] ~ &‘)M+l]N - [P(] + ~‘)~+l + q(1 - ~‘)~+l]~ ’
(23)
where rr, n’, and n” are defined as before. Here 2, depends on IV; it decreases when N increases. The rate of decrease is very slow, however, especially for values of r which are close to r” and yield high values for 2, . For example, consider the decrease in 2,) assuming p = .7, Y = s = 1, and a square-root utility function. The optimal betting proportion, 7~“, calculated from (11) is n” = .69, and the worst betting proportion is r’ = 1.00. For n = .75 we get from (23) Z, = .9883, Z, = .9879, Z, =z .9874 ,..., Z, = .9857 ,..., Z,, = .9822 ,..., Z,, = .9741. The implications of the high insensitivity to discrepancies from the optimal betting policy to the analysis of betting behavior is explored in the following sections, where results from a computer-controlled MBG are presented and discussed.
METHOD Subjects. The subjects were 51 male students They were assigned to one of three groups, I, II, at the laboratory. Each subject served individually decision experiment.
from the University of North Carolina. or III, on the basis of order of appearance for one session in a computer-controlled
BEHAVIOR
Tusk. following andr=s=
IN
BETTING
GAMES
177
Each subject was seated in front of the Flexowriter of an LGP-30 computer. The printed instructions provided a “cover story” for MBG with the value of p unknown, 1.
Imagine that there are only two players, player 1 and player 2, participating in a sequence of games. There is no possibility of a tie game. Consider the computer as your agent, whose job is to collect prior information about the condition of the players and to use it to predict which player will win. The computer’s prediction, of course, may or may not turn out to be correct. The experiment is conducted as follows: There are several problems in the experiment, Each problem corresponds to a different week, in which the same two players can play any number of games. At the beginning of each problem the computer will print the problem number and an initial sum for this problem. This is your starting capital for your first decision. The computer will print out a message, predicting which player will win. (It will print either 1 or 2.) Before the game is played you will put a wager on the player predicted to win. This wager may range between zero and your total capital, i.e., you may wager no money at all, part of your capital, or all of your capital. The game will then be played and the computer will print out the outcome. If the player predicted to win the game actually wins, the amount of your wager will be added to your capital. Otherwise, you will lose your wager. The computer will print out your capital after you make your decision. This capital will be equal to your initial capital plus (if the prediction is correct) or minus (if the prediction is wrong) the amount of your wager. This capital is your starting capital for your next decision. You will then proceed to the next game. The same procedure will be followed; i.e., (1) the computer will tell you which player is predicted to win; (2) you will make your decision ; and (3) the computer will print out the outcome of the game and then compute and print out your new capital. A problem will be terminated on one of two occasions: 1. After a predetermined number of games which may vary from problem to problem (i.e., from one week to another). 2. If you lose all of your capital.
In addition, there were four features of the experiment that the subject was asked to remember: (a) The number of problems in the experiment was fixed and unknown. (b) The number of games (stages) in a given problem was fixed and unknown. It varied considerably from one problem to another. (c) The parametric proportion of correct predictions, p, assumed the same vaIue for all problems. (d) The subject’s decision could not affect the outcome of a game. The values ofp were p = A, p = .7, and p = .8 for Groups I, II and III, respectively. For all subjects in each group, the same sequence of successes(correct predictions) and failures (wrong predictions) was presented. If a subject failed to complete a problem because of losing all of his capital he was moved to the next problem and received the same sequence of successes and failures as another subject who completed all games in the previous problem. The subject’s capital was expressed in points, each of which was worth l/l0 of a cent for subjects in Group I, l/30 of a cent for subjects in Group II, and I /IO0 of a cent for subjects in Group III. The accumulated points for each problem were converted into
178
RAPOPORT
AND
JONES
money at the end of each problem. The accumulated gain was paid to a subject at the end of the experiment. The mean gain was $1.69, $0.97, and $1.14 for Groups I, II, and III, respectively. The difference among the means was not significant, F(2,48) = .71, p > .20. After reading the instructions, the subject was left alone to perform the task. The printed messages, subject’s bets, and the outcomes of the games formed a record to which he could refer throughout the experiment. The typical subject completed the task in about 75 min. The number of games (stages) in each of 10 problems, n,, , w = 1, 2,..., 10, and the initial amount of capital (in points) for each problem, x,,,, , are given in the first two rows of Table 3.
TABLE
Number of Stages, Initial Capital,
3
and Number
of Subjects
Completing
Each Problem
Problem 1
2
~nw
4
5 ~~~
6 ~~.
6
17
11
2
14
~~~ 8
I ~~~~ ~~
8
9
10
20
5
15
12 40
80
140
180
120
20
200
60
160
100
I
17
13
16
17
9
17
11
17
11
14
II
17
12
15
17
17
16
3
14
14
15
III
17
13
14
16
16
16
9
17
15
17
"%/JO Group
3 ~~~~~ __--
At the end of Problems 3,6, and 9, the subject was asked to estimate the percentage of time that the computer wouId be correct in future predictions. He did so by typing a two-digit number on the Flexowriter. At the end of the experiment, following Problem 10, the subject was asked to answer several questions that appeared on a final questionnaire. In the last question, the subject was asked to estimate the proportion of successes (correct predictions) in the experiment.
RESULTS
The total number of games in the experiment was 110. Several subjects, however, lost their entire capital on certain problems, and played fewer than 110 games. The
BEHAVIOR
IN
BETTING
179
GAMES
mean number of games played was 94.88 92.12, and 100.59 for Groups I, II, and III, respectively, with corresponding standard deviations of 12.96, 10.56, and 9.01. (Differences among the groups in the mean number of games played failed to reach significance at the .05 level.) The number of subjects completing each problem is listed in the last three rows of Table 3. To be stopped on a given problem a subject had to bet and lose his entire capital on the predicted message of a certain game. Table 3 shows that more than half the subjects in Group II and III failed to complete the games in Problem 7, which had the largest number of games (n, = 20) and a relatively small initial capital. Eight out of 17 subjects in Group I bet all of their capital on the first game of Problem 5-the one with the smallest initial capital. On both problems, when their capital was relatively small many subjects preferred to risk their entire capital, and thus either to increase it in the fastest way possible or to lose it and move to a more favorable game, with a larger initial capital.
TABLE Means and Standard Deviations of p at the End of Problems
4 of the Estimates 3, 6, 9, and 10
Problem 3 Group
Mean
6 SD ~... -~-
Mean
9 SD ~~~ ~~.
10
Mean
SD -._~__~~.
Mean
SD -~~
I
,641
,080
.648
.I37
.556
,139
.628
II
.671
,092
,725
,104
.632
.136
.684
.074
III
,739
.154
.784
.140
,801
,078
.815
.070
.041
Table 4 presents the means and standard deviations of the estimates of p at the end of Problems 3, 6, 9, and 10. It may be recalled that a subject’s estimates at the end of Problems 3, 6, and 9 are his typed estimates of the proportion of future successes, while the estimate after Problem 10 refers to the proportion of successes in the games already played. The mean estimates for all groups are seen to be rather accurate. In only one case is there a significant difference from the true value of 9. The accuracy of the estimates is consistent with the prediction derived from the Bayesian learning model (17). The model predicts rapid learning of the value of p, say after two or three problems. Assuming the prior personal probability distribution
180
RAPOPORT
AND
JONES
go(p) to be a beta distribution with c(,-,= /3s = I, the Bayesian estimate for p = .6 at the end of Problem 3 (after 21 successes and 13 failures) equals .611 compared with the observed mean of .641 and standard deviation of .080. We will assume, then, that the value of p was estimated correctly from Problem 4 on, and analyze only the results of Problems 4-10. The main dependent variable for the analysis is the proportion of the amount of capital wagered by the subject on a given gamej, yzL’?, w = 4, 5 ,..., lO,j = 1, 2 ,..., n,, , where n, is the number of games actually played by the subject on Problem w. The optimal betting policy model predicts that ywj is constant over games. To test this prediction, each subject’s games in Problem w were divided into the first and second halves, discarding an odd game in the middle of the series if necessary. Only subjects who played at least two games in a problem were considered, and hence the results of several subjects were excluded from the analysis of a few problems. A simple test for whether the proportion bet is the same in the first and second halves is a paired t test on the difference scores. Twenty-one paired t tests were computed, 7 for each group. In 13 cases the mean difference scores were positive (the mean proportion bet was larger on the first half than on the second half), and in 8 cases they were negatve. Only 4 t tests yielded significant differences at the .Ol level. Thus, this simple test of stationarity shows only few significant differences between the first and second half of every problem in terms of the proportion of capital bet by subjects. However, it is a very weak test of the model. It provides no evidence concerning whether the proportion of capital bet is constant over subjects, nor whether the mean proportion bet by a subject has the same value in all problems. To study interproblem changes in the mean proportion bet let yw indicate, for any subject, the mean proportion of capital bet on Problem w, i.e.,
The means and standard deviations of yw , pw and SD(y,) respectively, are presented in Table 5 for each group separately. Table 5 shows large standard deviations and considerable changes in the mean proportion bet from one problem to another. One-way analysis of variance with repeated measures revealed problem effects significant at the .Ol level for each of the three groups. The significant interproblem differences among the mean proportions bet provide evidence against the optimal betting policy model. They do not, however, suggest explanations or possible reasons for the discrepancy from optimal behavior. To understand better the factors affecting subjects’ betting policies and to demonstrate the nature of the inadequacy of the model, we can formulate and test two hypotheses which are directly implied by any stationary proportional Markov betting policy.
BEHAVIOR
IN
BETTING
TABLE Means
and Standard
Deviations
of Mean
181
GAMES
5 Proportion
Bet on Each Problem
Problem Group
4
5
6
I
8
9
10
I
Mean SD
,298 .206
.518 .272
,267 ,216
.420 .I86
.310 ,236
.431 ,279
,442 .290
II
Mean SD
.384 .183
.351 .170
,485 .217
.660 .212
.438 ,273
,421 ,248
.485 ,250
III
Mean SD
.593 .280
.450 .207
.535 ,203
.696 ,224
,611 ,265
.542 .I 94
,600 ,188
The first hypothesis asserts that the proportion of capital wagered, yWj , is independent of outcomes of games played at prior stages. The second hypothesis states that ylnoj is independent of the amount of the subject’s capital, xWj . The two hypotheses are dependent, since a run of several successful outcomes will increase the subject’s capital, while several consecutive failures will decrease it (even though it might still be large in its absolute value). To test the first hypothesis, xWj should be fixed before the effect of outcomes of preceding games is assessed. To test the second hypothesis, the effect of the outcomes of preceding games should be controlled before the relation between xWj and ~~,~+i is investigated. The second hypothesis is tested by studying the relation between a+, , the initial capital on Problem w, and yW1 , the proportion bet on the jirst game of Problem w. Since capital gained on Problem w - 1 cannot be wagered on Problem w, successes and failures in Problem w - 1 presumably had no effect on yW1 . Furthermore, xWOassumes the same value for all subjects. Table 6 presents the means and standard deviations, ywl and SD( yWl) respectively, of ywl for Problems 4 to 10. Results are presented separately for each group. One-way analysis of variance with repeated measures revealed significant problem effects at the .Ol level for every group. Values of JW1 as a function of x,a are plotted in Fig. 7. The problem number, w, is written below the corresponding a+, . Figure 7 shows clearly that the mean proportion bet on the first game is a decreasing function of initial capital for all groups-a finding that flatly rejects the second hypothesis. The drop in rWl is relatively large. For example, the differences between mean proportions bet on the first games of Problems 5 and 6, where xs,a = 20 and x6,0 = 200, are .219, .310, and .137 for Groups I, II, and III, respectively. 480/7/1-12
182
RAPOPORT
AND
TABLE Means
and Standard
Deviations
JONES
6
of First
Proportion
Bet on Each Problem
Problem Group
4
5
6
7
8
9
10
I
Mean SD
.264 .213
.444 .354
.225 ,213
.427 .298
.359 .200
.365 .302
,540 ,343
II
Mean SD
,428 .237
.747 ,291
.437 .292
.511 .282
.487 .256
.612 .363
.640 .324
III
Mean SD
.554 .294
.644 ,271
.507 .243
.690 .234
.555 .248
.644 .266
.610 .225
FIG.
Problems
7.
Mean 4-10.
proportion
20
40
(5)
(lo)
60 so 100 (9)
(7)
of capital
bet
120
on first
To test the first hypothesis there must be with the same value of available capital. Then on the corresponding proportion bet should in finding the appropriate subset of games.
140
(4)
trial
160 180 200 (6) (6) as a function
of initial
capital
on
found, for each subject, a subset of games the effect of outcomes of preceding games be assessed. Two difficulties are involved First, there is more than one subset for
BEHAVIOR
IN
BETTING
183
GAMES
every subject, depending on the value selected for capital, xwj. Secondly (more importantly), inspection of individual data indicates that there were typically only two or three games in which xmj assumed exactly the same value. To overcome these difficulties two alternative approaches may be pursued. The first consists of studying the effect of outcomes of preceding games on the proportion bet, disregarding the relation between ywi and xwj . This approach makes certain that all the games played by every subject will be included in the analysis. Let y(z) indicate the mean proportion of capital bet by a subject on a certain game after a run of z consecutive successes, z = 1, 2,..., 5, and let y(x’) indicate the mean proportion bet by a subject after a run of z’ consecutive failures, a’ = 1, 2, 3. Longer runs of successes and failures are not considered, as they occur in the experiment with very low frequencies. Table 7 presents the means, M, of y(z) and ~(a’) taken over
TABLE Mean Proportion
I
Bet after z Consecutive Successes and z’ Consecutive Failures Successes
Group I
z=l
.2=2
x=3
Failures 2=4
z=5
,g = 1 z’ = 2
2’ = 3
M N
.332 41
.306 23
.289 10
.276 5
.294 3
.379 28
.386 13
,318 2
II M N
.374 50
.356 31
.324 18
.250 9
.215 5
.448 I9
.542 4
,713 1
III M N
.533 58
.516 45
.496 34
.461 24
.434 17
.575 11
.627 2
,627 1
Problems 4-10 for all subjects in a given group. N refers to the number of trials for which the means were computed. Table 7 shows the mean of y(z) to be a decreasing function of z for all groups. The behavior of the mean of y(z’) is less similar from group to group. Interpretation of these results is complicated by the evidence for the negative relation between xwj and ywj . Were this relation partialed out, the decrease in the mean of y(a) as a function of z would probably be steeper, while the mean of ~(a’) would increase as a function of z’. The second approach to test the first hypothesis is to select a “reasonable” range of values for x,$ , .a- to xf, in which the subject may be assumed to be in approximately the same financial status, and then to study the effect of preceding outcomes on the corresponding ywj . Clearly, the wider the range selected, the more values of x,.~
184
RAPOPORT
AND
JONES
and hence of yluj are included in the analysis, but the weaker is the justification for assuming that the subject is in the same financial status. We selected x- = 160 and XT = 200. The resulting mean number of games included in the analysis for which 160 < x,~ < 200 was 14.06, 15.71, and 11.29 for Groups I, II, and III, respectively. Because of the relatively small number of games analyzed, the effects of only four types of outcome run were studied: consecutive failures on Games j - 2 and j - 1 (-, -); a success on Game j - 2 followed by a failure on Game j - 1 (+, -); a failure on Game j - 2 followed by a success on Game j - 1 (-, +); and consecutive successes on Games j - 2 and j - I (+, +). For each run type a mean ytCj was computed for every subject on all games for which 160 < x,,~ < 200. (The number of subjects differed from one group to another since there were several subjects, most of them in Group III, with no (-, -) or (-, +) runs in this restricted range of capital.) Then the mean and standard deviation were determined over subjects in each group (Table 8).
TABLE Means
and Standard
Deviations
8
of Proportions
Bet after
Each Type
of Two-Outcome
Runs
Run Group
(-,
-1
c-t,
-1
(-,
i-1
c+,
+)
I
Mean SD
.330 .200
.261 .I73
.245 .174
,199 ,152
II
Mean SD
.380 .226
.374 .I76
.400 .253
.279 .161
III
Mean SD
,665 .228
.488 .219
.512 .270
.380 .164
Two paired t tests were computed for each group, one comparing the effects of the (-, -) run and the (f, +) run, and the other comparing the effects of the (+, -) run and the (-, +) run. The first comparison yielded the following results: t(16) = 4.03, p < .Ol, t(l5) = 1.78, p < .lO, and t(7) = 2.11, p < .lO, for Groups I, II, and III, respectively. The differences between the (-, +) run and the (+, -) run were not significant for all groups. We conclude that there is a tendency in all groups to wager a larger proportion of capital following two failures than following two successes.
BEHAVIOR
IN
BETTING
GAMES
185
DISCUSSION
It is clear that many subjects did not consistently employ a proportional Markov betting policy. Their betting policies were not Markovian since they were affected (admittedly to a small extent) by the outcomes of the immediately preceding games. Furthermore, the proportion of capital bet was negatively related to the amount of capital available on a given game. At the same time, fairly good agreement was reached between the mean proportion bet and the optimal betting proportion for a logarithmic utility function. The mean proportion bet for Problems 4-10 in Groups II and III came close to the betting proportions derived from the optimal betting policy (2). The poorer fit for Group I is partially explained by subjects’ tendency in this group to overestimate the value of p. When p is replaced by its mean estimate the discrepancy between the mean proportion bet by subjects in Group I and the optimal betting decision is reduced. Discrepancies between the optimal betting proportions (2) and the mean proportions bet should be interpreted relative to the insensitivity of the optimal betting policy, studied in an earlier section. Comparison of results in Tables 5 and 2 shows that 20 out of 21 mean proportions fall within the .95 tolerance bands surrounding the optimal betting proportions for the logarithmic utility function. It may be recalled that the betting policy (2) may be derived from two different assumptions. One postulates a maximization of expected utility of capital at the end of N stages, where the utility function is taken to be logarithmic. The other postulates a maximization of expected rate of growth of a subject’s capital for the whole duration of the process. Thus, the maximization of E(log X, 1 X,,) results from two conceptually different assumptions concerning a subject’s betting behavior. The second of these assumptions seems the more compelling. It is consistent with findings from gambling experiments (Tversky, 1967) s h owing that, for the range of money used in these experiments, utility is mole OY less linear in money for most subjects. Secondly, the latter assumption comes closer to our notion of rational gambling behavior in the present experiment. Note that the first assumption postulates a maximization of the expected utility of capital at the end of the process, given the initial capital available to a subject. As the duration of a problem, n, , was unknown in the experiment and varied considerably from one problem to another, it seems more instructive to assume that the subject was trying to maximize the expected rate of growth of his capital over all stages. The betting policy (2) is appealing, but also has its faults when taken to explain a subject’s betting behavior. First, it is based only on expectation and does not take into consideration possible preferences of a subject for various dispersions of the outcomes, which have been suggested by Coombs and Pruitt (1960) to be another significant variable affecting betting behavior. Secondly, it implies that the loss of a subject’s capital is the worst thing that could happen to him. This implication, which may hold
186
RAPOPORT
AND
JONES
in certain economic multistage decision processes, is not sustained in the present experiment. It was stated by many subjects that they did not regard the loss of their capital on a given problem as disasterous since they could move to another problem and have “a new start.” Perhaps if the experiment were to consist of only one long problem, this implication would be more valid. The results suggest that subjects played the MBG efficiently in terms of maximization of expected rate of growth of their capital, allowing the amount of capital and outcomes of preceding games to have a relatively small effect on their betting decisions. This statement, however, is only a very gross summary of the subjects’ decision behavior in light of the insensitivity of the optimal betting policy, resulting in wide tolerance bands for even very high values of 2. It should better be taken as an hypothesis to be tested in further experimental MBGs with a known p and Y rf S, which allow for greater sensitivity of the optimal betting policy to departures from optimality.
REFERENCES BECKER, G. M., DEGROOT, M. H., AND MARSCHAK, J. An experimental study of some stochastic models for wagers. Behaoioral Science, 1963, 8, 199-202. BELLMAN, R., AND KALABA, R. On the role of dynamic programming in statistical communication theory. IRE Transactions on Information Theory, 1957, IT-3, 197-203. BELLMAN, R., AND KALABA, R. On communication processes involving learning and random duration. IRE National Convention Record, 1958, 4, 16-20. BERNOULLI, D. Exposition of a new theory on the measurement of risk. Econometrica, 1954, 22, 23-36. BREIMAN, L. Optimal gambling systems for favorable games. In J. Neyman (Ed.), Proceedings of the fourth Berkeley symposium on probability and mathematical statistics. Vol. 1. Berkeley: University of California Press, 1961. Pp. 65-78. COOMBS, C. H., BEZEMBINDER, T. G. G., AND GOODE, F. M. Testing expectation theories of decision making without measuring utility or subjective probability. Journal of Mathematical Psychology, 1967, 4, 72-103. COOMBS, C. H., AND PRUITT, D. G. Components of risk and decision making: Probability and variance preferences. Journal of Experimental Psychology, 1960, 60, 265-277. DAVIDSON, D., SUPPES, P., AND SIEGEL, S. Decision-making: An experimental approach. Stanford: Stanford Univer. Press, 1957. EDWARDS, W. Variance preference in gambling. American Journal of Psychology, 1954, 67, 441-452. EDWARDS, W. The prediction of decisions among bets. Journal of Experimental Psychology, 1955, 51, 201-214. FERGUSON, T. S. Betting systems which minimize the probability of ruin. Journal of the Society of Industrial Applied Mathematics, 1965, 13, 795-818. KELLY, J. L., JR. A new interpretation of information rate. The Bell system Technical Journal, 1956, 35, 917-926. MARSCHAK, J. Decision making. Los Angeles: Western Management Science Institute, Working Paper No. 93 (Revised), 1966.
BEHAVIOR
IN
BETTING
GAMES
187
F., AND NOGEE, P. An experimental measurement of utility, Journal of Political 1951, 59, 371-404. MURPHY, R. E., JR. Adaptive processes in economic systems. New York: Academic Press, 1965. RAIFFA, H., AND SCHLAIFER, R. Applied statistical decision theory. Boston: Harvard University, Graduate School of Business Administration, 1961. RAPOPORT, A. Choice behavior in a Markovian decision task. Journal of Mathematical Psychology, 1968, 5, 163-181. SAVAGE, L. J. The foundations of statistics. New York: Wiley, 1954. STEVENS, S. S. Measurement, psychophysics, and utility. Ia C. W. Churchman and P. Ratoosh (Eds.) Measurement: Definitions and theories. New York: Wiley, 1959. TVERSKY, A. Utility theory and additivity analysis of risky choices. Journal of Experimental Psychology, 1967, 15, 27-36. MOSTELLER,
Economy,