Optional stopping on ascending and descending series

Optional stopping on ascending and descending series

0RGAI~IZATIOI~ALBEttAVIOR AIq-D ~ctUI'cIAI'~PERFORIVIA:5~CE 7~ 53--62 (1972) Optional Stopping on Ascending and Descending Series PI-I ILIP BRICKMAN...

573KB Sizes 0 Downloads 58 Views

0RGAI~IZATIOI~ALBEttAVIOR AIq-D ~ctUI'cIAI'~PERFORIVIA:5~CE 7~ 53--62

(1972)

Optional Stopping on Ascending and Descending Series PI-I ILIP BRICKMAN

Northwestern University For an optional stopping task on which the value of the alternatives is a positive or a negative function of their sequential position, the optimal decision rule for unknown distributions ("sample t h e first X and then select the next new maximum") yields much too short search on the ascending series and much too long search on the descending series. One hundred and twelve undergraduate subjects playing such a game did in fact continue much longer on a descending series t h a n on an ascending series. T h o u g h subjects playing the descending series eventually learned to optimize b e t t e r than subjects playing the ascending series, the latter saw themselves as more successful on the task. Implications are discussed for real world situations in which people fail to maximize their gains in an improving enviromnent, or fail to minimize their losses in a deteriorating environment.

Since many decisions come in the form of a choice between accepting a currently available outcome or rejecting that outcome in favor of a continued and uncertain search process, the study of when people are willing to stop searching and when they prefer to continue promises to increase our understanding of how individuals make important life decisions and adaptations. Furthermore, as tasks for investigation, optional stopping problems have the advantage of having been extensively studied by mathematicians (e.g., Gilbert & Mosteller, 1966; 1ViacQueen & Miller, 1960) so that optimal decision rules are known and subjects' behavior can be compared with the behavior prescribed by the normative model. Empirical research on subjects' behavior in optional stopping tasks has indicated that subjects in general behave quite optimally (Kahan, l~apoport & Jones, 1967; Rapoport & Tversky, 1966, 1970). No one has yet investigated, however, how readily subjects can cope with a series of observations that itself may be such as to reward or punish search, or how readily subjects can cope with shifts in the nature of the series they are searching. The case of series whose alternatives increase or decrease in value over time is interesting for two reasons. First, if the distribution of values in the series is unknown to the subject, application of the optimal strategy derived for sampling stationary unknown series 53 © 1972 by Academic Press, Inc.

5~

PI-IILl P BRICI~IMAI'~

(Gilbert & Mosteller, 1966) will yield highly nonoptimal results. This optimal strategy requires the player to pass the first X observations (where X is a function of N, the number of possible observations) and then select the first alternative after X that exceeds the largest or most attractive choice viewed up to that point. In an ascending series, however, if one passes the first X values and then chooses the next new maximum, one will stop much too quickly. In a descending series, a new maximum beyond X will be impossible, so that application of this stopping rule will lead to search through less and less desirable outcomes until the end of the series. The case of ascending and descending series is also interesting as a possible tool for exploring the dynamics of adaptation to changing environments. If subjects follow the optimal decision strategy in such cases, they will be failing to maximize their gains in an improving environment, and failing to minimize their losses in a deteriorating environment-despite whatever evidence they accumulate about the nature of that environment (i.e., the form of the distribution). There are many real life situations in which people apparently fail to maximize their gains or minimize their losses. Our understanding of these instances will be increased if we can trace some of them back to a rational regulation of search behavior. In addition to studying optional stopping on ascending and descending series, the present study will also examine how well subjects adapt when switched from ascending series to descending series, or vice versa, and how this adaptation may be affected by feelings of success or failure on earlier stopping problems. METHOD

~ubjects. Subjects were 113 University of Michigan undergraduates enrolled in the introductory psychology courses; they received course credit for their participation in the experiment. One subiect was discarded for his failure to read the cards used in the game properly. Of the 112 remaining, 57 were males and 55 were females. Experimenters. Five experimenters, all males, each ran subjects in all conditions. The game. Subjects were told that they would play a simple game, the object of which was to win as many points as possible. In each game, the subiect viewed a series of numbers, one at a time, by turning over the top card in a deck. tie then decided either to keep that number or to reject it. The number he kept represented the number of points he won (or lost, if it was negative) on that game. If he rejected a card, he could turn over the next one; but once he had rejected a card he could not go

STOPPING ON ASCENDING AND DESCENDING SERIES

55

back to it. Each deck contained 20 cards all told, and the subject was told that if he reieeted the first 19 cards he would have to accept the last one: Reinforcement of search. The ascending decks or A Decks were constructed by looking up two-digit numbers in a table of random numbers and then adding to them ten times the number of the trial. Thus the values in the A Decks are distributed as Y = 10T ~-X , where T, the trial number, ranges from 1 to 20 and X is a random variable uniformly distributed from 0 to 99. The descending decks or B Decks were constructed by subtracting 10T from the two-digit random numbers. Thus the values in the B Decks are distributed as Y = - - 1 0 T - t - X . The addition of the random variable makes each series only approximately monotonic, but the general nature of the A Decks encouraged search while the B Decks discouraged it. In addition to containing score values, the cards in all decks were numbered from one to twenty so that subjects could keep track of how many cards they had viewed and how many were remaining for them to view. Calculation of optimal strategies ]or A and B series. Most generally, the subject should not accept an outcome when the expected value for continued search exceeds the value of that outcome. For example, to decide whether to keep or reject card 18, we must know the expected value of a subsequent choice between keeping card 19 or going on to card 20. The optimal expected value of rejecting card 18, or the expected value of optimal choice among cards 19 and 20 (for the A Deck), is:

EV(T19 or T20) = (EV(T19)[T19 > 250) N P(T19 > 250) + EV(T20) N P(T19 < 250) = (270) × (.4) + (250) × (.6) = 258. Thus card 18 should be rejected unless its value exceeds 258, in which case it should be accepted. An extension of the same reasoning yields a critical value of 260 (rounded) for card 17, and 261 for card 16. Since card 16 can never exceed 260, it, and all earlier cards in the A Decks, should never be selected. In the B Decks, to calculate the appropriate decision number for the very first choice, i.e., card one, we must calculate the optimal expected value for the complex choice among the next nineteen cards--obtaining, in the process, the optimal decision values for all subsequent choices as well. The results, rounded up, rule that we should accept a value of greater than or equal t o 46 oa the first card; greater than or equal to

56

P H I L I P BRICKMAN

37 on the second card; and 26, 15,6, and --4 for the third, fourth, fifth, and sixth choices, respectively. Manipulation of ]eedback. At the end of the fifth game, the experimenter told the subject that he would add up the subject's scores and tell him how he was doing. Subjects were randomly assigned tO one of two feedback conditions, success or failure. In the success condition, they learned that their performance on the first five games placed them at the 91st percentile for college students; in the failure condition, they were told that their performance placed them at the l l t h percentile. Postmeasure and debriefing. After playing game ten, subjects rated how well they themselves felt they had done on the first five games and the second five games. These questions were inadvertently omitted for the first 37 subjects (about equally distributed over conditions), so that results on these measures are based on only two-thirds of the total sample (i.e., N --- 75). Finally, subjects were invited to discuss the experiment and give their impressions of it, and were in turn given oral and written feedback by the experimenter on the nature and purpose of the study. RESULTS

First series. Initially subjects encountering the B Decks searched much more than subjects encountering the A Decks, but by the third game this relationship had reversed itself (see Fig. 1). The differences between the A and B Decks in mean search are highly significant (p < .001) at all five games. The variances of the search scores were not homogeneous, but rather variance was an inverted U function of mean search i.e., cells with intermediate search had most variance while those with either very little or very much search had least variance. For the points represented in Figure 1, the mean standard deviation is 5.98, with the largest being 8.37 and the smallest 1.81. An arc sin transformation (Winer, t962) did not change the results of the analyses of the untransformed data, nor did use of a nonparametric test (the median test; see Siegel, 1956). The mean number of times subjects in each condition stopped on the cards recommended by the optimal decision rules are presented in Table 1. Subjects playing the B Series made significantly more optimal choices than subjects playing the A Series (Fl,~o~= 24.54, p < .001). On the other hand subjects playing ~he B Series rated themselves as significantly less successful than subjects playing the A Series (F1,6~ = 5:06, p < .05). Since in neither series was the subject allowed to see cards beyond his stopping point, only on the B Decks did nonoptimal subjects receive clear evidence of their nonoptimality (as alternatives get worse.) Feedback. The manipulation of feedb~c~ wa.~ ,quite successful in

STOPPING ON

ASCENDING

AND

DESCENDING

SERIES

57

20 AS

18

16

14

12

10

8

6

4

2

BB

1

2

3

5

4

7

6

First Series

8

@

Second Series Trial

Fie. 1. N u m b e r of cards viewed on ascending and descending series.

affecting subjects' perceptions of their success (Ft,67 = 118.87, p < .001). But feedback itself did not affect subsequent length of search or optimal stopping, nor did it interact with experience on ascending o r descending series in determining length of search, optimal stopping, or perceived sueTABLE i ]~FFECTS OF ASCENDING AND DESCENDING SElZI]~S ON OPTIMAL STOPPING AND PERCEIVED SUCCESS

Trials 6-10 Trials 1-5 N a t u r e of first series A B

Nature of series Ascending (A) N

Optimal stopping Perceived success

=

56

0.30

(0.60) a 5.51 (3.59)

Descending (B) N

=

56

A N

=

Nature of second series B A B 29 N = 27N = 2 8 N = 28

1.25

1.14

0.89

0.43

3.32

(1.27) 4.30 (3.89)

(1.41) 7.72 (1.36)

(0.89) 2.06 (1.71)

(1.07) 6.95 (2.46)

(1.06) 5.85 (1.69)

S t a n d a r d deviations are in parentheses.

58

PHILIP

BRICI~MAN

eess. i strong trend was that A-Deck subjects given failure feedback subsequently persisted longer than did A-Deck subjects given success feedback, while there was no corresponding difference for subjects initially experiencing the B Decks (F1,10, = 3.86, p < .o6). A-Deck subjects fold they were doing poorly could often guess that there must be better values later in the deck, while B-Deck subjects doing poorly apparently found failure less diagnostic of their difficulties. S e c o n d series. Figure 1 also reveals that subjects who were shifted from one type of series to the other remained strongly influenced by their initial experience. Subjects who experienced the B Decks on the first five trials never at any point on trials 6.-10 searched as long as subjects who experienced the A Decks (p < .001 at every trial). AA subjects searched most, but AB subjects persisted next most despite being negatively reinforced. BB subjects searched least, but BA subjects never searched very extensively despite occasions for positive reinforcement. As in the series prior to feedback, subjects encountering the B Decks were initially provoked into more extended, costly search, while subieets encountering the A Decks stopped prematurely with relatively modest gains. By game seven the direction of these differences is reversed and thereafter subjects playing the A Decks search longer than thoses playing the B Decks. In the optimality of their choices, subjects who encountered second series that were opposite to their initial experiences made many fewer optimal choices than subjects who encountered second series that were consistent with their initial experiences (see Table 1). Building on their earlier advantage, subjects who began on the B Series and continued on the B Series were most efficient of all in utilizing their knowledge of the distribution to make optimal choices. BB subjects actually chose the best alternative, as specified by the optimal decision rule, about twothirds of the time, while the next best, group (the AA group) did no better than about one-fourth optimal choices. The superiority of the BB group contributes not only to the highly significant First Series X Second Series interaction, (F1,~o4-~ 69.06, p < .001), but also to the significant series main effects (both p < .001). In their perceived success of the second five trials, subjects playing second decks consistent with their earlier experiences (i.e., and AA and BB groups) rated themselves as relatively more successful than subjects playing different decks (F1.67 = 27.25, p < .001). Subjects playing A Decks again rated themselves as more successful than subjects playing B Decks (F1,67 = 55.1Ax, p < .001), and those initially playing the B Series rated themselves as more successful on the second series (F~,67 -= 10.75, p < .005). The principal CONtribution tO these effects is the very

STOPPING ON ASCENDING AND DESCENDING SERIES

59

low perception of success by the AB group who persisted in their disastrous searches. The other group experiencing a shift in environment, the BA group, rated themselves as quite successful, though in fact their performance was far from optimal. Inopportune strategy on an ascending series thus seems less distressing, perhaps because it is less visible, than inopportune performance on a descending series. DISCUSSION First of all it should be noted that subjects in the present experiment were not informed about the nature of the series they would be viewing, or that these series would be nonstationary. Thus even if their initial behavior is considered nonoptimal (according to a decision rule that assumes full information about the nature of the distribution), this should not be taken to contradict previous findings indicating that subjects can respond optimally to nonstationarity when they know about it (Chinnis & Peterson, 1968, 1970). The task of the present study is to account for and consider the implications of subjects' stopping behavior on ascending and descending series that subjects may not recognize as being nonstationary. Descending series and open-ended commitments. Why did subjects playing the B Decks incur such overwhelming losses, persisting in the face of unequivocal evidence that their options were getting worse? Sampling the first few alternatives in order to set standards for an aeceptable outcome is of course the rational strategy. However, the expeetaneies so induced also seemed to cause subjects to shrug off evidence that the series is steadily descending. Subjects persisted hoping, despite the downward trend of the data, that things would take a turn for the better. In their post-experimental comments subjects indicated that having gone so far made it difficult to stop and somehow reasonable to continue just a little further. They had not initially committed themselves to risking the number of points they eventually lost. After having rejected the first few options, however, they gradually became committed to recovering what they had lost. Thus a major commitment grew out of a step-by-step involvement that proved to be open-ended--a process that might be called "open-ended commitment." It may well be that more commitment processes occur in this fashion than in the relatively dramatic and discrete fashion implied in standard treatments of dissonance theory (e.g., Brehm & Cohen, 1962). For instance, such a process would seem to be a reasonable characterization of the American involvement in Vietnam. On another level, amateur investors and amateur card players are considered by professionals as particularly vulnerable to open-ended commitment. Stock brokers feel

60

PHILIP BRICKMAN

that amateur investors fail to minimize their losses because they are too reluctant to sell a stock when it begins to drop; professional gamblers capitalize on the amateur card player's inability to resist staying in a hand after the stakes (including his own) have become large, even though the odds against his winning have become even larger. Thus in both areas folk wisdom fruitlessly cautions "never throw good money after bad." Ascending series and standards as ceilings. Not only did subjects playing the ascending series perform relatively less well than subjeets playing the descending series, but they were blissfully ignorant of their under-achievement. Again, it is rational to take the initial alternatives of an unknown sequence as the standards which a future alternative must exceed. But if the commitment to these standards interferes with subjects' recognizing that the sequence of alternatives is increasing in value, these standards have acquired a value that may be more than rational. The standards then actually turn into ceilings on subjects' performances. The notion of standards as ceilings that limit performance has not been as clearly worked out as the complementary notion of standards as "floors" that motivate performance (Siegel, 1957), although sociologists have recognized this problem in standards that management sets for workers (Gouldner, 1954). The lack of attention to the ceiling function may have occurred because such ceiling functions are generally unobtrusive, whereas floor functions are motivationally salient. Nonetheless, factors that limit search have been demonstrated to have important social consequences. Beez (1968), for instance, showed that teachers with low expectancies for students may sharply limit how much the students learn simply by unwittingly placing a ceiling on students' exposure to information. The most dramatic ceiling effects were shown by the group switched from a descending series to an ascending series (the BA group). The switch for the BA group did not constitute a sharp expectancy disconfirmation since the initial values of the A and B series fall within the same range, so that there is no immediate clue that the decks have changed. The switch in decks emerges unmistakably only as search continues. The price the BA group paid for truncating their search thus resembles in some ways the price paid by the animals in traumatic avoidance studies who continue to leave the test chamber so quickly that they never learn that punishment has been discontinued (Solomon, Kamin & Wynne, 1953; Dinsmoor, 1954). Possibly, also, the apparent willingness of BA subjects to settle for modest immediate rewards rather than delay their choices in what they feel is the faint hope of larger rewards should call to mind the findings by Misehel (1958, 1961) that children

STOPPING ON ASCENDING AND DESCENDING SERIES

61

from impoverished environments or children low in achievement motivation are less willing to seek delayed rewards. F r o m all these cases, it seems clear t h a t simply placing subjects in an environment t h a t will reward exploration or investment will not automatically lead t h e m to recognize and extract m a x i m u m benefits from their new opportunities. Start and end points. I n the present study the decision was made to begin the ascending and descending sequences at the same point, after which they diverged. This choice caused the mean absolute values of the two series to be different, so t h a t subjects playing the A Decks could in fact earn more points t h a n subjects playing the B D e c k s - - a fact which m a y have contributed to the greater satisfaction of the A Deck people. An alternative procedure (see for instance T h i b a u t & Ross, 1969) would be to start the sequences at different points and have them converge or cross. This of course confounds t y p e of series with location of starting point. Nonetheless, all these possibilities should be explored, for they would certainly affect the salience of environmental shifts and might well contribute to the building of a general model for search behavior. ACKNOWLEDGMENTS I would like to thank Robert Zajonc, William Batko, James tteisler, Richard Roistacher, Zick Rubin, and Lawrence Becker for their invaluable help at various stages of this study. REFERENCES BEEZ, W. V. Influence of biased psychological reports on teacher behavior and pupil performance. Proceedings o] the 76th Annual Convention o] the American Psychological Association, 1968, 605-606. BRE~, J. W., & CO~EN, A. R. Explorations in cognitive dissonance. New York: Wiley, 1962. CHINNIS, J. O., JR., & PETERSON, C. R. Inference about a nonstationary process. Journal o/ Experimental Psychology, 1968, 77, 620-625. CHII~NIS, J. O., JR., • PETERSO.iN',C. R. Nonstationary processes and conservative

inference. Journal o/ Experimental Psychology, 1970, 84, 248-251. DINS~OOR, J. A. Punishment: I. The avoidance hypothesis. Psychological Review, 1954, 61, 34-46. GILBERT,J. P., & M0STELLER,F. Recognizing the maximum of a sequence. American Statistical Association Journal, 1966, 61, 35-73. GOULDI~ER, A. W. Patterns o] industrial bureaucracy. Glencoe, Ill.: Free Press, 1954. K~HAN, J. P., RAPO~OR~,A., & JONES, L. V. Decision making in a Sequential search task. Perception and Psychophysics, 1967, 2, 374-376. MAcQUEEN, J., & MILLER, R. G., JR. Optimal persistence policies. Operations Research, 1960, 8, 362-380. MISCHEL, W. Preference for delayed reinforcement: An experimental study of a cultural observation. Journal o/ Abnormal and Social Psychology, 1958, 56, 57-61. MISCHEL, W. Delay of gratification, need for achievement, and acquiescence in another culture. Journal o/Abnormal and Social Psychology, 1961, 62, 543-552.

62

PHILIP BRICKMAN

I:~APOPORT, A., & TVERSKY, A. Cost and accessibility of offers as determinants of optional stopping. Psychonomic Science, 1966, 4, 145-146. RAPOPORT, A., • TVERSKY, A. Choice behavior in an optimal stopping task. Organizational Behavior and Human Per]ormance, 1970, 5, 105-120. SIEGEL, S. Nonparametric statistics /or the behavioral sciences. New York: McGrawHill, 1956. SIEGEL, S. Level of aspiration and decision making. Psychological Review, 1957, 64~ 253-262. SOLOMON, R. L., K A ~ , L. J., & WYliE, L. C. Traumatic avoidance learning: The outcomes of several extinction procedures with dogs. Journal o] Abnormal and Social Psychology, 1953, 48, 291-302. THmAuT, J., & ROSS, M. Commitment and experience as determinants of assimilation and contrast. Journal o] Personality and Social Psychology, 1969, 13, 322-329. WINE~, B. J. Statistical principles in experimental design. New York: McGraw-Hill, 1962. RECEIVED: J u l y 2, 1970