Search behavior: A correction procedure for three-choice probability learning

Search behavior: A correction procedure for three-choice probability learning

JOURNAL OF MATHEMATICAL 2, 145-170 PSYCHOLOGY: Search Behavior: Three-Choice A Correction Probability MICHAEL Stanford (1965) University, Pro...

1MB Sizes 0 Downloads 28 Views

JOURNAL

OF MATHEMATICAL

2, 145-170

PSYCHOLOGY:

Search Behavior: Three-Choice

A Correction Probability MICHAEL

Stanford

(1965)

University,

Procedure Learning1

for

COLE Stanford,

California

A modified three-choice probability learning experiment, in which a subject was required to continue responding until he made the “correct” response on each trial, was run for three groups of 24 subjects over a series of 1000 trials. Groups differed in the experimenter-assigned probabilities, rr, that each alternative was designated as correct. The main results of the experiment were: (a) First choice probabilities changed more rapidly than in traditional probability learning and for terminal trials approximated the value given by $‘(A,) = rryZ%r~ . (b) A positive recency effect as defined in ordinary probability learning was obtained. (c) A choice-point model derived from statistical learning theory provided accurate, parameter free predictions of the asymptotic response probabilities for the first choices, and described the course of learning for first and second choices with the use of two parameters estimated from the data. Tests of the model using additional aspects of the data met with mixed results.

INTRODUCTION

Consider a person in the following situation: Before him are three buttons. Above each button is a light. He is asked to predict which of the lights has been chosen as “correct,” and to indicate his choice by pushing the appropriate button. He chooses, but finds that he has chosen incorrectly. The experimenter then asks him to choose again, this time to determine which of the two remaining lights will be correct. Choosing again, perhaps our hypothetical subject guesses correctly. The experimenter then lights the light above the correct button and the trial is terminated. Given a large number of such trials, the subject will most probably arrive at some “strategy” for maximizing the number of correct responses that he can make with the fewest possible guesses per trial. In short, he will develop a pattern of search behavior with respect to the lights in much the same way that a trouble-shooting operator will go through a series of steps, searching for a defective part whose location is always uncertain. It i This article is based upon a dissertation of the requirements for the degree of Doctor who served as his advisor.

submitted to Indiana of Philosophy. The

145 IO

University in partial fulfillment author is indebted to W.K. Estes

146

COLE

is with ee development and structure of this type of “search” behavior that we will be concernd in the present paper. The search situation bears a close resemblance to a traditional probability learning experiment in which the subject is required to make a single choice from a well defined set of alternatives. In the probability learning situation, following the subject’s response, the experimenter indicates the correct choice for the given trial. The experimenter’s choice of correct alternatives during a sequence of tri;lls is usually programmed according to an unrestricted random schedule in which each alternative is correct with probability ri , ~~=, rrTTi= 1. As the hypothetical example above illustrates, the search procedure differs from ordinary probability learning in that the experimenter indicates the correct alternative only after the subject has made the correct choice. When the initial choice is incorrect, the subject must choose again, and continue choosing until he makes the correct response. Consequently, the search experiment may be viewed as probability learning with correction. For two choices such a procedure might seem redundant since a single response is sufficient to establish which alternative is correct. (The only data on this question failed to establish a difference between correction and noncorrection procedures (Burke, Estes, & Hellyer, 1954).) H owever, for more than two choices the requirement that the correct choice be made on every trial provides new data of both empirical and theoretical interest. First of all, the data from the initial choices on each trial for the search subjects may be compared with data from three-choice probability learning experiments. Considered at the start of any trial two subjects, one of them run with a search procedure, one with a traditional probability learning procedure will be faced with much the same objective situation. Both must try to choose the correct response from among the full set of alternatives and both have had the same reinforcement history (in the sense that both have received an indentical series of trial-terminating events). A comparison of the two types of data should determine whether or not the correction procedure, i.e., the added responses on trials when the initial response is incorrect, significantly alters the course of learning as measured on the first choice. Responses on the second choices will be of interest in themselves. It is not at all certain from apriori considerations how the subjects will respond to the subset present following an error on the initial response. Will such subsets be treated as separate learning problems within the larger problem or will some complex interactive process obtain ? Within this context it will be possible with the search procedure to investigate certain theoretical problems which do not arise with the traditional procedure. For instance, the situation facing the subject when he must choose from a subset of alternatives is similar to that dealt with by Lute (1959) and it is possible that a Lute-type formulation, in particular the constant ratio rule (Clarke, 1957), will apply to the data from the choices on subsets from the search experiment.

THREE-CHOICE

PROBABILITY

LEARNING

147

A second theoretical approach to search data may be termed a “choice-point” theory, various varieties of which have grown out of stimulus sampling theory (Bower, 1959; Estes, 1959b). A distinguishing feature of this type of theory is its detailed analysis of the stimuli present at the choice point, the response or response sequence that the subject must make, and the reinforcement contingencies. How best to interpret these ideas in the context of a search will be one of the major tasks of analysis in the experiment to be described. In the study to follow, three groups (differing in the z values assigned to the alternative responses) will be investigated in an attempt to shed light on the problems, both empirical and theoretical, raised above. Each set of rr values has been chosen for the comparisons that it could provide, either with other groups in the present experiment or with data collected from other experiments. Two of the groups have the same 7 value, 0.667 for the response most frequently indicated as correct but differing STvalues for the two remaining alternatives (0.222 and 0.111 for the group, 0.333 and 0.000 for the other). The rr values of the third group, 0.445,0.333, and 0.222 correspond to the 7r values used in a recent three-choice, noncontingent experiment (Estes, Cole, Keller, Polson, & Steiner, 1962). These values were chosen to make possible direct comparisons with data gathered under very similar conditions except for the amount of information feedback available to the subject following errors. In addition, the 71values for the different groups are arranged so that in each group, for one of the subsets formed when an error is made on the initial choice, the renormalized n values associated with the remaining alternatives correspond to the nonzero first choice 7~values for Group 1. METHOD

Subjects. Seventy-two students at Indiana University during the spring semester of 1962 served as subjects. Thirty-six were fulfilling a class requirement by their participation in the experiment and thirty-six were obtained by means of an advertisement in the student newspaper. The latter were paid $3.75 for their participation. Subjects within the voluntary and paid halves of each experimental group were assigned randomly to experimental conditions in the order of their appearance at the laboratory. Apparatus. The apparatus had two major components ; the subject’s response board and the experimenter’s control equipment, the latter being located in an adjacent room. The subject sat in an open booth with the response board before him at table top height. The board was painted flat black. Centered on it in a triangular array were three response switches. These response switches were covered with an opaque, blue plastic cover beneath which was a small light which could be made to illuminate the opaque cover. The switching and lighting operations were independent of each other. Directly above each response button was a 28 V pilot lamp which served as a reinforcer. At the base of the board, below the apex of the triangle formed by the response buttons, was a handrest board. When this board was depressed, a light flashed on the control panel in the experimenter’s room.

148

COLE

The control equipment consisted of an IBM 526 Summary Punch and associated electronic programming devices. This apparatus presented reinforcements to the subject, controlled the timing of each trial, and recorded responses. For a detailed description of this equipment see Friedman, Burke, Estes, Cole, Keller, & Millward (1963). Experimental dures for these alternatives was are presented in

Design. The subjects were divided into three groups of twenty-four. Procegroups differed only in the proportion of trials on which each of the response indicated as correct. The assignments of these proportions (rri) for each group Table 1. TABLE 1 REINFORCEMENT

Group 1 2 3

Note: ni, i = is the probability for choice.

Tl

r2

0.667 0.667 0.445

0.333 0.222 0.111

1, 2, 3 is the probability that Ai is designated

PROBABILITIES

XL3 0.000 0.111 0.222

n12

r13

0.667 0.750 0.572

1.000 0.860 0.667

that Ai is designated correct. correct when the alternatives

v23 1.ooo 0.667 0.600

rrij , i,j = 1, 2, 3 (i #j) Ai and Aj are available

A separate sequence of trial-terminating, correct events was generated for each subject. The sequences were random with the sole restriction that each event occur on the proportion, rri , of the trials. The Rand tape of one million random digits was used to generate the sequences. Each subject was given 1000 trials, divided into three sessions. During the first session, 330 trials were given, and during each of the second and third sessions, 335 trials. The response, Ai , is defined as pushing the response button corresponding to the ith event light, that is, a prediction that the ith light will be lighted. In order to balance out possible position preferences, all possible orders for assigning 71 values to physical locations were used. Since there were three possible events, six orderings for the physical location of the events were possible. Each set of six orders was presented to four subjects in each experimental group, two paid and two volunteer. Procedure. Each trial began when the three response buttons were lighted. This was the signal for the subject to begin responding. The subject continued making responses until he had selected the correct button, i.e., the button corresponding to the reinforcing light selected for that trial. Thus, there might be from one to three responses per trial: one if the subject chose the designated alternative on the first attempt, two if the first choice was incorrect but the second correct, and three if both the first and second attempts proved incorrect. After each response, whether or not it terminated the trial, the subject had to remove his hand from the response button, place it on the handrest, and then make the next response. If this routine was not followed, depressing a response button had no effect, so that returning the hand to the handrest board always had to occur between successive responses. This contingency was arranged by having the handrest board wired into the response circuit in a way which rendered the buttons ineffective until the handrest had been depressed. It was included to assure that response topography was relatively uniform among subjects and particularly to prevent a subject from resting his hand on a response button for an entire session.

THREE-CHOICE

PROBABILITY

LEARNING

149

This procedure resulted in variable intervals between the responses within a trial, since each subject could choose his own rate of responding, once the signal lamps were lighted. However, once the correct response had been made, a fixed sequence of events began, The reinforcing light above the correct response button appeared immediately following the correct response. This light remained on for 1 sec. Following its offset there was a 1 set interval during which no lights were on, and then all the response buttons were illuminated again, marking the start of the next trial. Thus, though the interval between responses within a trial varied, a 2 set interval always intervened between the last response on one trial and the onset of the signal for the next. All subjects quickly adjusted to this routine, and after the first few trials worked at an even pace throughout the session, which typically lasted 25-30 min. Each experimental session began with the experimenter conducting the subject to the experimental room and reading the appropriate section of the instructions. At the first session, the whole set of instructions was read; at the later two sessions, only a summary of the procedure was given. The subject was told that on each trial of the experiment one of the response alternatives would be designated by the experimenter as “correct,” and that it was his job to find which alternative this was in as few guesses as possible. It was emphasized that the first choice on each trial should indicate the subject’s best guess as to the location of the correct alternative; and that a second guess, when necessary, should indicate which of the remaining pair of alternatives was thought to be correct. It was pointed out that the speed with which each experimental session came to a close was dependent on the subject’s accuracy in choosing the correct alternative ; the sooner each correct choice was made the sooner a trial terminated and thus the sooner each session terminated. In order to ensure that the set of alternatives was clear to subject at all times, it was arranged that after a response button had been pressed (whether or not it was correct) the light within that button went out, leaving the remaining set of alternatives illuminated. Thus, if the first choice was incorrect, two response buttons remained lighted, indicating that the subject should choose between them. If both the first and second choices were incorrect, the one remaining alternative button remained lighted until it too had been pushed. When a correct response was made, all response lights went out and remained out until the next trial began. At the end of the first two sessions, the subject was asked to refrain from asking questions until all three sessions had been completed, at which time questions concerning the experiment were answered. Sessions were schedules on successive days as often as possible, but in some instances there was a two or three day interval between sessions.

RESULTS PAID

vs.

VOLUNTEER

SUBJECTS

Because half of the subjects in each group were paid and half volunteers, the question of possible differences between the two subject populations arises. A comparison was made of the average A, (first choice) proportions for the two subgroups in 100 trial blocks. A separate comparison was made for each of the three main experimental groups. The results indicated only extremely small differences between the two subgroups with respect to A, response probability. This evidence of homogeneity will serve as justification for combining the groups in future analyses.

150

COLE

AVERAGE

FIRST

AND

SECOND

CHOICE

LEARNING

CURVES

The average proportions of occurrence of each response on first choices are plotted in successive blocks of trials in Figs. l-3. The significance of the theoretical curves will be discussed below. In order to show the early, rapidly changing part of each curve in some detail, the first 200 trials have been plotted in 25 trial blocks, whereas the remaining trials, during which change was quite gradual, are shown in 100 trial blocks.

Group 1: A2

6

4 20 0

12345678

9

FIG. 1. Mean response proportions and A, responses are presented &,A,, The 6 values upon which the theoretical

10

11

0

.

12

4 16

13

14

15

plotted in successive blocks of trials for Group 1. The in the bottom, middle, and upper graphs respectively. points are based are 0, = 0.002, 0, = 0.080.

As may be seen in Fig. 1, the task for Group I quickly narrows down to a choice between two alternatives. By the second block of 25 trials only 2.5 y0 of the total responses are Aa’s, and from this point onward there are almost no As’s. By the second block of trials the A, curve has already surpassed the rrl value, and by Block 6, it appears to have leveled off. This impression of stability is supported by a paired

THREE-CHOICE

PROBABILITY

LEARNING

151

Group 2: A2

FIG. 2. interpretation e‘ =

0.030.

Mean response proportions see Fig. 1. The 8 values

upon

plotted which

in successive the theoretical

blocks of trials for Group 2. For points are based are 0, = 0.005,

152

COLE

t-test comparing the response level during trials 501-750 with that during trials 751-1000 (t = 0.50 with 23 d.f., p > 0.30). Since there is no evidence that response probability is changing during the last 500 trials, the average response proportion for this block of trials will be treated as an estimate of the asymptotic probability. Table 2 gives the average proportion of Ai responses for the last 500 trials of the experiment for this and each of the other experimental conditions. TABLE TERMINAL

Group 1 2 3

MEANS

P(4)

=I

0.779 0.884 0.531

0.018 0.016 0.019

AND

STANDARD

P(4 0.221 0.087 0.304

2 ERRORS OF FIRST

CHOICES

SE,

P(A,)

SE,

0.015 0.012 0.016

0.000 0.029 0.166

0.000 0.004 0.011

Initially the A, response curve for Group 2 (Fig. 2) is quite similar to that of Group 1. In fact, for the first 100 trials, the two curves may almostbe superimposed.

FIG. 3. interpretation 0, = 0.008.

Mean response proportions see Fig. 1. The 0 values

plotted in successive blocks of trials for Group 2. For upon which the theoretical points are based are B0 = 0.002,

THREE-CHOICE

PROBABILITY

153

LEARNING

However, after Block 1,) the A, curve for Group 2 continues to rise and remains substantially above that for Group 1. Judging by eye, there appears to be a slight upward trend which is continuing at the end of training. However, a t-test like that run in Group 1 to test for asymptotic respose stability failed to yield a significant t

60 D

0

50

1

g501. 1

2

3

4

2

3

4

i

5

6

7

5

6

7

8

: -+8

-90

9

10

E 90

T

A2,3

100

TRIAL BLOCKS

FIG. 4. Mean response proportions for second choices, Group 2. The proportion of choices, Pij , of response Ai over A, when the subjet A;j was presented is calculated for successive 100 trial blocks. The theoretical points have the same interpretation as in Fig. 1. The 0 values used in generating these theoretical points were those used in Fig. 2.

(t = 0.805 with 23 d.f., p > 0.20). The curves for the A, and A, responses are even more stable than that for the A, response, with no discernable trend over the last 500 trials. In Group 3 (Fig. 3), learning appears to be slower than in the other two groups, and there is a greater appearance of a continuing rise in A, response probability at the termination of training. However, as was the case for Group 2, this rise is extremely small over the last half of the series. Once again the t-test comparing the first and last halves of the last 500 trials did not yield a significant p value (t = 1.02 with 23 d.f., p > 0.10). What little rise there is in the A, curve comes almost entirely at the expense of the A, curve. The A, curve shows a change only of the order of 0.01 between blocks 3 and 10. The curves for the second choice trials are shown in Figs. 4 and 5. It should be emphasized that these second choice curves are plotted for blocks of 100 trials, not

100 presentations

of the ij-th subset. Each curve represents the proportion of cases, presented a choice between alternatives i and j, the subjects chose i. Because the number of presentations of each subset is in part a function of how often each response is chosen on the first choices, the frequency with which each subset occurs will differ among subsets as well as among subjects.

Pii , on which,

Al,2

80

0 A

70

Theoretical mean 0 Data A Monte Carlo

I

1

2

3

4 100

FIG.

5.

Mean

TRIAL

+ b-7

8

9

10

BLOCKS

proportions for second choices, Group 3. The proportion of choices, the subset Aij was presented is calculated for successive 100 The theoretical points have the same interpretation was in Fig. 1. The 6 values used these theoretical points were those used in Fig. 3.

Pij , of response trial blocks. in generating

5

response

Ai over A, when

The rapid elimination of A, responses in Group 1 results in a concomitant elimination of the subset (AtA,), which is almost never presented after Block 1b. Correapondingly, the probabilities PI3 and P,, rapidly approach 1.00. Consequently, no second choice curves are presented for this group. Reflecting the reduced number of trials represented at each data point, the curves for the second choices are less smooth than the first choice curves. They also appear to be changing more slowly as a function of trials, even when the mij for the subset is higher than the rri for first choice trials (compare the A, curve in Fig. 2 with the A, curve in Fig. 4). However, it should be kept in mind that even the most frequently presented diad (A&) is presented only 25-30 y0 as often as the triad (A,A,A,), since the triad occurs at the beginning of every trial. The numbers of opportunities to respond to the other diads are still smaller.

THREE-CHOICE

PROBABILITY

155

LEARNING

Tests for asymptotic response stability, like those carried out on the first choice data, were performed for each curve in Figs. 4 and 5. In general, the results indicated that second choice response probability is changing even during later trials, although in some cases the amount of change is slight. TABLE TERMINAL

Group

PI,

MEAN

AND STANDARD

SE,,

3

ERRORS; SECOND CHOICE

PI,

SE,,

RESPONSES

p23

SE,,

1

0.000

0.000

1.00

0.000

1 .oo

0.000

2 3

0.776 0.551

0.230 0.050

0.860 0.702

0.070 0.030

0.722 0.620

0.030 0.202

Note: Pij is the proportion of times of alternatives A,A, is present. The standard errors were calculated reconverted to proportions.

that response

i is chosen

from

transformations

arcsine

over

response

j when

of the raw scores

the subjet and then

The terminal proportions, Pij , of choices of response i over response j for each of the diads and all groups are presented in Table 3, along with the standard errors of the means. When an effort is made, by combining suitable blocks of second choice trials, to compare levels of first and second choice curves after approximately equal frequencies of presentation, a different picture emerges than that of Figs. 4 and 5, where the second choices are plotted in 100 trial blocks. It was calculated that for Group 2, Block 10 of (A&) trials follows 6013 presentations of the (A@,) diad, an average of 250 presentations per subject. The triad (A,A,A,) has been presented this often after 2 8 blocks of trials, i.e., at about the middle of Block 3. Similarly, for Group 3, Block 10 of the diad (A&) follows 4541 presentations of this diad, or 181 presentations per subject. This frequency of presentation corresponds roughly to Block 2d for the first choice data. Both of these diads share a common T value 0.667 with the A, response alternative in Groups 1 and 2. When a comparison of the appropriate response proportions was made it was found that when equated for number of presentations, the first and second choice response levels are quite similar. Thus, PI3 for Group 3 is 0.755, corresponding to P(A,) for Groups 1 and 2 which were 0.735 and 0.788 respectively. Similarly, P, for Groups 2 is 0.774, similar to the P(A,) for Groups 1 and 2 of 0.750 and 0.821.

156

COLE

-

C-4

l.OO-

vi -r( 4

Data

Monte Carlo Monte Carlo

‘.LS).

(learning curve 13’s) kequential El’s)

A

.90--

k

601. 0

2

1 NUMBER

3

4

5

OF PRECEDING

4 6

lob+----+----1 0

E1’S

NUMBER

1

2

3

OF PRECEDING

4-6 E2’S

FIG. 6. Recency results for Group 1. The 0 values upon which the Monte Carlo curves based came from the estimates using the learning curves (Fig. 1) and the sequential data (Table

M A-4

Monte Carlo (leaming CUW~ 8’s) Monte Carlo kequential 0’s)

407

7oJ : 0

1

2

3

4

NULIBER OF PRECEDING

5 El’s

are 4).

I

b

c

.ODW 2 NUMBER OF PRECEDING

3-5 ET’S

NUMBER OF PRECEDING E3’S

FIG. 7. Recency results for Group 2. The 8 values upon which the Monte Carlo curves based came from the estimates using the learning curves (Fig. 2) and the sequential data (Table

are 4).

THREE-CHOICE

NUMBER

OF

PRECEDING

El’s

PROBABILITY

NUMBER

OF PRECEDING

FIG. 8. Recency results for Group 3. The 0 values based came from the estimates using the learning curves

RECENCY

157

LEARNING

EZ'S

NUMBER

OF

PRECEDING

E3'i

upon which the Monte Carlo curves (Fig. 3) and the sequential data (Table

are 4).

CURVES

Figures 6-8 contain the recency data. A recency curve is generated by calculating the probability of any response Ai , following k consecutive occasions on which the ith alternative was designated as correct. (In the discussion to follow, we will use the symbol Ei to mean that the ith response alternative was designated as correct.) A great deal of interest has been generated by the question of whether the probability of Ai increases or decreases as the number of directly preceding Ei’s increases. If P(A,), the probability of a response, Ai, increases as the run of directly preceding Ei’s increases, the result is cahed positive recency. If P(A,) decreases as the run of directly preceding Ei’s increases, the result is called negative recency (Jarvik, 1951). Because the present experiment departs from the usual procedure of probability learning studies, the manner in which the recency curves are calculated for our data must be explained. The procedure adopted here was to consider only the first choice on any trial, and to calculate the first choice probability as a function of the number of preceding trials, k, which terminated in Ei’s. Any second or third choices which intervened between a first choice and the reinforcement on a trial were ignored. Looked at in this way, the sequence of responses and events are in a form similar to that which would arise from a multiple-choice, noncontingent experiment.

158

COLE

Contrary to results that might have been expected on the basis of earlier studies using two choices (Jarvik, 1951; Nicks, 1959) there is very little evidence for a negative recency effect in these data. There is only the slightest hint of a decrease in these data for runs of 3 or 4 preceding Ei’s and this is present only for the most frequently reinforced alternative. Rather, the present data resemble those obtained by Friedman et al. (1963) in a two choice situation. For a block of trials with r = 0.80 (which was preceded by a long training series with varying n values) Friedman et al. found a positive recency effect which was greater for the alternative associated with the lower n value. In both experiments, by far the greatest effect for the more frequently reinforced alternative follows the first occurrence of the appropriate event, with little or no change from there on (and a suggestion of some decrease for Group 2 (Fig. 6A) in the present experiment). The significance of the curves labelled “Monte Carlo” will be discussed below. In addition to comparisons within groups, it is possible to compare recency curves across experimental groups. In this regard, it will be noted that the recency curves for the A, responses in Figs. 6A and 7A are extremely similar. The only consistent difference between the two groups is that the functions for Group 1 run about ten percentage points below the corresponding points for Group 2, in concert with the respective over-all response levels of the two groups. The initial portion of the A, function for Group 3 (Fig. 8A) is much steeper than its counterparts in Groups 1 and 2 reflecting in part the low level of the initial data point for the Group 3. Comparisons of the recency data for the less frequently reinforced alternatives across groups yield somewhat inconsistent results. On the one hand, the A, function for Group 2 (Fig. 7B) is steeper than the A, function for Group 1 (Fig. 6B), indicating that positive recency may be greater the rarer the corresponding event. On the other hand, the A, function for the second half of the series in Group 3 (Fig. SC) is steeper than the corresponding data for Group 2 (Fig. 7C), although the probability of the event associated with A, is larger in Group 3 than in Group 2. There is one real difficulty in assessing these recency data. The numbers of cases upon which data points in the graphs are based vary greatly between groups and between response alternatives within a group. For responses associated with a low rr value, only the first two or three points, k = 0, 1, 2, have high frequencies of occurrences, necessitating the combining of points along the abcissa. RESPONSE

SEQUENCES

The procedure used in the present experiment gives rise to two general types of response sequences. First, there are the sequences of first choice responses, considered by themselves; second and third responses which intervene between successive first choices being ignored. In an analysis of the first choices the data are treated much as if they came from a three-choice noncontingent experiment run under the traditional probability learning procedure.

THREE-CHOICE

PROBABILITY

159

LEARNING

In order to provide sequential data against which to test theoretical formulations, the proportion of sequences of the type AisnEi,,Ai,,+l (where i = 1,2,3) observed on the final 500 trials have been tabled in Table 4 for each group. These sequences will be referred to as the three-tuples. Also included in Table 4 are theoretical values generated using Monte Carlo subjects and a “dual-process” theory to be explained below. TABLE

4

OBTAINEDANDPREDICTED TERMINALPROPORTIONS JOINT EVENTS (&&.nAt.n+d

Group Obs. Aim

Ei,,

1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3

1 1 1 2 2 2 3 3 3 1 1 1 2 2 2 2 3 3 1 1 1 2 2 2 3 3 3

Prediction

1

Pred.

Pred.

Obs.

OF

Group

2

Pred.

Pred.

THE

Group Obs.

Pred.

3 Pred.

&N-+X

1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3

B’s

la

2b

0.435 0.087 0.000 0.187 0.074 0.000 -

0.415 0.100 0.000 0.192 0.068 0.000 -

0.432 0.077 0.000 0.182 0.079 o.ooo -

0.117 0.026 0.000 0.043 0.029 0.000

0.117 0.034 0.000 0.052 0.022 0.000 -

0.113 0.041 o.ooo 0.044 0.030 0.000

-

-

0.000 0.000 0.000 0.000 0.000 0.000 -

0.000 0.000 0.000 0.000 0.000 0.000 -

-

*&, = 0.002 ep = 0.080

bOo = 0.10 ep = 0.10

la 0.550 0.025 0.007 0.156 0.038 0.005 0.084 0.008 0.010 0.052 0.008 0.007 0.011 0.008 0.001 0.007 0.001 0.001 0.015 0.001 0.001 0.004 0.001 0.001 0.002 0.000 0.001

Q.

0.510 0.053 0.017 0.168 0.023 0.005 0.082 0.077 0.003 0.059 0.007 0.001 0.019 0.003 0.001 0.010 0.001 0.000 0.017 0.002 0.001 0.006 0.001 0.000 0.002 0.000 0.000

= 0.005 0.030

en =

2” 0.537 0.039 0.011 0.150 0.035 0.004 0.076 0.008 0.011 0.058 0.009 0.001 0.017 0.006 0.000 0.008 0.001 0.001 0.016 0.001 0.001 0.005 0.002 0.000 0.002 0.000 0.000

“t$, = 0.05 8, = 0.05

0.187 0.041 0.016 0.061 0.097 0.017 0.046 0.023 0.047 0.106 0.021 0.010 0.032 0.067 0.010 0.025 0.014 0.026 0.05 1 0.009 0.006 0.016 0.031 0.005 0.012 0.008 0.018

18

2b

0.129 0.073 0.035 0.095 0.057 0.026 0.061 0.038 0.019 0.076 0.044 0.021 0.054 0.032 0.015 0.035 0.022 0.013 0.037 0.020 0.010 0.026 0.016 0.009 0.019 0.011 0.005

0.173 0.042 0.019 0.079 0.081 0.018 0.063 0.025 0.03 1 0.084 0.040 0.014 0.036 0.061 0.012 0.026 0.024 0.021 0.044 0.011 0.014 0.016 0.024 0.008 0.012 0.009 0.014

&B, = 0.002 er = 0.006

be0 = 0.05 8, = 0.25

160

COLE

CONSTANT

RATIO

RULE

A second kind of response sequence, one that comes about as a result of the idiosyncratic procedure employed in the present study, involves the second choices. When an error is made on an initial choice, the subject is then presented with a subset of the total set of alternatives from which to choose. There are three such subsets, one corresponding to each of the three possible initial choices. Viewed in this way, it seems reasonable to inquire whether choices from the subsets will conform to the same general laws that have been established for choices among subsets in other experimental contexts. In particular, will the constant ratio rule, which has proven such an effective analytic tool in psychophysics (Clarke, 1957) and which forms a central part of Lute’s (1959) choice theory, apply in the present context ? (It should be remembered that for the present experiment, presentation of subsets is subject controlled whereas the Clarke-Lute formulation was applied to situations in which presentation was experimenter controlled.) It is difficult to see, however, why the present procedure should not be amenable to the same analysis. It will be recalled that the constant ratio rule states that the ratios of the probabilities of any two responses should be the same, regardless of whether they appear in a full matrix of stimulus objects, or only in a subset of the full set. Thus, if the full set is (A&4,), the ratio P(A,)/P(A,) (f or instance) must be invariant whether observed when (A&4,) is presented or when the subset (&A,) is presented. TABLE CONSTANT

Group

RATIO

5 RULE ANALYSIS

A23

A,,

Al,

2

Obs. Pred.

0.722 0.750

0.860 0.968

0.776 0.910

3

Obs. Pred.

0.620 0.647

0.702 0.762

0.551 0.624

The application of the constant ratio rule reduces to asking whether the a.ppropriate ratios of response probabilities from the top half of Table 2 will closely approximate the observed values in Table 3. The ratios calculated from the terminal first choice probabilities are compared with the observed values in Table 5. It is apparent .from Table 5 that discrepancies exist in each group. However, the data are variable enough to make firm rejection of the rule difficult.

THREE-CHOICE PROBABILITY LEARNING

161

THEORY The results discussed in the preceding pages provide basic data that a successfu1 theory of search behavior must be able to account for. However, the number of theories which give an adequate description of even the asymptotic first choice probabilities is extremely limited, and only one seems to provide parameter free predictions of the asymptotes plus predictions for the mean learning curves. In their usual form, the linear and stimulus sampling models (Bush and Mosteller, 1955; Estes, 1959a) will not give accurate asymptotic predictions of these data. The very aspect of these theories which has been successful in many traditional probability learning experiments, the prediction of asymptotic probability matching, disqualifies them as contenders in the present context. In all search groups, the asymptotic first choice probabilities overshoot the respective r values by a convincing margin. Various attempts were made to account for these search data using extant variations of the linear and stimulus sampling models (see Cole (1962) for a full discussion of the application of these models). These efforts met with some success in certain cases, but the difficulties encountered in applying the models encouraged us to look further for an account of these search data. A DUAL PROCESS MODEL While in the process of examining these models, it was noticed that the asymptotes of all groups closely approximated the value given by the ratio of the squares of the r values. That is, the asymptotic probability of a response, Ai , corresponded closely to the value given by

(1) Values calculated

from this formula

are presented

in Tabie

6.

TABLE 6 THEORETICAL

vs.

OBSERVED

ASYMPTOTIC

PROBABILITIES

LAST 500 TRIALS

Group

1

Group

2

Group

3

~. AI Observed Theoretical

7r

II

A,

A,

A,

A,

A,

A,

Aa

A,

0.779 0.221 0.000 0.884 0.087 0.029 0.531 0.304 0.166 0.800 0.200 0.000 0.880 0.097 0.023 0.553 0.310 0.137 0.667 0.333 0.000 0.667 0.222 0.111 0.445 0.333 0.222

162

COLE

The close approximation of the values calculated from Eq. 1 to the observed first choice asymptotes suggest an interesting theoretical analysis within the context of a choice-point theory. The general line of development to be used follows closely an earlier discussion by Estes (1959b). The general conception of how the response process proceeds in this choice-point model can be characterized as follows: Step I.

The trial begins with

Step 2.

The subject orients

the presentation (looks at, considers)

of a set of N stimulus

objects.

one of the altermatives.

Step 3. At this point, one of two courses may be followed: (a) the subject may make an approach response to the object toward which he has oriented, in which case an overt response occurs, or (b) the subject may reorient and consider another alternative, returning him to Step 2. Step 4. If the subject went to Step 3, one of two courses may again be followed: (a) the response is correct and the trial terminates, or (b) the response is incorrect and the subject reorients, this time orienting toward one of the two remaining response alternatives, in which case the sequence again follows the course from Step 2. This process continues until a correct response is made. Thus, we have a dual-process system: an orienting response and an approach response each of which must occur in order to give us the final, overt choice response. Before we can write down an expression describing the change in probability of the overt choice response, the basic assumptions of the model should be made explicit These assumptions may be listed as follows: 1. Stimulus

Assumptions

Each of the two component stimulus set.

responses

occurs in the presence

of a specific kind of

(a) When the subject is at a choice point (i.e., when a set or subset of alternatives is presented) a set of “choice point stimuli” is present for sampling. Observing responses become conditioned to these choice point stimuli. Four such sets of choice point stimuli exist: (1) the set always present at the beginning of a trial; (2) thz set present after an incorrect A, response; (3) the set present after an incorrect A, rseponse; and (4) the set present after an incorrect A, response. (b) There are three sets of “object stimuli,” one associated with each of the three alternative choices. The object stimuli from a given object set are sampled only when the subject orients toward that object. It is assumed that the subject samples form each object set on every trial. If a correct response is made before he has sampled (say) set X, he always orients toward (and thus samples from) set X before the next trial.2 2 This assumptions

assumption leading

was introduced to different models

for mathematical could be introduced

simplicity. at this

Other point.

equally

plausible

THREE-CHOICE

(c) Each set of stimuli subject samples randomly

(a) All designated tioned to approach sampled, sampled.

below,

the stimuli

163

LEARNING

composed of a large number on every trial.

(d) For the model discussed exclusive, nonoverlapping sets. 2. Conditioning

PROBABILITY

of elements

are assumed

from

which

to form

the

mutually

Assumptions

stimuli sampled on a trial become conditioned to the response which is as correct at the end of the trial. The observing response becomes condichoice point stimuli that have been sampled at any point during the trial. The response becomes conditioned to all the correct object stimuli that have been and the nonapproach response to all other object stimuli that have been

(b) Stimuli

not sampled

on a trial do not change their conditioning

status.

(c) It will be assumed in the developments to follow that learning occurs in accordance with the linear difference equation that has been used to describe the change in response probability in statistical learning theory (Estes, 1959a). In the present context, these changes in response probability are said to take place with respect to each of the component responses, not the choice response, which is a compound of the two components. 3. Response Assumptions (a) The probability of a response is equal to the proportion that are conditioned to that response.

of stimulus

elements

(b) There exist two kinds of responses: an orienting response and an approach response. The initial probability of the orienting response is equal to I/r, where r is the number of alternatives. In the model to be discussed below, it is assumed that the initial probability of each approach response, P,,i , is 4; but it should be emphasized that this represents only a special case of a general class of theories differing in the initial approach probabilities that may be assumed to obtain. (c) An approach response is made only to the alternative toward which the subject is oriented. The probability of orienting toward a response, i, will be designated oi . The probability of approaching object i after oi has occurred will be designated pi . (d) The probabilities oi and pi remain constant during probability can occur only following the trial-terminating

a trial. Change event.

in response

From these assumptions we may proceed to write down the probability of a first choice response P(A,) on any trial, 7~, of the experiment in terms of its component responses. The probability of the choice response will be the sum of the probabilities of all

164

COLE

of the ways in which the orienting and approach responses can combine on a trial to result in that choice. When we begin to write out these various paths to the final response, we see that the probability that Ai is chosen after the first orientation is 0i.n * Pi,, * The probability that Ai is chosen following exactly two orientations is

and so forth. (i,j, K = 1, 2, 3, and it is understood that i, j, and K are all different.) Generalizing these expressions and using standard summation notation we may write: P(At,n) = 0i.nPi.n 2 (1 - 0i.nPi.n - Oj,npj,n - %x.np,,n)“-’

(2)

where t denotes the number of possible ways to arrive at the overt response. Once in this form, the series of events leading to the choice response may be seen to form a geometric series whose sum is: P(A,,,)

Oi nPi n

= -IA oi,,Pi,n

+

oj,nPj,n

(3) +

ok,nPk,n

This expression gives the probability of choice response Ai on any trial, n, in terms of the (as yet unspecified) probabilities, oi and pi , of the orienting and approach responses. Before we can predict Ai,% , we must first be able to calculate the probabilities 0i.n and pi,, for each trial of the experiment. The assumptions listed above suggest one method for obtaining these probabilities. According to assumption 2(c), the probabilities of the two component responses change according to the linear difference equation of statistical learning theory. Using this information, and taking as an example the orienting response (the same remarks will hold for the approach response), we have the following rules for change in response probability of a component response: 1. When the ith response is designated as correct on trial n

Oi,ntl = (I - 4) 0i.n + 0” *

(4)

2. When the ith response is not designated as correct on trial n %,n+1

=

(I

-

63)

0i.n

.

(5)

The parameter, 0, , in Eqs. 4 and 5 represents the proportion of stimulus elements from the choice point set that are sampled on a trial. Similarly, the oD in Eq. 7 below is the proportion of object stimuli sampled per trial. From Eqs. 4 and 5 we may proceed to derive the probability of the orienting response, oi on any trial n. This is done in the following manner: Eqs. 4 and 5 are first

THREE-CHOICE

PROBABILITY

LEARNING

165

multiplied by the probability that they will apply on any trial. Thus, (4) is multiplied by ri and (5) is multiplied by 1 - rri . The two expressions are then summed, giving a single expression for the probability of the oi response on any trial n + 1: Oi.n+1

=

(1

-

4)

O&l

+

e,ri

7

When this difference equation is solved (see Estes (1959) for further discussion of this procedure) we arrive at the desired result for the probability of the ith orienting response on any trial n: Oi,n = rri - (?ri - OQ) (1 - L?J-1, (where oi,r is given by assumption 3.b). By analogous reasoning, the following expression to response alternative i may be written:

for the probability

pi,% = 7ri - (7ri - p,,J (1 - 8$-l.

(6) of the approach

(7)

In the above expressions, oi,i symbolizes the initial probability of making orienting response i, and p,,r the initial probability of making approach response i. The fact that the B’s in two equations have different subscripts signifies that in general the approach and orienting responses will not be expected to have the same learning rates. By substituting appropriately for Eqs. 6 and 7 into Eq. 3, we may now obtain an expression for the probability of the overt choice response, P(&J on any trial n:

Although somewhat forbidding at first glance, Eq. 8 has a number of rather simple properties. There are two main elements to the equation: one element represents the average probability of the observing response and one the average probability of the approach response on any trial II. Consider the numerator: as the number of trials approaches cc, the left hand element of the numerator, zi - (rri - oi,J (1 - 8,)‘+l approaches the asymptotic value, 7ri . The same holds true for the right-hand element representing the average response probability of the approach response. Thus, in the limit, the numerator approaches rrjrri = z-: . In the denominator, only the expression zy=, 7r”j remains in the limit. Thus, the asymptote given by Eq. 8 is exactly the equation we began with in the development of the dual-process model, Eq. 1, i.e.,

P(A,,,) =-$-. s-f j=l

166

COLE

There are two difficulties involved in the use of Eq. 8. First of all, Eq. 3 as here interpreted is in fact the mean of a ratio, not the ratio of means. Secondly, the terms on the right hand side of the equation are products and the mean of a product equals the product of the means only if the covariance term is zero (a restriction which is implausible in the present context). Thus, though very convenient in applications, Eq. 8 can provide only an approximation to Eq. 3. Furthermore it is unclear how adequate is this approximation. To overcome the objections to this line of derivation and to obtain an estimate of the adequacy of Eq. 8, Monte Carlo runs were made. The results of both types of analyses are discussed in turn. In order to apply Eq. 8 to obtain predictions for the learning curves (Figs. l-3) it is necessary to have an estimate of the two parameters, 8, and tip . By assumption, of the approach and orienting responses, are $ Pi.1 and oi,l , the initial probabilities and $ respectively. The method used to obtain estimates of B0 and ep in the present case was to determine by numerical means the curve which appeared to give the best fit to the observed curves. Estimates of B0 and Ba were determined for each group separately. The theoretical curves were then calculated for blocks of 25 trials, and the results included in Figs. l-3. From the figures it is apparent that the results obtained with these B’s of “best fit” compare favorably with the linear model fits to ordinary probability learning data (Estes, 1959a). However, as the caption to each figure indicates, the best fitting B’s varied from group to group. As yet no way has been determined to distinguish B0 from 8, so that the designations in the graphs are only a convenient way of noting that two B’s with separate values were used in generating the theoretical curves. In addition to the first choice learning curves, it is possible to make predictions about the course of learning on the second choices. According to our assumptions, learning on the second choices follows exactly the same rules as learning on first choices, except that there are only two response alternatives to choose from. Thus, Eq. 8 will apply on the second choices, except that instead of summing over all three r values to obtain the denominator, only the r’s associated with the subset to be chosen from need be considered. There is a problem, however, in deciding how to calculate the second choice learning curve for successive blocks of trials. This is the case because the number of second choice trials which occur for a given subset depends on both the first choice probabilities and the r values. As a result, the number of trials for each subset will vary throughout the experiment, and application fo Eq. 8 to the second choices becomes a problem. Once the first choice theoretical curves have been obtained, there is a straightforward method for obtaining the second choice curves. By multiplying the average first choice probability of (say) response Ai for each block of trials by the probability, 1 - ri , that the subject makes an error on an Ai we may obtain the probability that the subset A@, occurs. Multiplying this probability in turn by the number of trials

THREE-CHOICE

PROBABILITY

LEARNING

167

in a block we obtain the number of trials for the subset AJ, in that block of trials. Thus, for each block of Ai trials we may obtain the theoretical number of trials on which the subset &l, occurs. With this knowledge, we can then solve Eq. 8 using the appropriate number of second choices for each block of trials, and get theoretical curves for each of the second choice curves, as shown in Figs. 4 and 5. In the present application, second choice learning curves were calculated for 100 trial blocks. The only change in Eq. 8, aside from the exclusion of the rr value that did not enter into the choice, was to change the initial probability of the orienting response. Instead of oi,i = *, the oi,i was taken equal to &, since on second choices only two alternatives are present. The initial probability of the approach response is not affected by the change from three to two choices. In addition we will make the assumption that the same 0 values apply on both first and second choices. Figures 4 and 5 contain the predictions from Eq. 8 applied to the second choices in the manner described above. For convenience in interpreting the graphs, each curve has been plotted separately. In only one case (Pis in Group 3, Fig. 5) does there appear to be any systematic deviation of the theoretical from the observed curves. In this one case it appears that slightly larger 8 values are required to give the best fit to the data. However, even in this case, the deviations are not extreme. In each case, the fits to the most reliable curves (the A.&l, subsets have many more cases per data point than the A,A, and A,A, subsets) are almost on a par with the first choice curves. The second theoretical analysis, the use of Monte Carlo subjects, proceeded directly from the assumptions of the model listed above. One hundred statistical subjects were generated for each experimental group using the same parameter values as were used in the application of Eq. 8. The data obtained by this means were then submitted to exactly the same analysis as the data from the real subjects. The mean curves obtained in this manner are included in Figs. l-5. From the figures it is apparent that Eq. 8 provides a rather good approximation to the “true” mean curve as represented by the Monte Carlo subjects. An added advantage of the MonteCarlo procedure in the present case is that it permits us to evaluate the adequacy of the model with respect to two additional aspects of the data for which we do not have theoretical expressions; the recency curves and the three-tuples. It is evident from an inspection of Figs. 6-8 and Table 4 that the model is less successful in handling these aspects of the data than it was in describing the learning curves. In general, the theoretical values calculated on the basis of the 0 values taken from the learning curves indicate that these B’s are too small to account for the observed sequential effects. This is a fairly common outcome in situations where the linear model of statistical learning theory has been applied (see Friedman

et al., 1963). Lacking theoretical expressions for the sequential statistics, it is a difficult task to determine whether or not 0 values exist which will describe the present sequential data. The strategy used in the present case was to generate sets of 25 Monte Carlo

168

COLE

subjects with various combinations of B values to find if there were any B’s which could approximate the data, although an exact specification of the best fitting B values could not be obtained in the absence of the appropriate theoretical expressions. It was found that plausible fits could be found for the three-tuples (Table 4) but when the appropriate B’s were then used in the recency analysis, the results were poor (Figs. 6-8). There seems to be no way with the present assumptions, to obtain the large change in response probability between the “0” and “I” points and at the same time produce small charges following longer strings of Ei’s.

DISCUSSION

The two most important classes of results that appear to distinguish the present experiment from its predecessors are: (1) fast learning with unusually large overshooting of the r values, and (2) absence of a negative recency effect. In part the impression that learning in the search situation is faster than in usual probability learning is the result of the extensive series of trials that was used and the way in which the data were grouped, That is, the first choice asymptotes have been calculated for a series of trials that is more than twice as long as any that have been reported for three-choice probability learning. Even the grouping of the smallest blocks of trials, 25 to a block, is a much coarser grouping than one ordinarily encounters. The present data were regrouped to provide comparisons with the data from various three choice experiments run under normal probability learning procedures (Cotton & Rechtschaffen, 1958; Estes et al., 1962; Gardner, 1957, 1958, 1961). The conclusion drawn from this comparison was that search procedure yielded slightly more rapid learning and higher asymptotes than other three-choice studies (see Cole, 1962). A general factor which may be operating in the search situation is the existence of a stronger incentive to be correct than is ordinarily present in conventional probability learning. The incentive referred to is the possible reduction in the duration of an experimental session for increased accuracy of choices. In the usual probability learning session, the duration of the experimental session is fixed. In the search experiment the fewer guesses required per trial, the less time it takes to complete each trial, and consequently, the less time it takes to complete the experimental session. This relationship between accuracy and the duration of the experiment is made clear to the subject in the instructions. It seems reasonable to inquire, then, whether or not this incentive has the same effects on response probability as more conventional forms of reward such as money, poker chips, etc. In those experiments where rewards of various types for correct responses have been introduced into probability learning experiments, the asymptotic response probability has been found to be an increasing function of the size of the incentive: the greater

THREE-CHOICE

PROBABILITY

LEARNING

169

the incentive for correct responding, the more the terminal response probability has exceeded V. Perhaps this same sort of effect is present in the search situation. Unfortunately, however, there is a paucity of probability-incentive data which will permit the comparisons necessary to properly assess the role of incentive factors in the search situation. No three-choice, incentive-probability learning experiments have been published to date. Moreover, the two-choice experiments have for the most part involved only a limited number of trials, and often no ,learning curves are presented even for the trials that are run (Taub & Myers, 1961; Myers, Reilly, & Taub, 1961). The second general class of results from the present experiment which appear atypical are the recency data. It will be recalled that a positive recency effect is found throughout the trial series. Exactly how these results fit in with the recency effects reported during the last several years is not altogether clear. Early probability learning experiments yielded negative recency effects, and for a while seemed to represent a solid piece of evidence against the usual reinforcement interpretation of probability learning (Jarvik, 1951; Nicks, 1959). However, recent experiments employing considerably more extensive series of trials than were common in the early experiments, have found the negative recency effect to be a transient phenomenon if it is found at all. Thus, Edwards (1961) reports a negative recency effect in early trials which gives way to positive recency in later trials, while Friedman et al. (1963) find no recency effect in early trials with a strong positive recency effect during later trials. In contrast, in the present experiment the recency functions were found to be similar when calculated for the first and last halves of the sessi”on (Cole, 1962). There is some evidence that the recency results obtained in the present study is not peculiar to the search situation, but may be a general feature of three-choice probability learning. Although this aspect of the data has not been reported in the studies involving three choices cited above, preliminary analysis of the recency data from the Estes et al. (1962) three-choice data yielded recency data substantially the same as that reported here. REFERENCES G. H. Choice-point behavior. In R. R. Bush & W. K. Estes (Eds.), Studies in mathemalearning theory. Stanford, Calif.: Stanford Univer. Press, 1959. Pp. 109-124. BURKE, C. J., ESTES, W. K., & HELLYER, C. Rate of verbal conditioning in relation to stimulus variability. r. exp. Psychol., 1954, 48, 153-161. BUSH, R. R. & MOSTELLER, F. Stochastic models for learning. New York: Wiley, 1955. CLARKE, F. R. Constant-ratio rule for confusion matrices in speech communication. r. acoust. Sot. Amer., 1957. 29, 715-720. COLE, M. Search behavior: a correction procedure for three-choice probability learning. Unpublished doctoral dissertation, Indiana Univer., 1962. COTTON, J. W., & RECHTSCHAFFEN, A. Replication report: two- and three-choice verbal-conditioning phenomena. r. exp. Psychol., 1958, 56, 96. BOWER,

tical

EDWARDS, W. Probability learning in 1000 trials. r. exp. Psychol., 1961. 62, 385-394. ESTES, W. K. The statistical approach to learning theory. In S. Koch (Ed.), Psychology: a study of a science. New York: McGraw-Hill, 1959a, Vol. 2. ESTES, W. K. A random-walk model for choice behavior. In K. J. Arrow, S. Karlin, & P. Suppes (Eds.), Mathematical methods in the social sciences. Stanford, Calif.: Stanford Univer. Press, 195913. Pp. 265-276. ESTES, W. K., COLE, M., KELLER, L., POLSON, P., & STIENER, T. Three choice non-contingent probability learning with extensive trials. Unpublished data, 1962. FRIEDMAN, M. P., BURKE, C. J., EWES, W. K., COLE, M., KELLER, L., & MILLWARD, R. B. Two choice behavior under extended training with shifting probabilities of reinforcement. In R. C. Atkinson (Ed.), Studies in mathematical psychology. Stanford, Calif.: Stanford University Press, 1963. Pp. 250-316. GARDNER, R. A. Probability learning with two- and three-choices. Amer. J. Psychol., 1957, 70, 174-185. GARDNER, R. A. Multiple-choice decision-behavior. Amer. J. Psychol., 1958, 71, 710-717. GARDNER, R. A. Multiple-choice decision-behavior with dummy choices. Amer. J. Psychol., 1961, 75, 205-214. JARVIK, M. E. Probability and a negative recency effect in the serial anticipation of alternative symbols. J. exp. Psychol., 1951, 41, 291-297. LUCE, R. D. Individual choice behavior. New York: Wiley, 1959. MYERS, J. L., REILLY, R. E.. & TAUB, H. A. Differential cost, gain, and relative frequency of reward in a sequential choice situation. r. exp. Psychol., 1961, 67, 357-361. NICKS, D. C. Prediction of sequential two-choice decisions from event runs. r. exp. Psychol., 1959, 57, 105-114. TAUB, H. A., & MYERS, J. L. Differential monetary gains in a two-choice situation. J. exp. Psychol., 1961, 61, 157-162. RECEIVED:

November

15, 19z3