Response speeds in probability learning

Response speeds in probability learning

JOURNAL OF MATHEMATICAL PSYCHOLOGY: Response JEROME (1967) Speeds in Probability L. MYERS, University 4, 473-488 BLASE of Massachusetts, GA...

972KB Sizes 1 Downloads 153 Views

JOURNAL

OF MATHEMATICAL

PSYCHOLOGY:

Response JEROME

(1967)

Speeds in Probability

L. MYERS,

University

4, 473-488

BLASE

of Massachusetts,

GAMBINO, Amherst,

Learning1 AND

MARI R.

Massachusetts

JONES

01002

Response probabilities and response speeds were obtained from four groups of subjects who differed with respect to the level of 7~ in a probability learning task. A selfpaced procedure was used to obtain a more sensitive measure of response speed. Response-probability data, both marginal and conditional statistics, were similar to those obtained in previous studies under an experimenter-paced procedure. Response speeds differed significantly as a function of the response made and the level of 7~, results not generally obtained under the experimenter-paced procedure. Speed data were also analyzed as a function of preceding runs of events and as a function of preceding response-event outcomes. The W-S model developed by Myers and Atkinson provided a good description of the response-probability data, but problems were encountered in fitting the speed data.

The present study investigates the relationship between event probability and response speeds in a probability-learning task. The results have implications for mathematical models as well as for methodology in probability-learning experiments. Extensive evidence exists that choice reaction times are inversely related to the probability of the choice (e.g., Falmagne, 1965). In view of this, it is somewhat surprising to find that few data are available for choice times in probability-learning tasks, and that those data that have been published frequently fail to reveal systematic variations in choice times as a function of changes in event probabilities. Stevenson and Zigler (1958) found that event probability had no effect in a contingent reinforcement situation with normal and feebleminded children. Using adult subjects, Gerjuoy, Gerjuoy, and Mathias (1964) obtained no evidence of a correlation between response probability and response latency. Similarly, Friedman, Burke, Cole, Estes, Keller, and Millward (1964) observed no systematic relationship between n (probability of E, , the more frequent event) and response times, although the A, response time was about 50 msec. shorter than the A, when n was .8. Calfee (1963), using rats as subjects, failed to obtain a difference between mean latencies at 7~‘s of .65 and JO; however, he did find that A, and A, latencies were inversely correlated with the response probabilities. i This research was supported by funds from NIH Grant MH-03803-05 and National Foundation Grant GS-386. The authors wish to thank the Research Computing Center, sity of Massachusetts, for assistance in analyzing data.

473

Science Univer-

474

MYERS,

GAMBINO,

AND

JONES

Only two investigators have clearly found that n influences response times. Their studies differed from those cited above in that a discriminative stimulus Si served as the signal for the subject’s response. Wollen (1963) trained subjects on two types of randomly interspersed trials; although all four groups had v of 1.00 on trials initiated by one signal, S, , they differed with respect to the event probabilities on trials initiated by the second signal, S, . The average latency for A, and A, responses combined decreased monotonically with increases in n on S, trials. In a subsequent test, presentation of S, with no feedback available following the response resulted in decreases in both A, and A, latencies with increased V. In an experiment by Erickson (1966), the discriminative signal indicated which of four two-choice games was to be played on any trial. A, response times are not reported; the A, latencies vary inversely with r. Commenting on the failure of Friedman et al. (1964) to obtain rr effects, Myers and Atkinson suggest “. . . that in the typical experimental situation the subject decides on his response prior to the signal to respond. Under these conditions, response time, measured from the onset of the signal, would reflect the speed of reaction to the trial signal, and not choice time” (p. 188). This hypothesis also accounts for the negative results of Gerjuoy et al. (1964) and Stevenson and Zigler (1958). It is also consistent with the positive findings of Erickson (1966) and Wollen (1963). In the latter two studies, the subject presumably delays his decision until the signal to respond becausethe signal provides information necessaryto the decision. If this interpretation of latency findings in previous experiments is valid, it should be possibleto demonstratea relationship between responselatency and event probability in a simple two-choice task that does not involve a discriminative stimulus. In the present experiment, subjects are self-paced; latency is measuredfrom the onset of the event on trial n to the occurrence of the responseon trial n + 1. With this technique, latency should clearly reflect decision time, and with four levels of r, an extensive test of existing models is provided. Only two of the many mathematical models developed to describe probability learning yield specific predictions for response latencies. These models are of particular interest since one, LaBerge’s (1959), predicts no effect of r, while the second,Myers and Atkinson’s (1964), predicts the type of inverse relationship between responseprobability and latency that hasbeen observed generally in choice reactiontime studies, but infrequently in probability-learning studies. LaBerge has assumedthat the proportion of stimulus elements connected to irrelevant responsesdetermines resp,onselatency. During the course of learning, these originally neutral elementsbecome conditioned to either the A, or A, response (prediction of E, , the more frequent event, or of E, , the lessfre,quent event). Latency therefore declines with practice; however, according to LaBerge’s model, it is not inffuenced by rr and it does not differ for the two responses. In the Myers-Atkinson weak-strong (W-S) model an extension of Estes’ pattern

RESPONSE

SPEEDS

IN

PROBABILITY

LEARNING

415

model (1959), it is assumed that a single stimulus element is sampled on each trial, that this element becomes either strongly or weakly conditioned to either A, or A, , and that one of two latency distributions is sampled on each trial depending on whether the element is in the weak or the strong state. Further, assuming that the mean of the latency distribution is less in the strong than in the weak state, Myers and Atkinson derive the prediction that response time will be less for A, responses than for A, responses, and that increases in 7~will result in shorter il, response times and longer A, response times. In summary, response times and response probabilities will be inversely related. In addition, the response time averaged over both responses will decrease with increases in rr. Models originally formulated to describe choice reaction-time data might also be considered in the probability-learning context. However, those which make a clear statement about the relationship between marginal response probabilities and marginal latencies (e.g., Audley’s “implicit response” model, 1960; Bower’s VTE model, 1959; Kintsch’s random-walk model, 1963) make the same prediction that the W-S model does: response latency will be inversely related to response probability. The choice reaction-time models are of somewhat less interest than the W-S model since, without additional assumptions, the former do not describe the learning process over trials.

METHOD

Apparatus. Two 1” diameter event lights (E, and E,) and three microswitch response buttons were mounted on a 124 ‘< g-in. wooden panel. The event lights were $-in. apart and each was placed directly above a response-indicator button. The third button, designated the decision key, was mounted centrally, t&in. below the level of the two response-indicator buttons. One ounce of pressure was required to operate the buttons. The panel itself formed a 20” angle with the table and was equipped with an adjustable elbow rest to ensure that the subject’s arm rested comfortably. All programming and recording equipment was separated from the testing situation by a Masonite screen. A Tally tape reader was used to automate the sequences of reinforcing events. Decision time, the length of time the decision key was depressed, was recorded to the nearest millisecond. This measure, as well as the response and event, was automatically printed on each trial. Procedure. Each subject was randomly assigned to one of four levels of ?T (90 : IO, 80 : 20, 70 : 30, 60 : 40). Five sequences were randomly generated for each rr level. For half of the subjects in each of the 20 sequences, the left light was the more frequent event; for the remaining subjects, the right light was the more frequent event. All subjects received 400 self-paced trials. The subject started each trial by depressing the decision key, then indicated his prediction by pressing one of the response-indicator buttons. The correct event light came on when the subject pressed the decision key again, and it remained on until the subject released this key to make his next response. Subjects were instructed to try to outguess the experimenter on each trial. They were told not to rush, but to work at a comfortable pace. They were told that response time measurements were being recorded.

476

MYERS,

Subjects. The subjects All were right-hand-dominant riment.

were

AND JONES

80 summer school students females. Each subject received

RESULTS

RESPONSE

GAMBINO,

AND

at the University of Massachusetts. $1.00 for participating in the expe-

DISCUSSION

PROBABILITY

Figure 1 presents P(A,), the proportion of predictions of the more frequent event (EJ for each block of 100 trials. The effect of 7~is clear and the position and shapes of the curves generally resemble those obtained under the more common procedure in which the pacing is automated and inter- and intra-trial intervals are constant over trials. P(A,) does exceed V, particularly at x’s of .7 and .8, but no more than would be expected on the basis of previous studies using at least this many trials (Edwards, 1961). The changes in P(A,) from the third to the fourth block are not significant, indicating that the learning process has asymptoted. This is important since it provides some justification for the subsequent application of asymptotic expressions derived from the W-S model. 1.00 .90

.70 5.60 a SO A0

--TT o--OTT -Tr Lk-dlT

-

okFIG.

1.

Probability

I 2 BLOCKS

of an Al

response

3 OF

=.6 =.7 =.6 s.9

4 TRIALS

for each block

of 50 trials.

RESPONSE

SPEEDS

Figure 2 presents the probability the preceding run of events.

FIG.

2.

Probability

of an

Al

response

IN

PROBABILITY

of an A, response

conditional

upon

477

LEARNING

conditional

the length

on the length

of the preceding

of

run.

When 7~ is .6, some suggestion of negative recency is obtained; in blocks 1 and 3, P(A,) declines after several consecutive Q’s and in all four blocks P(A,) increases following several consecutive E,‘s. Positive recency is the rule at the other rr levels. The first in a run of El’s results in an increase in P(A,); this reinforcement effect diminishes as run length increases. A similar positive recency, or reinforcement, effect is found for runs of E,‘s at the three higher 7r levels. The finding of more negative recency at the 7~ of .6, and particularly with E, runs, is reasonable if we note that mean run lengths are shorter under these conditions. Several investigators have demonstrated that the degree of negative recency is inversely correlated with run length (e.g., Derks, 1963; Jones and Myers, 1966). In general, the run curves are similar to those obtained by several other investigators (Edwards, 1961; Heuckeroth, 1965; Jones and Myers, 1966; Lindman and Edwards, 1961) all of whom used one or more of the 7r levels used in the present study. Another conditional statistic which is an important indicant of the type of process underlying the behavior is the probability of a response conditional upon the responseevent outcome of the preceding trial--P(A,,, / Ei,,-lAj,,-l). Variations of the linear or the pattern model will typically yield the parameter-free prediction that the probability of an A, response will be greatest following an E,A, outcome and least following an E,A, outcome. The observed conditional statistics in Table 1 are consistent with this prediction at all four n levels. In order to supplement our later analysis of response speeds, as well as to provide a further point of comparison of the present data with data previously obtained under the experimenter-paced procedure, the W-S model was applied to the first-order conditional probabilities for the last block of 100 trials at each n level. The appropriate asymptotic expressions

47%

MYERS,

GAMBINO;

AND

TABLE OBSERVED

ASYMPTOTIC

VALUES

OF P(A,,,

JONES

1

1 Ei,n-lAj,,-l)

AND

Two

Observed

SETS OF PREDICTIONS”

Predicted II

E,Ad .6

-%A,) Ed&) -%A,)

.7

.8

Total

GA,) -WA -GA,)

PM, PWI PM, P(A,

-GA,) -GA,) -GA,)

PM, P(A, P(A, P(A,

.9

&AI)

PM, PL% P(A, PM,

&Ad

GA,) &Ad GA,) EA)

.800 .527 ,573 ,375

.78.5 .544 .589 .337

.777 .533 .603 .349

,885 .516 .705 .347

.883 .578 .708 .336

.839 .605 .668 .411

.909 .715 ,782 ,667

.904 .732 ,796 .551

.901 .683 .732 .473

.942 .855 ,769 .526

~.942 .845 ,751 > .634

.959 .780 .793 .532

x2

9.358 PARAMETER

VALUES

I ?7

.6 .I .8 .9

N 2.232 1.828 2.837 3.245

59.665

6

p

.582 .717 ,715 .382

,031 .089 .084 .009

II 8

>N / 2.342 I I

P

I

I

.613 I I

.032 I I

a The two sets of predictions have been deri,ved from the W-S model. In case I, paramet’ers have been estimated at each level of *; in case II, a single set of parameter estimates has been employed. Consequently, the x2’s are distributed on 4 and 13 df, respectively.

are available in several other papers (e.g., Heuckeroth and Atkinson, 1964; Myers, Suydam, and Heuckeroth, techniques are presented in Atkinson and Estes (1963) (1960). They will not be reproduced here. Using a minimum x2 procedure (Myers and Atkinson, parameters were estimated, one set for each level of z-.

and Myers, 1967; Myers 1966) and the derivational and Suppes and Atkinson ‘_. 1964), four sets .of three The parameters are N, the

RESPONSE

SPEEDS

IN

PROBABILITY

LEARNING

479

number of stimulus elements, 6, the effect of an incorrect response, and CL, the effect of a correct response. The resulting predictions are presented in Table 1 in the column labeled I, together with observed values, parameter estimates, and the total x2 for the four data sets. This statistic, distributed on four 4f is not significant at the 5% level. Since the statement of the model implies invariance of parameter values as J function of rr, parameters were again estimated, this time under the restriction that IV, 6, and TVbe held constant over levels of n. The results appear in the column labeled II in Table 1. The x2, on 13 df, is significant and the fit, while respectable in comparison to many results in the literature, is markedly inferior to that obtained in column I. Thus, the first-order conditional response probabilities seem well described by the W-S model, but parameter values appear to be different at the four rr levels. The same conclusion has been made in previous investigations (Heuckeroth, 1965; Myers and Atkinson, 1964; Myers et aE., 1966). The analyses of this section yield two major conclusions. First, the results of these analyses are wholly consistent with those of previous studies even though the present study differed from its predecessors in using a self-paced procedure in order to better investigate response speeds. It would appear that the introduction of the self-paced procedure does not markedly alter the basic psychological process involved in probability learning. The second point worth noting is that, with the exception of some negative recency at n of .6, the data are completely consistent with stimulus sampling models, particularly with the W-S model which is able to handle the overshooting of n in Fig. 1 and gives a good account of the first-order conditional response probabilities. Therefore, we have good reason to attempt to apply the model to the response time data which will be considered next. RESPONSE

SPEEDS

In order to obtain a more symmetric distribution of the data, all analyses were performed on response speeds, the reciprocals of the latencies. The mean speeds for blocks of A, and A, responses for each of the experimental groups are plotted in Fig. 3. The results are clear: speeds increase during the experimental session, they increase as a result of increases in n, but more markedly for A, responses than for A, responses, and A, responses are faster than A, responses. The results of F tests of these effects (Trials, n, rr x Response, and Response) were all significant at the .Ol level. Although the curves are still rising slightly, the differences between response speeds for the third and fourth blocks are not significant. Therefore, the speeds in the fourth block were treated as asymptotic in the W-S analysis. These results are important for their methodological implications since response speeds and probabilities have not appeared to be correlated in previous simple choice studies (Friedman et al., 1964; Gerjuoy et al., 1964; Stevenson and Zigler, 1958). 4W4/3-9

480

MYERS,

GAMBINO,

AND

JONES

We may better understand the discrepancy between our results and those of previous investigators if we assume that response time in the probability learning situation consists of two components-decision time and reaction time. In previous studies, a trial consisted of a ready signal followed by a response which in turn was followed by the event for that trial; some fixed time later (usually two or three seconds), the

Al

3.50

.--. -

3.00

fj

ll=.6 lT=.7 8=.8 IT=.9

A2

2.50

W

a cn 2.00

0.50 0 I

2

3

4

I

BLOCK FIG.

3.

Response

speeds

2

3

4

OF TRIALS

for each block

and T level.

signal appearedagain, initiating the next trial. The decisionfor trial 1z+ 1 is presumably begun at the onset of the event on trial n and is probably completed well before the signal for the next response.Since the response-timemeasureobtained in these studiesis the time between the signal and the response,it reflects only simple reaction time, which is undoubtedly lesssensitive to manipulations of z=than is decisiontime. In the present study, the ready signal is absent. The subject moves at his own pace and the measureobtained is the time from the event on trial n to the responseon trial n + 1. This measure should reflect both decision and reaction time, with n primarily influencing the more sensitive decision-time component. Further support for this argument may be found in discrimination studies (Erickson, 1966; Wollen, 1963) in which the ready signal is a discriminative stimulus which provides information necessaryto the decision. In these studies, the measure of time between the signal and responseon a trial does include a decision time component, and therefore it is not surprising that VThashad significant effects. Ideally, the investigator should measureresponsetime from that point at which the subject

RESPONSE

SPEEDS

IN

PROBABILITY

LEARNING

481

begins to make a decision. This moment should coincide with the presentation of information relevant to the next choice; in the specific instance of the simple twochoice study, a self-paced procedure is an adequate solution to the problem of measuring decision time. In addition to their methodological implications, the data of Fig. 3 are also of theoretical import. The results flatly contradict LaBerge’s (1959) model. LaBerge assumes that neutral elements, not conditioned to either A, or A, , are present at the beginning of the experiment. On each trial, samples of elements are obtained until an overt response occurs; all elements in the sample become conditioned to the correct response for that trial. In this way, the neutral elements are all asymptotically conditioned to an overt response. Only one sample will be required to evoke a response and, since response time depends on the number of samples of elements obtained on a trial, response time will be constant at asymptote. It should not vary as a function of either the value of v or the particular response made. Since both of these factors are very significant sources of variance in the present study, there is no reason to pursue the LaBerge model any further. Predictions about the relation between A, and A, response times and about response time as a function of 7r can also be derived for the W-S model. Assuming that each stimulus element is either weakly or strongly conditioned to exactly one response, and further assuming that there is a distribution of response speeds with mean $ associated with the strong state of conditioning, and a distribution with mean w associated with the weak state of conditioning, the expected asymptotic speeds of A, and A, responses are, respectively,

(2) where 9 = S/p. Subtracting the second expression from the first, it is clear that asymptotic A, speeds will be higher than asymptotic A, speeds whenever # -3 w. The significant difference between A, and A, response speeds can be interpreted within the framework of the model. To obtain predictions about the role of n we take the first derivative of each of the above expressions with respect to V. We find that asymptotic A, speeds should increase, and asymptotic A, speeds decrease as a function of rr if # > W, as we have assumed, Note that this result depends on the assumption that 9, Z/J,and w are invariant over levels of r. The A, response speeds clearly increase as a function of rr; unfortunately, SO also do the A, speeds. More precisely, the A, response speeds show little differentiation at V’S of .6, .7, and .8, but are quite high at x of .9. (If latencies are plotted, the data

482

MYERS,

GAMBINO,

AND

JONES

are even less in accord with the W-S model. The A, latencies at .6, .7, and .8 exhibit about as-much spread, and in the same direction, as do the A, latencies.) The W-S model is unable to describe these results unless we assume that the values of the parameters of the model vary as a function of rr. We will make this assumptidn when we later attempt to fit some conditional speed statistics. The W-S model fails to describe the data of Fig. 3 (assuming parameter invariance over levels of V) because of the symmetric relation between Eqs. 1 and 2. Such symmetry is typical of the models developed to describe response probabilities and seems a natural extension to response speeds. Consequently, many models of choice behavior will, if extended to response speeds, incorrectly predict negative correlations between A, and A, response speeds over levels of r. For the same reason, models of choice reaction time such as those formulated by Audley (1960), Bower (1959), or Kintsch (1963) will also be inadequate. However, the predicted result is typical in studies of choice reaction time (e.g., Falmagne, 1965). Thus, the data are not only inconsistent with our models, they are inconsistent with results from a related area of inquiry. In the face of the models and the choice reaction-time data, it is tempting to conclude that our data are an artifact of our method; however, Wollen (1963) has obtained data consistent with ours using a discrimination situation, When a signal previously associated with partial reinforcement was presented in an extinction test (no event appeared following each prediction), both fast and slow responders exhibited a positive correlation between A, and A, response speeds over values of V. Considering that Wollen’s data were obtained under conditions very different from those of the present experiment, but exhibited a similar trend, it appears that the results of Fig. 3 may be typical of prediction situations, both self- and experimenterpaced. Figures 4-7 present response speeds conditional on the length of the preceding run of homogeneous events. The most distinguishing characteristic of these data are their variability, particularly when rr = .9 (Fig. 7). Nevertheless, some trends are at least indicated. The A, response speed curve generally shows an increment with the first in a run of El’s, continues to drift upward (although the curves are rarely monotonic), and is usually higher than all other response speed curves. These results are consistent with the W-S model. The speed of an A, response given j preceding -8,‘s will be the weighted average of the parameters I,!Jand w; the weights are the probability of ,being in the strong A, state and the probability of being in the weak A, state, given that an A, response had been made and that jE,‘s had just occurred. From the axioms of the model (Myers and Atkinson, 1964), it is clear that the probability of being in the strong state, the weight for #, will increase as run length increases. The same reasoning should apply to A, response speeds, conditional upon the length of the preceding run of E2’s. However, in this case there are no clear indications of positive recency. With the exception of the data for v = .7 (Fig. S), there is a decreasing trend in

RESPONSE

FIG. 4. Response Fig. 7 for legend).

speeds

SPEEDS IN PROBABILITY

conditional

upon

the length

483

LEARNING

of the preceding

run

at r =

.6 (see

Frc.

5.

Response

speeds conditional

upon

the length

of the preceding

run at TT = .7.

FIN.

6.

Response

speeds

upon

the length

of the preceding

run

conditional

RUN

FIG.

7.

Response

speeds

conditional

upon

at rr = .8.

LENGTH

the length

of the preceding

run

at v = .9.

484

MYERS,

GAMBINO,

AND

JONES

the A, response speeds conditional upon runs of E,‘s. This is again what the W-S model would predict. As the run of E,‘s grows longer, there is an increased probability that an element conditioned to A, will be only weakly conditioned; consequently, the parameter w receives an increasingly larger weight in the model. When rr = .7, we observe. what might be ,characterized as negative recency; in all four blocks, the initial decline in the speed of A, responses conditional on E, run length is followed by a subsequent rise. Such a result would not be predicted by a stimulus sampling model, nor is there any obvious reason why it should be observed only when VT= .7. The A, speeds, conditional on E, runs, should also decline with run length. This is generally the case when n = .6. Under other conditions, no clear pattern emerges. Although many of the trends in Figs. 4-7 are consistent with qualitative predictions derived from the W-S model, the curves fluctuate too much to permit a strong conclusion about the adequacy of the model. It is obvious that the present data set must be supplemented by others, preferably with more reliable data points. Future investigators might well employ many more subjects in each event sequence, and effort would best be repaid if concentrated on 7~‘s of .6 and .7, where substantial amounts of data can be obtained for Es runs. We next consider the effect of the response-event combination upon response speed on the following trial. Table 2 presents average response speeds following TABLE OBSERVED AND PREDICTED

AVERAGE RFISPONSE SPEEDS FOLLOWING RJZSPONSES IN LAST BLOCKY

A .7 .8 .9

CORRECT AND INCORRECT

Sine

Scar n

2

Observed

Predicted

Observed

Predicted

1.596 1.835 2.144 3.386

1.484 1.755 2.152 3.295

1.428 1.420 1.611 2.795

1.556 1.517 1.932 3.202

a The predictions in this table and in Table 3 are derived from theoretical expressions appear in Myers and Atkinson (1964).

* .ooo 3.093 3.386 4.587 the W-S

model.

w 1.604 1.364 1.605 3.031 The appropriate

correct (sCOr) and incorrect (sn,,) responses for the fourth trial block. It is clear that, at all r levels, response speeds are higher following correct responses. Some insight into why this is so may be obtained by further analyzing these conditional speeds. In Table 3, both sCOfand srnc have been separated into two components, speed of a repetition of the previous response and speed of a switch from the previous response. Thus we have the statistic siii , the average speed of a repetition of a previously correct .response; siji , the average speed of a switch from a previously correct

RESPONSE

SPEEDS

IN

PROBABILITY

TABLE

485

LEARNING

3

OBSERVED AND PREDICTED AVERAGE RESPONSE SPEEDS OF SWITCHES AND REPETITIONS CORRECT AND IINCORRECT RESPONSES IN LAST BLOCKS

~~~ n

.6 .7 .8 .9

Observed 1.522 1.892 2.320 3.408

&,z ~~~___

Predicted 1.471 1.791 2.203 3.311

SL,i

s*j,

Observed 1.816 1.511 1.945 3.057

Predicted 1.525 1.552 1.760 3.112

Observed

.v,,j

Predicted

1.467 1.379 1.565 2.854

FOLLOWING

1.554 1.548 1.975 3.195

Observed 1.392 1.454 1.659 2.711

Predicted 1.559 1.491 1.886 3.212

response; siii , the average speed of a repetition of a previously incorrect response; and siij , the average speed of a switch from a previously incorrect response. Briefly, the results may be summarized as siji > sijj > sjji R5

sjj

.

(3)

The only exception, siii < sijj at r = 6, is reversed in earlier trial blocks. These relationships may be explained by assuming that subjects are reinforced for strategies involving a sequence of responses. Therefore, a wrong response increases response time because the subject is forced to consider alternative strategies. The switching from a previously correct response may be slower than the repetition of a previously correct response because the former may involve a momentary conflict between a strategy which dictates a switch in response and the tendency to repeat a reinforced response. The relationships among response speeds, described above, are also consistent with the stimulus sampling model developed by Myers and Atkinson. Since that model can predict alternative trends as well, any meaningful evaluation requires that parameters be estimated and numerical predictions be generated. The estimates of CL, 6, and N are those which were obtained in fitting the conditional response probabilities (Case I of Table 1). Estimates of $ and W, presented in Table 2, were obtained by simultaneously solving Eqs. 1 and 2 with the observed marginal response speeds inserted. (At v = .6 the obtained estimate of 4 was negative and the value of zero was substituted.) The five parameter values were then inserted into the theoretical expressions presented in the appendix. The resulting predictions are presented in Tables 2 and 3. Considering Table 2 first, it is clear that the predicted values of scOr generally exceed the predicted values of sine , as in the observed data; in this respect, the sole

486

MYERS,

GAMBINO,

AND

JONES

failure of the model is at r = .6. However, the predicted sCOTare consistently too low and the predicted sine are consistently too high. Turning to Table 3, we note that the predicted values also fail to reflect the greater speed of a switch following a correct response (siji) when compared to speeds following incorrect responses.

CONCLUDING

REMARKS

The methodological implications of the present study have been discussedin detail above. Briefly, the responseprobability data are consistentwith thosepreviously obtained using an experimenter-paced procedure; however, the self-paced procedure of this study solvesthe problems associatedwith investigating responsetimes. The theoretical implications of the present study are somewhatmore complicated. The LaBerge (1959) model clearly fails to describeany aspect of the speeddata other than the increase in speed over trials. The Myers-Atkinson W-S (1964) model is able to account for the discrepancy in A, and A, speeds,the effect of r on A, response speeds,the positive recency in the run curves, and the relative effects of correct and incorrect responsesupon the speed of the next response.However, the model cannot account for the increase in A, speedswith increasedn without permitting the parametersto be independently estimated at each QTlevel. Since the model implies parameter invariance over levelsof rr, and sinceno alternative theory of the relationship of parameters to v is available, the A, marginal speedspresent a problem for the model. Nor is the fit to conditional statistics satisfactory. Many of the differences between observed and predicted statistics are quite large and the predicted statistics are not consistently in the sameorder as the observed. Furthermore, at rr = .6, the relative magnitude of I,!Jand w are not consistent with the assumptionof the model. On balance, it appears that the model fails to describe adequately the speed data. This failure may either signal an inadequacy in the underlying choice theory ( i.e., stimulus sampling, weak and strong conditioning states) or in the extension of the theory to responsespeeds.The first alternative seemsmore likely since, if weak and strong conditioning statesare assumed,the one further assumptionnecessary for dealing with speeddata is very plausible; namely, that eachstate hasan associated distribution of responsespeeds,the strong-state distribution having the higher mean. Difficulties in the underlying theory are also indicated by the consistent finding of parameter variability as a function of r (this paper as well as Heuckeroth, 1965; Myers and Atkinson, 1964; Myers et aZ., 1966), the failure to describe preasymptotic data (Heuckeroth and Myers, 1967; Heuckeroth, 1965), and the consistent underestimate of asymptotic variances of responseprobabilities (Heuckeroth, 1965; Heuckeroth and Myers, 1967). In the face of impressive fits to other statistics in the papers cited above, and in the absenceof any better description of binary choice behavior, it may be premature to discard the W-S model. However, there

RESPONSE

SPEEDS

IN

PROBABILITY

487

LEARNING

is certainly sufficient evidence to suggest the need for the development and testing of alternative models. The response speed data, the inability of stimulus-sampling models to deal with certain sequential dependencies (Anderson, 1964), and the failure to describe preasymptotic data with the weak-strong model all suggest that a model which incorporates axioms about patterns of events and strategies of subjects may provide a viable alternative to the stimulus-sampling approach.

APPENDIX

The values of niL presented in Table 4 are the total numbers of asymptotic trials on which Ej and A, both occurred, pooling over all subjects in each group. These values are the denominators for the first-order conditional statistics presented in Table 1.

TABLE

4

VALUES OF tzjg

7r

fill

n12

%I

n.lE

-

- ---~__---.

.6

743

4.59

467

291

.7

1004

424

336

176

.8

1200

284

202

57

.9

1592

186

143

19

REFERENCES ANDERSON, N. H. An evaluation of stimulus sampling theory: paper. In A. W. Melton (Ed.), Categories of human learning. Pp. 129-144. ATKINSON, R. C., and ESTES, W. K. Stimulus sampling theory. E. Galanter (Eds.), Handbook qf mathematical psychology, Pp. 121-268. AUDLEY, R. J. A stochastic model for individual choice behavior. l-15. BOWER, G. H. Choice-point behavior. In R. R. Bush and W. K. tical learning theory. Stanford: Stanford Univer. Press, 1959.

Comments on Professor Estes’ New York: Academic Press, 1964. In R. D. Lute, R. R. Bush, and Vol. 2. New York: Wiley, 1963. Psychological Estes (Eds.), Pp. 109-124.

Review, Studies

1960,

67,

in mathema-

488

MYERS,

GAMBINO,

AND JONES

CALFEE, R. C. Long-term behavior of rats under probabilistic reinforcement schedules. Technical Report No. 59, October, 1963, Psychology Series. Institute for Mathematical Studies in the Social Sciences, Stanford University, Stanford, California. DERKS, P. I,. Effect of run length on the gambler’s fallacy. Supplementary Report. Journal of Experimental Psychology, 1963, 65, 213-214. EDWARDS, W. Probability learning in 1,000 trials. Journal of Experimental Psychology, 196 I, 62, 385-394. ERICKSON, J. R. On learning several simultaneous probability-learning problems. Journal of Experimental Psychology, 1966, 72, 182-189. ESTES, W. K. Component and pattern models with Markovian interpretations. In R. R. Bush and W. I<. Estes (Eds.), Studies in mathematical learning theory. Stanford: Stanford Univer. Press, 1959. Pp. 9-52. FALMAGNE, J. C. Stochastic models for choice reaction time with applications to experimental results. Journal of Mathematical Psychology, 1965, 2, 77-124. FRIEDMAN, M. P., BURKE, C. J., COLE, M., ESTES, W. K., KELLER, L., AND MILLWARD, R. B. Two-choice behavior under extended training with shifting probabilities of reinforcement. In R. C. Atkinson (Ed.), Studies in mathematical psychology. Stanford: Stanford Univer. Press, 1964. Pp. 250-316. GERJUOY, I. R., GERJUOY, H., AND MATHIAS, R. Probability learning: Left-right variables and response latency. Journal qf Experimental Psychology, 1964, 68, 344-350. HEUCKEROTH, 0. Choice behavior and reward structure: A test of three mathematical models. Technical Repor No. 21, December, 1965; Studies in Choice Bekavior. University of Massachusetts, Amherst, Massachusetts. HEUCKEROTH, O., AND MYERS, J, L. Choice behavior and reward structure: Preasymptotic statistics. Journal of Mathematical Psychology, 1967, 4, 120-139. JONES, M. R., AND MYERS, J. L. A comparison of two methods of event randomization in probability learning. Journal of Experimental Psychology, 1966, 72, 909-911. KINTSCH, W. A response time model for choice behavior. Psychometrika, 1963, 28, 27-32. LABERGE, D. A model with neutral elements. In R. R. Bush and W. K. Estes (Eds.), Studies in mathematical learning theory. Stanford: Stanford Univer. Press, 1959. Pp. 53-64. LINDMAN, H., AND EDWARDS, W. Supplementary report: Unlearning the gambler’s fallacy. Journal of Experimental Psychology, 1961, 62, 630. MYERS, J. L., AND ATKINSON, R. C. Choice behavior and reward structure. Journal of Mathematical Psychology, 1964, 1, 170-203. MYERS, J. L., SUYDAM, M. M., AND HEIJCKEROTH, 0. Choice behavior and reward structure: Differential payoff. rournal of Mathematical Psychology, 1966, 3, 458-469. STEVENSON, H. W., AND ZIGLER, E. F. Probability learning in children. Journal of Experimental Psychology, 1958, 56, 185-192. SUPPES, P., AND ATKINSON, R. C. Markov learning models for multiperson interactions. Stanford: Stanford Univer. Press, 1960. WOLLEN, K. A. Relationship between choice time and frequency during discrimination training and generalization tests. Journal of Experimental Psychology, 1963, 66, 474-484 RECEIVED:

August 2, 1966