LEARNING
AND
MOTIVATION
170-182 (1992)
23,
The Effect of Signalled Reinforcement on a Synthetic VI Schedule PHIL REED
of Birmingham,
University
United
Kingdom
TODD R. SCHACHTMAN University
of Missouri-Columbia
AND
J. N. P. RAWLINS University
of Oxford,
United
Kingdom
The present experiment investigated whether a signal for reinforcement would facilitate the acquisition of a required response sequence, despite enhanced acquisition of the sequence resulting in a lower response rate. Two groups of hungry rats responded on a synthetic variable interval (VI) schedule for food reinforcement. On this schedule, reinforcement was available for the first response to the target lever after a VI 60-s schedule had elapsed. However, reinforcement was also made contingent on a second criterion, which required that the subject emit a specified number of responses to a second lever prior to the response to the target lever. Rats that received a brief tone simultaneously with the delivery of reinforcement performed the task requirements more accurately than rats lacking a reinforcement signal. This enhanced performance of the required response sequence was accompanied by a lower rate of response to the target lever. This finding provides support for the notion that reinforcement signals on a conventional VI schedule attenuate responding due to a more “accurately” structured emission of the response that precedes reward. 0 IWZ Academic Press. Inc
A brief, response-contingent stimulus presented immediately prior to the delivery of reward attenuates response rates on variable interval (VI) This research was supported in part by a grant from the United Kingdom Medical Research Council. Requests for reprints should be addressed to Phil Reed, Department of Psychology, University College, London, Lower Street, London, VCl6BT. England or Todd R. Schachtman, Department of Psychology, University of Missouri, Columbia, MO 65211. 170 CO23-9690192 $5.00 Copyright 0 1W2 by Academic Press. Inc. All rights of reproduction in any form reserved.
SIGNALLED
REINFORCEMENT
171
schedules (Pearce & Hall, 1978; Tarpy, Lea, & Midgely, 1983; Williams, 1982). There have been several accounts of this finding, including the reinforcement of response bursts (e.g., Williams & Heyneman, 1982) sign-tracking (Iversen, 1981), response efficiency (Roberts, Tarpy & Lea, 1984), and competition for associative strength (e.g., Pearce & Hall, 1978). An alternative view of this effect is that the subject learns about the particular sequence of behavior that most often precedes reinforcement on a free-operant schedule, and the reinforcement signal promotes the emission of this sequence of responding (Reed, 1989). A number of authors have suggested that on VI schedules animals learn a response which comprises a long interresponse time (Morse, 1966; Weardon & Clark, 1988). Enhanced emission of a behavioral sequence which involves only a single response (e.g., a leverpress or keypeck) following a relatively long pause from responding would generate low levels of responding relative to a condition in which this response pattern was not promoted by the signal. The generation of definable response sequences has been the focus of a number of investigations (e.g., Grayson & Wasserman, 1979; Hawkes & Shimp, 1975; Schwartz, 1980,1982; Vogel & Annau, 1973; Wasserman, Deich, & Cox, 1984). Such studies have arranged reinforcement to be contingent upon one subset of all possible response sequences, and noted a selective increase only in frequency of the reinforced sequence (see Grayson & Wasserman, 1979; Wasserman et al., 1984). In some studies, despite there being no constraint on the manner in which a specified number of responses is emitted, the subjects came to emit one response sequence in preference to all others (e.g., Schwartz, 1980, 1984). These results suggest that subjects can acquire a response sequence even when there is no explicit requirement to emit any one particular response sequence . Reed, Schachtman, and Hall (1991) demonstrated that a signal for reinforcement will enhance acquisition of the response sequence that precedes reinforcement. When reinforcement was contingent upon a particular response sequence, a signal for reinforcement promoted acquisition of this sequence, and also enhanced terminal levels of performance, relative to a condition lacking a signal. However, it is possible that in the study reported by Reed et al. (1991), a signal for reinforcement served only to promote response rate per se rather than the acquisition of a response sequence. This result is possible because, in those experiments in which a signal-induced enhancement of response sequence acquisition was noted, there was also a corresponding increase in overall response rate. Thus, the latter effect on response rate may have been, at least in part, responsible for the effect on response-sequence acquisition. To demonstrate that enhanced response rates are not entirely responsible for
172
REED,
SCHACHTMAN,
AND
RAWLINS
enhanced response-sequence acquisition, it is necessary to show that an increased emission of a response sequence is possible when there is a concomitant decrease in overall response rate. On a free-operant VI schedule, a signal for reinforcement attenuates overall response rate. Thus, if the account given by Reed (1989) of the action of a reinforcement signal is correct, then it should indeed be possible to obtain an enhanced emission of a response sequence that leads to lower overall response rates. Shimp (1982) has devised a technique that enables a dissociation between the simple response strengthening effects of a treatment and the shaping of particular response sequences. This technique has been termed a “synthetic VI schedule”. On such a schedule, a response to one manipulandum (the target response) is reinforced, but only if two criteria are satisfied: first, the VI schedule has programmed a reinforcer; and second, the subject had previously emitted a specified number of responses to a second manipulandum (the nontarget response). On a simple VI schedule, reinforcement is available only after a period of time has elapsed, and should thus provide reinforcement for a long interresponse time (IRT). With the synthetic VI schedule, it is possible to observe the effectiveness of reinforcement on shaping a pattern of behavior performed during the IRT (e.g., the subject’s performance on the nontarget manipulandum). The synthetic VI schedule has been extensively investigated, and the relationship between response and reinforcement rates has been found to be similar to the relationship observed between response and reinforcement rates on a simple VI schedule (Shimp, 1979). Thus, the procedure appears to be a good analogue to a VI schedule, but has the advantage of allowing a more precise analysis of the shaping properties of reinforcement in addition to the effects of reinforcement on response rate. If a signal for reinforcement does promote the acquisition of response sequences on VI schedules, this would entail, necessarily, a reduction in overall response rate through the promotion of behavioral sequences comprising long IRTs. On a synthetic VI schedule, a signal for reinforcement should, therefore, lower the response rate to the target lever, and also increase the accuracy with which the subject performs the required number of nontarget responses. Such a result would suggest that a signal for reinforcement does promote the acquisition of response sequences rather than merely response rate. METHOD Subjects
The subjects were 32 male, Lister hooded rats. They were all 3-4 months old at the start of the study, had a free-feeding body weight range of 290-435 g, and were maintained at 80% of this weight throughout the
SIGNALLED
REINFORCEMENT
173
experiment. The subjects had previously served in a study assessing the effect of prior exposure to a visual stimulus on subsequent conditioning of that stimulus with food reinforcement, but the rats were naive with respect to lever pressing and the auditory stimulus used in the present experiment. The rats were housed in pairs and had water constantly available in the home cage. Apparatus Eight identical operant conditioning chambers (Campden Instruments Ltd.) were employed in the present experiment. Each chamber was housed inside a light- and sound-attenuating enclosure. A 65dB(A) background masking noise was supplied by a ventilating fan. Each chamber was equipped with two retractable levers. The food tray, into which reinforcement (one 45mg food pellet) could be delivered, was covered by a clear, hinged, Perspex flap, centrally located between the levers. A speaker was mounted on the roof of the chamber, through which a lOOdB(A) tone could be delivered. The chamber was not illuminated during the present experiment. Procedure The subjects were magazine trained in two, 30-min sessions, during which food pellets were delivered according to a variable time 60-s schedule. During the first session, the flap covering the magazine was taped open to allow easy access to the pellets. In the second session, the flap was lowered to its standard resting position. Subjects were then trained to lever-press by reinforcing every response [i.e., a continuous reinforcement (CRF) schedule]. Two sessions of CRF were given, one on each lever, and each session lasted until 75 reinforcements had been earned. Following the second CRF session, the animals were divided into two equal groups (N = 16). During Phase 1 of the experiment, reinforcement was programmed for the first press on the target lever (the right lever for half the subjects and the left lever for the remaining subjects) after a VI 60-s schedule (range 3-180s) had elapsed, but only if that target response was immediately preceded by at least one response emitted on the nontarget lever since the last reinforcer was delivered (the nontarget response could be made prior to or after the interval elapsing). If, after the interval had elapsed, the subject responded on the target lever without having emitted a nontarget response, or the last response the subject had emitted prior to emitting a response to the target lever after the VI schedule had timed out had been to the target lever (i.e., two target lever responses had been emitted in a row), then reinforcement was withheld until a nontargettarget response sequence had been emitted. For one group (Group Unsig), these were the only contingencies in operation. The other group (Group
174
REED, SCHACHTMAN,
AND RAWLINS
TABLE 1 Rates of Reinforcement (Reinforcers per min) Obtained during the Final Two-Session Block of Each Phase of the Study for Both Groups Phase
Group signaled Group unsignaled
1
2
3
0.92 0.95
0.89 0.91
0.45 0.44
Sig) also received a 500-ms, lOO-dB(A) tone stimulus [35dB(A) above background] which occurred simultaneously with the onset of the delivery of reinforcement. It should be noted that the operation of the pellet dispenser also provided a cue simultaneously with the delivery of reinforcement. This cue, however, produced only a click lo-dB(A) above background, and was presumably not salient enough to act as a reinforcement signal. During Phase 2, the two groups received the same signalling contingencies as detailed above, but they now had to make five consecutive responses to the nontarget lever prior to a target response. The nontarget responses could be made prior to or after the interval elapsing, but the target response had to be made after the VI 60-s schedule had elapsed. In Phase 3, the same contingencies were in force for the two groups, except that the subjects now had to emit nine consecutive nontarget lever presses prior to the target lever response to earn reinforcement. All phases of the experiment consisted of 28, 40-min sessions. RESULTS The mean reinforcement rates obtained by both groups over the last two-session block of each phase of the experiment are displayed in Table 1. Inspection of these data reveals that the rates of reinforcement were very similar for both groups. An analysis of variance (ANOVA) with Group (sig versus unsig) and Phase as factors was conducted on the data. A rejection level of p < .05 was adopted for this and all subsequent analyses. It is also worth noting that all analyses in the present study initially included counterbalancing (i.e., left lever as target versus right lever as target) as a factor, but in no case did the main effect or any interaction involving this factor reach significance; data were therefore reanalyzed with the groups collapsed over the counterbalancing condition. The ANOVA revealed a significant main effect of Phase [F(2, 60) = 10.311, but no main effect of Group nor an interaction of the two factors occurred (Fs < 1). Figure 1 displays the group-mean response rates emitted on the target
SIGNALLED 15
175
REINFORCEMENT
r
-
Group
----Group
111,1,111111t1
1
Phase 1
11,,1,11111111
14 1
Phase 2 Session
Sig Unsig
11111111111111
14 1
Phase
3
l4
FIG. 1. Group-mean response rates to the target lever during each session for both groups, over all three phases of the study. Group Sig, signaled reinforcement. Group Unsig, unsignaled reinforcement.
lever during sessions in all three phases of the experiment. The data from Phase 1 reveal that both groups started the study with comparable response rates, but, over the course of Phase 1, Group Sig came to emit a slightly higher response rate on the target lever than Group Unsig. A two-factor ANOVA (Group x Block) conducted on these data revealed a main effect of Block [F(13, 390) = 20.391, but there was no main effect of Group, nor was there an interaction between the factors (p < 0.20). During Phase 2, response rates for Group Unsig on the target lever increased, but rates for Group Sig decreased from their terminal Phase 1 levels. A two-factor ANOVA (Group x Block) conducted on the Phase 2 data revealed a main effect of Group [F(l, 30) = 6.701, an interaction [F(13, 390) = 10.031, but no main effect of Block (p > .lO). The group difference probably occurred because Group Sig responded at a lower rate. During Phase 3, Group Sig continued to emit target-lever responses at a lower rate than Group Unsig. A two-factor ANOVA (Group x Block) revealed a main effect of Group (F(1, 30) = 9.05), but no main effect of Block or interaction between the factors occurred (F’s < 1). Rates of responding on the nontarget lever were also measured. Averaged over the final eight sessions of Phase 1, Group Sig emitted a mean of 10.3 responses per min on the nontarget lever and Group Unsig emitted a mean of 8.1 responses per min. During the final eight sessions of Phase 2, the nontarget lever response rate for Group Sig was 7.8 responses per min and for Group Unsig the rate was 6.9 responses per min. Over the final four blocks of Phase 3, Group Sig responded at a mean of 4.8
176
REED, SCHACHTMAN,
AND RAWLINS
Phase 1 -Group Sig --4r oup Unrig
Phase 3
I
Response sequence length r’ 2. Group-mean frequencies of emission of response sequences defined in terms of the number of nontarget responses emitted prior to a target response averaged over the last eight sessions of training. Group Sig, signaled reinforcement. Group Unsig, unsignaled reinforcement. FIG.
responses per min on the nontarget lever. The corresponding score for Group Unsig was 4.2 responses per min. A two-factor ANOVA (Group x Phase) conducted on these data revealed no significant effects (Fs < 1). The mean number of consecutive nontarget lever responses emitted prior to a target lever response during the final eight sessions of each phase are displayed in Fig. 2. Inspection of the data for Phase 1 (when a single nontarget response was required) reveals that the most frequent response sequence in both groups was that containing one nontarget response prior to the target response; other sequences were emitted with a much lower frequency. There was very little difference between the groups in the emission of this sequence as a result of their signalling treatments. Inspection of the Phase 2 data (when five nontarget responses were required), however, reveals that Group Sig generally emitted response sequences containing a greater number of nontarget responses than Group Unsig; the most frequently emitted response sequence for
SIGNALLED
REINFORCEMENT
177
Group Sig contained five nontarget responses. In contrast, dominant response sequences contained either one or three nontarget responses for Group Unsig. In Phase 3 (when nine nontarget responses were required), the dominant sequence for both groups increased; nine nontarget responses were most frequent for Group Sig and six nontarget responses were most often emitted by Group Unsig. These data were analyzed as a means of examining the effect of the signal for reinforcement: first, the dominant sequence length was assessed; and second, the accuracy of performance of the required sequence was examined. The mean number of nontarget responses emitted by each of the groups in each phase was calculated. For Group Sig, these scores were 1.9 responses in Phase 1, 4.8 in Phase 2, and 7.2 during Phase 3. The corresponding scores for Group Unsig were 2.3, 3.3, and 6.1 responses. The number of nontarget responses contained in the dominant response sequence emitted by each subject, averaged over the last eight sessions of each phase, was analyzed by a two-factor ANOVA (Group x Phase). This analysis revealed a main effect of Group [F(l, 30) = 10.531 and Phase [F(2, 60) = 37.101, and an interaction between the factors [F(2, 60) = 7.541. Analysis of the simple main effect of Group for each Phase revealed no difference during Phase 1 (F < l), but that Group Sig emitted response sequences containing more nontarget responses than Group Unsig in Phase 2 [F(l, 60) = 6.971 and Phase 3 [F(l, 60) = 16.431. In order to more directly assess the influence of the reinforcement signal on the accuracy of performing the required sequence (i.e., the degree to which responding in a particular phase conformed to the performance specifically required for reinforcement), the percentage of response sequences emitted by each group that contained exactly the number of nontarget responses that the schedule required for reinforcement was analyzed. For Group Sig, 48% of the sequences emitted in Phase 1 were reinforced; this figure decreased to 30% in Phase 2; and was 19% in Phase 3. In Group Unsig, 41% of the sequences emitted were reinforced in Phase 1; 15% were reinforced in Phase 2; and 11% were reinforced in Phase 3. A two-factor ANOVA (Group x Phase) was conducted on these data, which revealed a main effect of Group [F(l, 30) = 4.011 and Phase [F(2, 60) = 10.831, but no interaction between these factors @ > 0.10). DISCUSSION The present study provides support for the notion that a signal for reinforcement enhances the acquisition of the preceding pattern of behavior, even when the promotion of this response sequence leads to lower overall rates of response. Moreover, the reinforcement rates obtained by both signalled and unsignalled groups were comparable. This fact was important to establish so that any group differences observed for response
178
REED, SCHACHTMAN,
AND RAWLINS
rate and emission of response sequences could be attributed to the shaping properties of the reinforcer, rather than to different reinforcement frequencies. Examination of the data on the length of the response sequence demonstrated that, during Phases 2 and 3, rats receiving signalled reinforcement in general emitted longer nontarget-response sequences than rats not receiving a signal, and that the former subjects performed the required sequence more accurately than the latter subjects. The finding that the signalled group was more accurate was important to establish, since it is possible for a subject to have a longer sequence emitted while displaying poor accuracy; for example, a subject could emit many more nontarget responses prior to a target response than the schedule required. Such behavior would imply poor learning about the response requirements. Enhanced accuracy in performing the required response sequence may be accounted for in terms of the response-efficiency notion described by Roberts et al. (1984)) which suggests that a signal for reinforcement allows the subjects to learn about the schedule in operation. Specifically, the signal may increase sensitivity to the molar feedback function (i.e., the relationship between the overall rate of response and the overall rate of reinforcement) of the schedule. On a VI schedule, increases in response rates above a certain minimal level produce no further increase in the obtained rate of reinforcement. Roberts et al. (1984) claim that a signal presented along with reinforcement helps the subject to learn this relationship and so they emit only a low level of responding. Conversely, on a VR schedule, there is a direct relationship between rate of response and rate of reinforcement, and a reinforcement signal may help the subject to better learn this relationship and, consequently, to respond at a higher rate than subjects with no signal for reinforcement. In the present experiment, a signal for reinforcement may have facilitated acquisition of the overall requirements of the schedule and promoted more efficient and, hence, accurate performance. However, there is a growing body of evidence which demonstrates that reinforcement signals do not always enhance efficiency of performance. Such findings are difficult for such an “efficiency” view to accommodate (e.g., Reed & Hall, 1988; Schachtman & Reed, in press; Tarpy, St. Claire-Smith, & Roberts, 1986). Several previous studies have provided evidence consistent with the notion that a signal for reinforcement enhances learning about the preceding pattern of responding (Reed, 1989; Reed, Schachtman, & Hall, 1988a; Schachtman & Reed, in press). These previous studies have found, however, that changes in the emission of a response sequence occurred in the same direction as the change in response rate. The present study provides the first clear evidence that increased learning about a response pattern may produce lower overall rates of target responding. Levels of responding to the nontarget lever were not greatly different in the two
SIGNALLED
REINFORCEMENT
179
groups of subjects. This aspect of the data also suggests that any unconditioned effects of the auditory stimulus did not affect the results in any systematic manner. There are a number of alternative mechanisms which may have led to the present enhanced response-sequence learning by subjects receiving signalled reinforcement. It is possible that the signal acted as a secondary reinforcer to establish and maintain behavior more effectively than a delayed primary reinforcer. Since it takes a short time for a food pellet to be delivered to the magazine, combined with the time the subject takes to retrieve the reward, it might be that the presentation of a stimulus serves as an immediate (conditioned) reinforcer for the group receiving signalled reinforcement. Alternatively, the conditioned reinforcement produced by the stimulus combined with primary reinforcement may provide a greater magnitude of reinforcement than primary reinforcement alone (see Tarpy & Roberts, 1985). The signal may have served as a marking stimulus that enhanced learning about the required response sequence (see Lieberman, McIntosh, & Thomas, 1979). On the basis of the present data, however, it would be unwise to speculate further on the possible involvement of such mechanisms. It is worth noting that the signal for reinforcement could have overshadowed the target response; that is, the signal and the target response may have competed with each other for associative strength (Pearce & Hall, 1978). This competition would serve to reduce the rate of response to the target lever. However, this account cannot explain how the signal promotes more accurate emission of the required response sequence on the nontarget lever. Moreover, the instrumental “overshadowing” account (Pearce & Hall, 1978) cannot explain the enhanced rates of responding noted on some schedules of reinforcement (e.g., Reed et al., 1988a,b; Schachtman, Reed, & Hall, 1987). There are reports which suggest that, under some circumstances, a signal for reinforcement will reduce overall response rates in a manner consistent with overshadowing. For example, St. Claire-Smith (1987) and Williams (1982) both report such findings. However, support for the view that the process of overshadowing produced the effects is not compelling. St. Claire-Smith demonstrated that subjects which had previously experienced signalled reward on a VI schedule (and that led to a low rate of response) were not as affected by subsequent reinforcer devaluation as were subjects which previously had no signal. This result implies that subjects with signalled reward had not acquired a response-reinforcer association as strongly as subjects who did not receive previous signalled reward; since there was little associative strength to devalue in signalled-reward subjects, the devaluation treatment would not produce such pronounced results. Apart from complications in interpreting studies in which groups begin treatment phases with differing baselines, there is no reason to believe
180
REED,
SCHACHTMAN,
AND
RAWLINS
that anything other than generalization decrement produced these results. The signalled-reinforcement groups received a reinforcer consisting of an immediate secondary reinforcer and a delayed (by SIO-ms) primary reward, whereas the unsignalled-reward group received a reinforcement of just a briefly delayed primary reinforcer. In the devaluation procedure, only the primary reinforcer was devalued, and it is possible that devaluing only this reward was not sufficient to effectively devalue the full reinforcer that maintained behaviour for the group receiving the signal (i.e., the secondary plus the primary reward). Hence, it is not surprising that the devaluation treatment was not as effective for this group as for the unsignalled-reward group. The study by Williams (1982) is interpretable in terms of sign-tracking effects; a localized visual stimulus presented temporally proximate to reinforcement will evoke more sign-tracking responses than one temporally distant from the reward (see Reed, under review). This is, in fact, the pattern of results described by Williams (1982). With the present synthetic VI schedule, lower levels of performance to a target lever were noted with signalled relative to unsignalled reinforcement. The present results also suggest that a signal for reinforcement acts to promote the preceding pattern of responding. On the simple VI schedule, the reinforced pattern of behavior comprises a long IRT; however, behavior other than pressing the lever which occurs during this IRT is not measured. On the synthetic VI schedule, the signal promoted a response pattern which consisted of a number of nontarget responses emitted prior to a target response. It may be that the nontarget responses promoted in the synthetic schedule are analogous to the unmeasured behavior involved in the long IRT in the simple VI schedule. One difference, however, between the simple and synthetic schedules is that the response requirement in the synthetic schedule is fixed; that is, the animal has to emit certain nontarget behaviors to obtain reinforcement. However, in the simple VI schedule, the pattern of responding (besides the reinforcement-producing response itself) that obtains reinforcement is not predetermined. Nevertheless, the relationship between reinforcement rate and response rate is very similar on both schedules, indicating that the two are subject to similar controlling variables (see Shimp, 1979). Further, there is good evidence that, even in situations in which certain aspects of responding are not constrained by the schedule, subjects will come to emit a stereotyped response sequence (Schwartz, 1984). Moreover, under such conditions, a signal for reinforcement will promote the acquisition and emission of the stereotyped response sequence (Reed et al., 1991). In support of this contention are a number of previous studies which have shown that a signal for reinforcement promotes learning about specific IRTs (Reed, 1989; Tarpy & Roberts, 1985). It is reasonable to assume that a signal can promote specific patterns of behavior even when this
SIGNALLED
181
REINFORCEMENT
leads to a lower overall target response rate, and the present results support this conclusion. REFERENCES
Grayson, R. J., & Wasserman, E. A. (1979). Conditioning of two-response patterns of key pecking in pigeons. Journal of the Experimental Analysis of Behavior, 31, 23-29. Hawkes, L., & Shimp, C. P. (1975). Reinforcement of behavioral patterns: Shaping a scallop. Journal of the Experimental Analysis of Behavior, 23, 3-16. Iversen, I. H. (1981). Response interactions with signalled delay of reinforcement. Behaviour Analysis
Letters,
1, 3-9.
Lieberman, P. A., McIntosh, D. C., & Thomas, G. V. (1979). Learning when reward is delayed. A marking hypothesis. Journal of Experimental Psychology: Animal Behavior Processes,
5, 224242.
Morse, W. H. (1966). Intermittent reinforcement. In W. K. Honig (Ed.), Operant behavior: Areas of research and application (pp. 52-108). Englewood Cliffs, NJ: Prentice-Hall. Pearce, J. M., & Hall, G. (1978). Overshadowing the instrumental conditioning of a leverpress response by a more valid predictor of the reinforcer. Journal of Experimental Psychology:
Animal
Behavior
Processes,
4, 356-367.
Reed, P. (1989). The influence of interresponse-time reinforcement on the signalled reward effect. Journal of Experimental Psychology: Animal Behaviour Processes, 15, 224-231. Reed, P. (under review). Signalled delay of reward: Overshadowing versus sign-tracking explanations.
Reed, P. & Hall, B. (1988). The schedule dependency of the signalled reinforcement effect. Learning
and Motivation,
19, 387-407.
Reed, P., Schachtman, T. R., & Hall, G. (1988a). Potentiation and overshadowing in rats as a function of the schedule of reinforcement. Learning and Motivation, 19, 13-30. Reed, P., Schachtman, T. R., & Hall, G. (1988b). Potentiation of responding on a VR schedule by a stimulus correlated with reinforcement: Effects of diffuse and localised signals. Animal Learning and Behavior, 16, 75-82. Reed, P., Schachtman, T. R., & Hall, G. (1991). Effects of signalled reinforcement on the formation of behavioral units. Journal of Experimental Psychology: Animal Behavior Processes,
17, 475-485.
Roberts, J. E., Tarpy, R. M., & Lea, S. E. G. (1984). Stimulus response overshadowing: Effects of signalled reward on instrumental responding as measured by response rate and resistance to change. Journal of Experimental Psychology: Animal Behaviour Processes, 10, 244255. Schachtman, T. R., & Reed, P. (in press). Reinforcement signals facilitate learning about early members of a response sequence. Behavioral Processes. Schachtman, T. R., Reed, P., & Hall, G. (1987). Attenuation and enhancement of instrumental responding by signals for reinforcement on a variable interval schedule. Journal of Experimental
Psychology:
Animal
Behavior
Processes,
13, 271-279.
Schwartz, B. (1980). Development of complex, stereotyped behaviour in pigeons. Journal of the Experimental Analysis of Behavior, 33, 153-166. Schwartz, B. (1982). Interval and ratio reinforcement of a complex, sequential operant in pigeons. Journal of the Experimental Analysis of Behavior, 37, 349-357. Schwartz, B. (1984). Creation of stereotyped, functional behavioural units. In M. L. Commons, R. J. Herrnstein, & A. R. Wagner (Eds.), Quantitative analyses of behavior. Vol IV: Discrimination processes (pp. 139-158). Cambridge, MA: Ballinger. Shimp, C. P. (1979). The local organisation of behavior: Method and theory. In M. D.
182
REED, SCHACHTMAN,
AND RAWLINS
Zeiler & P. Harzem (Eds.), Advances in analysis of behaviour. Vol 1: Reinforcement and the orgunisation of behaviour (pp. 261-298). Chichester: Wiley. Shimp, C. P. (1982). Reinforcement and the local organisation of behavior. In M. L. Commons, R. J. Herrnstein, & H. Rachlin (Eds.). Quantitative analyses of behavior. Vol II: Matching and maximizing accounts (pp. 111-130). Cambridge, MA: Ballinger. St. Claire-Smith, R. (1987). Interaction of the effects of overshadowing and reinforcer devaluation manipulations on instrumental performance. Learning and Motivation, 18, 167-184. Tarpy, R. M., Lea, S. E. G., & Midgely, M. (1983). The role of response reward correlation in stimulus response overshadowing. Quarterly Journal of Experimental Psychology B, 35, 53-65. Tarpy, R. M., & Roberts, J. E. (1985). Effects of a signalled reward in instrumental conditioning: Enhanced learning on DRL and DRH schedules of reinforcement. Animal Learning and Behavior, 13, 6-12. Tarpy, R. M., St. Claire-Smith, R., & Roberts, J. E. (1986). The effect of informational stimuli on instrumental response rate: Signalling reward versus signalling the availability of reward. Quarterly Journal of Experimental Psychology B, 38, 173-189. Vogel, R., & Annau, Z. (1973). An operant discrimination task allowing variability of response patterning. Journal of the Experimental Analysis of Behavior, 20, 1-6. Wasserman, E. A., Deich. J. D., & Cox, K. E. (1984). The learning and memory of response sequences. In M. L. Commons, R. J. Herrnstein, & A. R. Wagner (Eds.), Quantitative analyses of behavior. Vol IV: Discrimination processes (pp. 99-l 13). Cambridge, MA: Ballinger. Weardon, J. H., & Clark, R. B. (1988). Interresponse-time reinforcement and behaviour under aperiodic reinforcement schedules: A case study using computer modelling. Journal of Experimental
Psychology:
Animal
Behavior
Processes,
14, 200-211,
Williams, B. A. (1982). Blocking the response reinforcer association. In M. L. Commons, R. J. Herrnstein, & A. R. Wagner (Eds.), Quantitative analyses of behavior. Vol III: Acquisition (pp. 427-447). Cambridge, MA: Ballinger. Williams, B. A., & Heyneman, N. (1982). Multiple determinants of “blocking” effects on operant behaviour. Animal Learning and Behavior, 10, 72-76. Received July 25, 1989 Revised March 27, 1991