Behavioural Processes 49 (2000) 121 – 129 www.elsevier.com/locate/behavproc
Early-session increases in responding during extinction Franc¸ois Tonneau *, Gerardo Ortiz, Felipe Cabrera Centro de Estudios e In6estigaciones en Comportamiento, Uni6ersidad de Guadalajara, 12 de Diciembre 204, Col. Chapalita, CP 45030 -Guadalajara, Jalisco, Mexico Received 22 July 1999; received in revised form 19 January 2000; accepted 28 January 2000
Abstract After training under a variable-interval 60-s schedule of reinforcement, four rats were exposed to 30-min extinction tests, which occurred either at the start or at the end of the session (each session being 50-min long). Response rate in extinction decreased when the extinction test occurred at the end of the session, but first increased and then decreased when the extinction test occurred at the start of the session. Consistent with other recent results, this finding suggests that some variable, other than reinforcement, contributes to early-session increases in responding. © 2000 Elsevier Science B.V. All rights reserved. Keywords: Within-session response pattern; Variable-interval schedule; Extinction; Lever Press; Rat
1. Introduction Operant behavior tends to change systematically within an experimental session: Response rate usually increases, and then decreases, along the course of the session (e.g. McSweeney and Hinson, 1992). Although the existence of such bitonic within-session changes is now well documented (McSweeney and Hinson, 1992; McSweeney and Roll, 1993), the nature of the underlying mechanisms has remained unclear. Most controversies have revolved around the decrease of response rate in the later part of a session, which has usually been discussed in terms
* Corresponding author. Tel./fax: +52-3-1211158. E-mail address:
[email protected] (F. Tonneau)
of satiation (Bizo et al., 1998; Palya and Walter, 1997) or habituation to the reinforcer (e.g. McSweeney et al., 1996). Comparatively, the increases of response rate that can be observed at the start of a session have attracted less experimental attention. Two prominent explanations for such increases invoke sensitization to the reinforcer (McSweeney et al., 1996) or the cumulated arousal induced by successive food or water deliveries (Killeen et al., 1978; Killeen, 1995). In spite of genuine differences, both of these accounts assume that early-session increases in response rate are produced by the reinforcers (such as food or water) actually delivered within the experimental session. The possibility remains, however, that at least part of the early-session increase in responding arises from factors other than reinforcement. Evi-
0376-6357/00/$ - see front matter © 2000 Elsevier Science B.V. All rights reserved. PII: S 0 3 7 6 - 6 3 5 7 ( 0 0 ) 0 0 0 8 5 - 1
122
F. Tonneau et al. / Beha6ioural Processes 49 (2000) 121–129
dence on this point has been recently provided by McSweeney et al. (1998). These authors exposed rats and pigeons to various multiple, variable-interval schedules, the subjects being placed in the darkened experimental chamber 0, 5, 10, 15 or 30 min before the start of the session. Delaying the start of the session usually resulted in higher early-session response rates and smaller early-session increases in responding. Because the initial delay did not involve any reinforcer, these results suggest that early-session increases in responding are not entirely reinforcement-based; McSweeney et al. (1998) concluded that sensitization to the experimental context may have contributed. Exposure to the unreinforced context can affect later behavior in complex ways, however, increasing response rate even in a test session that is separated in time from the initial exposure (e.g. Reed and Reilly, 1990). Hence providing converging evidence that non-reinforcement processes affect the time course of early-session responding would be useful. One way to assess the role of reinforcers, such as food or water, in producing early-session increases in response rate is to determine whether such increases can occur in the absence of reinforcement, that is, during extinction. To this end, we trained rats on a variable-interval (VI) 60-s schedule to establish responding and then exposed them to test sessions which started with 30 min of extinction and ended with 20 min of reinforcement. Early-session increases in responding should not occur during extinction if they are actually driven by reinforcers such as food or water. To control for nonspecific effects of extinction, the rats were also exposed to alternative test sessions, which started with 20 min of reinforcement and ended with 30 min of extinction.
2. Materials and methods
2.1. Subjects The subjects were four naive male Wistar rats (A52, A53, A80, A81), about 90 days old at the beginning of the experiment, bought from a local
supplier of the University of Guadalajara. The rats were housed individually and maintained on a 12-h/12-h light/dark schedule.
2.2. Apparatus Two operant chambers (24× 31× 29.5 cm) were equipped with an aperture (3 cm wide, 4 cm high, 4 cm deep) located 2 cm above the floor at the center of the right panel and giving access to a liquid dipper (Coulbourn model, E14-06C-10). On each side of the aperture (5 cm from the aperture, 6 cm above the floor) was a response lever (3 cm wide, 2 cm protruding) requiring a minimum of approximately 0.20 N to operate. In this experiment, only the left lever was operative. Each operant chamber was enclosed in a ventilated, sound-attenuating cubicle (Coulbourn model, E10–20). A PC-AT computer controlled both chambers through Paraport interfaces.
2.3. Procedure The rats were given free access to food and water for 15 days after arriving in the laboratory. During the next 10 days, they had access to water for only 30 min per day, whereas solid food (Purina Chow) remained continuously available. Then the experiment began. During the experiment, the rats were given access to water for only 30 min at the end of each daily session, at about 9 am for A52 and A53, and at about 3 pm for A80 and A81. Solid food remained freely available in the home cages. Aside from shaping and extinction, all schedules used in this experiment reinforced a response with a probability that depended on the length of the preceding inter-response time and a preset duration t. The probability of reinforcement increased linearly from 0 to 1 for inter-response times ranging from 0 to t; any interresponse time longer than t resulted in a reinforcer (Weiss, 1970). This type of schedule closely mimics a traditional VI schedule, and will be referred to as such in this paper. On such a ‘VI’ schedule reinforcement rate approximately equals 1/t and can be manipulated by changing the value of t. A reinforcer consisted in presenting 0.10 cc of water for 5 s.
F. Tonneau et al. / Beha6ioural Processes 49 (2000) 121–129
Lever pressing was shaped by successive approximations, until the rat responded regularly and obtained 100 reinforcers in a row. On the next day, each rat was allowed to lever press and obtain another series of 100 reinforcers. Each rat was then exposed to the following schedules: VI 5-s (that is, a ‘VI’ schedule with t equal to 5 s), VI 15-s, VI 30-s, and VI 45-s, each schedule being maintained for two consecutive days. Each session ended after 100 reinforcers or 50 min, whichever came first. Each rat was then exposed to twenty sessions of VI 60-s schedule (the baseline schedule), followed by one test session, five VI 60-s sessions, one test session, five VI 60-s sessions, one test session, five VI 60-s sessions, and one test session, in this order. Each session was 50 min long. The test sessions were of two types, Early versus Late. Early test sessions started with 30 min of unsignalled extinction and ended with 20 min of VI 60-s. Late test sessions started with 20 min of VI 60-s and ended with 30 min of unsignalled extinction. Rats A52 and A53 received their test sessions in the order, Late – Early – Late – Early, whereas rats A80 and A81 received their test sessions in the reverse order.
Fig. 1. Distribution of inter-reinforcement times in the baseline, VI 60-s schedule. Inter-reinforcement times were collected in 10-s intervals, pooling data across rats and across sessions 1–20 of exposure to the VI schedule. The vertical dotted line indicates the average inter-reinforcement time (58.90 s).
123
2.4. Data analysis All response rates were corrected for reinforcement duration. Rates presented on a minute-byminute basis in Figs. 4 and 5 below were smoothed with a moving average to avoid graph cluttering and to stress the main visual trends of the within-session response patterns. The average was centered on a window of five data points and used the following weights: 1/9, 2/9, 3/9, 2/9, 1/9. Averages in the first, second, last, and next-to-last position in a series were computed with less than five points, the weights being adjusted accordingly (e.g. Makridakis et al., 1998). The 50 data points of each baseline session were considered a single series. Each test session, however, was partitioned in two series: one, 20 points long, corresponding to the 20-min period with reinforcement, and the other, 30 points long, corresponding to the 30-min period of extinction. These two data series were smoothed separately.
3. Results Fig. 1 shows the obtained distribution of interreinforcement times for the VI 60-s schedule, collected in 10-s intervals, and aggregated over the four rats and the first twenty sessions of baseline training. The average inter-reinforcement time, indicated by the vertical dotted line, was appropriately close to 60 s (exact value: 58.90 s). Baseline training proceeded uneventfully, and response rates on the VI 60-s schedule showed no regular trend after session 20. Furthermore, each rat developed a regular within-session pattern of responding. Fig. 2 shows for each rat the proportion of total responding emitted in successive 5-min intervals within a session, the data being averaged over sessions 16–20 of baseline training (the last five baseline sessions before the first extinction test). Each rat exhibited an early increase in response rate, followed by a regular decrease in the later part of the session. (A52’s and A53’s tendency to slightly increase responding toward the end of the session has been observed in our laboratory in various conditions. The causes of this phenomenon remain unclear.)
124
F. Tonneau et al. / Beha6ioural Processes 49 (2000) 121–129
second Late extinction test appear in the right panels. Aside, perhaps, from a minor and shortlived increment in A80 (immediate right of the vertical line), the rats did not increase their response rate at the start of the extinction period. Fig. 5 presents the results obtained in Early extinction. On the first Early test session (left panels), each subject showed a systematic, regular increase in responding at the start of the extinction period. On the second Early test (right panels), the early increase was still evident in A53 and A81; it diminished in A52 and was absent in A80. The two rats for which the early-session increase
Fig. 2. Proportion of total responding in successive 5-min intervals, averaged over the last five sessions of the first baseline phase (VI 60-s, sessions 16–20). Response rates were corrected for reinforcer duration. Each line represents a different rat.
Fig. 3 compares response and reinforcer rates in the first 15 min of sessions 16 – 25 of baseline training (these sessions include the last five baseline sessions before the first extinction test of each type, Early and Late). The regular response profiles for each rat (upper panel) contrast with the many irregularities in reinforcer rate (lower panel). Even across ten sessions, the VI schedule injected so much noise into the reinforcer profiles that no systematic trend is apparent. Linear regression of reinforcer rate on time in session (min 1 – 15) gave respective slopes of − 0.04, 0.26, 0.19, and −0.03 for rats A52, A53, A80, and A81. (Under the standard assumptions of linear regression the corresponding P-values would range from 0.35 to 0.91.) Thus, although response rate increased regularly at the start of the session, reinforcement rate did not seem to change systematically. Fig. 4 presents for each rat, and minute by minute, the smoothed response profiles obtained in each Late extinction test (filled circles connected with lines) and in the preceding baseline (continuous lines); baseline rates have been averaged over the last five VI 60-s sessions preceding a Late test session. Results for the first Late extinction test appear in the left panels, results for the
Fig. 3. Upper panel: Response rate from min 1 to 15, averaged over sessions 16 – 25 of baseline training (VI 60-s). Response rate was computed on a minute-to-minute basis and corrected for reinforcer duration. Lower panel: Reinforcer rate from min 1 to 15, averaged over sessions 16 – 25 of baseline training. Note the irregularity of the reinforcer rate profile, even averaged over ten sessions. Each line represents a different rat.
F. Tonneau et al. / Beha6ioural Processes 49 (2000) 121–129
125
Fig. 4. Smoothed response rate in the first (left panels) and second (right panels) Late test session for each rat. Filled circles connected with lines represent smoothed response rate in a single test session. Continuous lines show the smoothed response profiles averaged over the last five sessions of the baseline period preceeding a test session. Vertical dotted lines indicate the transition from VI 60-s to extinction in a test session. Note the different vertical scale for each rat.
shrunk or vanished in second Early test also showed lower rates or weaker early-session increases in baseline (A52, A80, Fig. 5, right panels). Conversely, the rat with the strongest increase in second Early test, A53, also showed the strongest increase in the preceding baseline. When reinforcers were reintroduced in the ex-
perimental situation (Fig. 5, min 31), responding resumed. The rats tended to ‘restart’ the increasing profile characteristic of their early-session behavior in baseline. A similar effect, observed within a session after 30 min of response interruption, has been recently reported by Cannon and McSweeney (1998, Experiment 1).
126
F. Tonneau et al. / Beha6ioural Processes 49 (2000) 121–129
4. Discussion Collecting response rates on a minute-to-minute basis, in individual subjects and along individual sessions, revealed many details of within-session response patterns. First, each rat showed increasing and then decreasing response profiles. This finding confirms the generality of within-session
changes in responding (McSweeney and Roll, 1993) with water as a reinforcer and moderate reinforcement rates (about 1 water delivery per min). Second, when response rate decreased it rarely decreased in a strictly monotonic fashion, be it in the later part of a baseline session or in extinction. A global decrease in responding was often accompanied by short-term increases that
Fig. 5. Smoothed response rate in the first (left panel) and second (right panel) Early test session for each rat. Vertical dotted lines indicate the transition from extinction to VI 60-s in a test session. Same conventions as in Fig. 4.
F. Tonneau et al. / Beha6ioural Processes 49 (2000) 121–129
Fig. 6. Average post-reinforcement rate profiles in baseline. The profiles were computed on a minute-to-minute basis and from min 1 to 5 after a reinforcer (response rate could not be reliably estimated beyond min 5). Each profile was smoothed by a moving average (see Section 2.4) to facilitate visual comparison with the smoothed response profile in Early extinction. Each line represents a different rat.
appeared as small peaks or ‘waves’ along the smoothed within-session profiles (Figs. 4 and 5; see also Ferster and Skinner, 1957, p. 339 and Fig. 405). The central result of this experiment is that responding tended to increase regularly at the start of Early extinction (Fig. 5). This phenomenon has proven replicable in our laboratory and has been observed after training Wistar rats on various variable-interval schedules. In Early extinction response rate tends to increase regularly for several minutes before decreasing; in one extreme case the extinction profile reached its peak 20 min after the start of the session. To our knowledge, extinction profiles of this nature have not been previously documented. Skinner’s (1950, Fig. 8) and Ferster and Skinner’s (1957, pp. 346 – 351) cumulative records of extinction after VI training, for example, show no evidence of regularly increasing response rates. In these experiments, however, extinction was begun after several minutes of the baseline schedule, as in our Late extinction tests. That responding would increase regularly in Early extinction is quite unanticipated. This increase cannot be plausibly attributed to our subjects having encountered higher and higher
127
reinforcement rates as time elapsed within a baseline session. As shown in Fig. 3, reinforcement rate did not increase systematically as time elapsed within baseline sessions. The increase of response rate in Early extinction cannot be explained either on the basis of a temporal discrimination acquired in baseline. If response rate on the baseline, VI schedule increased as a function of postreinforcement time (cf. Catania and Reynolds, 1968), this trend could have manifested itself in extinction, producing an increasing profile in the absence of reinforcement for reasons wholly unrelated to early-session effects. However, the post-reinforcement rate profiles were slightly decreasing in our experiment (Fig. 6), presumably because of the prevalence of short intervals in the inter-reinforcement-time distribution (Fig. 1). ‘Frustration’ (Amsel, 1962) is another unlikely explanation of our results. After at least twenty sessions of mixed reinforcement and non-reinforcement on a VI 60-s schedule, frustration should be fully developed, lever pressing should be conditioned to it (see Amsel, 1994, pp. 287– 288), and extinction should not bring about any further increase in response rate. Frustration theory may explain why responding persisted in extinction after VI 60-s reinforcement, but not why response rate increased. Furthermore, at the start of Late extinction tests response rate was usually higher than at the start of the session (Fig. 4). On current theories of within-session patterns, this higher response rate indicates a higher reinforcement value after 20 min of VI 60-s than at the start of the session, this higher reinforcement value reflecting in turn a higher level of combined sensitization and habituation (McSweeney et al., 1996) or a higher level of combined arousal and satiation (Killeen, 1995). A switch to extinction after min 20 (Late extinction) should thus have proved at least as frustrating as a switch to extinction at the start of the session (Early extinction). Yet responding generally failed to increase in Late extinction (Fig. 4); this failure cannot be due to a ceiling effect, because by min 20 response rate had already passed below its maximum.
128
F. Tonneau et al. / Beha6ioural Processes 49 (2000) 121–129
Finally, the observed increases consisted in systematically increasing profiles (as opposed to burst-and-pause patterns) that reached a peak more than 5, 10 or even 15 min after the start of extinction. In A53s second Early extinction test, for example, responding reached its highest level, 19 min after the start of the session. The extent and regularity of these profiles seem inconsistent with the notion of frustration as an emotional reaction. If reinforcement-based theories of early-session response patterns are correct, the response increase we observed in Early extinction cannot be explained by arguing that our subjects ‘failed to discriminate’ extinction from baseline. By definition, our rats failed to ‘discriminate’ extinction from baseline as long as they kept pressing the lever (instead of giving up). This ‘discrimination failure’ is a necessary condition for observing a response increase in extinction: rats that discriminate extinction should quit responding, and rats that quit responding (such as A80 in its second Early test, Fig. 5) cannot show an increasing response profile. However, if a failure to discriminate extinction is a necessary condition for observing an increasing profile, it is not a sufficient condition of this profile according to reinforcement-based theories; for responding cannot increase because of reinforcers that are not presented. Failure to discriminate extinction from the baseline schedule can explain why responding persisted in extinction, but not why response rate increased in extinction. It is also difficult to argue that the increasing profile of lever pressing observed in baseline, although initially due to reinforcement, became a habit (Adams and Dickinson, 1981) and transferred intact to our extinction tests. For there is no such thing as a unitary, increasing pattern of lever pressing in baseline. What appears as a unitary profile of lever pressing in Fig. 3, for example (upper panel), is in fact an increasing profile interrupted by water deliveries (hence drinking) at irregular intervals averaging 1 min (Fig. 1). In extinction, on the other hand, no water was ever delivered, which prevented drinking and should have disrupted any behavioral chain or chunk that might have been established in baseline (however unlikely).
That response rate increased regularly at the start of Early extinction tests, often in ways reminiscent of the increasing profiles observed at the start of baseline sessions, thus seems problematic for theories which attribute early-session increases in response rate to the actual delivery of reinforcers such as food or water (e.g. Killeen, 1995). Our results could be explained in at least three different ways. First, lever pressing might increase at the start of a session because of competition with exploratory behavior. Rats that explore the environment after being released into the chamber will not lever press; as the tendency to explore decreases with time of exposure to the chamber, their rate of lever pressing will increase. Second, responding on an operant schedule might depend on the activation of task-related memories by the context (cf. Spear, 1973); if memory retrieval takes time, the rate of lever pressing will increase through the session. Finally, operant responding might be sensitized by features of the context other than the reinforcers (McSweeney et al., 1998). Each of these three explanations attributes at least part of early-session increases in responding to a factor other than reinforcement. Hence, these explanations are compatible with the present data, showing an early-session increase in response rate in extinction. As rats are exposed to more and more extinction tests, responding in extinction should become weaker and weaker, countering any possibility of observing an increasing response profile. Hence none of these explanations incorrectly implies that early-session response increases in extinction should be permanent. Besides, all of the three explanations allow the course of early-session response patterns in baseline to depend on parameters of the reinforcer, such as its duration or its rate. Assume, for example, that early-session increases in lever pressing arise through interference with exploration. Reinforcing lever pressing more frequently should increase its competitiveness with respect to exploration (e.g. Staddon 1978), producing stronger early-session increases in operant responding. Alternatively, assume that early-session response increases represent the progressive retrieval of task-related memories. Although mere
F. Tonneau et al. / Beha6ioural Processes 49 (2000) 121–129
exposure to the context should suffice to retrieve the relevant memories (explaining the early-session response increases in extinction), reinforcers are presumably more potent retrieval cues; hence delivering them more frequently should accelerate the early-session response profile. A similar argument applies to the hypothesis that sensitization to the context contributes to early-session response patterns (McSweeney et al., 1998). The present study was not devised to discriminate among these tentative, non-reinforcement based explanations, each of which in need of further elaboration and testing. However, that some non-reinforcement mechanism contributes to early-session effects seems clear. By revealing the orderly nature of individual behavior on a minute-to-minute basis and in individual test sessions, our results should also encourage researchers to perform more fine-grained analyses of within-session effects than has usually been the case.
Acknowledgements We thank the editor and two anonymous reviewers for their helpful comments. Parts of these data were presented as posters at the 14th Mexican Meeting/2nd Ibero-Interamerican Meeting of Behavior Analysis (Guadalajara, Me´xico, February 1999) and at the Annual Meeting of the Society for the Quantitative Analyses of Behavior (Chicago, IL, May 1999).
References Adams, C., Dickinson, A., 1981. Actions and habits: Variations in associative representations during instrumental learning. In: Spear, N.E., Miller, R.R. (Eds.), Information Processing in Animals: Memory Mechanisms. Erlbaum, Hillsdale, pp. 143 – 165. Amsel, A., 1962. Frustrative nonreward in partial reinforcement and discrimination learning: Some recent history and
.
129
a theoretical extension. Psy. Rev. 69, 306 – 328. Amsel, A., 1994. Pre´cis of frustration theory: An analysis of dispositional learning and memory. Psychonom. Bull. Rev. 1, 280 – 296. Bizo, L.A., Bogdanov, S.V., Killeen, P.R., 1998. Satiation causes within-session decreases in instrumental responding. J. Exp. Psy. Anim. Behav. Process. 24, 439 – 452. Cannon, C.B., McSweeney, F.K., 1998. The effects of stopping and restarting a session on within-session patterns of responding. Behav. Process. 43, 153 – 162. Catania, A.C., Reynolds, G.S., 1968. A quantitative analysis of the responding maintained by interval schedules of reinforcement. J. Exp. Anal. Behav. 11, 327 – 383. Ferster, C.B., Skinner, B.F., 1957. Schedules of Reinforcement. Appleton-Century-Crofts, New York. Killeen, P.R., 1995. Economics, ecologics, and mechanics: The dynamics of responding under conditions of varying motivation. J. Exp. Anal. Behav. 64, 405 – 431. Killeen, P.R., Hanson, S.J., Osborne, S.R., 1978. Arousal: Its genesis and manifestation as response rate. Psy. Rev. 85, 571 – 581. Makridakis, S., Wheelwright, S.C., Hyndman, R.J., 1998. Forecasting: Methods and Applications. Wiley, New York. McSweeney, F.K., Hinson, J.M., 1992. Patterns of responding within sessions. J. Exp. Anal. Behav. 58, 19 – 36. McSweeney, F.K., Hinson, J.M., Cannon, C.B., 1996. Sensitization-habituation may occur during operant conditioning. Psy. Bull. 120, 256 – 271. McSweeney, F.K., Roll, J.M., 1993. Responding changes systematically within sessions during conditioning procedures. J. Exp. Anal. Behav. 60, 621 – 640. McSweeney, F.K., Swindell, S., Weatherly, J.N., 1998. Exposure to context may contribute to within-session changes in responding. Behav. Process. 43, 315 – 328. Palya, W.L., Walter, D.E., 1997. Rate of a maintained operant as a function of temporal position within a session. Anim. Learn. Behav. 25, 291 – 300. Reed, P., Reilly, S., 1990. Context extinction following conditioning with delayed reward enhances subsequent instrumental responding. J. Exp. Psy. Anim. Behav. Process. 16, 48 – 55. Skinner, B.F., 1950. Are theories of learning necessary? Psy. Rev. 57, 193 – 216. Spear, N.E., 1973. Retrieval of memory in animals. Psy. Rev. 80, 163 – 194. Staddon, J.E.R., 1978. Theory of behavioral power functions. Psy. Rev. 85, 305 – 320. Weiss, B., 1970. The fine structure of operant behavior during transition states. In: Schoenfeld, W.N. (Ed.), The Theory of Reinforcement Schedules. Appleton, New York, pp. 277 – 311.
.