ADVANCES IN THE STUDY OF BEHAVIOR VOL. II
Ontogeny and Phylogeny of Paradoxical Reward Effects' ABRAMAMSEI.A N D MARKSTANTON DEPARTMENT OF PSYCHOLOGY THE UNIVERSITY OF TEXAS AT AUSTIN AUSTIN. TEXAS
I. Introduction . . . . . 11. Paradoxical Effect rcement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . A Little History: Why "Paradoxical"? . . . . . ............ B. The Paradoxical Effects . . . . . . , . , , , , , . . . . . . . . . . . . , , , , . . . . . , , , , . , 111. Frustration Theory as One Mechanism for the Paradoxical Effects. . . . . . . . . IV. The Comparative Analysis of Learning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . Emphasis on Behavioral Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
B. Systematic Variation ....................... C. Species Differences i ......................... V. Toward an Ontogenetic Analysis of Paradoxical Effects A . The PREE and MREE . . . . . . . . . . . . , , . . . . . . . . . . . , . , , . . . . . . . . . . . , B. Successive Negative Contrast (SNC) and Patterned Alternation (PA). . . . .... C. The Overtraining Extinction Effect (OEE) . . . . . . . . VI. Comments on the Neural Substrate of Paradoxical Effects . . . . . . . . . . . . . . . VII. Concluding Considerations: Implications for Behavior and Behavior Theory A . Ontogeny of Appetitive Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Ontogeny of Reward Learning: Difficulties for Theory . . . . . . . . . . . . . . , C. Concluding Comments . . . . . . . . . ... ........ References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
I.
227 230 230 233 234 236 237 238 239 242 243 249 255 251 263 263 265 266 267
INTRODUCTION
Among behavioral and neuroscientists there is a broad segment of opinion suggesting that a fuller explanation of behavior will involve causal mechanisms derived from three levels of investigation: proximal, developmental, and evolutionary. 'The work from our laboratory was supported by NSF Grant BMS74-19696 and by Grant R01MH-30778 from NIMH. 227
Copyright @ 1980 by Academic Res. Inc All nghts of reproduction in any form reserved
ISBN 0-12-004511-7
228
ABRAM AMSEL AND MARK STANTON
Proximal mechanisms are mechanisms that operate to affect behavior over a relatively short time course. Such mechanisms, actual or hypothetical, are the determinants of present behavior in the context of the immediate environment. Hinde (1966, p. 4) has suggested that in this level of analysis “we need not be chary of using the word ‘cause’ in its everyday sense.” Some of these proximal mechanisms have a historical basis, in the sense that associations formed at any stage of development may be cued and activated by contemporary determinants. Proximal mechanisms are organized in our thinking by terms such as sensoryperceptual factors, associative or learned factors, and motivational-emotional factors. At a less abstract level present behavior in the immediate environment can be related to neural and hormonal systems. The fact that differences exist in the operation of proximal mechanisms among different individuals, and even in the same individual at different points in the life span, brings us to the second broad area of investigation, development. Here questions about the historical basis for both associative and nonassociative factors in the lifetime of the individual are brought into sharper relief: How have genetic endowment, maturation, and early experience interacted to determine the nature and operation of the proximal mechanisms which determine the behavior of the adult? Is the individual’s machinery age dependent and, if so, by what developmental principles can this age dependence be understood? Evolution, which may be generally defined as the interaction of natural selection with phylogenetic history, may also play a part in the history of the individual. Heredity and differential reproduction adapt the species to changing ecological contingencies. Adaptation, in turn, alters the environment. Proximal mechanisms may be modified by natural selection at all stages of phylogeny and ontogeny . In this sense, also, are applications of evolutionary principles involved in the understanding of individual behavior: evolution explains differences among species and genetic principles partially explain differences in endowment among individuals of the same species. The problem of accounting for behavior is, then, admittedly very complex. The number of research problems and theoretical issues contained at each level of explaining behavior are sufficient to occupy a great many investigators crossing many disciplines. It is not surprising, then, that these three levels of mechanism which share, at least in theory, complex interrelationships, are usually in practice studied separately, and often independently. This independence leads to methodological and philosophical differences in approach to the study of behavior: as an approach, the ethologist’s interest in evolution and adaptation, with its emphasis on ecology, genetic transmission, and species comparisons, seems very remote from that of, say, the sensory physiologist, whose object of study, the properties of sensory neurons, leads to a choice of subjects (model systems, preparations) and stimuli based on laboratory convenience and amenability to experimental control.
PARADOXICAL REWARD EFFECTS
229
This usual independence of approach should not, however, be overstated. There are an increasing number of cases in which it has been profitable to combine considerations from at least two levels of mechanism in the study of behavior. Investigators interested mainly in developmental mechanisms have made species comparisons to understand development better (e.g., Kendler and Kendler, 1975; Rosenblatt, I97 1); and sensory physiologists have studied perception by making appropriate developmental (e.g., Hubel and Wiesel, 1959) and comparative (Pettigrew, 1980) manipulations. To those whose major interest is in the mechanisms of proximal causation, species and age comparisons can be very illuminating. To students of brain and behavior, for example, the develop ing species or organism can serve as a preparation or model system for studying the relationships between behavioral and neural changes (e.g., Altman et al., 1973; Fibiger et al., 1970). Some investigators who pursue proximal causation at the level of behavior theory, for example in the study of learning and memory, have found consideration of species and age differences just as illuminating as other investigators whose main interest is in brain correlates. In the case of learning, the theoretical advantages of making phyletic comparisons have been argued cogently by Bitterman (1 975). In this article, we will argue that the study of ontogeny provides similar theoretical advantages and is complementary to the study of phylogeny in this regard. A summary of our present approach might take the form of identifying stages in the biopsychological study of related behavioral effects; and it might be termed a modified empirical construct approach. As we see it the stages are:
I . Observe and describe a number of apparently related effects; 2. Develop a conceptualization of these effects in terms of empirical con-
structs; 3. Study these effects phylogenetically and ontogenetically for their presence or absence, and particularly for order of their appearance; 4. Study the appearance and order of appearance of these effects in relation to the presence or absence of portions of, or activities of, the neural substrate; 5 . Relate the findings from 3 and 4 to 2, i.e., to a conceptualization in terms of empirical constructs. The particular behavioral phenomena we have chosen to address in this context are the paradoxical effects ofreinforcement. There are many such effects, and we list a larger set of them in Table I, but we restrict our discussion to four of these effects that are of the between-subjects variety: the Partial Reinforcement Extinction Effect (PREE), the Magnitude of Reinforcement Extinction Effect (MREE), the Successive Negative Contrast (SNC), and the Overtraining Extinction Effect (OEE). We will deal also with another effect, Single Patterned Alter-
230
ABRAM AMSEL AND MARK STANTON
nation ( P A ) , which is not strictly speaking a paradoxical effect but, as we shall see, is bound theoretically to such effects, We have begun an ontogenetic study of these phenomena in the laboratory rat that stresses very early development. And we have decided to study these effects first because (a) they have been studied extensively in a number of species by Bitterman, Gonzalez, and others, and (6)they have, at one time or another, all been deduced from the same sets of theoretical premises, specifically in treatments of the role of appetitive nonreinforcement (nonreward) on behavior (e.g., Amsel, 1962, 1967; Capaldi, 1967). The remainder of this article follows the five stages we have just outlined. Section I1 provides some theoretical background and describes the effects; Section 111 summarizes the frustration theory view of the role of appetitive nonreward in all these effects; Sections IV and V provide data from some phylogenetic, and our recent ontogenetic, experiments on the development of paradoxical effects; Section VI discusses some of our findings in relation to the neural substrate; and Section VII reflects on the implications of this work for behavioral theory.
n. A.
PARADOXICAL
EFFECTS OF REINFORCEMENT
A LITTLEHISTORY: WHY“PARADOXICAL”?
There are two senses in which the effects we have listed are paradoxical. The first, and simplest, sense is that in each case more produces less, and less produces more. The PREE is a case in which the lesser density of reward leads to greater resistance to (more trials to) extinction. In SNC and MREE greater magnitude of reward produces “abnormally” low levels of performance when reward is reduced or less resistance to extinction, respectively. An inverse relationship between number of acquisition trials and trials to extinction defines the OEE. In the within-subjects cases (Table I), to give some examples, the Partial and Magnitude of Reinforcement Acquisition Effects describe cases in which certain lesser percentages and magnitudes, respectively, of reward to one of two discriminanda result in relatively greater, rather than lesser, levels of performance. And Operant Behavioral Contrast describes the finding that reducing the value of reinforcement of one of two discriminative stimuli ( S - ) increases performance to the other ( S + ) . Even patterned (single) alternation produces paradoxical effects in the sense that, early in training, a reward produces better performance on the next trial while a nonreward produces poorer performance, and after some amount of training reward and nonreward have the opposite effects on the next trial (e.g., Tyler et al., 1953). The other sense in which these (and other) effects are paradoxical is more
23 1
PARADOXICAL REWARD EFFECTS
TABLE I PARAIH)XI(.AI. EFFECTS OF APPETI'TIVE REINFORCEMENT" Between-subjects effects
Within-subjects effects"
Partial Reinforcement Acquisition Effect (Haggard. 1959; Goodrich, 1959; Wagner, 1961) Partial Reinforcement Extinction Effect (Humphreys. 1939a.b. 1940, 1943) Magnitude of Reinforcement Extinction Effect (Hulse, 1958; Armus, 1959) Overtraining Extinction Effect (North and Stimmel, 1960) Successive Negative Contrast (Elliot, 1928: Crespi, 1942)
Partial Reinforcement Acquisition Effect (Henderson, 1966; Amsel ef a / . 1966) Magnitude of Reinforcement Acquisition Effect (MacKinnon, 1965) Simultaneous Negative Contrast (Bower. 1961) Operant Behavioral Contrast (Reynolds, 1961, 1963) Peak Shift (Hanson, 1959; Honig e t a / . , 1959) Pavlovian Positive Induction (Pavlov, 1927) The Overtraining Reversal Effect (Reid, 1953; Pubols, 1956)
"The references provided in each case are the early, seminal ones. For more recent references and more detailed discussion of these effects see Amsel (1967. 1971), Capaldi (1967). Gonzalez and Champlin (1974). Mackintosh (1974), and Rashotte (1979). "The within-subjects effects are all demonstrated in the context of discrimination learning and differential conditioning
theoretical. The starting point is associationistic psychology, and more particularly that branch of associationism known as Learning Theory. All learning theories have had to deal with the fact of decreasing incremental changes in performance with successive reinforcements, and of decreasing decremental changes in performance with successive nonreinforcements. More specifically, changes in the probability or associative strength of a learned response in a nonchoice situation are often expressed in a simple linear equation of the general form
Where A is the asymptote or limit to which probability can grow, /3 is a parameter reflecting rate of growth to asymptote, and Ap,, is the increment in probability on a given trial, which obviously gets successively smaller as learning proceeds. The use of stochastic or linear models to describe the course of learning was brought into prominence by Estes (1950) and Bush and Mosteller (1951) and was subsequently elaborated by them and by other mathematically oriented psychologists. For example, Bush and Mosteller (1955) devised mathematical accounts of a variety of learning phenomena based on a linear model with assumptions
232
ABRAM AMSEL AND MARK STANTON
borrowed from Hull's (1943) theory; Lovejoy (1965) used a linear-model approach to develop a theory of attention; and more recently Rescorla and Wagner (1972) have used such a model, with simple but important modifications, to provide an account in associative terms of stimulus selection and attentional factors in Pavlovian conditioning. One of the characteristic simplifying assumptions of such linear-model theorizing is that decremental and incremental effects are conceptualized as governed by the same mechanisms, a Guthrian view. This is the case whether such effects are regarded as changes in response probability (Bush and Mosteller, 1955; Estes, 1950), or as changes in associative strength (Rescorla and Wagner, 1972). According to such models a decrement (inhibition) is the simple inverse of an increment (excitation); extinction curves are of the same form as acquisition curves; and reductions in performance, associative strength, or response probability with decreases in reinforcement density or magnitude are to levels appropriate to those values of reinforcement. In short, linear models of learning describe simple monotonic relationships in both learning and extinction, and make no provision for what we call paradoxical effects of reinforcement. In this sense, then, the word paradoxical represents the departures from what is expected on the basis of simple linear-operator models of basic or classical learning theories. To take the most recent case as an example, the Rescorla-Wagner model, with all its successes in other respects, cannot account for any single paradoxical effect, neither those which we will elaborate here nor any of several others (see Table I). An earlier case in point is the basic equation for growth of habit strength in Hull's ( 1943) theory
SHR= M - Me-'" In this equation the rate (i) and limit of growth ( M ) parameters play the same role as in all the linear models that were to follow. While habit strength in Hull's theory does not decrease in extinction, as does associative strength in the Rescorla-Wagner model, total inhibitory strength does subtract from excitatory strength and the rate of growth of inhibitory strength in extinction can be taken as a direct indicant of the strength of association formed in acquisition. Indeed, Hull (1 943, p. 118) considered resistance to extinction one of a number of measures of habit strength: The strength of the habit is manifested indirectly by various measurable aspects of action: (1) reaction amplitude or magnitude (A), (2) reaction latency (t), (3) resistance to experimental extinction (n), and (4) probability (p) of occurrence, i.e., percent of appropriate stimulations which evoke the associated reaction.
In this regard Hull's 1943 theory suffered from the same limitations as its linear-model successors. Spence (1960) pointed this out in the context of an early systematic discussion of the partial reinforcement acquisition effect (PRAE), a
PARADOXICAL REWARD EFFECTS
233
discovery by Haggard (1959) and Goodrich (1959) that rats trained under a 50% schedule of reward reached higher asymptotes of running speed than those under 100% reward. To reiterate, none of these classical associationistic theories can account for the PRAE or for any of the other effects we will describe as paradoxical. B.
THEPARADOXICAL EFFECTS
Even before Hull’s Principles was published there was already evidence from learning experiments that at least one extinction effect was not in accord with his view of habit measurement. Soon there were other findings that pointed to an absence of the expected direct relation between habit strength and performance reflecting resistance to extinction. Reinforcement manipulations which would presumably yield weaker habit strength were followed by stronger rather than weaker resistance to extinction. Perhaps the most seminal paradoxical effect of reinforcement to be discovered was the finding of Humphreys ( 1939a) that random interspersal of reinforcement and nonreinforcement in human eyeblink conditioning yielded inferior performance in acquisition but greatly increased resistance to extinction. The partial reiilforceinent extinction effect (PREE), as this effect came to be known, was shown by Humphreys (1943), as it had been shown in a different context by Skinner (1938), to occur in appetitive (reward) learning in rats as well as in defense conditioning in humans, and much of the later work on this and other effects has been done in the context of reward learning. The PREE, or “Humphreys Paradox” as the effect has been called (Kimble, 1961, p. 287), became a major problem for investigation in learning theory, inspiring a vast amount of research, several theories, and many applications. Broad reviews exist elsewhere (Lewis, 1960; Robbins, 1970). Another paradoxical effect of appetitive reinforcement, first reported by North and Stimmel (1960) in rats, was the finding that giving a large number ( N , in Hull’s equation) of reinforcements in acquisition results in faster extinction than giving a smaller number. This phenomenon has been termed the overtraining extinction effect (OEE) and has held up under a variety of conditions (see Mackintosh, 1974, pp. 423-426, for a summary). It is not in line with that portion of Hull’s equation that makes habit strength a direct function of N , at least not if strength of habit is inferred from resistance to extinction. At about the same time, it also was shown in rats that a large magnitude of reward in acquisition results in faster extinction than a small reward magnitude (Hulse, 1958; Wagner, 1961). The magnitude of reward extinction eflecr (MREE), as it has been termed, has since been confirmed in several experiments (e.g., Gonzalez and Bitterman, 1969; Ison and Cook, 1964; Traupmann, 1972). Again an extinction measure of performance seems to contradict Hull’s view that habit strength (SHR)is a direct function of reinforcement magnitude ( w ) .
234
ABRAM AMSEL A N D MARK STANTON
As early as 1928, Elliott showed that when rats were trained to learn a more complex maze, and their incentive was changed in mid-course from bran mash to sunflower seed, their performance (number of errors) deteriorated to a level below that of a control group rewarded with sunflower seeds throughout. Crespi’s (1942) more influential demonstration of what has come to be known as successive negative contrast (SNC) or, as Crespi termed it, the depression efect, dealt with contrast in terms of speeds in a straight alley rather than errors in a maze. It was as damaging to the 1943 Hullian view, and the later linear-model views, as were the PREE, the OEE, and the MREE. The term SNC refers not to an extinction effect, but to the finding that a shift from a large to a small magnitude of reward causes a reduction in performance, not only to the level of performance of controls trained throughout on the lower magnitude of reward, but transiently to a level lower than the small-reward controls. This effect has been viewed quite reasonably by some (e.g., Gonzalez and Bitterman, 1969) to be different from the MREE only in degree: in the MREE the downward shift is to zero reward, and there is evidence for subzero extinction performance, or performance below a never-rewarded “operant” level; in the case of the SNC the downshift in reward is simply not to zero. To reemphasize the point, these effects, when they occur, represent changes in performance that are not in accord with decremental assumptions in Hull’s (1943) theory or in any of the simple linear models of learning that followed. It is noteworthy, however, that these paradoxical effects are much more robust in instrumental than in classical conditioning, especially in animals, and do not seem to occur in the “purer” forms of classical conditioning from which preparatory responding (in animals) and cognitive involvement (in humans) have been removed or minimized (Amsel, 1972b). Indeed, there are levels of functioning both in fish and reptiles (and presumably other lower animals) and in humansthese might be characterized as relatively “primitive” levels-in which the reward effects appear in the nonparadoxical form predicted in 1943 by Hull and the linear-model approaches. It does not seem unreasonable to think that such results are obtained when acquisition, extinction, and reduction in reward magnitude do not arouse mediating expectancies about reward or anticipatory goal responses (or cognitions) either in animals, because the mechanisms do not exist, or in humans, because they are not engaged. In Sections IV and V we will examine some of these nonparadoxical cases. 111.
FRUSTRATION THEORY AS ONEMECHANISM FOR THE PARADOXICAL EFFECTS
In a series of theoretical papers, Amsel(l958, 1962, 1967, 1971 , 1972a,b) has developed an account of a variety of effects, including the various acquisition and
PARADOXICAL REWARD EFFECTS
235
extinction effects related to appetitive continuous and partial reinforcement and discrimination learning, and discriminative versus nondiscriminative contrast effects. A major assumption of this account is that in instrumental (and perhaps Pavlovian) but not in simple classical conditioning (see Amsel, 1972b), response diminution, including extinction, consists not of a decrement in the associative strength or response probability established in acquisition, but rather of the learning of a new association based on the frustrative properties of nonreward. These properties are aversive, and they produce escape and avoidance responses that are incompatible with the appetitive approach response established in acquisition (for extensive evidence see Daly, 1974). This is an "active" characterization of the effects of nonreward, or reduced or delayed reward, the response decrement reflecting learning of a new competing response tendency rather than unlearning of the old. More specifically, the theory assumes that an anticipation or expectancy of reward (TR-SR) is established and comes to control the goal approach response in instrumental learning. (As we shall see, it is a mediating mechanism of this kind that may be absent in the learning of fish and turtles, humans operating relatively noncognitively, and "precognitive " infant rats.) Once rH-sR is established and mediates approach responding, nonreward unconditionally elicits an aversive reaction termed primary frustration (RF).This reaction has drive properties (Amsel and Roussel, 1952); its reduction can reinforce escape responses (Daly, 1974); and it has feedback stimulus (sF) properties which can cue, guide, and direct behavior (Amsel and Ward, 1954). These feedback, drive-stimulus properties are analogous to the stimulus aftereffects of nonreward (Capaldi, 1967; Sheffield, 1949) or memories of nonreward (Capaldi, 1972) postulated in sequential theory, and are particularly important in any analysis in which reward expectancy or anticipation is not thought to play a major role in learning. With repeated nonrewarded trials, primary frustration can serve as a US for the conditioning of anticipatory or conditioned frustration (TF-SF) which, when established to accompanying cues (CSs) provides its own feedback stimulation that can evoke goal-avoidance responses (RAvd). It is these sFconnected avoidance tendencies that constitute the new learning that occurs in expectancy-mediated extinction, or indeed, in any situation in which reward is reduced or delayed. The frustration theory account of the paradoxical reward effects rests on the assumption that the conditioning of the avoidance tendency in extinction, contrast, or other effects of relative reward reduction depends on the magnitude of primary frustration (RF) which in turn depends on the discrepancy between the anticipated (r R-s R) and the realized reward. Factors that increase the strength or value of the expected reward will, by increasing the magnitude of RF when reward is absent, lead to enhanced avoidance of the goal and, consequently, to more rapid extinction or, in general, to greater reward-reduction effects. In this way, the larger number of training trials or reward magnitude in acquisition will
236
ABRAM AMSEL AND MARK STANTON
produce faster extinction than fewer training trials or smaller rewards (the OEE and MREE, respectively). Successive negative contrast may be thought of as a less drastic version of the MREE, the downshift in reward being to a low reward magnitude rather than to zero. A theoretically consistent account of the partial reinforcement extinction effect follows from an additional assumption of the theory, that of counterconditioning, specifically the counterconditioning of feedback stimuli from conditioned frustration (S p) to approach, rather than avoidance, of the goal area in the face of anticipated frustration. This is assumed to occur on a partial reinforcement schedule in which rewards and nonrewards are interspersed during acquisition. Both rR-sRand rF+F gain strength under such a schedule, but, given that the approach tendency is dominant, sF acquires some association with the approach response. If these mechanisms operate to retard extinction it is because in partially reinforced animals continued approach is a response to feedback cues from anticipatory frustration (sF-RApp)whereas continuously reinforced animals, experiencing frustration for the first time in extinction, learn an avoidance tendency (sF-RA“~) unopposed by the counterconditioned approach tendency. With this particular theoretical integration of the paradoxical effects of reinforcement as a set of guiding principles or heuristic, we have begun to chart the course of development in infant rats of these effects and of their underlying processes. As we have already noted, many comparative investigations of these effects, mainly by M . E. Bitterman, R . C. Gonzalez, and their co-workers are already in existence. And these studies are comparative in the phylogenetic sense; they look for the presence or absence of these effects in fish, reptiles, and birds, as well as in mammals. Our recent ontogenetic work in this area owes much, not only to the earlier work of these investigators with a number of species, but also to discussions, particularly by Bitterman (1960, 1965, 1975), of methodological problems inherent in comparative analyses. From the viewpoint of methodological and theoretical analysis it may turn out that in ontogenetic and in phylogenetic analysis the problems are similar, although perhaps somewhat less severe in the former. In the next sections we will look at some examples of comparative studies of paradoxical effects from the phylogenetic and ontogenetic perspectives. But first, we will provide a brief review of some problems encountered in comparative analysis as explicated by Bitterman and others (Bitterman, 1960, 1975; Sutherland and Mackintosh, 1971). IV.
THECOMPARATIVE ANALYSIS OF LEARNING
The essential features of the comparative study of learning are: ( u ) emphasis on “process, ( h ) “systematic variation” and “functional analysis” as a solu”
PARADOXICAL REWARD EFFECTS
237
tion to the control problems that are inherent in cross-species (and, in our case, cross-age) comparisons, and ( c ) characterization of species (and, in our case, age) differences through the study of behavioral phenomena that can be taken to reflect the operation of underlying processes. A. EMPHASIS ON BEHAVIORAL PROCESS From the point of view of theoretical analysis, emphasis on process is perhaps the most important distinguishing feature of the comparative analysis of learning. It should be noted, however, that this emphasis works in both directions: a comparative approach based on considerations of learning theory yields comparative observations and information that is perhaps more powerful than would be available from behavioral comparisons not guided by theoretical considerations. This distinction between investigation of behavioral process and investigation of behavior cannot be overemphasized. Characterization of behavior is an empirical matter while characterization of process is necessarily theoretical. Often the same behavioral effect can be understood in terms of a number of theoretical processes and, conversely, the manifestation of the same process may be inferred from a variety of behavioral effects. As an example of the former, the overtraining reversal effect (ORE)-faster reversal following extended discrimination training-has been taken to reflect the operation both of attentional (Sutherland and Mackintosh, 197 1) and frustrative (Amsel, 1962) processes. Similar considerations are familiar to physiological investigators. For example, in attempting to assess the effects of brain lesions on memory, one must be aware that the lesion may also produce sensorimotor or motivational changes that result in behavior indicative of “forgetting. By virtue of its theoretical nature, any analysis at the level of behavioral processes is cast in terms that are abstract and general, the concern being to provide an account of the functional properties of molar laws of behavior, rather than the specifics of stimulus, response, response topography, and so on, that occur in a given situation. In this regard, questions of process are quite distinct, in our view, from questions of “belongingness” and “species adaptiveness, determinants of behavior long recognized (Thorndike, 1911) but not studied intensively until recently (Hinde and Stevenson-Hinde, 1973; Seligman and Hager, 1972). It is one thing to ask which stimuli and which responses are more likely than others to enter into an associative relationship (Garcia and Koelling, 1966), or which reinforcers are more likely than others to strengthen or weaken a particular response (Shettleworth, 1978), and quite another thing to ask if the functional relationships derived from these novel experimental observations reflect, on the whole, the operation of new or essentially the same processes. Taste aversion learning provides, perhaps, the best illustration of this point. In view of the fact that the conditioning of taste aversions occurs rapidly and with very long ”
”
238
ABRAM AMSEL AND MARK STANTON
CS-US intervals, some investigators (e.g., Kalat and Rozin, 1972) used their
failure to obtain “blocking” in this system to argue that taste aversion learning is a more “primitive” learning system, involving more rudimentary processes, than other learning systems. After a large amount of systematic research, however, it appears that the same processes seem to operate in taste aversion learning as in Pavlovian conditioning in general (Domjan, this volume) and that the faster conditioning and extended CS-US interval can be regarded as extremes in these two dimensions. We are not arguing that differences between various behavioral systems do not exist or that such differences are unworthy of study; on the contrary, we have used the example of conditioning of taste aversion to point to the distinction between questions of belongingness or preparedness and questions of process, and to emphasize the point that results appearing to suggest radical difference in kind may not reflect actual differences in process. So much depends on the level of abstraction with which one views the phenomena, as well as on the amount of systematic research and the overall pattern of data available for examination. B.
SYSTEMATIC VARIATION
The major control problem in any comparative analysis is the changes or divergences that occur across species (or across ages within a species). In the case of comparisons of learning across species, for example, it is impossible to equate for level of motivation induced by deprivation procedures or for the reinforcing value of a goal event, to mention just two of many relevant factors. Bitterman (1965, 1975) has proposed systematic variation as a partial solution to this control problem. There is little we can add to Bitterman’s well-known discussion of this research strategy: the essence of it is that any conclusion about a difference in behavior (and in process) among species or ages in development should not depend on the application of any one set of values of a parameter. For any species or for any age level in ontogeny it should be possible, for example, to determine an effective range of reward conditions that produce diff#Vencesin performance and, presumably, in learning. For another species or age, the range may be different. In the case of paradoxical effects, for example, the conclusion that an effect exists at one level of phylogenetic or ontogenetic development but not at others should be reached only after extensive systematic variation of a number of relevant parameters has been accomplished. It goes without saying that “complete” systematic variation is an ideal, and that most investigators, including ourselves, are likely to suggest developmental transitional stages and divergences after very little systematic variation of parameters has been undertaken. These suggestions can properly be regarded as working, hypotheses-and nothing more.
239
PARADOXICAL REWARD EFFECTS
DIFFERENCES IN PARADOXICAL EFFECTS C. SPECIES In fish and turtles, four paradoxical effects shown in adult rats (and also in birds)-successive negative contrast (SNC), the magnitude of reward extinction effect (MREE), the overtraining extinction effect (OEE), and the partial reinforcement extinction effect (PEE)--have failed to appear, except in a few cases under conditions of highly massed trials. Consider the following experiment on SNC by Lowes and Bitterman (1967). Goldfish were trained to strike an illuminated target in order to receive Tubifex worms as a reward. Training was given at the rate of one trial a day for a 63-day period. Group 4 received a 4-worm reward and Group 40 a 40-worm reward on each trial throughout the training period, while Groups 4-40 and 40-4 had their reward magnitudes shifted approximately midway through training from 4 to 40 worms and from 40 to 4 worms, respectively. The group mean log target striking latencies are plotted as a function of three-trial blocks in Fig. 1 . The results show a clear reward-magnitude effect in the preshift phase of the experiment. In the postshift phase, Group 40-4 nevertheless failed to increase its latency of responding (decrease its speed) even to the level of Group 4, much less above that level. Group 4-40, on the other hand, improved its performance to the level of Group 40. This absence of incentive contrast in fish cannot easily be attributed to a lack of discriminability of the reward magnitudes. 2.20
-
2.00
-
-4 o---o 4 - 4 0
m40
1.50
~
-
40-4
0--Q
0
z
w
t
a
" -I
3
1.00 -
z
U
W
5 0.50
#'
1
1
I
I
I
l
I
I
*
I
I
I
I
I
,
,
,
,
,
240
ABRAM AMSEL AND MARK STANTON
- 1
2 .o
*-•
I
4
r
14 THREE -TRIAL BLOCKS
1
1
,
21
FIG.2. Acquisition and extinction of an instrumental response in goldfish as a function of reward magnitude. From Gonzalez et a / . (1972).
With regard to the MREE, we turn to an experiment by Gonzalez et a!. (1972). Three groups of goldfish were given one trial a day in an aquatic runway for a total of 24 acquisition and 39 extinction trials. The groups differed in terms of reward magnitude during acquisition, the different magnitudes being 1,4, or 40 Tubifex worms. As Fig. 2 shows, extinction performance was directly related to acquisition performance; there was no MREE. In an experiment providing information on the OEE in fish (Gonzalez et al., 1967b), trials per session (10 vs 20) was varied factorially with reward magnitude (1 vs 10 Tubifex worms) during acquisition of a target-striking response. After 12 training sessions, the four groups were extinguished. The results (Fig. 3) give no evidence of the OEE; persistence (resistance to extinction) was directly related to the number of acquisition trials. This absence of paradoxical extinction appears also to characterize the behavior of turtles. In a single study, Pert and Bitterman (1970) have tried and failed to demonstrate SNC, MREE, or PREE in turtles (Chrysemys picru picta). The one exception to the generalization that these three effects fail to appear in any form in fish and turtles is that while neither fish nor turtles show the PREE in widely spaced trials (Gonzalez et al., 1965; Longo and Bitterman, 1960; Shutz and Bitterman, 1969), the effect can apparently be demonstrated in massed trials in goldfish (Gonzalez and Bitterman, 1967) and turtles (Murrillo et al., 1951). On the basis of results such as these Bitterman has said that the fish and turtle are “Hullian animals,”* meaning that their performance in relation to reward reduction or omission (extinction) is not out of line with the theory of simple ’Personal communication. An exact quote (Bitterman, 1960, p. 708) is: “We might. . . conclude that contemporary S-R theory is appropriate at the level of the fish, although new processes of learning come into operation at the level of the rat.”
24 1
PARADOXICAL REWARD EFFECTS
learning proposed by Hull in his Principles ofBehavior (1943). The argument is that, in the fish and turtle, reinforcement (reward) has a direct effect on the association (the greater number of rewards resulting in greater associative strength), but that these animals do not learn “about reward,” which is to say they do not form anticipations or expectancies about reward or nonreward. As there seems to be some agreement that birds and mammals share a common ancestry in the class Reptilia dating back about 250 million years (Colben, 19SS), it could be the case that the ability to learn about rewards is a system in vertebrate evolution. There also appear to be cases in which adult humans operate as “Hullian animals. In one of his last articles Spence ( 1966) summarized and interpreted a series of experiments from his laboratory and elsewhere having to d o with cognitive and drive factors in human eyelid conditioning. He sought to integrate findings from a series of experiments in eyelid conditioning in which a “masking” procedure had been employed. In the masking situation, the conditioning of the eyeblink response is imbedded in a larger set of experimental manipulations designed to reduce the influence of cognitive (expectacy) factors and particularly to minimize awareness of the transition from acquisition to extinction. A task devised by Estes and Straughan (1954) was used to mask the intent of the investigation. Subjects were told to guess which of two side lights would come on next when a center signal light appeared, and to press a right or left button to ”
70
-
60
-
9
m
% 50z 0
n
u)
W
=40-
k
i30-
4z 5 20 W s
lo+ , I
LOW 10 1
2
3
4
5
6
5-MIN BLOCKS
FIG. 3. Extinction of an instrumental response in goldfish as a function of reward magnitude (high vs low) and number of acquisition training trials (10 vs 20 per session). From Gonzalez ef al. ( 1967b),
242
ABRAM AMSEL A N D M A R K STANTON
so signify. The “cover story” was that the experiment had to do with the effects of distraction on their performance on this task, and that distracting stimuli in the form of a tone and an air puff to their eye would be given between pressing the button and lighting of one of the side lamps. Of course the tone and the air puff were the CS and US, respectively, and the guessing task was designed to minimize cognitive involvement in the conditioning. Under these masking conditions, Spence and his students found that extinction following continuous reinforcement is very greatly retarded and is very much like the rate found in experiments with lower animals; under the masking conditions the partial reinforcement extinction effect is absent in humans as it often is in animal experiments on classical aversive conditioning (see Spence, 1966; Wagner ef al., 1967). For our present purposes, these findings suggest that there is a level of adult human functioning, corresponding perhaps to functioning at lower phylogenetic and ontogenetic levels, in which learning and extinction (reaction to reinforcement change) can proceed without benefit of cognitive mediation (goal anticipation). This point, which is reminiscent of our earlier quote from Bitterman (1960), has also been made by Amsel (1972b) in the context of a distinction between pure classical and preparatory Pavlovian conditioning, and by Wickelgren (1979) who has very recently pointed out that the Thorndike-Hull S-R associative theorizing may turn out to be an important and appropriate model for learning at noncognitive levels. Our developmental work suggests that transitions between nonparadoxical and paradoxical effects of reward from one infant-rat age to another may be the ontogenetic counterpart of ( a ) the transitions between the fish-turtle and the bird-mammal stages phylogenetically, and (b) the transitions between masked and unmasked extinction effects in adult human eyeblink conditioning. The implication in the latter case is that the fish-like infant becomes the adult but remains the fish-like infant at the same time. It would be interesting to show, in this regard, that extinction in infant human eyeblink conditioning has the characteristics of extinction under masked conditions in adults.
V. TOWARDAN ONTOGENETIC ANALYSIS OF PARADOXICAL EFFECTS
We are now ready to summarize some of our recent work on the ontogeny of paradoxical effects in the rat. In general, our strategy has been to start with juvenile- and weanling-age rats and to work with younger and younger animals, concentrating finally on the age range 11 to 14 days. In our presentation, we will combine the treatments of some of these effects because in many cases it was convenient to study two or more of them in the same experiment. Within each
PARADOXICAL REWARD EFFECTS
243
subsection we will present data from experiments over the last 4 or 5 years more or less in the order in which these experiments were performed. At the end of this section we will provide a tabular status report of the ages at which the various paradoxical effects have first been observed in our experiments. A.
THEPREE AND MREE
Our earliest work in this area dealt with the ontogeny of the PREE. As a first approximation to answering questions about the age at which persistence resulting from PRF treatments can first be acquired and retained, we reported differences in PREE between young rats whose runway training started at 30 days of age and weanling rats whose training started at day 18 (Chen and Amsel, 1975). Extinction of running after CRF training was very slow and gradual in the weanling rats, but extinction in the older, but still young, rats was more like the adult pattern, abrupt and negatively accelerated. A comparison of extinction performance following CRF and PRF acquisition at both ages showed a very durable P E E , in the sense that it could be demonstrated following a 45-day vacation period and a phase of CRF reacquisition. In experiments that followed we restricted the training to narrower age ranges and studied younger pups. In the first of these (Burdette et al., 1976), there were four age groups, and at each age original training was restricted to a 2.5-day period, followed by a 2-day extinction period which began 12 hr after terminal acquisition. These experiments revealed a very clear PREE when the CRF and PRF treatments were given as early as in the 18- to 21-day range. In a second experiment, another group was added at each age to equate the PRF and CRF groups for rewards rather than trials. This rewards-equated group is designated PRF-R. Using only the two extreme age groups, trained at 18-21 or 35-38 days, the result was the same: while persistence was not different between PRF and PRF-R conditions at either age, a clear PREE emerged in comparisons between the two PRF groups and the CRF group at both ages. In two experiments we looked for the magnitude of reward extinction effect (MREE) at various ages. In the first of these (Burdette er af., 1976, Experiment 3) rats were trained at 18-21 or at 36-39 days of age under CRF conditions with either a 45- or 300-mg food pellet as reward. The MREE was shown to operate in preweanling and juvenile rats as it does in adults: larger reward in CRF acquisition led to faster extinction at both ages. It was also the case that preweanling rats were more persistent following CRF training than juveniles at both reward levels. In a second experiment (Chen and Amsel, cited in Amsel, 1979) PRF or CRF acquisition were combined factorially with 45- or 300-mg reward at three ages: preweanling age (17-20 days), and at 30-33 and 55-58 days of age. (At preweanling age an intermediate level of reward, 97 mg, was also included.) Our data show that at all ages, the size of the PREE was greater following training
244
ABRAM AMSEL A N D MARK S T A N T O N
Acquisition
Extinctio n OW6 21-24
jlr 30 20
v
CRF-07 10
d
Blocks of 6 Trials FIG. 4. Acquisition and extinction of a running response in the rat as a function of age in acquisition (17-20, 30-33, and 55-58 days), reward schedule (CRF vs PRF), and reward magnitude ( 4 5 , 97-, and 300-mg food pellets). From Amsel (1979).
PARADOXICAL REWARD EFFECTS
245
with large than with small reward (Fig. 4), and that at all ages the effect is attributable to a direct relation between CRF reward size and rate of extinction, the MREE. To examine more closely the development of the PREE over the preweanling age range. we (Amsel and Chen, 1976, Experiment 1) included three groups spanning 17 to 2 1 days of age at the start of training along with three older groups for comparison. The older groups started at days 28, 35, and 65. Training was restricted to a 2-day period. A clear PREE was present at all ages. There was an inverse relationship between resistance to extinction and age, particularly after PRF training (Fig. 5 ) . In a second experiment retention of persistence over 45 days was demonstrated at three different ages, including the youngest group, trained at 17- 18 days. These experiments confirmed that rats trained in a runway at preweanling, weanling, and juvenile ages show the PREE and W E E in an immediate extinction test, that the PREE is, if anything, greater in preweanlings than in the older rats, and that it is retained into young adulthood. The very interesting questions that arise from an ontogenetic perspective are: ( a )Is there a transitional age range for the PREE, the MRE,E, and other paradoxical effects? (b) What is the order of their first occurrence? (c) Do the approximate ages of first appearance of these effects correspond to periods during which other significant behavioral and physiological changes are occurring? In most of our recent experiments, we have settled on the range I 1 to 14 days as an apparently
BLOCKS OF EIGHT TRIALS
FIG.5 . Acquisition and extinction of a running response in the rat as a function of age (in days) and reward schedule (CRFvs PRF). From Amsel and Chen (1976).
246
ABRAM AMSEL A N D MARK STANTON
important one for the appearance of the PREE, and we have moved from this finding to the investigation of other ages and effects. Because rat pups cannot be weaned much, if any, earlier than 16 days of age we needed to develop a procedure that would enable us to study reward-schedule effects in a seminatural setting. The general procedure is as follows: Pups are culled to litters of eight at 3 days of age, are separated from their mothers at times appropriate to deprivation conditions, and are placed in a plastic chamber maintained at 33°C (the average temperature in an undisturbed nest). Experimental training is in a clear Plexiglas alley (usually 32 cm long and 8 cm wide) also maintained at 33°C. Approximately 20 min prior to the first trial, a lactating dam receives an injection of a general anesthetic producing a surgical level of anesthesia and blocking milk release (Lincoln et al., 1973; Wakerley and Lincoln, 1971). Preliminary training in these experiments consists of one or more “priming” trials in which each pup is placed either in the goal chamber or directly against the dam and allowed to attach to a nipple and suckle for a short period of time. At the end of this time the pup is gently detached from the nipple, carried by hand to the opposite end of the alley, and allowed to approach the dam. On reward (R) trials the dam is positioned on her side at the end of the alley and the pup is permitted to attach to a nipple, usually for 15 to 30 sec. On nonreward (N) trials, the dam is removed from the alley or the pup is prevented from reaching her. Using such a procedure we were able to demonstrate appetitive learning and 5-trial patterned alternations of acquisition and extinction of an approach (“crawling”) response in 10-day-old rat pups (Amsel et al., 1976). In this experiment, the dam was removed from the goalbox on N trials. Because of the possibility of differential distance cues related to the presence and absence, respectively, of the dam on R and N trials (the “homing” factor), we changed the apparatus and the procedure in later experiments so that the dam was in the goal box on all trials but was inaccessible to the pup on N trials. (In later experiments we also added an exhaust fan to the chamber in which the dam was placed.) This was accomplished with an apparatus in which the anesthetized dam was in the rear portion of the goalbox and could be made accessible or inaccessible to the pup by a door that bisected the box. The most important feature of the procedure was that there was no possible differential stimulation from the mother’s presence or absence while the pup was in the runway and the approach response was being measured (Amsel et al., 1977). Using this changed procedure we repeated the 5-trial ALT experiment of Amsel et al. (1976) and manipulated two factors-presence/absence of the dam on N trials, and 0- versus 15-sec detention in the goalbox on N trials. There was no effect in acquisition attributable to either factor. In extinction, where the dam was either present but inaccessible on all trials or absent on all trials, and where detention in the goal box was 0 or 15 sec, approach responding extinguished in
247
PARADOXICAL REWARD EFFECTS
all groups, the No-Detention groups being somewhat less resistant to extinction than the Detention group, and the Dam-Present groups extinguishing more slowly than the Dam-Absent groups. The procedure in this experiment eliminates any nonassociative interpretation of acquisition, patterned alternation, and extinction in 11-day-old rats, lending confidence that we are dealing with a true form of learning maintained by the reinforcing properties of suckling/contact. A later experiment (Amsel et al., 1977, Experiment 3) showed that simple contact with a male or female conspecific can also serve as a reinforcer for 1 1-day-old PUPS. It was a recent series of three experiments (Let2 er al., 1978) that led to the conclusion that the period between 1 1 and 14 days of age is transitional for the PREE in the rat. In one of these experiments (see Fig. 6 for results) the reinforcer was either 30 sec of dry suckling or 5 sec of milk from an anesthetized dam induced to lactate by oxytocin injection. While the most interesting extinction effect for our purposes was the occurrence of a PREE in 14- but not 1 1-day-olds, another interesting result was that, at both ages and reinforcement schedules, the Milk groups extinguished more slowly than No-Milk groups and, in fact, the 1 1-day-old Milk groups showed little evidence of extinction. If suckling with and ACQUISITION
EXTINCTION I
7 -
14-DAY-OLDS
t
n
b
g,,t l6
W
w n (I)
5 1 11-DAY-OLDS
2
4
6
8
10
12
2
4
6
8
BLOCKS OF TWO TRIALS
FIG. 6. Acquisition and extinction of an approach response in a runway in infant rats as a function of age, reward schedule, and reward magnitude (“milk” = nutritive suckling, “no milk” = dry suckling, on an anesthetized dam). See text for details. From Letz e t a / . (1978).
24 8
ABRAM AMSEL AND MARK STANTON
without milk can be regarded as two reward magnitudes, which seems reasonable on the basis of the acquisition data, then the MREE might have been expected to occur in the CRF groups. It was not observed at either age. As we have seen, this effect is found in rats as young as preweanling age, 17-20 days, when the reward is dry food rather than milk (Burdette et al., 1976, Experiment 3; Chen and Amsel, cited in Amsel, 1979). In other recent work in our laboratory (Chen and Amsel, 1980a) we have again found evidence of the PREE in preweanling rats given acquisition and extinction between 14 and 15 days of age. In one experiment we used as the reinforcer a restrained unanesthetized lactating female rat that was given periodic ip injections of the hormone oxytocin to stimulate milk release. Nonreinforcement was absence of the dam. There were 28 training trials, CRF or PRF, and 20 trials of extinction. There was also a nonreinforcement (NRF) control. Subjects in the PRF groups were more persistent in extinction than those in the CRF group. The NRF group showed no significant acquisition or “extinction. ” In another experiment using a similar procedure where reward was an anesfherized lactating (oxytocin-injected) female we found that a PREE in pups trained for 40 trials between 14 and 15 days was retained over a 13-day interval. In a third experiment (Chen and Amsel, 198Ob), we showed a PREE in rats trained at 12-13 or 1 1-12 days, with extinction on day 13, but not in pups trained between 10 and 11 days, with extinction on day 11 or day 13. We have now made PRF-CRF extinction comparisons under a variety of different conditions, including a very recent experiment involving 120 acquisition trials (Stanton et al., 1980), and we have been unable to induce persistence in 1 1-day-old rats, but have been able to do so in rats 1 to 3 days older. In the case of the MREE, we have shown this effect in preweanlings but not in 14-dayolds. In mature rats the strength of the PREE and MREE depends on a number of training variables-number of acquisition trials, trial spacing, deprivation level, reward magnitude, and others. Our failure to demonstrate a PREE at day 11 or MREE at day 14 may therefore reflect, in Bitterman’s terms, too narrow a range of systematic variation. On the other hand, it may possibly reflect true transitional periods in development of the PREE and MREE. There are many observations to suggest that in several other respects the 10- 15 day age range does include important transitional periods. Infant rats spend at least 12 hr each day attached to a nipple but receive milk in discrete episodes following the milk ejection reflex triggered by the release of the neurohypophysial hormone, oxytocin (Wakerley and Lincoln, 1971). These brief, intermittent episodes of milk release are separated by 5-15 min of non-milk suckling. Two changes in the suckling behavior have been reported in rat pups 12-14 days of age (Hall et a[., 1975): ( a ) a sharp increase in the number of pups detaching from a nipple and scrambling for another immediately after milk ejection (see also Drewett et al., 1974), and ( b ) an inverse relation between duration of food
PARADOXICAL REWARD EFFECTS
249
deprivation and latency to attach to a nipple beginning at this age, but not at younger ages. At around 14 days of age pups first open their eyes, gain the ability to thermoregulate, begin to leave the nest, and begin to meet their nutritional needs in ways other than suckling (Bolles and Woods, 1964). In addition, this age range includes the age at which the maternal pheromone is first reported to appear (Holinka and Carlson, 1976; Leon, 1974; Leon and Moltz, 1972). It seems to be the case, then, that the mechanisms responsible for maintaining the mother-infant bond undergo a change at about 2 weeks of age. Up to about 2 weeks of age, suckling, even during the long no-milk intervals, and contact with the mother may be viewed as involving a kind of built-in persistence essential for survival. This suggests the possibility that as eating and drinking come more and more under direct instrumental control of the pup, externally imposed differential-reward schedules and magnitudes may become more and more effective determinants of learned persistence, and of the MREE. B.
SUCCESSIVE NEGATIVE CONTRAST (SNC) ALTERNATION (PA)
AND
PATTERNED
Information on the ontogeny of SNC has been sparse. In one experiment (Sayheed and Wolach, 1972), SNC was not found in "immature" (45-day-old) rats even though training was conducted in such a way that they reached maturity at the end of the experiment. The rewards used, however, were not different enough to produce SNC in the adult ( 1 10-day) control condition. Another study (Roberts, 1966) has shown significant contrast in adult (180-day) but not in immature (25-day) rats following 21 preshift trials given at the rate of one trial a day. Here, too, maturation was confounded with learning; the immature rats were 46 days old at the time of the shift. In any case, neither of these studies used animals young enough to be in the age range we have identified as transitional for the PREE. In one experiment, we (Stanton and Amsel, 1980) examined the effects of downshifts in reward in 11- and 14-day-old rat pups under conditions that were similar to those which produce the PREE (Letz et al., 1978). The major difference between the procedures of Letz et a/. and those used in this and other recent experiments in our laboratory is that while Letz et al. used a nutritive suckling condition involving injections of oxytocin to induce milk release, we have recently changed to an oral cannulation procedure in which light cream is injected as reward while pups suckle on an anesthetized dam. The cannulation technique. which offers several advantages, such as control over the timing and amount of milk delivery, was adapted from Hall and Rosenblatt (1977). The cannula, a piece of PE-I0 tubing, has a flanged end which rests securely on the tongue, exits at the ventral surface of the jaw, is pulled through a small fold of skin behind the head and secured there with two heat-flanged washers.
250
ABRAM AMSEL AND MARK STANTON
Our main experiment on SNC is a 2 x 2 x 3 factorial design, the factors being age (1 1 vs 14 days), deprivation (16 vs 24 hr), and groups. The groups, representing reward manipulations, are as follows: Large reward control (M-M), small and downshifted group (M-M).Large reward reward control (suckling/milk, designated “M”) consisted of 30 sec of suckling on the anesthetized dam plus infusion of 0.03 ml commercially available “Half and Half” at room temperature. The infusion, which took about 5 sec, reliably elicited the “stretch reflex” (Lincoln et al., 1973; Vorherr et al., 1967). Small reward (Sucklingho-milk, designated “M”)consisted of 30 sec of suckling on the anesthetized dam with no diet infused. Figure 7 presents the results from each age-deprivation condition in a separate panel. There were clear and highly significant effects in acquisition of both reward level and deprivation; but at neither age nor deprivation condition was there any evidence of SNC. In each case, the performance of the downshifted group declined to the level of the dry suckling controls but not below. This suggests that preweanling rats do not possess the processes responsible for the expressions of SNC. In a second experiment, we sought to produce an SNC effect by running older subjects: 16-day-olds were run under the 24-hr deprivation condition. We chose 16-day-olds because they are still young enough for the sucklingho-milk condition to be an effective lower level of reward. The results were the same: there was no evidence of SNC at 16 days of age; but again, at age 16 days, shifts in levels from high to low reward produced significant shifts in performance appropriate to the low-reward level. The failure to find SNC over the age range 11 to 16 days could have a number of explanations, and it is necessary to summarize these as a lead-in to the rationale for our work on the ontogeny of patterned (single) alternation. As you will recall, an explanation offered for failure to obtain SNC in goldfish (Lowes and Bitterman, 1967)-0ne that could apply equally here-is that instrumental performance in this species (at this age, in our case) is not governed by the incentive (or “reward expectancy”) mechanisms that have been advanced to explain SNC in adults. Instead, the level of learned performance in pups aged 11-16 days reflects the direct reinforcing action of the reward on an S-R association, and the reinforcing action is directly related to the magnitude of the reward. Lowered reward anticipation (or increased frustration) is not a factor at this age. In brief, as Bitterman has said of goldfish and turtlesand as we said of humans undergoing classical conditioning without cognitive involvement-the 11- to 16-day-old rat pup acts like a “Hullian” animal, whose performance, reflecting habit strength, is directly related to magnitude of reinforcement, not to incentive motivation. A second interpretation of these results is in terms of the sequential hypothesis of incentive contrast (Capaldi, 1967; MacKintosh, 1974). The guiding principle
(M-a),
25 1
PARADOXICAL REWARD EFFECTS
Preshift
Post Shill
Preshift
Postshift
-
?
4
6
2
4
2
4
M-M
M-M
6
2
4
Blocks of Four Trials
FIG. 7. Effect of reward reduction on performance in a runway as &functionof deprivation period (16 vs 24 hr) and age (1 1 vs 14 days). Group M-M was shifted from suckling/rnilk to sucklingho-mi&. _The other groups were not shifted, Group M-M receiv-
ing suckling/milk and Group M-M receiving suckling/no-milk in both phases. From Stanton and Amsel (1980).
here is that instrumental responding on Trial N + 1 is conditioned to the stimulus trace (or memory) of the reward outcome on Trial N. In the usual contrast experiment the inferior performance of the shifted animals in the postshift phase is the result of generalization decrement: the feedback stimulus from large reward on Trial N, that controls the response on Trial N + 1, is replaced by a feedback stimulus from small reward. A sequential interpretation of the failure to observe SNC in our infant rats would be that this failure reflects their inability to discriminate the aftereffects of suckling/milk from sucklingho-milk. A similar explanation has been offered by Mackintosh (1971) for the absence of SNC in goldfish. A third interpretation of these results derives from frustration theory (Amsel, 1958, 1967), which, you will recall, accounts for SNC in terms of an aversive state (primary frustration, RF),that occurs unconditionally when expected reward exceeds realized reward, and the conditioned form of that state, anticipatory frustration (fF-sF), that arouses an avoidance tendency. This avoidance tendency, added to other possible decremental consequences of reduced reward, produces
25 2
ABRAM AMSEL AND MARK STANTON
the below-baseline performance in downshifted animals. On such a theory, the absence of SNC in preweanling rats reflects ( a ) absent (or weak) RF, ( b )absent (or weak) rF-sF, (c) an absent (or weaker-than-in-adults)connection between sF and avoidance, or ( d ) some combination of these. In another experiment, this time using only 14-day-old pups, we tried again to produce SNC and, at the same time, to provide some evidence for or against these various interpretations. First, we increased the number of trials, both preshift and postshift. Increasing the number of preshift trials, we reasoned, should increase the strength of the reward expectancy (rH-sR)and hence of primary frustration (RF)following the downshift. It should also increase the effectiveness of generalization decrement when the downshift occurs. The number of postshift trials was increased for similar reasons. It is possible that rF is conditioned very slowly at this age and requires more trials to be effective. We further tested the hypothesis that frustrative processes are weak at this age by extinguishing the low- and high-reward control groups (Groups M-Mand M-M) at the end of the postshift phase to see if the MREE would appear. (As you recall, we had not found it in earlier experiments with fewer trials at this age.) Finally-and this is important-we assessed the role of carryover (and the discriminability of the large- and small-reward magnitudes) by including a group that was trained on a patterned (single) alternation (PA) schedule. This schedule consisted of regular, single alternations of the large (sucklinglmilk) and small (sucklinglno-milk) rewards used in the SNC condition. When adult rats are run on this schedule, they acquire a tendency to run fast on large and slow on small-reward trials. The most popular interpretation of this result, particularly in massed-trials conditions (Mackintosh, 1974), is that animals learn to respond discriminatively based on the previous trial outcome, the S + being some trace or memory of the previous small reward and the S- of the previous large reward. Performance on a singlealternation schedule, then, should tell us whether or not sequential processes (Capaldi, 1967) are operating in our 14-day-old rats. Subjects were divided into four groups, the usual three SNC groups, M-M, fi-m, and M-M ( N = 12lgroup from 12 litters) and a group run on an alternation schedule, Group PA-M-fi ( N = 8 from at least 5 litters). Subjects in Group PA-M-fi received suckling/milk reward on odd-numbered trials and suckling/ no-milk reward on even-numbered trials. Training was carried out in a single day in three sessions of 40 trials each (120 trials total). The intersession interval was 5-6 hr. For all groups, the intertrial interval was 8 sec. For the three SNC groups, the first session and a half (60 trials) served as the acquisition phase. The postshift phase was 30 trials long, consisting of the second half of session two and the first 10 trials of session three. At that point, extinction began and continued for the remaining 30 trials of session three. Only Groups M-M and ~ -were M extinguished. Group M-M was terminated at the end of the postshift phase. Group PA-M-a received patterned alternation training for all 120 trials of
253
PARADOXICAL REWARD EFFECTS
the experiment. The deprivation interval was 24 hr. The large-reward was 0.03 ml Half-and-Half within a 30-sec suckling period; the small reward was 30 sec dry suckling. Figure 8 summarizes the results. Performance of the three SNC groups over all phases of the experiment is in panels A, B, and C, and performance of Group PA-M-M is in panel D. As in earlier experiments, there was a magnitude-ofreward effect in acquisition, pups approached milk reward significantly faster than dry suckling reward. Group M-M rapidly declined to the level of M-M performance but not below; there was no SNC. In extinction, there was the suggestion of an MREE which would seem to fall in line with the failure to find this effect with fewer trials at this age (Letz et al., 1978, Experiment 2), on the one hand, and, on the other hand, that it is clearly present in preweanlings (Burdette et al., 1976). The important finding of this experiment, in our view, is the presence of patterned responding to single alternations of reward magnitudes in the absence of SNC in 14-day-old rat pups, despite extended preshift training in which they manifested a very large reward-magnitude effect in acquisition. A 30
. -g
Acquisition
I
10
b.+.
-
’.O
10 -
0
I
I
I
I
1
I
L
FIG.8. A comparison of-SN_C and patterned alternation in 14-day-old rats. The two control groups, M-M and M-M,are rewarded with nutritive and dry suckling, r e v tively, in acquisition (A, B) and this is followed by extinction (C). The-shifted group shows the effects of a reduction from nutritive to dry suckling (Group M-M, A, B). The PA results are shown in (D). From Stanton and Amsel (1980).
254
ABRAM AMSEL AND MARK STANTON
In one of another set of experiments (Chen et al., 1980) involving different feeding systems-ating of dry food and drinking milk out of a cup-adult subjects showed the classic Crespi-&man depression (contrast) effect with dry food as reward, while weanling rats (17 to 24 days old) showed no evidence of SNC with these dry food pellets. In another experiment comparing pups 25-26 and 35-36 days old using food pellet rewards, the 35- to 36-day-olds showed a strong SNC effect and the 25- to 26-day-olds showed a small and marginal SNC effect. When milk taken from a cup was the reward at three ages, 25- to 26-dayolds showed clear SNC, there was a suggestion of SNC in 20- to 21-day-olds, but none in the 16- to 17-day-olds. The large reward was 0.3 ml, the small reward 0.02 ml. A further experiment attempting to produce contrast effects in pups 16-17 days old under massed-trial conditions, and also to determine the effects of magnitude of reward on extinction in 16- to 17-day-olds, failed to show either SNC or the MREE.
FIG. 9. A comparison of patterned (single) alternationin a runway at 11 and 14 days of age. Reward (or odd-numbered trials) was nutritive suckling. Nonreward (on evennumbered trials) was goalbox confinement without access to the dam. From Stanton e t a / . (1980).
255
PARADOXICAL REWARD EFFECTS
We have been able to demonstrate PA not only at 14 days of age, but even at 1 1 (Stanton rt al., 1980). Three conditions have been employed: ( a ) suckling with milk delivery alternated with nonreward (no contact with dam); ( b ) suckling without milk delivery (dry suckling) alternated with nonreward; and (c) alternation of milk-suckling and dry suckling. The apparatus, general methods, and procedures of the experiments were the same as in the previous ones. The results of these experiments can be summarized briefly: all three alternating reward conditions resulted in patterning in both 14- and 1 I-day-olds. In every experiment, plots of blocks of R against N trials, and trial-by-trial plots of average speeds over the last 10 trials, showed this effect. The data from the first experiment are shown as an example in Fig. 9. A series of control conditions were also run for comparison. One group (CRF) was rewarded on every trial for 120 trials. Another group (PRF) received 120 trials on a 50% reward schedule such that the outcome of Trial N had no predictive value for Trial N 1. As expected, no suggestion of patterned alternation in running speeds appeared in either of these groups. These experiments show that infant rats, I 1 and 14 days of age, can learn patterned responding on the basis of single alternations of various combinations of reward, reduced reward, and nonreward. This learning is most efficient when milk-suckling is alternated with no-suckling nonreward, but occurs as well when dry suckling and nonreward or milk- and dry-suckling reward are alternated. Taken together, these experiments indicate that neither milk-delivery-related nor suckling-contact-related cues are necessary for patterning but that either is sufficient. It seems reasonable, then, to summarize the conditions that have produced patterning in these experiments as any that create a reward discrepancy.
+
C. THEOVERTRAINING EXTINCTION EFFECT(OEE) Our investigations of the OEE are not as far along as is our work on the ontogeny of the other reward effects, but our work on this phenomenon also suggests that a theoretically meaningful transitional period can be determined. We have completed three experiments (Stanton and Amsel, 1980). In the first experiment, we investigated the OEE at 14-15 days of age, and we used the cannulation technique with nutritive suckling on an anesthetized dam as the reward. One group received 85, the other group 25 acquisition trials. We reasoned that if frustrative processes are weak or absent at this age, as is suggested by our data on SNC, the OEE should not be observed. The methods were generally the same as in the SNC work. The apparatus was the same one used in our investigations of contrast and patterning. The entire experiment took six sessions, three sessions per day for 2 days. Group 85 received 20 acquisition trials in each of the sessions from one through four and 5 acquisition trials at the beginning of session five. At this point, extinction began
256
ABRAM AMSEL A N D MARK STANTON
without signal or delay and continued for 25 trials. An additional 30 extinction trials were given in session 6. Group 25 was treated like Group 85 except that it received handling during the first three sessions rather than acquisition training. The IT1 was about 90 sec. In an attempt to equate groups for deprivation, subjects were taken at the end of these sessions to another room and “postfed” at the nipple in a plastic cage bearing no auditory, tactile, or thermal resemblance to the goalbox. Group 85 subjects were fed one reward (0.03 ml) and Group 25 subjects were fed one reward plus the amount received by the Group 85 subjects during the training session. The results can be summarized briefly: There was a significant asymptotic difference, the group given 85 acquisition trials reaching higher speeds than the one given 25 trials. This difference remained throughout the extinction phase: there was no evidence of the OEE. In a second experiment we tested animals 60-70 days of age in a runway using the same spacing and distribution of trials as were used in the first experiment. The reward was a single 190-mg Noyes food pellet. The straight runway was 103 cm, instead of 38 cm, long, and was made of wood instead of Plexiglas. The acquisition results were very similar to those of the first experiment. The greater number of trials resulted in greater asymptotic speeds. In extinction an OEE showed up in each of three measures, and, of course, in an overall speed measure. The number, distribution, and spacing of training trials that failed to produce an OEE at 14-15 days was sufficient to do so at 60-70 days. It is possible, however, that the other procedural differences, not developmental differences, between the two experiments account for their respective outcomes. To avoid this problem in a third experiment, we compared the performance of preweanling (18- to 19-day-old) and weanling (25- to 26-day-old) rats in the same apparatus and with the same reward, milk in a cup. (The experimental design was the same one used in the previous experiments.) Our use of these ages and this reward was based on the experiments of Chen el al. (1980) showing SNC under these conditions. You will recall that in these experiments, where reward consists of milk in a cup, SNC appears at 25-26 days, being absent at younger ages. Our investigations of the ontogeny of the MREE showed that by 18-19 days this effect also occurs (Amsel and Chen, 1976; Burdette et al., 1976; Letz et al., 1978; Stanton and Amsel, 1980). On the basis of these data, we expected to find the OEE at 25-26 days but not at 18-19 days. The apparatus was the same one used in the second experiment except that the food trough was replaced by a stainless-steel cup. Reward consisted of 0.10 ml commercial Half-and-Half in the cup. In all other respects, the procedure in this experiment was the same as in the second experiment. As in the previous experiments, the 85-trial subjects ran significantly faster than the 25-trial subjects. This was true at both ages. A summary of the extinction data is as follows: while there was some evidence of the OEE at 25 days, it was not strong, and a clear “reverse-OEE” was found at 18 days.
257
PARADOXICAL REWARD EFFECTS
CURRENT PATTERN O F
TABLE I1 REWARDEFFECTS AT VARIOUS ACES
RESULrS ON
Reward effect Age (days) I1
14-20 21-25
25 + Adult
PA
PREE
MREE
SNC
OEE
Yes Yes
No Yes Yes Yes Yes
No No Yes Yes Yes
No No No Yes Yes
-
-
Yes
No
-
Yes Yes
Table I1 provides an idealized summary of our results on the ontogeny of paradoxical effects, showing the approximate age at which we have first seen each effect in our experiments. Our basis for identifying the “transitional” ages is stronger for some effects than for others. Even in our strongest case (the PREE), however, a greater degree of systematic variation could produce a different result. It would therefore be wise to regard this summary as an interim report, and we offer the following admonitions: ( a )It would be wiser to think in terms of ease of producing an effect at a particular age than in terms of an absolute age at which an effect first appears. ( h ) The order of appearance in ontogeny of an effect, as we present this order on the basis of a small number of experiments, should, as we have urged previously, be taken as nothing more than a working hypothesis. ON VI. COMMENTS
THE
NEURAL SUBSTRATE OF PARADOXICAL
EFFECTS
Because, in the alaicial rat, rapid postnatal behavioral and physiological development occur in parallel, the rat’s behavioral ontogeny is particularly open to physiological interpretations. Toward this end, a thorough developmental psychobiology (biopsychology) is a necessary long-run strategy, a fact attested to by the exponential increase in the amount of work we see in these fields. In the short run, however, a simpler strategy is available: combine what is known of the physiological basis of behavior in adults with data on neural development and see if this leads to any understanding of ontogenetic trends in behavior. This shortrun strategy has obvious dangers. First, since many neural changes occur simultaneously during development we must either designate one (or some) of these changes as particularly salient while ignoring the others, a very risky undertaking; if we consider all (or many) of the changes simultaneously we are back quickly to the long-run strategy. Second, the techniques employed in deriving the
258
ABRAM AMSEL AND MARK STANTON
adult “reference data” often are not the kind that can provide information that meshes with the developmental data. For example, among the neural developments that occur in ontogeny are changes in the organization, orientation, or afferent-efferent relations of the gross anatomical structures. The common techniques of adult physiology-lesions, stimulation, and recordingdo not address the effects of these changes. Third, the short-run strategy adopts, but does not test, the assumption that adult brain-behavior relationships apply as well to infants, that as a portion of the brain achieves adult-like status a corresponding adult-like behavior will be possible. That these examples have been taken from ontogenetic development should not provide comfort to comparative investigators; the problems described in these examples are, if anything, worse in phylogeny (Nauta and Karten, 1970). In spite of these and other dangers, the short-run strategy has value, at least as a preliminary heuristic. it is in this spirit that we offer a selective review of what is known so far of the biopsychology of the paradoxical effects, and search for convergences and divergences between data on physiological and behavioral development. It has been over four decades since Papez (1937) made his now classical proposal of a limbic substrate for emotion. An account of the paradoxical effects in terms of frustration makes it natural to look to the limbic system for their neural substrate. There is now much evidence for limbic involvement in the paradoxical effects. For example, sensitivity to reward changes has been disrupted by lesions of the amygdala (Schwartzbaum, 1960), and SNC has been eliminated by lesions of cingulate cortex (Gurowitz et al., 1970), and by administration of the minor tranquilizers (Rosen et a l . , 1967; Rosen and Tessel, 1970) which alter the electrical activity of limbic structures (Guerrero-Figueroa et al., 1973). The component of the limbic circuit which we emphasize in this discussion is the septohippocampal system. The postnatal development of the dentate gyms of the hippocampus has been studied extensively (Altman and Bayer, 1975). In the rat 90%of the granule cells develop postnatally. Differentiation increases rapidly at about 12-14 days and reaches adult levels at 25-30 days. This has led some writers (Altman et al., 1973; Douglas, 1975) to propose that hippocampal development plays a critical role in the ontogeny of “behavioral inhibition. These writers, however, differ somewhat in the way in which they characterize inhibition. Altman er al. adhere to a response inhibition view in the simple descriptive sense of peripheral response suppression. Douglas’ (1975) view is that the hippocampus is the substrate of Pavlovian internal inhibition, which is not equivalent to simple response suppression because response suppression may simply involve the competition of two excitatory tendencies, one appetitive, the other aversive. Pavlovian inhibition would operate equally to counteract excitatory tendencies of both kinds (see also Amsel, 1972b). Both parties base their views largely on studies of habituation to novel environments, spontaneous alternation, ”
PARADOXICAL REWARD EFFECTS
259
and passive avoidance learning. In all of these situations, performance decrements are found both in hippocampally damaged adult rats and in intact infant rats, the maturation of adult-like performance at 25-30 days coinciding with that of the hippocampus. It is ako noteworthy that the development of behavioral responsiveness to cholinergic drugs in the rat also coincides with that of the hippocampus (Campbell er al., 1969). Douglas ( 1975) has documented the evidence that, in adults, drug treatments which alter cholinergic function exert their behavioral effects via their action on the hippocampus, and therefore describes the system which is developing as the “hippocampal-cholinergic inhibitory system” (1975, p. 338). There is a third view on the nature of the behavioral inhibition subserved by the hippocampus. While it does not specifically address a psychobiological theory of behavioral maturation, it brings our work into line with such theories because it is based on a physiological analysis of the PREE in the context of the concept of conditioned frustration. This third view is that of Jeffrey Gray (1975, 1976, 1977) and his colleagues (Gray et al., 1978). According to Gray (1977), the septohippocampal system is an important component of the ‘‘behavioral inhibition system, a system responsible for halting ongoing behavior and increasing attention to the environment in the presence of cues associated with punishment, frustrative nonreward, or novelty. Gray’s theory also considers the septohippocampal system the substrate of mantained responding in the face of such cues (or habituation to their descriptive effect) under conditions in which such responding is the “best alternative” for the organism. [In this respect the system appears to act as the neural substrate for general persistence (Amsel, 1972a).] Gray’s is a more specific and detailed account of the functioning of the septohippocampal system than the response-braking idea (e.g., Altman ef al., 1973), and while the counterconditioning (habituation) aspect resembles Douglas’ (1972) Pavlovian inhibition view, these two positions differ in at least three respects. First, as it is based on two-process learning theory Gray’s position draws a distinction between inhibition of Pavlovian and of instrumental responding (Gray et ul., 1979). This could correspond to what we have referred to, respectively, as unmediated and mediated inhibition (after Amsel, 1972b). Contrary to Douglas (1972, 1975), Gray’s view is that instrunientul “inhibition” is subserved by the septohippocampal system. Second, according to Gray this is a system that suppresses, and builds persistence in, responses that have aversive outcomes (punishment or nonreward). Finally, Gray’s position must be distinguished from the others on the basis of the central position he gives to the role of the hippocampal theta rhythm. As Gray sees it, the relation between hippocampal theta and the operation of the septohippocampal system is as follows: Frequencies of theta associated with voluntary behavior (and the absence of reflexive behavior or fixed action patterns) lie above 8 Hz;those associated with reflexive behavior (and the absence ”
260
ABRAM AMSEL A N D MARK STANTON
of voluntary behavior) lie below 7 Hz; and theta frequencies in the 7 to 8 Hz range (specifically 7.7 Hz) occur when the behavioral inhibition system (specifically conditioned frustration) is active. The evidence relating this frequency of theta to conditioned frustration is quite convincing and includes the following: ( a ) 7.7 Hz theta occurs in response to nonreward in the goalbox of a runway (Gray and Ball, 1970). (b) Artificially driving theta (by medial septal stimulation) on 50% of the trials in a goalbox in which rats are always rewarded makes them more persistent than nonstimulated CRF controls (produces a “PREE”); 7.7 Hz theta-driving in extinction reduces persistence (Gray, 1972). ( c )Eliminating theta in PRF animals, either by overdriving with high-frequency septal stimulation in acquisition (Gray et d.,1972a), or by medial septal (Gray et a f . , 1972b), or full septal (Henke, 1974, 1977) lesions in both acquisition and extinction eliminates the PREE. Thus, all three techniques-recording, stimulation, and lesioning4onverge to suggest the same conclusion. Gray’s results have been corroborated in our laboratory by Glazer. He has shown that persistence in instrumental responding can be increased by inducing 7.5-8.5 Hz theta through pharmacological means, instrumental conditioning of theta, or electrical means (Glazer, 1972, 1974a, b, respectively). The effect on the P E E of blocking theta consists of an increase in resistence to extinction in the CRF animals coupled with a decrease in the PRF animals, a pattern that suggests interference with conditioned frustration. Essentially the same effect occurs in adult rats when maturation of the dentate gryus is disrupted by neonatal X-irradiation (Brunner et al., 1974), in rats treated with antianxiety drugs (Gray, 1977), and in rats subjected to noradrenaline depletion in the dorsal bundle by injection of 6-hydroxydopamine (e.g., Gray et al., 1979; Mason and Iversen, 1978; Owens et al., 1977). Gray has argued that all of these agents exert their effects by acting on the 7.7 Hz theta rhythm, the hippocampal activity that he proposes as the substrate for frustrative inhibition and for the counterconditioning mechanism in persistence. (Mason and Fibiger, 1978, take a different view-that the dorsal bundle extinction effect, as it is called by them, affects attentional mechanisms in acquisition, and that persistence increases for that reason.) In support of this position, Gray e? a f . (1979) present data on the effects of various drugs on the “theta-driving curve. This curve is obtained by plotting the current threshold for driving theta as a function of the stimulation frequency applied to the septal pacemaker cells. In the normal rat, this curve is V-shaped with the minimum threshold lying right at 7.7 Hz. Both the minor tranquilizers and 6-OHDA selectively raise the driving threshold at 7.7 Hz, producing a flatter (rather than a biphasic) theta-driving curve. The model presented in Fig. 10 summarizes Gray’s thinking and may be regarded as a hypothetical neural circuit for conditioned frustration. A number of other views on the functional significance of the hippocampal theta rhythm exist (see Isaacson and Pribram, 1975), and it seems likely that in its broad spectrum ”
PARADOXICAL REWARD EFFECTS
' inh;bits nonreworded behavtor
I
fimbrio [counterconditioning1
26 1
' I '
*
Lot sept
Locus
FIG. 10. Hypothetical neural circuit of conditioned frustration and nonreward persistence. Signals predicting a nonreward response enter the medial septum (Med. sept.) and travel to the hippocampus via the theta producing fibers of the dorsal fornix. The hippocampus then either inhibits nonrewarded behavior or causes such behavior to persist by sending signals via the fimbria to the lateral septum (Lat. sept.), which inhibits the activities of the medial septum. The entire septohippocampal system is innervated by noradrenergic fibers of the dorsal bundle which originate in the locus coeruleus. The effects of antianxiety (antifrustration) drugs and of dorsal bundle lesions on persistence are thus explained. From Gray et nl. (1978).
conditioned frustration is, at best, just one of the many concomitants of theta. Nevertheless, there appears to be a plain empirical relation between frustrative suppression, the PREE, and hippocampal theta in adult rats, and this suggests the question: Do 1 1 -day-old rat pups, which show no evidence of the P E E , also fail to show hippocampal theta? Existing evidence (Vanderwolf et al., 1975) is that the age of first appearance of hippocampal theta in the rat is in the 12- to 14-day range, precisely the period that we have identified as transitional for the PREE. The fact that theta can be recorded at this age indicates the onset of at least some rudimentary functioning of the hippocampus. There is, interestingly enough, parallel evidence that the corticosterone level, which Levine and others (e.g., Coover er a l . , 1971) have shown to be elevated in appetitive extinction, and have taken to represent a physiological response to frustrative nonreward (Levine el a f . , 1972), is virtually absent in the rat between 6 and 12 days of age, and shows a significant rise at day 14 that continues until it peaks at day 24 (Henning, 1978). McEwen et al. (1975) have surveyed evidence for an interrelation between this steroid hormone and hippocampal function and in a recent report (Micco et al., 1979) have implicated corticosterone in nonreward persistence (Fig. 10). In addition to the work of Gray's group on the PREE, there exist lesion studies with adult rats which implicate the septohippocampal system in the other paradoxical effects of reward. Hippocampal lesions eliminate SNC (Franchina and Brown, 1971), and septa1 lesions appear to abolish the MREE (Wolfe
262
ABRAM AMSEL A N D MARK STANTON
et al., 1966). In the runway, hippocampal damage either eliminates (Franchina and Brown, 1970; Brunner et al., 1974, Experiment 3) or delays (Brunner et al., 1974, Experiment 2) the appearance of patterned alternation. In a Skinner box go/no-go paradigm, hippocampectomy eliminates patterning at an 80-sec IT1 (Walker et al., 1972), enhances it at an 8-sec IT1 (Means et al., 1970), and delays it at a 5-sec IT1 (Warburton, 1969). Cholinergic antagonists also impair PA in normal rats in this situation (Warburton, 1972). It is worth considering, therefore, how our data as a whole relate to the issue of hippocampal-cholinergic maturation and the ontogeny of inhibition. This involves two considerations: the degree of resemblance of the infant rats in our appetitive learning situation to adults with impaired hippocampal-cholinergic function, and the nature of the inhibition involved-response-braking, Pavlovian, or frustrative. With regard to the resemblance issue, the answer depends on the paradoxical effect in question and on whether, to go back to an earlier point, one chooses to speak in relative or absolute terms. In relative terms, our data indicate that inhibition increases with age. Extinction following both CRF and PRF training becomes more rapid with age (Fig. 5) and the appearance of some of the continuous-reward paradoxical effects (OEE and SNC) does coincide with the “completion” of hippocampal maturation. In absolute terms, however, we have documented instances of response inhibition, which in adults depends on hippocampal-cholinergic function, at ages (1 1 and 14 days) which clearly precede the proliferation of granule cells in the dentate gyrus: Acquisition and retention of the PREE are shown in the preweanling period; the MREE, which appears at 18-22 days, antedates hippocampal maturation; and patterned alternation, where response inhibition is very clear, is present even at 11 days of age (Fig. 9).3 Clearly, a deficit in, or absence of, response-braking (Altman et al., 1973) does not fit the performance of these 1 1-day-olds. They have no trouble “slamming on the brakes” after initiating responses on nonrewarded trials. A conclusion on the matter of whether the inhibition deficit is of a Pavlovian or frustrative nature is more difficult. It is of course possible that instrumental response suppression represents the summation of both kinds of “inhibition. What our data show is not an absence of inhibition in the hippocampally immature rat, but rather the absence of paradoxical inhibition. While preweanlings do not show SNC, their performance does decline with reward reduction to the level of the low-reward controls. Similarly, our failure to find the MREE until about 18 days of age, or the OEE until even later, does not reflect an absence of inhibition (in this case, extinction). Extinction is quite clearly present, but its ”
’While it is true that hippocampal lesions do not abolish PA in all situations (see above references)and that there is even evidence for a noncholinergic system mediating PA (Warbunon, 1969)-in the runway apparatus and with the training parameters (ITI, number of trials) we have employed, our 1 I to 1Cday-old rats bear a much closer resemblance to intact adults that to adults with hippocampal lesions in their ability to pattern on a single-alternation schedule.
PARADOXICAL REWARD EFFECTS
263
strength is directly, rather than inversely, related to acquisition reward factors. It is just not of the paradoxical sort. Possibly this nonparadoxical inhibition is present at birth. A recent report purports to show the reversal of an appetitive discrimination at 1 day of age (Johanson and Hall, 1979). Two recent reports (Campbell and Raskin, 1978; Smith and Spear, 1978) may be important in understanding the apparent contradiction between our findings and those on which the views of Altman and Douglas are based. These reports stress the importance of “nest cues” as a factor in the learning and/or motivation of infant rats. Campbell and Raskin (1978) point out that evidence for the Jacksonian principle of caudal-rostra1 (noradrenergic/excitatory-cholinergic/ inhibitory) brain development may depend on the testing situation used. They found that the peak in activity of rats at day 15, previously thought to reflect noradrenergidexcitatory maturation occurring in advance of cholinergicl inhibitory maturation (Fibiger el a / . , 1970), could be eliminated by testing the animals in the presence of nest cues. Their conclusion was that novel environments are particularly activating to young rats, and that activity measures in these environments are inappropriate to investigations of the Hughlings Jackson principle. Smith and Spear (1978) showed that the ability of 16-day-old rats to display inhibition in passive avoidance and spontaneous alternation tasks, precisely the situations that are the source of Altman’s and Douglas’ views, was dramatically affected by the presence of nest stimuli. When shavings from the nest were present, performance on all these tasks was adult-like. When clean shavings or no shavings were present, however, the classic inhibition deficit was found. If one assumes that an anesthetized dam constitutes a nest environment stimulus, as Smith and Spear do, our findings of simple response inhibition (but not necessarily paradoxical effects) in rats in the 1 1- to I6-day age range provide additional support for the idea that infants will be shown to have greater inhibitory capacity when they are tested in an environment that is not strongly arousing. Perhaps the rudimentary functioning of the hippocampus, suggested by the appearance of theta at 12-14 days, has behavioral impact in situations of low but not high arousal. Such an assumption would reconcile our data with the failures of others to find inhibition at this age. VII.
A.
CONCLUDING CONSIDERATIONS: IMPLICATIONSFOR BEHAVIOR A N D BEHAVIOR THEORY
ONTOGENY OF APPETITIVE BEHAVIOR
While the main thrust of this article concerns theoretical considerations related to the ontogeny of learning and the paradoxical effects of appetitive reward, our work has some bearing on the ontogeny of appetitive behavior. Evidence has accumulated (Hall and Rosenblatt, 1977, 1978; Hall et a / . ,
264
ABRAM AMSEL AND MARK STANTON
1975, 1977) which points to the development, between 10 and 20 days of age in the rat, of “the ability to pair the physiological consequences of suckling with the suckling behavior itself” (Hall and Rosenblatt, 1978, p. 424). The evidence comes from studies of nipple attachment and detachment. Attachment comes increasingly under the control of nutritive deprivation during this developmental period (Hall et al., 1977). Nipple detachment and termination of suckling appear at 10 days of age to be due to stomach fill per se (Hall and Rosenblatt, 1978), whereas at 20 days of age, detachment occurs before the stomach is full and depends very much on the nutritive value of what is ingested. What appears to be occurring is an increase with age in the ability to inhibit attachment and suckling in the absence of hunger. That suckling in the absence, as well as in the presence, of milk delivery reinforces approach and attachment in deprived 11- and 14-day-old infant rats is consistent with investigations of suckling behavior (Blass et al., 1977; Hall and Rosenblatt, 1977; Hall ef a l . , 1977). Our finding (Fig. 7) that deprivation increases approach speeds at 14 days but not at 1 1 days of age also is consistent with one of the aforementioned findings-that the incentive properties of suckling are becoming more deprivation-related over this age range. Deprivation increases approach speeds whether sucklinglmilk or sucklingho-milk is the reward. Another of our findings relevant to the work on the ontogeny of appetitive behavior is that, with deprivation held constant, suckling/milk appears to be a larger reward than sucklingho-milk, even at 11 days of age. This is true both between-subjects, as in our studies involving CRF acquisition (Figs. 6 , 7 , and 81, and within-subjects, as in our studies of patterned alternation (Fig. 9; Stanton ef a/., 1980). Whereas this finding does not contradict the suckling work in any direct empirical sense, as this work is derived from a consummatory rather than an instrumental response measure, it is contradictory to a possible implication of the suckling work that, at 11 days of age, dry-suckling and suckling for milk are about equally r e ~ a r d i n g .These ~ young animals seem more sensitive to the nutritive consequences of their instrumental behavior than they are to the consequences of suckling itself. This may mean that instrumental response measures are more sensitive than consummatory measures to motivational reward differences, possibly because the latter are more difficult to inhibit (Hall and Rosenblatt, 1978, p. 424; Martin and Alberts, 1979) or that milk intake has a rewarding effect on appetitive behavior at an earlier age than satiation suppresses 4This conclusion was drawn in a recent report by Kenny ef a / . (1979) who tested preferences for nutritive and dry suckling in a Y-maze. Our ability to demonstrate differential effects of these events at 1 1 days is attributable to our testing procedure and training parameters. In our patterned alternation procedure-which most resembles their preference test in that it provides a within-subjects assessment of nutritive and dry suckling-we employed an 8-sec IT1 and 120 trials. On the basis of unpublished data from our laboratory, we believe that had we used a 30-sec IT1 and only 60 trials, as Kenny e t a / . did, our results would agree with theirs.
PARADOXICAL REWARD EFFECTS
265
it. Whether this facilitating effect of milk intake depends on the gastric consequences of milk ingestion or on some other mechanism remains to be determined. €3. ONTOGENY OF REWARDLEARNING: DIFFICULTIES FOR THEORY The developmental data we have presented pose a problem for accounts of instrumental learning that attribute PA on the one hand, and the PREE, SNC, MREE, and OEE on the other, to the same mechanisms. The basic discrepancy that emerges out of our preliminary ontogenetic work can be summarized as follows: reward effects which involve intermittent reward (patterning, the PREE) develop sooner than those involving continuous reward (the MREE, SNC, the OEE). One can get patterned alternation at an earlier age than the PREE, and, the PREE, in turn, at an earlier age than SNC, the MREE, and the OEE. The appearance of PA at an earlier age than other paradoxical effects indicates that the failure of these young animals to respond in an adult-like manner to incentive reductions cannot be attributed to the absence of sequential (e.g., memorial) processes (Capaldi, 1972). This finding is noteworthy because of the emphasis these processes have been given by Mackintosh ( I 974) and others in explanations of the paradoxical effects of instrumental reward such as SNC. Earlier experiments with adult rats (e.g., Amsel et al., 1969; Surridge and Amsel, 1966) had failed to show patterned alternation under conditions (24-hr ITI) that are nevertheless capable of yielding SNC, the PREE, the OEE, the MREE, and other instrumental reward effects (but see Capaldi and Spivey, 1964, for the opposite results). Our earlier results seemed to show that sequential processes are not necessary for the Occurrence of these effects. Our finding that infant rats show PA but do not show SNC or a clear OEE or MREE seems to permit the stronger statement that sequential processes are not sufSiciertt to account for these effects either. If sequential interpretations of incentive reduction hinge on a “correlation between successive contrast and alternation learning” (Mackintosh, 1974, p. 401), our experiments showing alternation learning in 1 1 - and 14-day-olds suggest that this correlation is limited, at best. This dissociation of patterned alternation and these other paradoxical effects appears also to be characteristic of the behavior of ‘‘lower” organisms. Fish and turtles do not show the MREE, SNC, OEE, or a spaced-trials PREE. They do, however, show a massed-trials PREE and, with sufficient training, fish show patterned alternation (Gonzalez, 1972). The dissociation of the PREE and the continuous reward paradoxical effects (MREE, SNC, OEE) in the preweanling period also poses problems for Frustration Theory. One solution to the problem suggested by Bitterman (1975) for instrumental learning in nonmammalian species (e.g., fish and turtles) has been to make the distinction between the “massed-trials P E E ” reflecting reward-
266
ABRAM AMSEL A N D MARK STANTON
aftereffect mechanisms, and the “spaced-trials PREE, reflecting frustration mechanisms, and to attribute incentive-shift phenomena to “spaced-trial (i.e., frustration) mechanisms. Accordingly, a P E E - S N C dissociation in the 14- to 16-day age range may indicate the presence of sequential processes at this age and that the PREE obtained at this age is of the massed-trial variety and not dependent on frustrative processes. By this reasoning, the persistence we have demonstrated so far in rats 12-16 days of age is a sequential-processes PREE, and our failure to find SNC, the MREE, and the OEE in this age range is due to the absence of the required frustrative processes. There is another possible hypothesis to explain the apparent ontogenetic dissociation of S N C , the MREE, and OEE on the one hand, and the PREE on the other, within the context of frustration theory. It is that, while frustration may be conditioned in 14-day-old rats, its response-suppressive properties are weak. (See our earlier reference to the marginal presence of hippocampal theta and the corticosterone response at this age.) It would follow from such an hypothesis that phenomena requiring very great (paradoxical) suppression (SNC, MREE, and OEE) do not for this reason appear at this age. In the case of ?he PREE, however, where persistence requires the counterconditioning of $-SF to approach, the weaker response suppressive effects of TF-SF would not rule out and might even facilitate such counterconditioning. It is in this case a matter only of counterconditioning to weaker stimuli. ”
”
C. CONCLUDING COMMENTS As we pointed out earlier, the biopsychological study of paradoxical effects can bring together learning theory, brain physiology and neuroanatomy, and the phylogenetic and ontogenetic study of behavior. This article was an attempt to move in this direction. To paraphrase an earlier statement: We have begun to show that there is an ontogenetic level that appears to correspond to lower phylogenetic levels and perhaps even to a lower level of adult human functioning. At this level, learning, extinction, and other reactions to reinforcement change occur without benefit of goal anticipation (cognitive mediation). Our developmental work suggests that transitions between nonparadoxical and paradoxical effects of reward from one infant-rat age to another may be the ontogenetic counterpart of ( a ) the transition between the fish-turtle and birdmammal stages phylogenetically, and ( b )the transitions between, for example, masked and unmasked extinction effects in adult human eyeblink conditioning. Our review of the literature of brain function over the period of our developmental stages has shown that there are concurrent changes in brain neurophysiology and neuroanatomy, particularly in the limbic system-specifically in the moststudied structure, the hippocampus-which has been identified as an organ of
PARADOXICAL REWARD EFFECTS
267
inhibition. frustration, and anticipation (memory), all of which are involved in what we have called paradoxical functioning.
References Altman, J . , and Bayer, S. 1975. Postnatal development of the hippocampal dentate Gyrus under normal and experimental conditions. In ”The Hippocampus: Structure and Development” (R. L. Isaacson and H. Pribram, eds.), Vol. I , pp. 95-122. Plenum, New York. Altman, J., Brunner, R. L., and Bayer, S . A. 1973 The hippocampus and behavioral maturation. Rehtrv. B i d . 8, 557-596. Amsel, A. 1958. The role of frustrative nonreward in noncontinuous reward situations. Psycho/. Bull. 55, 102-1 19. Amsel, A. 1962. Frustrative nonreward in partial reinforcement and discrimination learning: Some recent history and a theoretical extension. P s y h o l . Rev. 69, 306-328. Amsel, A. 1967. Partial reinforcement effects on vigor and persistence: Advances in frustration theory derived from a variety of within-subjects experiments. In “The Psychology of Learning and Motivation” (K. W. Spence and J . T . Spence. eds.), Vol. I , pp. 1-65. Academic Press, New York. Amsel, A. 1971. Positive induction, behavioral contrast, and generalization of inhibition in discrimination learning. I n “Essays in Neobehaviorism” (H. H. Kendler and J . T . Spence, eds.), pp. 217-236. Appleton, New York. Amsel, A. 1972a. Behavioral habituation, counterconditioning, and a general theory of pesistence. In “Classical Conditioning 11: Current Research and Theory” (A. H. Black and W. F. Prokasy, eds.), pp. 409-426. Appleton, New York. Amsel, A. l972b. Inhibition and mediation in classical, Pavlovian and instrumental conditioning. In “Inhibition and Learning” (R. Boakes and S. Halliday, eds.). pp. 275-300. Academic Press, New York. Amsel, A. 1979. The ontogeny of appetitive learning and persistence in the rat. In “Ontogeny of Learning and Memory” (N. E. Spear and B. A. Campbell, eds.), pp. 189-224. Erlbaum, Hillsdale, New Jersey. Amsel, A. and Chen. J . 1976. Ontogeny of persistence: Immediate and longterm persistence in rats varying in training age between 17 and 65 days. J . Conip.Phvsiol. Psycho/. 90, 808-820. Amsel, A,, and Roussel, J . 1952. Motivational properties of frustration: I . Effect on a running response in the additon of frustration to the motivational complex. J . Exp. Psycho/. 43, 363368. Arnsel, A , , and Ward, J . S. 1954. Motivational properties of frustration: 11. Frustration drive stimulus and frustration reduction in selective learning. J . Exp. Psycho/. 48, 37-47. Amsel. A., Rashotte, M. E., and MacKinnon, J . R. 1966. Partial reinforcement effects withinsubject and between subjects. Psycho/. M o ~ u J 80, ~ ~ Whole . No. 628. Amsel, A , , Hug, J . J . , and Surridge, C. T . 1969. subject-to-subject trial sequence, odor trials. and patterning at 24-h ITI. P s y h o n . Sci. 15, 119-120. Amsel. A,. Burdette. D. R . , and Letz, R. 1976. Appetitive learning, patterned alternation, and extinction in 10-day-old rats with nonlactating suckling as reward. Narure (London) 262, 8 16-81 8. Amsel. A,. Letz. R., and Burdette, D. R. 1977. Appetitive learning and extinction in I I-day-old rat pups: Effects of various reinforcement conditons. J . Conip. Physiol. Pswhol. 91, 1156-1 167. Amius, H. L. 1939, Effect of magnitude of reinforcement on acquisition and extinction of a running response. J . Exp. Psycho/. 58, 61-63.
268
ABRAM AMSEL A N D MARK STANTON
Bitterman, M. E. 1960. Towards a comparative psychology of learning. Am. Psychol. 15, 704-712. Bitterman, M. E. 1965. Phyletic differences in learning. Am. Psycho/. 20, 396-410. Bitterman, M. E. 1975. The comparative analysis of learning. Science 188, 699-709. Blass, E. M.. Teicher. M. H.. Cramer, C. P., Bruno, J. P., and Hall, W. G. 1977. Olfactory. thermal, and tactile controls of suckling in preauditory and previsual rats. J. Comp. Physiol. Ps)lchol. 91, 1248-1260. Bolles, R . C . , and Woods, P.C. 1964. The ontogeny of behavior in the albino rat. h i m . Behuv. 12, 427-441. Bower, G . H. 1961. A contrast effect in differential conditioning. J. Exp. Psychol. 62, 196-199. Brunner, R. L . , Haggbloom, S. J . , and Gazzara, R. A. 1974. Effects of hippocampal xirradiation-produced granule-cell agenesis on instrumental runway performance in rats. Physiol. Behnv. 13, 485-494. Burdette, D. R.. Brake, S., Chen, J., and Amsel, A. 1976. Ontogeny of persistence: Immediate extinction effects in preweanling and weanling rats. h i m . Learn. Behav. 4, 131-138. Bush, R. R., and Mosteller. F. 1951. A mathematical model for simple learning. Psycho/. Rev. 58, 3 13-323. Bush, R. R., and Mosteller, F. 1955. “Stochastic Models for Learning.” Wiley, New York. Campbell, B. A., and Raskin, L. A. 1978. Ontogeny of behavioral arousal: The role of environmental stimuli. J . Comp. Physiol. Psychol. 92, 176-184. Campbell, B. A.. Lytle, L. D., and Fibiger, H. C. 1969. Ontogeny of adrenergic arousal and cholinergic inhibitory mechanisms in the rat. Scienre 166, 635-637. Capaldi, E. J. 1967. A sequential hypothesis of instrumental learning. I n “The Psychology of Learning and Motivation” (K. W. Spence and J. T. Spence, eds.), Vol. 1, pp. 67-156. Academic Press, New York. K. Capaldi, E. J. 1972. Memory and Learning: A sequential viewpoint. In “Animal Memory” (W. Honig and P. H. R. James, eds.), pp. 111-154. Academic Press, New York. Capaldi. E. J.. and Spivey, J. E. 1964. Stimulus consequences of reinforcement and nonreinforcement: Stimulus traces or memory. Psychon. Sci. 1, 403-404. Chen, J . , and Amsel, A. 1975. Durability and retention of persistence acquired by young and infant rats. J . C O t p . Physiol. P.\ychol. 89, 238-245. Chen, J.. and Amsel, A. 198Oa. Learned persistence at 11-12 but not at 10-1 I days in infant rats. Dev. Psychohiol. (in press). Chen, J., and Anisel, A., 198Ob. Retention under changed-reward conditions of persistence learned by infant rats. Dev. Psvrhobiol. (in press). Chen. J., Gross, K.,and Amsel, A. 1980. Ontogeny of successive negative contrast and its dissociation from other paradoxical reward effects in preweanling rats. J. Comp. Physiol. Psychol. (in press). Colbert, E. H. 1955. “Evolution of the vertebrates” Wiley, New York. Coover, G. D., Goldman, L., and Levine, S. 1971. Plasma corticosterone increases produced by extinction of operant behavior in rats. Phyviol. Behav. 6, 261-263. Crespi, L. P. 1942. Quantitative variation of incentive and performance in the white rat. Am. J. P.sy~~hol. 55, 467-5 17. Daly, H. B. 1974. Reinforcing properties of escape from frustration aroused in various learning situations. In “The Psychology of Learning and Motivation” (G. H. Bower, ed.), Vol. 8, pp. 187-231. Academic Press, New York. Douglas, R. J . 1972. Pavlovianconditioning and the brain. / t i “Inhibition and Learning” (R.Boakes and J . Halliday, eds.), pp. 529-553. Academic Press, New York. Douglas, R. 1. 1975. The development of the hippocampal function. In “The Hippocampus” (R. L. isaacson and K. H.Pribram, eds.). pp. 327-361. Plenum, New York.
PARADOXICAL REWARD EFFECTS
269
Drewett, R. F., Stratham, C., and Wakerley. J . 8 . 1974. A quantitative analysis of the feeding behaviour of suckling rats. h i m . Behuu. 22, 907-913. Elliott, M. H. 1928. The effect of change of reward on the maze performance of rats. Univ. of California Pub. in Psychology No. 4 , pp. 19-30. Estes, W. K . 1950. Toward a statistical theory of learning. Psycho/. Rev. 57, 94-107. Estes, W. K . , and Straughan, J. R. 1954. Analysis of a verhal conditioning situation in terms of statistical learning theory. J. E.rp. Psycho/. 47, 225-234. Fibiger. H. C . , Lytle. L. D., and Campbell, B. A. 1970. Cholinergic modulation of adrenergic arousal in the developing rat. J. Conip. Physiol. Psycho/. 72, 384-389. Franchina. J. J., and Brown, T. S . 1970. Response patterning and extinction in rats with hippocampal lesions. J . Cornp. Physiol. Psycho/. 70, 66-72. Franchina. J. J., and Brown, T. S. 1971. Reward magnitude shift effects in rats with hippocampal lesions. J. Coinp. P h y i d . Psychol. 76, 365-370. Garcia, J . . and Koelling, R. A . 1966. Relation of cue to consequence in avoidance learning. P.s?r,hori, Sci. 4, 123- 124. ~ i [ 387-394. ~ Glazer, H. 1. 1972. Physostigmine and resistance to extinction. P s y c h o p h u r ~ n u r o / i ~26, Glazer, H. 1. I974a. Instrumental conditioning of hippocampal theta and subsequent response persistence. J . Coinp. Ph.vsiol. Psychol. 86, 267-273. Glazer, H. I . 1974b. Instrumental response persistence following induction of hippocampal theta frequency during fixed-ratio responding in rats. J . Comp. Physiol. Psycho/. 86, 1156-1 162. Gonzalez, R. C. 1972. Patterning in goldfish as a function of magnitude of reinforcement. Psychon. Sci. 28, 5 3 - 5 5 .
Gonzalez. R . C.. and Bitterman, M. E. 1967. Partial reinforcement effect in goldfish as a function of amount of reward. J. Cottip. Phwiul. Psychol. 64, 163-167. Gonzalez, R. C., and Bitterman, M . E. 1969. Spaced-trials partial reinforcement effect as a function of contrast. J . Conip. Physiol. Psycho/. 67, 94- 103. Gonzalez. R. C.. and Champlin, G . 1974. Positive behavioral contrast, negative simultaneous contrast and their relation to frustration in pigeons. J . Comp. Physiol. Psychol. 87, 173-187. Gonzalez. R. C.. Behrend, E. R.. and Bitterman, M. E. 1965. Partial reinforcement in the fish: Experiments with spaced trials and partial delay. A m . J. Psychol. 78, 198-207. Gonzalez, R . C., Behrend, E. R.. and Bitterman, M. E. 1967a. Reversal learning and forgetting in bird and fish. Science 158, 519-521. Gonzalez, R . C., Holmes, N . K.. and Bitterman, M. E. 1967b. Resistance to extinction in the goldfish as a function of frequency and amount of reward. A m . J . Psycho/. 80, 269-275. Gonzalez, R. C., Potts. A.. Pitcoff, K.. and Bitterman, M . E. 1972. Runway performance of goldfish as a function of complete and incomplete reduction in amount of reward. Psychon. Sci. 27, 305-307. Goodrich, K . P. 1959. Performance in different segments of an instrumental response chain as a function of reinforcement schedule. J . Exp. Psycho/. 57, 57-63. Gray, J . A. 1972. Effects of septa1 driving of the hippocampal theta rhythm on resistance to extinction. Physiol. Behcv. 8, 48 1-490. Gray, J . A . 1975. Elerrirrits o f u T t t w P r o c v x ~Theory ofL~cirning,London: Academic Press, 1975. Gray, J. A. 1976. The behavioral inhibition system: a possible substrate for anxiety. la “Theoretical and Experimental Bases for the Behaviour Therapies“ (M.P. Feldman and A . Broadhurst, eds.), pp. 3-41. Wiley, New York. Gray, J . A . 1977. Drug effects on fear and frustration: possible limbic site of action of minor tranquilizers. I n “Handbook of Psychopharmacology Vol. 8: Drugs, Neurotransmitters and Behavior” (L. L. Iversen, S. D. lversen and S. H. Snyder. eds.), pp. 433-529. Plenum. New York.
270
ABRAM AMSEL AND MARK STANTON
Gray, J. A., and Ball, G . C. 1970. Frequency-specific relation between hippocampal theta rhythm, behavior, and amobhital action. Science 168, 1246- 1248. Gray, J. A., Arujo-Silva, M. T., and Quintao, L. 1972a. Resistance to extinction after partial reinforcement training with blocking of the hippocampal theta rhythm by septal stimulation. Physiol. Behav. 8, 497-502. Gray, J. A., Quintao, L., and Arujo-Silva, M. T. 1972b. The partial reinforcement extinction effect in rats with medial septal lesions. Physiol. Behav. 8, 491-496. Gray, J. A., Feldon, J., Rawlins, J. N. P., Owens, S . , and McNaughton, N. 1978. The role of the septo-hippocampal system and its noradrenergic afferents in behavioural responses to nonreward. In “Functions of the Septo-Hippocampal System (Ciba Found. Symp. 58);’ pp. 275300. Elsevier, Amsterdam. Gray, J. A., Rawlins, J. N.P., and Feldon, J. 1979. Brain mechanisms in the inhibition of behavior, In “Mechanisms of Laming and Motivation: A Memorial Volume to Jerzy Konorski” (A. Dickinson and R. A. Boakes, eds.), pp. 295-316. Wiley, New York. Guerrero-Figueroa. R., Gallant, D. M., Guerrero-Figueroa, C., and Gallant, J. 1973. Electrophysiological analysis of the action of four benzodiazepine derivatives on the central nervous system. I n “The Benzodiazepines” (S. Garattini, E. Mussini, and L. 0. Randall, eds.), pp. 489-511. Raven, New York. Gurowitz, E. M..Rosen, A. J., and Tessel, R. E. 1970. Incentive shift performance in cingulectomized rats. J. Comp. Physiol. Psychol. 70, 476-481. Haggard, D. F. 1959. Acquisition of a simple running response as a function of partial and continuous schedules of reinforcement. Psychol. Rec. 9, 11-18, Hall, W. G., and Rosenblatt, J . S. 1977. Suckling behavior and intake control in the developing rat pup. J. Comp. Physiol. Psychol. 91, 1232-1247. Hall, W. G., and Rosenblatt, J. S. 1978. Development of nutritional control of food intake in suckling rat pups. Behav. B i d . 24, 413-427. Hall, W.G.. Cramer, C. P., and Blass, E. M. 1975. Developmental changes in the suckling of rat pups. Nature (London) 258, 318-320. Hall, W. G., Cramer, C. P., and Blass, E. M. 1977. Ontogeny of suckling in rats: Transitions toward adult ingestion. J. Comp. Physiol. Psychol. 91, 1141-1 155. Hanson, H. M. 1959. Effects of discrimination training on stimulus generalization. J. Exp. Psycho!. 58, 321-334. Henderson, K. 1966. Within-subjects partial-reinforcement effects in acquisition and in later discrimination learning. J. Exp. Psychol. 72, 704-713. Henke, P. G. 1974. Persistence of runway performance after septal lesions in rats. J. Comp. Physiol. Psycho!. 86, 760-767. Henke, P. G. 1977. Dissociation of the frustration effect and the partial reinforcement extinction effect after limbic lesions in rats. J. Comp. Physiol. Psychof. 91, 1032-1038. Henning, S. J. 1978. Plasma concentrations of total and free corticosterone during development in the rat. Am. J. Physiol. 235, 451-456. Hinde, R. A. 1966. “Animal Behaviour: A Synthesis of Ethology and Comparative Psychology.” McGraw-Hill, New York. Hinde, R. A,, and Stevenson-Hinde, J. 1973. “Constraints on Learning Limitations and Predispositions.” Academic Press, New York. Holinka, C. F., and Carlson, A. D. 1976. Pup attraction to lactating Sprague-Dawley rats. Behnv. Biol. 16, 489-505. Honig, W.K., Thomas, D. R., and Guttman, N. 1959. Differential effects of continuous extinction and discrimination training on the generalization gradient. J. Exp. Psychol. 58, 145-152. Hubel, D. H.,and Wiesel, T. N. 1959. Receptive fields of single neurones in the cat’s striate cortex. J . Physiol. 148, 574-591.
PARADOXICAL REWARD EFFECTS
27 1
Hull, C. L. 1943. “Principles of Behavior.” Appleton, New York. Hulse. S. H., Jr. 1958. Amount and percentage of reinforcement and duration of goal confinement in conditioning and extinction. J. Exp. Psycho/. 56, 48-57. Humphreys, L. G. 1939a. The effect of random alternation of reinforcement on the acquisition and extinction of conditioned eyelid reactions. J . E.tp. Psycho/. 25, 141-158. Humphreys. L. G. 1939b. Acqu on and extinction of verbal expectations in a situation analogous to conditioning. J . Exp. Psycho/. 25, 294-301. Humphreys, L. G. 1940. Extinction of conditioned psychogalvanic responses following two conditions of reinforcement. J. Exp. Psycho/.27, 71-75. Humphreys, L. G . 1943. The strength of a Thorndikian response as a function of the number of practice trials. J . Cotnp. P h p i d . Psycho/. 35, 101-1 10. Isaacson. R. L.. and Pribram, K. H. 1975. ”The Hippocampus, Vol. 1: Structure and Development and Vol. 2: Neurophysiology and Behavior.” Plenum, New York. Ison, J. R.. and Cook. P. E. 1964. Extinction performance as a function of acquisition trials. P.ry.hor?. Sci. 1, 245-246. Johanson. I. B., and Halt. W. G . 1979. Appetitive learning in I-day-old rat pups. Science 205, 41 9-42 1 . Kalat, J. W., and Rozin, P. 1972. You can lead a rat to poison but you can’t make him think. In “Biological Boundaries of Learning“ (M. E. P. Selignian and J . L. Hager, eds.), pp. 115-122. Appleton, New York. Kendler. H. H., and Kendler. T. S . 1975. From discrimination learning to cognitive development: A neobehavioristic odyssey. Iri “Handbook of Learning and Cognitive Processes” (W. K . Estes, ed.). Vol. I . pp. 191-247. Erlbaum, Hillsdale. New Jersey. Kenny. J. T., Stoloff, M. L . , Bruno, J . P., and Blass. E. M. 1979. Ontogeny of preference for nutritive over nonnutritive suckling in albino rats. J. Contp. Physiol. Psycho/. 93(4), 752-759. Kimble, G . A. 1961. “Hilgard and Marquis Conditioning and Learning” (2nd ed.). Appleton, New York. Leon, M . 1974. Maternal Pheromone. Pkysiol. Behov. 13, 44-453. Leon, M. and Moltz. H. 1972. The development of the pheromonal bond in the albino rat. Physiol. Behm,. 8, 683-686. Letz, R.. Burdette, D. R . , Gregg, B., Kittrell, M. E.. and Amsel, A. 1978. Evidence for a transitional period for the development of persistence in infant rats. J. Comp. Physinl. Psycho/. 92, 856-866. Levine. S.. Goldman. L.. and Coover, G. D. 1972. Expectancy and the pituitary-adrenal system. I n ”Physiology, Emotion and Psychosomatic Illness” (Ciba Found. Symp. 8). Elsevier, Amsterdam. Lewis. D. J. 1960. Partial reinforcement: A selective review of the literature since 1950. Psycho/. E d . 57, 1-28. Lincoln. D. W., Hill, A., and Wakerley, J. B. 1973. The milk-ejection reflex of the rat: An intermittent function not abolished by surgical levels of anaesthesia. J . Endocrinol. 57, 459476. Lungo, N., and Bitternan, M. E. 1960. The effect of partial reinforcement with spaced practice on resistance to extinction in the fish. J. Conip. Physiol. P s y h o l . 53, 169-172. Lovejoy, E. 1965. An attention theory of discrimination learning. J. Math. Psycho/. 2 , 342-362. Lowes, G . , and Bitternan, M. E. 1967. Reward and learning in the goldfish. Science 157,455-457. McEwen, B. S . , Gerlach, J. L., and Micco, D. J . 1975. Putative glucocorticoid receptors in hippocampus and other regions of the rat brain. I n “The Hippocampus: Vol. I: Structure and Development” (R. L. lsaacson and K. H. Pribram, eds.), pp. 285-322. Plenum, New York. MacKinnon, J . R. 1966. Interactive effects of the two rewards in a differential magnitude of reward discrimination. J. Exp. Psycho/. 75, 329-338.
272
ABRAM AMSEL AND MARK STANTON
Mackintosh, N. J. 1971. Reward and aftereffects of reward in the learning of goldfish. J . Comp. Physiol. Psychol. 76, 225-232. Mackintosh, N. J. 1974. “The Psychology of Animal Learning.” Academic Press, New York. Martin, L. T . , and Alberts, J . R. 1979. Taste aversions to mother’s milk: The age-related role of nursing in acquisition and expression of a learned association. J . Comp. Physiol. Psychol. 93,430-445. Mason, S. T., and Fibiger, H. C. 1978. Noradrenaline and partial reinforcement in rats. J . Conp. Physiol. Psychol. 92, 1 110-1 118. Mason, S. T., and Iversen, S. D. 1978. Reward, attention and the dorsal noradrenergic bundle. Brain Res. 150, 135-148. Means, L. W., Walker, D. W., and Isaacson, R. L. 1970. Facilitated single alternation go. no-go performance following hippocampectomy in the rat. J . Comp. Physiol. Psychol. 22, 278-285. Micco, D. J . , Jr., McEwen, B. S.,and Shein, W. 1979. Modulation of behavioral inhibition in appetitive extinction following manipulation of adrenal steroids in rats: Implications for involvement of the hippocampus. J . Comp. Physiol. Psychol. 93, 323-329. Murillo, N. R., Diercks, J. K.,and Capaldi, E. J. 1961. Performance of the turtle, psedudemy.s scripra troosrii, in a partial-reinforcement situation. J . Comp. Physiol. Psychol. 54, 204-206. Nauta, W. J. H., and Karten, H.J. 1970. A general profile of the vertebrate brain, with sidelights on the ancestry of cerebral cortex. In “The Neurosciences: Second Study Program” (F. 0. Schmitt, ed.), pp. 7-25. Rockefeller Univ. Press, New York. North, A. J., and Stimmel, D. T. 1960. Extinction of an instrumental response following a large numer of reinforcements. Psychol. Rep. 6, 227-234. Owens, S., Broader, M. R., and Gray, J. A. 1977. The effects of depletion of forebrain noradrenaline on the runway behavior of rats. Exp. Brain Res. 28, R22-R23. Papez, J . W. 1937. A proposed mechanism of emotion. Arch. Neural. Psychiar. 38, 725-743. Pavlov, 1. P. 1927. “Conditioned Reflexes” (G. V. A m p , trans.). Oxford Univ. Press, London and New York. Pert, A, and Bitterman, M. E. 1970. Reward and learning in the turtle. Learn. Motiv. 1, 121-128. Pettigrew, J. 1980. Hierarchical, binocular feature filtering: Parallel evolution of the same strategy in cat and owl. In “The Role of Feature Detectors in the Analysis of Visual Form and Pattern” (D. G.Albrecht and L. Davis, eds.) (in preparation). Pubols, B. H., Jr. 1956. The facilitation of visual and spatial discrimination reversal by overlearning. J. Comp. Physiol. Ps~vchol.49, 243-248. Rashotte, M. E. 1979. Reward training: Contrast effects. In “Animal Learning: Survey and Analysis” (M. E. Bitterman, V. M. LoLordo, J . B. Overmier, and M. E. Rashotte). Plenum, New York. Reid, L. S. 1953. The development of noncontinuity behavior through continuity learning. J . Exp. Psycho/. 46, 107-1 12. Rescorla, R. A., and Wagner, A. R. 1972. A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. I n “Classical Conditioning I 1 Current ResearchandTheory”(A. H.Blackand W. F. Prokasy,eds.), pp. 64-99. Appleton, New York. Reynolds, G . S. 1961. Relativity of response rate and reinforcement frequency in a multiple schedule. J . Exp. Anal. Behnv. 4, 179-184. Reynolds, G . S. 1963. Some limitations on behavioral contrast and induction during successive discrimination. J. Exp. Anal. Behav. 6 , 131-139. Robbins, D. 1970. Partial reinforcement: A selective review of the alleyway literature since 1960. Psycho/. Bull. 76, 415-431. Roberts, W. A. 1966. The effects of shifts in magnitude of reward on runway performance in immature and adult rats. Psychon. Sci. 5, 37-38. Rosen, A. J., and Tessel, R. E. 1970. Chlorpromazine, chlordiazepoxide and incentive-shift performance in the rat. J. Comp. Physiol. Psychol. 72, 257-262.
PARADOXICAL REWARD EFFECTS
273
Rosen, A. J.. Glass. D. H..and Ison, J. R. 1967. Amobarbital sodium and instrumental performance following reward reduction. Psychon. Sci. 9 , 129-130. Rosenblatt, J . S. 1971. Suckling and home orientation in the kitten: A comparative-developmental study. Iir “The Biopsychology of Development” (E. Tobach, L. R. Aronson, and E. Shaw, eds.), pp. 345-410. Academic Press, New York. Sayheed, H., and Wolach. A. H. 1972. Successive incentive shifts with immature and mature rats. Jap. Psychol. Res. 14, 54-60. Schutz, S. L., and Bitterman, M. E. 1969. Spaced trials partial reinforcement and resistance to extinction in the goldfish. J. Comp.Physiol. Psycho/. 68, 126-128. Schwartzbaum. J. S. 1960. Changes in reinforcing properties of stimuli following ablation of the amygdaloid complex in monkeys. J . C o n p . Phy.sio/. Psychol. 53, 388-395. Seligman. M. E. P., and Hager, J . L. 1972. “Biological Boundaries of Learning.” Appleton, New York. Sheffield, V. F. 1949. Extinction as a function of partial reinforcement and distribution of practice. J . &p. Psycho/. 39, 51 1-526. Shettlewonh, S. J . 1978. Reinforcement and the organization of behavior in golden hamsters: Pavlovian conditioning with food and shock unconditioned stimuli. J . Exp. Psyrhol. Anim. Behat,. Proc. 4, 152-169. Skinner. B. F. 1938. “The Behavior of Organisms.” Appleton, New York. Smith. G . J.. and Spear. N. E. 1978. Effects of the home environment on withholding behaviors and conditioning in infant and neonatal rats. Science 202, 327-329. Spence, K . W. 1960. “Behavior Theory and Learning.” Prentice-Hall, New York. Spence, K . W. 1966. Cognitive and drive factors in the extinction of the conditioned eye-blink in human subjects. Psycho/. Rev. 73, 445-458. Stanton, M . , and Amsel, A. 1980. Adjustment to reward reduction (but no negative contrast) in rats 11, 14, and 16 days of age. J. Comp. Ph>isiol. Psychol. 94, 446-458. Stanton, M..Daily, W.,and Amsel, A. 1980. Patterned (single) alternation in 11- and 14-day-old rats under various reward conditions. J. Comp. Physiol. Psychol. 94, 459-471. Surridge, C. T., and Amsel. A . 1966. Acquisition and extinction under single alternation and random partial-reinforcement conditions with a 24-hour intertrial interval. J . Exp. Psvchol. 72, 361368. Sutherland. N. S., and Mackintosh, N. J . 1971. “Mechanisms of Animal Discrimination Learning.” Academic Press, New York. Thorndike. E. L. 191 I . “Animal Intelligence: Experimental Studies.” Macmillan, New York. Traupmann. K. L. 1972. Drive, reward, and training parameters, and the overlearning-extinction effect (OEE). Leurn. Morivcir. 3, 359-368. Tyler, D.W.,Wortz, E. C., and Bitterman, M. E. 1953. The effect of random and alternating partial reinforcement on resistance to extinction in the rat. A m . J. Psycho/. 66, 57-65. Vanderwolf, C. H., Kramis, R., Gillespie, L. A , , and Bland, B. H. 1975. Hippocampal rhythmic slow activity and neocortical low-voltage fast activity: Relations to behavior. In “The H i p pocanipus. Vol. 2: Neurophysiology and Behavior” (R. L. lsaacson and K. H. Pribram, eds.), pp. 101-128. Plenum, New York. Vorhea, H., Kleeman, C. R.,and Lehman, E. 1967. Oxytocin-induced stretch reaction in suckling mice and rats: A semiquantative bioassay for oxytocin. Endocrinology 81, 71 1-715. Wagner, A. R. 1961. Effects of amount and percentage of reinforcement and number of acquisition trials on conditioning and extinction. J. Exp. Psycho/. 62, 234-242. Wagner. A. R., Siegel, L.S., and Fein, G. G. 1967. Extinction of conditioned fear as a function of percentage of reinforcement. J . Comp. Physiol. Psycho/. 63, 160-164. Wakerley. J. B., and Lincoln, D. W. 1971, Intermittent release of oxytocin during suckling in the rat. Nature (London) 233, 180- I81. Walker, D. W., Messer, L. G . , Freund, G . , and Means, L. W. 1972. Effect of hippocampal lesions
274
ABRAM AMSEL AND MARK STANTON
and intertrial interval on single-alternation performance in the rat. J . Comp. Physiol. Psychol.
80, 469-477.
Warburton, D. M. 1969. Effects of atropine sulfate on single alternation in hippocampectomized rats. Physiol. Behav. 4, 64-644. Warburton, D. M.1972. The cholinergic control of internal inhibition. I n “Inhibition and Learning” (R.Boakes and H. Halliday, eds.), pp. 431-460. Academic Press, New York. Wickelgren, W. A. 1979. Chunking and consolidation: A theoretical synthesis of semantic networks, configuring in conditioning, S-R versus cognitive learning, normal forgetting, the amnesic syndrome, and the hippocampal arousal system. Psychol. Rev. 86,44-60. Wolfe, J . W., Lubar, J. F., and Ison, J. R. 1966. Effects of medial cortical lesions on appetitive instrumental conditioning. Physiol. Behav. 2 , 239-244.