BEHAVIORAL AND NEURAL BIOLOGY 29, 359--364 (1980)
Adaptation to Rewarding Brain Stimuli of Differing Amplitude J. A. DEUTSCH Department of Psychology, University of California, San Diego, La Jolla, California 92093 DRAKE CHISHOLM
Department of Psychology, Bridgewater State College, Bridgewater, Massachusetts 02324 AND
P. A. MASON1 Department of Psychology, University of California, San Diego, La Jolla, California 92093 The prolongation of a brain reward stimulus produces no further increment of reward past a certain point. Such a phenomenon can be predicted from Gallistel's (1978, Journal of Comparative and Physiological Psychology, 92, 977-998) leaky integrator model or from adaptation. Adaptation takes a longer time to occur if the intensity of the stimulus is raised. No such effect of intensity is predicted from the leaky integrator model. Equivalent increase of reward magnitude (either by frequency or current increase) produces a longer time to adaptation in brain reward stimulation.
As the length of a rewarding brain stimulus is increased beyond about 1 sec there is very little increase in its reward value. This has been shown both in the runway situation where the measure of reward is running speed (Gallistel, 1974, 1978) and in the Y-maze situation where the measure is proportion of choice (Deutsch, Roll, & Wetter, 1976). Such an effect could be produced by adaptation of the type familiar in sensory systems. Adaptation to centrally applied electrical stimulation has been shown elsewhere to occur in the pain system (Deutsch & Dennis, 1975) 1 We thank Warren G. Young for the design and construction of the electronic apparatus and Professor C. R. Gallistel for valuable discussion. Supported by Grant NSF BNS 78-01605 to J. A. Deutsch. 359 0163-1047/80/070359-06502.00/0 Copyright© 1980by AcademicPress,Inc. All tights of reproductionin any formreserved.
360
DEUTSCH, CHISHOLM,AND MASON
and is therefore not a feature of peripheral receptors exclusively. Alternately, such a diminution of effect as length of stimulus train increases could be a property of the leaky integrator model proposed by Gallistel (1974, 1978). In such a model the input charges a network. The decay rate in the network is such that the effective range of temporal integration is about 2 sec. " I f the rewarding effect (the memory of reinforcement) depends only on the peak value of the output from the neural network, then the animal should have little preference between two equal-strength inputs both of which last longer than the duration at which the output approaches its peak value" (Gallistel, 1978). The adaptation model on the other hand assumes that it is not the peak signal that is utilized. Instead it is assumed that the instantaneous values of the reward signal, above a certain threshold, are summed. However, the value of a constant signal fed in declines as a function of time from the beginning of the signal. These two ways of looking at the same phenomenon generate some different predictions. If we consider the phenomenon as an example of adaptation, we would expect a stronger reward signal to take longer to adapt by analogy with other sensory systems and results on central stimulation (Dennis, 1976). The leaky integrator model does not appear to generate such a prediction, but instead would predict the signal to peak at the same time regardless of amplitude, at least until saturation begins to occur. Then a stronger signal should actually reach ceiling first. To decide between these two viewpoints we therefore ran an experiment in a Y-maze at two levels of reward. The two levels of reward were obtained in two ways. In the first we kept frequency of stimulus the same but used two levels of current. In the second we kept the current level the same but used two pulse frequencies to produce two levels of reward equivalent to those used in the first part of the experiment. At each level we had rats compare the relative effectiveness of pulse trains of different lengths against a standard length of one second. METHOD Two Sprague-Dawley male rats (350 g) were implanted with bipolar electrodes aimed at the lateral hypothalamus (-0.5 from bregma, 1.7 L, -8.2, from dura, incisor bar always set at +5.0). The two rats were trained to run in an automated Y-maze. There they were trained to discriminate intensity differences until no further improvement seemed to be occurring.
Apparatus This consisted of a Y-maze with a lever in the starting alley and in each of the two goal arms. The starting alley was 23 in. long. The goal arms, set at right angles to each other and 135° to the starting alley, were 16 in. long. To decrease the rats' speed between the start and the goal, the rats had to
ADAPTATION OF DIFFERING AMPLITUDE
361
climb over an obstacle 3.5 in. high and 9.5 in. long, 12 in. from the start of the alley. Pressing the lever once in one of the goal arms produced one brain reward. Another press in the same goal arm could not produce another brain reward until the rat had run back to the start and pressed the lever there. This lever functioned not only as a reset but also delivered a brain reward, but only for the first press. After that another reward would be available only after the rat had pressed one of the levers in the goal arms. The brain reward in the starting alley was of 1-sec duration, with each pulse 0.1 msec in duration (100 pps at 400/xA).
Procedure The basic paradigm in all conditions was that the rat was required to choose between two stimuli differing in duration for 80 trials. In each set of 80 trials, one reward stimulus always lasting 1 sec was available on one side of the maze. On the other side, another stimulus (either k, ¼, ½, 1, or 2 sec) was available. In any set of 80 trials the lengths of the two stimuli to be compared were kept constant. Also in each set of 80 trials the sides of the two stimuli were reversed after 40 trials. At the outset of each set of trials and after the side reversal following 40 trials the rat was forced to sample each side to make sure that he visited one of the sides at least three times in the first 15 trials of each condition. The first 15 trials of each condition (original training and reversal) were discarded and only the last 25 trials of each condition were combined to form a preference score and used for statistical purposes. The sets of trials were run in the following order (1-sec pulse train always the standard): 1.~sec 6. k s e c
2 . ¼ s e c 3 . ½ s e c 4. i sec (all at 100 pps at 400/zA) 2 . ¼ s e c 3 . ½ s e c 4. 1 sec
5. 2 sec 5. 2 sec
(all at 290/zA for rat 10, all at 100 pps) (all at 310/zA for rat 9, all at 100 pps) These sets of trials were then repeated another three times for each rat. This procedure enabled us to locate the site of probable difference between the two curves generated. As this was located at the ½-sec point, this critical point was then repeated eight times for each rat in the following way. A set of trials at the higher intensity was alternated with a set at the lower intensity (both comparing a 1-sec with a ½-sec train). This first experiment compared two intensities of reward varying amplitude but holding frequency constant. The second experiment increased the intensity of reward to the same extent as the first, but using the same amplitude while increasing frequency. In order to equate a stimulus of 300 /zA with one at 100 pps at 400/~A, the latter was placed as reward in one
362
DEUTSCH, CHISHOLM, AND MASON
goal box and the former was placed in the other, both stimuli lasting for 1 sec. The frequency of the 300-/zA stimulus was then adjusted from one block of trials to the next until a preference score of 25/50 was obtained. This frequency value was used for the second experiment. Again the values of ~-2 sec were compared with 1 sec in sequence, this being repeated four times. To make sure that no change in criterion has occurred for the two rats, the ½ sec value was then run 12 times, in strict alternation between the three conditions of the experiment. A set of 80 trims was run at the high frequency and low amplitude (high reward), low frequency and high amplitude (same high reward), and low intensity and frequency (low reward), and this was repeated four times. No change in criterion was found.
RESULTS In all conditions there was no consistent difference in choice between the choice of a 1-sec train against itself and the choice of a 1-sec train against a 2-sec train identical in all other respects (Table 1). It therefore seems that at the reward values tested there is little or no increment in reward value when a stimulus is prolonged from 1 to 2 sec. The 1-sec train is overwhelmingly preferred over ~ and ¼sec at all values o f amplitude and frequency. H o w e v e r the situation is rather different when ½ sec is compared with 1 sec. Rat 9
F o r rat 9, the 1-sec train at the high reward value is chosen much more frequently than the ½-sec value (mean = 15.95). When we compare this mean with the mean choice of 1 sec vs 1 sec at high reward value (mean = 24.88) there is a highly significant difference (t (26) = 8.024, p = 1.7 x 10-8). At the low reward value the 1-sec train is chosen only a little more frequently than the ½-sec value (mean 22.33). When we compare this mean with the mean choice of 1 sec vs 1 sec at the low reward value (mean 25.75) the difference (3.42) is significant only on a one-tailed test (t (14) = 2.08, p < .05). Comparing the high and low levels o f reward directly for rat number 9 at the 1-sec vs ½-sec conditions, the difference between the high and low reward means is 6.38 trials (t (30) = 6.26,p = 6.76 x 10-r). R a t 10
Similarly for rat 10, the 1-sec train at the high reward value is chosen much more frequently than the ½-sec value (mean = 12.65). When we compare this mean with the mean choice of 1 sec vs 1 sec (mean 25.0) there is a highly significant difference (t (26) = 10.04, p = 1 x 10-9). At the low reward value, the 1-sec train is chosen only slightly more frequently than the ½-sec value (mean 22.42). When we compare this mean with the mean choice of 1 sec vs 1 sec at the low reward value (mean 24.75) the
.~ .=
~.~
+1
r~
+1
•
+1
.
+1
+1
.=~.=
+1
"2
°
II
i
+1
+1
II
+1
+1
""
4-1
II
II
ra~
r.~
+1
÷1
~
+I
~
m
..o m
G,
r~ .,..j
e.,
,,5 E O
+1
~
~+1
+l
"+1
~+1
N~=N
o
~.~ t,~
o
~ e.~ ~
~
N~=g
O
r,.)
+1
+1
+1
+1
+1
+1
II
.,,-, , - ~
6 Z
•~ m
~
o ~
~ m
'~="!'~"~.~-~, .=~ '~ ~ e~.~ :::1. ~
::I. ~ t'q
363
364
DEUTSCH, CHISHOLM, AND MASON
difference is significant only on a one-tailed test (t (14) = 2.02, p < .05). Comparing the high and low levels of reward directly at the 1-sec vs ½-sec conditions, the difference between the high and low reward mean is 9.77 trials (t (30)= 9.169, p = 1 × lO-a). DISCUSSION The results show unequivocally that at higher intensities of reward it takes longer before no further increment of reward occurs. In this way brain reward behaves analogously to adaptation in sensory systems where stimuli of higher intensity take longer to adapt. On the other hand the results pose a difficulty for the leaky integrator model (Gallistel, 1974, 1978). Our results are also inconsistent with the assumption made that the chronaxie of the strength-duration function for trains of pulses is constant or independent of the experimenter's choice of values of the independent variables such as pulse frequency. They appear to be constant only if reward level is kept the same and Gallistel's (1978) experiment was done without varying the reward level. On the other hand our results confirm other conclusions reached by Gallistel (1976, 1978). We have again shown a reciprocity between the number of pulses and required current to produce a given level of reward. Our results also support the notion put forward by Gallistel (1978) that the reduction in effectiveness of a brain stimulus as train length increases is due to the properties of the neural network after brain stimulation has been translated into reward. This is because we have shown that the rate of adaptation depends on the magnitude of reward rather than the intensity or frequency of stimulation.
REFERENCES Dennis, S. G. (1976). Adaptation of aversive brain stimulation. II. Effects of current level and pulse frequency. Behavioral Biology, 18, 515-530. Deutsch, J. A., and Dennis, S. (1975). Adaptation of aversive brain stimulation: Effects of pulse frequency. Behavioral Biology, 13, 245-250. Deutsch, J. A., Roll, P. L., and Wetter, F. (1976). Choice between rewarding brain stimuli of differing length. Behavioral Biology, 18, 369-377. Gallistel, C. R. (1974). Note on temporal summation in the reward system. Journal of Comparative and Physiological Psychology, 87, 870-876. Gallistel, C. R. (1976). Spatial and temporal summation in the neural circuit subserving brain-stimulation reward. In A. Wauquier & E. T. Roll (Eds.), Brain-Stimulation Reward. New York: American Elsevier. Gallistel, C. R. (1978). Self-stimulation in the rat: Quantitative characteristics of the reward pathway. Journal of Comparative and Physiological Psychology, 92, 977-998.