Learning and electrical self-stimulation of the brain

Learning and electrical self-stimulation of the brain

J. Theoret. Biol. (1963) 4, 193-214 Learning and Electrical Self-Stimulation of the Brain J. A. DEUTSCH Departments of Psychiatry and Psychology...

1MB Sizes 0 Downloads 69 Views

J. Theoret. Biol. (1963) 4, 193-214

Learning

and Electrical

Self-Stimulation

of the Brain

J. A. DEUTSCH

Departments of Psychiatry and Psychology, Stanford University, Palo Alto, California, U.S. A. (Received 22 September 1961 and, in revised form, 9 November 1962) A theory of learning concerning approach behavior in the rat is applied to the phenomena of electrical self-stimulation of the brain. First, the theory is outlined and its application to learning in the intact animal is shown. Examples of subsequently verified predictions are then given. The theory is then applied to the phenomena of intracranial selfstimulation. Here learning is induced by the direct stimulation of parts of the central nervous system. However, such learning differs in many important respects from normal learning. It is shown that its anomalies are to be predicted from the theory. Further experimental verifications to iest the theory in its application to intracranial self-stimulation are also described. The reason why a theory based on behavioral evidence can be used to predict at a more physiological level is that it is a hypothesis about what type of system it is which produces observed behavior.

Introduction

It is the purpose of this paper to show how certain phenomena of learning produced by interference with the normal functioning of the central nervous system can be explained by a theory of learning. This is not a descriptive theory. That is, it does not describe learning as do most learning theories up to the present time, for instance by seeking to state in mathematical terms the relation of learning to variables which the experimenter can manipulate. Instead, this theory describes a system such that would exhibit the observed peculiarities of learning. Such a system is independent of the precise physical properties of the components out of which a machine embodying this system could be built. The properties manifested by the system are not a reflection of the properties of the components of the system, but a result of the interrelations of the components, and it is these interrelations which form the system and produce the behavioral properties which the theory seeks to explain. Such a theory can explain behavior even if the identity of its components is not specified. However, usefulness of such a theory can be even further extended if its components can be given an actual identification in terms of neural elements. Given such an identification, it is possible to use the theory not only for the

‘94

J.

A.

DEUTSCH

purposes of behavioral prediction, but for physiological prediction also. We shall consider how a particular system, behavioral predictions from which have been tested, can be applied to behavior where physiological interference with the nervous system has taken place. The theory to be discussed was developed in an attempt to provide an explanation of learning as we see it manifested in the performance of the rat. Not only can the theory account for a wide variety of experimental results from drinking to reasoning, but it has also successfully predicted further phenomena in these fields (Deutsch, r96ob). This theory has also been translated into a machine manifesting the behavior which the theory claims to explain (Deutsch, 1954). The Theory

Within the space of this paper it will only be possible to give a sketch of those parts of the theory relevant to the phenomena under discussion. (For a fuller statement of the theory and the evidence for it, see Deutsch, r96ob.) For ease of exposition the formal properties of the system and their identification will be presented together. (I) It is assumed that there are units or elements in the nervous system (to be called primary links) which are excited by various physiological states. (2) An increase in such states produces an increase in excitation in such primary links. For instance, an increase in osmotic pressure will increase excitation in one primary link; an increase in testosterone level will increase excitation in another. (3) Each primary link has connected to it an “analyser”, i.e. a receptor structure, sensitive only to one class of stimuli in the environment. For instance, the analyser connected to the primary link, sensitive to osmotic pressure, may be identified in part with the water-salt receptor (Deutsch & Jones, 1960). (4) The sensitivity of a primary link to a relevant physiological state (such as osmotic pressure) will be reduced by messages relayed to the link by the analyser connected to it. Such a reduction in sensitivity will dissipate with time. This arrangement permits the animal to make up a deficit quickly, but without overshooting. By the time the reduction in sensitivity has worn off, the physiological change which irritated the primary link initially will also have been reversed (see Fig. I). (5) There is also a large number of secondary links. A secondary link is connected to the motor system and also to an analyser in such a way that when the secondary link is excited (conditions for this are stated below), the analyser and motor system form a loop, which acts to maximize the excitation of the analyser.

LEARNING

AND

BRAIN

STIMULATION

‘95

Analyser

J Primary

f Chanie in physiological state

I

Iink To secondary link and motor output !

I. The primary link is irritated by a certain change in physiological state. This primary link passes the excitation caused by the irritation to secondary links and so causes specific types of goal-seeking activity. As the analyser attached to the primary link is fired by events occurring at the periphery (such as various tastes) the irritability of the primary link by the change in physiological state is temporarily reduced. FIG.

The connection between links which conveys excitation originating in a primary link will, for the sake of convenience, be called a motiwatimal pathway. The connection between links down which passes a message which causes the motivational pathway to become functional will be called the reinforcement pathway. When the analyser attached to a link has ceased to be stimulated, the link will for a brief period remain in an altered state. If during this altered state of one link a message arrives down the reinforcement pathway, signalling that the analyser connected to another link is being stimulated, then the motivational pathway between the two links becomes functional (Fig. 2). For the purpose of the phenomena to be discussed in this paper, a full discussion of this assumption is unnecessary. It will be sufficient to remark that the postulated feedback loops are such as to effectuate the approximation of the animal or a part of its body to a specific target, be it visual or somesthetic.

7 Primary link

FIG. 2. Links are connected together by a motivational pathway and a reinforcement pathway. The motivational pathway between two links becomes functional when a signal passes down the reinforcement pathway. D

J. A. DEUTSCH I96 (6) In order that a secondary link should become excited, excitation from a primary link must reach it. Such excitation originating in a primary link is passed from any link to any other provided that a functional connection exists between them. (7) Such a connection becomes functional when the analysers belonging to two links are stimulated in close temporal succession. In order for this to happen, a message must pass down another pathway already connecting the two links, causing the first pathway to become functional. (8) While the stimulation of two analysers in sequence will cause the motivational pathway between their two links to convey an increased amount of excitation, stimulation of the first analyser without the second will cause a decrease in the amount of excitation travelling from one to the other. (9) If there are two analysers being stimulated at the same time, and both their links are receiving motivational excitation, then the link receiving the larger amount of motivational excitation will determine which cue is approached. (IO) A link will not pass excitation to any other link while its analyser is being stimulated, but will divert excitation which it receives into the motor system so as to cause the animal to approach the cue by which its analyser is being stimulated. (I I) There are also some primary links excited by certain classes of analysers such as those signalling pain. It is assumed that other analysers fired just prior to the excitation of these links become connected to them and so excite them when these other analysers are again fired. Links whose analysers fire after the cessation of firing of such analysers become connected to these primary links in the normal manner. As a result of the operation of the above system, sequences of links are formed which will generally be connected also to primary links. Then if, for instance, the animal becomes thirsty the excitation generated by the primary link sensitive to extracellular osmotic pressure will flow down the sequence of links which has in the past become connected to the primary link for thirst. It will pass down these sequences of links.until it reaches one whose analyser is being stimulated. The animal will approach the cue to which the analyser is sensitive. As it approaches, the next stimulus will be detected by the analyser belonging to the link next in the sequence. The link attached to this analyser will now divert the excitation reaching it into its own motor system. Thus the animal will approach this next stimulus. Such a process will continue, as the animal approaches those stimuli which are nearer water in the sequence which has in the past led to water, until water is reached.

LEARNING

AND

BRAIN

Behavioral

STIMULATION

197

Applications

It seems best to illustrate the theory’s application to behavior by means of a few concrete examples, beginning with an experiment by Kendler (1946). This worker used a T-maze with water in one goal box and food in Goal box 1

Coal

Food

box 2

Water

L1 Cue 1

Start

FIG.

3. Diagram

of T-maze

indicating

the other (Fig. 3). At the starting were simultaneously hungry and they turned one way and with situation the theory assumes that

Goal

position

of rewards

and cues.

point of this maze he placed rats which thirsty. They were rewarded with food if water if they turned the other. In this two sequences of links are formed follow-

Food

Water

box 1

Cool

Cue 3

Cue 4

box

2

I

Cue 1

FIG. 4. Diagram of the system set up according to the theory when maze in Fig. 3. The arrows indicate flow of motivational excitation.

the animal

lcarnn

ing a single sequence up to the link whose analyser is stimulated by cues at the choice point (Fig. 4). At the end of one sequence is the link, whose analyser is stimulated by water, and at the other the link whose analyser is stimulated by food. If the rat is made thirsty, excitation from the link D*

198

J.

A.

DEUTSCH

sensitive to extracellular osmotic pressure (the “water” link) is transmitted down the sequence of links. We shall assume that the rat, when it is placed in the maze sees cue 2 immediately. As excitation from the “water” link will be reaching link 2 whose analyser is sensitive to cue 2, the rat will approach cue 2. As the rat approaches cue 2, cues 3 and 4 will stimulate the analysers attached to links 3 and 4. As excitation is reaching link 2 through link 4, as soon as the analyser attached to link 4 is stimulated by cue 4, cue 2 will no longer be approached. This is in accordance with rule IO, for link 4 will no longer pass excitation to any other link, and so no more excitation will reach link 2. Instead, as the link 4 is the only one of those which remain excited whose Coal box

Start FIG. 5. Diagram of maze in which water by going the other.

rat obtains

food by going

one way to the food box and

analyser is being stimulated, the animal will approach cue 4. The animal will then reach water by a repetition of such a process. Analogously, when it is hungry it will reach food instead. This deduction agrees with Kendler’s experimental result. This evidence, however, presents great difficulties to other theories (for instance, Hull (Deutsch, r96ob)), strangely enough, as such a result would be expected on the grounds of common sense. However, let us consider an experiment with a more paradoxical outcome. Hull (1933) used a maze in which there was a single choice point, as in a T-maze. The two arms of the maze converged again after the choice point to end in a single box (Fig. 5). The rats in his experiment were made hungry on one day and thirsty on the next. In order to obtain food in the goal box they always had to turn one way for food at the choice point and in the opposite way for water. The arrangement of links set up in this

LEARNING

AND

BRAIN

STIMULATION

‘99

situation according to the theory is illustrated in Fig. 6. The arrangement is the same as in Kendler’s experiment quoted above except that the link sensitive to cues in the goal box is common to both sequences past the choice point. Therefore, when the rat is made, say, thirsty, excitation from the “water” link will pass to the “goal-box” link (rule 6) and will then pass to both sequences of links (rule 6) which represent the two stems of the maze past the choice point. Therefore, excitation will reach both sets of links at either side of the choice point. To determine a choice of one side or the other such excitation has to be unequal (rule 9). Hull did find in this experiment that the problem is almost insoluble to a rat. It has been shown by means of control experiments (Deutsch, 1959) that the difficulty

FIG. 6. Diagram It can be seen that insoluble.

of system set up when motivational excitation

animal learns the problem illustrated in Fig. 5. spreads in such a way as to make the problem

is due to the common goal box where the animal is given either the food or the water. It has also been .shown (Leeper, 1935 ; Bolles & Petrinovich, 1954; Deutsch, 1959) that if two separate goal boxes are used, rats learn to solve the problem with ease. That that such should be the case on the theory can be seen by inspecting Fig. 4. Here excitation generated by, say, the “water” link does not excite the sequences after the choice point equally. If the “water” link is excited (when the animal is thirsty), this excitation passes exclusively down the sequence of links which “represent” the succession of cues which led to water. Thus when a link common to both sequences (past the “choice point” link) is removed, the problem becomes a soluble one for the system. So far the illustrations have shown how the theory can be used to make predictions about behavior. Many of the theory’s postulates can be tested

200

J.

A.

DEUTSCH

to some extent independently of each other. For instance, it is possible to test rule IO, which states that any link whose analyser is being stimulated will not pass excitation to any other link. If we suppose that the link gradually recovers its conductivity after the stimulation has ceased, other consequences emerge besides that of efficient progression toward a goal. Consider the situation where a rat is trained hungry in a T-maze and rewarded equally whichever goal box it chooses. The theory would predict that if the stimuli in one arm or goal box have impinged on the rat’s analysers, the rat will then run up the other arm in the succeeding trial. The links attached to the analysers that have been stimulated will not transmit as much excitation as those “representing” the other side. As the amount of excitation in both arms would otherwise be equal, the links representing the other side will have a larger amount of excitation reaching them. Therefore, according to rule 9, the rat should choose this other side. It has frequently been demonstrated that in the situation just described, a rat exhibits a regular alternation, this being particularly marked between the first and second trials of the day. It has also been shown by Montgomery (1952) that the animal alternates stimulus alternatives rather than turning movements. Further, the theory predicted (1953) and experiment showed that if the animal was simply placed in one of the goal boxes and allowed to feed (after it had become familiar with the situation) the animal would tend to run to the other goal on the next trial (Sutherland, 1957; Glanzer, 1958). These last experiments are important because they test the notion that excitation reaches a unit underlying performance through other units nearer the final goal. If these other units are rendered less conductive then the unit receiving its excitation through them is less able to determine behavior. Another way of testing this notion, important when we come to consider intracranial brain stimulation, is by using rule 8. This states that when two analysers fail to be stimulated after each other as they had been previously, then the connection between the links to which these analysers are attached will become less permeable to excitation originating in primary links. It has been shown also, as would be predicted, that the rat, when directly placed in the goal box where it does not find food, is also less likely when placed in the starting alley, to run to the goal box (Seward & Levy, 1949; Deese, 1951). In this last case, the goal box cues are not followed by the food cues as was the case during learning. As a result, less excitation will flow from the food link to the goal box link. As the excitation to the rest of the links representing the maze passes through the goal box link a smaller amount of excitation will reach these, with the consequence that the animal is less likely to run the maze but more likely to scratch itself, or indulge in some other previously less pressing pursuit.

LEARNING

AND

BRAIN

STIMULATION

201

Deutsch & Clarkson (rgsga,b) tested the assumption about the transmission of excitation in a more searching manner. They trained rats in a maze with one entrance and two alleys, which ended in a single goal box (Fig. 3). The animals could secure a reward of food in the goal box, whichever of the two routes they took. Having learned to run to the goal box in this situation, the animals were tested under two conditions. In the first condition, an animal met a block before the goal box in the arm of the maze which it had chosen. In the second condition the animal ran to the goal box, which it found empty
/’ Cues d” left-----41 alley, ‘.

FIG. 7. Diagram in a maze illustrated

I ‘.

1

-\‘\

“Cues ,r-----right alley

in

‘\

of system when the rat has learned to obtain food by going either way in Fig. 5 on the trial after it had found a block in one of the arms.

The overwhelming majority of the rats chose the other alley on the next trial. On the trial after that on which they found no food in the goal box, the number of animals taking the opposite path was very close to that taking the opposite path after a rewarded trial. This prediction was made in the following way. In the first condition, when the rat finds a block in one of the alleys, goal box cues no longer succeed the cues of this alley. Hence, less excitation will be transmitted from the link, to which analysers sensitive to the goal box are attached, to the link “representing” a part of this alley (Fig. 7). However, the transmission of excitation from the “goal box” links to the “alley” links on the other side will remain unaffected. We should therefore deduce that the rat will approach the cuea on this other side. However, when the rat in the

202

J.

A.

DEUTSCH

second condition finds the goal box empty, both sets of links representing both sides will be equally affected. Less excitation will reach the goal box links and such a diminution of excitation will therefore flow to both sides equally (see Fig. 8). Hence, the tendency to choose one alley or the other should be unaffected. 7 Food

FIG. 8. Diagram of the system way in a maze illustrated in Fig. the food box.

Learning

where the rat has learned to obtain food by going either 5 on the trial after it had found no food when it reached

and Intracranial

Sel&Stimulation

In 1954 Olds and Milner reported that electrical stimulation of certain subcortical regions can evoke a repetition of any action which just precedes the stimulus. For instance, if by pressing a lever the rat could close a circuit delivering 3-0 microamperes of current for half a second in the medial forebrain bundle and other structures (Olds, Travis & Schwing, 1960), it would learn to press the lever at extremely high rates. The other two most striking characteristics of such habits, as compared with those executed for a reward of food, are their insatiability (the animals often press for hours until exhaustion supervenes) and their extremely fast cessation, once the electric stimulus is disconnected. The discovery of this phenomenon was very important. One of the reasons for this is that it can be used to test the adequacy of existing theories, as Olds & Milner (1954) themselves rightly emphasized. So far such theories have found the phenomenon with its striking characteristics a complete enigma.

LEARNING

AND

BRAIN

STIMULATION

203

It is the main purpose of this paper to postulate an explanation of the various properties of electrical self-stimulation by supposing that the stimulating electrode is placed in a certain part of the system described above and to show how such an assumption is borne out by the experimental evidence. In brief, it is assumed that the electrode stimulates both the motivational pathway and the reinforcement pathway between two links. When an animal presses a lever, certain analysers send messages to the links to which they are attached. The lever press also causes the implanted electrode to stimulate a reinforcement pathway leading to these links, whose analysers were stimulated by the animal pressing the lever. When this occurs then, by rule 7, the motivational pathway leading to the same links as the reinforcement pathway that has just been stimulated will become functional. Electrical stimulation of such a motivational pathway will then result in excitation, which will reach the links connected to analysers fired by the lever-press. Therefore, according to rule 5, when such excitation reaches these links the animal will attempt to lever press. Each time a lever press occurs, while the current source is turned on by the lever, it will produce excitation to produce the next press, and so this habit will seem insatiable. Also, when the current source is disconnected, cessation of response will be swift, as a press will no longer induce the excitation which caused it. Thus, the apparently swift extinction of the habit that appears when this occurs would on this explanation be regarded rather as the swift decay of a drive process. Such a decay of an electrically-induced drive has been observed in other contexts. It has been possible to elicit eating and drinking by the stimulation of hypothalamic structures (e.g. Andersson & McCann, 1955; Smith, 1956; Wyrwicka, Dobrzecka & Tarnecki, 1959). Habits learned for food or water can also be elicited by such stimulation. However, decay of activities elicited by such stimulation is rapid once no further stimulation occurs. It is also interesting to note from the standpoint of the theory that severance of the pathways leading from such loci in the hypothalamus produces an animal which eats but no longer initiates chains of activity leading to eating (Morgane, 1961). This supports the notion that hungermotivated behavior is produced by the flow of excitation from one set of structures (which are identified as the primary links) to others within the nervous system. Extinction

or Drive

Decay

Animals cease very quickly to respond when a response is no longer followed by an intracranial stimulus. For instance, Seward, Uyeda & Olds (1959) reported that a group of animals with electrodes implanted in the hypothalamic area made only 18% of the number of lever presses they

204

J.

A.

DEUTSCH

were giving at the end of training during the first two days of extinction tests when the intracranial stimulus was no longer available. The group with septal electrodes made only I I yO of responses of its final level during training. Even these figures are high in the experience of the writer, who observed that if threshold levels of current are used the animal seldom gives more than thirty presses to “extinction”. The dependence of performance under conditions of intracranial brain stimulation on closely spaced electrical stimuli has been noted under other situations. For instance, Sidman et al. (1955) f ound that animals will not learn to press the bar if the intervals between the electrical “rewards” are too long, as in cases where the animal must press a number of times in order to secure one “reward”. The schedules of reinforcement used by Sidman et al. produce efficient responding when the animals are hungry or thirsty and working for a conventional reward. However, in the case of brain stimulation, it is usually found that gaps of more than a few seconds between “rewards” are inconsistent with performance of the habit. Brodie et al. (1960) have obtained performance with some animals when gaps between electrical rewards are of the order of minutes. However, the current strengths used were very high and it is noted in their report that there is a strong correlation between length of gap tolerated and the number of responses to extinction, which again was unusually high. Such findings would be expected if the “motivation” for the next lever press were derived from the previous press. Brady & Conrad (1960) also report that they successfully trained rats to obtain a brain stimulus on the average of one every sixty seconds. However, the stimulus here seems high (IOO cps, 0.2 msec pulse duration, 25 peak milliamperes, train duration 0.5 sets). Further, these animals were periodically given painful electric shocks, which has been shown in some placements to motivate responding for brain stimulation (Deutsch & Howarth, 1962). It is difficult to compare and evaluate reports of responding where there are long gaps between brain stimuli. First, we often do not know how high above threshold the brain stimulus was. Second, we do not know about the relation of the placement to drive pathways (such as those of fear or sex) in which there may be excitation independent of the stimulating current. Third, it is difficult to compare the strength of tendencies acting against responding between the various situations which have been used. The writer has trained rats to press a lever to obtain a brain stimulus only when a light was on. Using a lever which required less than 5 g of pressure to produce a threshold stimulus, animals would return after about three minutes after the last set of stimuli to secure another set of “rewards”. (The presence of endogenously occurring motivational excitation here was probable.)

LEARNING

AND

BRAIN

STIMULATION

205

They were then trained to press a 25 g lever which they would continue to press when responding was uninterrupted. However, when the light was switched off for periods of a minute, the rats quickly ceased to return to the lever. Further analysis of the behavior showed that the pressure exerted on the lever decreased with time since the last brain stimulation. Therefore, the probability that an animal would press hard enough to obtain another brain stimulus falls off with time and the gap between brain stimuli consistent with maintenance of a habit will be a function of the pressure (or effort) required to execute it. We might therefore expect that the harder the task the less tolerant its executor would be of gaps between brain stimuli. Such data are again consistent with the notion that there is a decay in motivational excitation after the occurrence of a brain stimulus. Such considerations make it difIicult to evaluate experiments such as that of Stein (1958) from the standpoint of the present theory. Stein paired a tone with intracranial brain shock 400 times. He found a slightly elevated rate of pressing a lever which produced such a tone, without accompanying brain shock, in a subsequent session. Such a finding would only be predicted on the present theory if some endogenously produced motivational excitation was present in the pathways stimulated by the electrodes. Whether such was indeed the case is impossible to judge from the available report. Anomalies of behavior which have been obtained when animals are trained to run mazes for intracranial stimulation in these regions can also be explained on the same grounds. Olds (1956) compared the learning of a maze for a food reward and for an electrical brain stimulus in rats which were eighteen to twenty-four hours hungry. He found that the rats running for an electrical “reward” would improve their performance markedly during the block of fifteen closely spaced trials each day, in contrast with the animals running for a food reward. However, there was a sharp overnight decrement, the rats seeming to lose interest in running until stimulation was applied the next day. Seward, Uyeda & Olds (1960) compared animals in runway and grid crossing situations under conditions both of massed and spaced practice. There was again a tendency for sharp improvement within one day’s session for the group receiving massed training, and an overnight decrement in performance. These two anomalies are absent in the group receiving their trials fifteen minutes apart. However, the rate of learning taking the first trials of the day only is comparable for both groups. (I am greatly indebted to Dr. Seward for supplying me with the data on this point.) The features noted of a sharp improvement within a day’s trials if these are closely spaced and the overnight decrement in performance agree with However, the day-to-day improvement the theoretical interpretation.

206

J.

A.

DEUTSCH

(although small) in the last experiment where the animals were neither water nor food deprived cannot be explained unless we assume that some other drives (such as fear) were present. That such is not altogether unlikely will be shown below. A further feature of learning with a brain shock as a reward, as contrasted with learning for a food or water reward, is that responding after extinction begins quite suddenly when the current is again switched on (Olds, 1958 ; Lilly, 1958). It has been observed that many animals that have lost all interest in the lever after the current has been switched off will resume pressing as if they had never been away from the lever if placed on it again, thus securing one “free” electrical stimulus. Similarly, it is a general observation that one “free” stimulus administered to an animal that has ceased to respond will recall the animal to the lever, where it will immediately begin responding at its old rate (the “priming” phenomenon). Such observations accord well with the interpretation in terms of the above theory, that an excitation usually stemming from primary links is induced in the system by the electrical stimulus. This interpretation of extinction provided by the theory has recently been more directly tested (Howarth & Deutsch, 1962). When an animal learns a habit for a reward such as water or food, it will continue to perform such a habit a certain number of times after the food or water is withdrawn. That is, the number of responses to extinction tends to remain constant. The number of times a habit is repeated after reward has been withdrawn is constant regardless of the time after the training trials that the extinction session occurs (e.g. Skinner, 1950). However, on the present interpretation, cessation of responding in the situation where the “reward” is an electrical stimulus should be a function solely of the time since the electrical stimulus was switched off and not of the number of unrewarded presses executed. Very strong evidence was obtained for this interpretation in the experiment by Howarth & Deutsch (1962) where the lever was withdrawn for varying amounts of time after the current was switched. off. The number of responses to “extinction” was smaller the longer the lever was withdrawn at all if the lever were until there was no lever pressing to “extinction” absent for twenty seconds after the current had been switched off. The diminution of the number of responses due to lever withdrawal for a given time was accurately predictable by counting the number of responses in that time during control sessions when the lever was not removed. It was also possible to show that the time to cessation of responding increases with the intensity of the previous electrical stimulation and with its previous duration (both up to a certain maximum). Such findings support the present interpretation of the fact that it is difficult to train animals in situations where electrical “rewards” are spaced in time and that higher

LiARNING

AND

BRAIN

STIMULA’TION

207

current intensities produce more successful performances in these situatfons. Further evidence supporting the present interpretation comes from two recent experiments. Deutsch & Rimm (unpublished data) trained two groups of rats to cross a grid following o sets delay and after’30 sets after a reward in a goal box had been administered. One group of rats was run thirsty to a water reward; the other was run for an intracranial stimulus as a reward. After the 30 set delay, the animals running to intracranial stimulation almost trebled their running time as compared with the o set delay, whereas the animals running to water actually speeded up after the delay by some 25%. Deutsch, Adams & Metzner (unpublished data) showed that the probability of choice of intracranial stimulation when put in conflict with water fell as a function of time since the last intracranial stimulus. This result was obtained with animals trained in a T-maze with thirsty animals when water was available on one side of the T and an intracranial stimulus on the other. We should also predict from the theory that increases in stimulating current should produce greater reward and motivation both by stimulating a larger number of pathways and by stimulating some more intensely. However, it has been found (Olds, 1958; Reynolds, 1958) that rates of lever pressing are not always linear with the strength of electrical stimulus before the seizure threshold is reached and that they often decline as stronger electrical stimuli are used. Olds himself attributes this to the involvement of aversion centers as current spreads further at higher intensities. That such an explanation is right, at least in some placements, is made very probable by other work (e.g. Bower & Miller, 1958; Roberts, 1958). That there are other placements where other factors besides aversion slow down responding at higher intensities has been shown by Hodos & Valenstein (1962). Rats in their experiment would prefer to press a lever which supplied a higher current, though their rate of responding was lower on such a lever than on one which supplied a lower intensity stimulus. Motor arrest and motor twitches are suggested as an explanation of such slowing of response. Motor interference is indeed a very pronounced feature of stimulation at high intensities. Using placements in the ventral tegmentum and hypothalamus, Howarth and the writer have shown that stronger electrical stimuli always lead to a preference for that side of the T-maze where they are administered and also to a longer time to cessation of responding in a lever-pressing situation, which agrees with the prediction from the theory. However, such larger stimuli only sometimes lead to an increased rate of responding in the lever-pressing situation and often to a decrease. It is therefore probable that response! rates are not good measures of the effect under consideration.

208

J.

Interactions

A.

DEUTSCH

with

Normal

Drive

Another set of expectations about the electrical self-stimulation phenomenon, if the theory is right, concerns itself with its relation to normal motivations or drives, such as hunger and thirst. If the electrode is stimulating connections between links which ordinarily convey excitation from the primary links, then such excitation should sum with the excitation which is electrically produced to enhance performance in various ways. For instance, Sidman et al. (1955) found that animals, when hungry, would work for electrical stimuli more widely spaced in time. The summation of excitation here would presumably lead to a longer decay time, as such decay would start from an initially higher level. Margules & Olds (1962) found that electrical self-stimulation could only be obtained from loci in portions of the hypothalamus from which hunger effects could also be elicited. Hoebel & Teitelbaum (1962) report that manipulation of loci in the hypothalamus in ways which increase hunger enhances performance during electrical self-stimulation. Also, manipulations that reduce hunger decrease electrical self-stimulation. Brady et al. (1957) and Olds (1958) observed that the threshold current necessary to obtain electrical self-stimulation and the rates of self-stimulation were affected by drive states such as hunger and sex. A further test was carried out (Deutsch & Howarth, 1962) because although these results can be explained in terms of the theory, they could be accounted for in other ways. For instance, it could be said that owing to changes in biochemical conditions the stimulated tissues were more sensitive electrically, or it could be held that the effect of the stimulus was more rewarding. However, such arguments could not be used if it could be shown that a habit previously learned under conditions of electrical stimulation could be evoked by a specific drive in the absence of further intracranial stimulation. If the pathways conveying excitation from a certain primary link are connected to a secondary link by electrical stimulation, then the excitation reaching the secondary link should cause the appropriate action. It should not matter whether such excitation is generated by the electrical stimulus or the primary link. Hence, a habit “motivated’ only by the electrical stimulus should also be evoked in the absence of this stimulus, when the appropriate primary link is excited. This is shown to be the case most conveniently in the case of fear, where large and sudden increments are easily produced (Deutsch & Howarth, 1962). In this experiment two groups of rats were taught to press a lever for an intracranial electric stimulus. One group had received implantations of electrodes in the ventral tegmental region, close to the midline. The other group’s electrodes were more anterior, being in the hypothalamus and one was as far anterior

LEARNING

AND

BRAIN

209

STIMULATION

as the diagonal band of Broca. Apart from the difference in electrode placement, both groups received the same treatment. During training to press the lever, they were repeatedly removed to make sure that they had learned where the lever was in relation to the rest of the box. After such training was complete, the animals were held away from the lever by hand for about one minute. During this period, the animal would “lose interest” in the lever. Then we would attempt to frighten the animal with a buzzer, or if this failed, we would pair the buzzer with a mild electric shock. There was a large difference between the two groups of animals in response to the fear-producing stimulus. The animals with tegmental placements would resume lever pressing even though this now produced no further brain stimulus, whereas the other group with more anterior electrode placements tended not to react in this manner. It was also possible to show in another experiment that time to extinction was extremely prolonged in animals with tegmental electrodes when they were frightened, whereas those with hypothalamic electrodes were not affected. This confirms the notion that the electrodes producing electrical self-stimulation are stimulating pathways which convey excitation from specific primary links. When such excitation is evoked from the relevant primary link, the habit learned under the artificial insertion of excitation into the pathway reappears. It is assumed that the pathway from a primary link being stimulated in this experiment leads from a primary link which is excited by noxious stimuli. Such excitation passing down the pathway we stimulated reached secondary links. Therefore the cues that stimulate the analysers connected to these links will become attractive in the presence of noxious stimuli. This experiment therefore supports the interpretation of previous data made on the theory. Deutsch & Metxner (unpublished data) have obtained further evidence about the relation of drives to intracranial stimulation. Measuring running speed in a T-maze to intracranial stimulation after various delays, they found that running speed decreased as a function of time since the last brain stimulus, whether this was applied in the goal box or not. They further found that the rate at which running speed decreased with time since the last brain stimulus was significantly altered by thirst in some electrode placements. Differentiation

of Drive

and Reward

Effects

So far the evidence that has been presented suggests that drive effects are produced by intracranial stimulation by the stimulation of pathways conducting excitation from primary links. This has been argued from results showing the insatiability of responding under this stimulation, the fast decay of responding, and the interaction of normal drive states with

210

J.

A.

DEUTSCH

the intracranial stimulus. We have not yet discussed any direct evidence that two types of pathways are involved as the theory suggests. There are, however, plausible reasons to believe that this is the case in the phenomenon under consideration. As was mentioned above, it is possible to evoke drives such as hunger and thirst by direct hypothalamic stimulation (Andersson & McCann, 1955; Smith, 1956; Wyrwicka et al., 1959) and to produce the performance of habits relevant to the satiation of such drives. However, such drive evocation does not have rewarding properties. That is, it does not appear that animals can be trained to acquire new habits to obtain such stimulation. It seems that electrical evocation of drive does not necessarily produce reward, and that therefore two different systems are present, as the theory proposes. An experiment that suggests more directly that two systems are involved in the phenomenon under discussion has been performed (Deutsch et al., 1962). If there are two systems being simultaneously stimulated, it would be likely that they should have different thresholds of electrical excitability. Preliminary experiments with animals implanted in the ventral tegmental region suggested that the drive pathway might have a lower electrical threshold than the reinforcement system. It was argued that excitation of the drive pathway would maintain the performance of a learned response when such stimulation is non-contingent upon the performance of such a response. If, for instance, it were possible to stimulate the drive pathway while keeping the stimulation of the reinforcement system at a minimum, then a learned response should be maintained even when the stimulus occurred randomly in relation to the animal’s actions. However, if the reinforcement system were also excited with each stimulus to a significant extent, random stimulation would quickly reward new actions and so lead to a diminution of responding instead of its maintenance. Therefore, we compared response rates of a previously learned lever-pressing habit under four conditions: (I) When the electrical stimulus at threshold was administered contingent on the performance of the lever press; (2) When the electrical stimulus at threshold was given randomly and irrespective of a lever press; (3) When the electrical stimulus, at approximately half the threshold value, was given contingent on the performance of the lever press; (4) When the electrical stimulus, at approximately half the threshold value, was given randomly irrespective of a lever press. Threshold was defined as that strength of electrical stimulus sufficient to produce learning and maintain performance in the animal. It was found that at the sub-threshold intensity of electrical stimulus, the non-contingent stimulus, administered throughout the length of one minute test periods

LEARNING

AND

BRAIN

STIMULATION

211

(condition 4) was more effective in sustaining the performance of the learned habit than the same stimulus when it was contingent on the performance of the learned act (condition 3). Such an effect has been found in nine out of the ten animals with tegmental placements so far tested, and occurs to a marked and statistically significant extent in most of them. On the other hand, a comparison of the effects of contingent and noncontingent threshold stimulation (conditions I and 2) shows that the non-contingent stimulus, administered at the same rate in the threshold and sub-threshold condition (conditions 2 and 4) roughly at the rate of the animal’s own response, significantly and markedly decreases the number of already learned responses in a test session. Observation of the animals under test confirms the belief that new responses are acquired and interfere with the execution of the previously learned habit when the threshold stimulus is not contingent on the performance of the previously learned habit. No such interference is seen during the sub-threshold condition. It may be asked why, during the sub-threshold non-contingent condition, more responses were emitted than during the sub-threshold contingent condition. The reason is apparent on observing the animal. The rate at which it administers the sub-threshold stimulus to itself (contingent, condition 3) quickly becomes low (presumably because of the low drive produced) and soon ceases altogether. However, in the non-contingent condition (4) the animal receives the stimulus even when it is not responding and so returns to the lever even after pauses in responding. The effect of non-contingent supra-threshold stimulation upon the maintenance of a learned habit has been independently studied by Herberg (personal communication), who reports that such maintenance is generally of short duration. Another way of distinguishing the characteristics of the drive system from the reward system may be obtained in the following way. Deutsch (unpublished data) has found that at relatively high frequencies of burst repetition, rats prefer the higher frequency, even though the number and size of bursts is kept the same. Rats are trained in a T-maze and rewarded with ten I/S set bursts of 60 cps intracranial stimulus whichever arm of the maze they choose. On one side the bursts occur with 115 set gaps between them. Though the side where the bursts occur closer together is chosen when the above values are used, any preference disappears when the choice is between the same pulses spaced 112 set apart and I set apart. It seems reasonable to suppose that a greater reward results if successive pulses are injected so that each follows before some process from the previous pulse has died down. Interaction of successive excitations will thus occur and, if this is the case, we have a measure of the time in which such a process in the reinforcement system decays to baseline. From the

212

J.

A.

DEUTSCH

data quoted above it would seem that such decay occurs within one half second, as after this value there is no preference of the more frequent bursts. Data bearing on a similar decay on drive as measured by running speed indicate that such decay is much slower. Such determinations make it possible to look for the two systems on a neurophysiological level as it will be possible to state, using these methods, what parameters of facilitation (and perhaps also adaptation) to expect in the two postulated pathways. Another experiment performed by the writer which shows that there are two different physiological systems underlying drive and reward demonstrates that they are differentially stimulable by frequencies of imposed stimulation. A rat with an implanted electrode is run in a T-maze with a different frequency of stimulus available in each of the goal boxes. There is a “reward” of 2,000 cps in one box and of 60 cps in the other, both lasting for one second. The amplitudes of the two frequencies are then adjusted until the rat chooses the higher frequency somewhat more often. After this the running speeds of the animal are measured during blocks of trials with one of the two frequencies at the previously determined amplitudes available in both goal boxes. It is found that, controlling for motor effects, the less preferred (i.e. the lower) frequency produces the high running speed during these block trials. It is therefore concluded that the drive pathway is more sensitive to lower frequencies and the reinforcement pathway more sensitive to higher frequencies of stimulation. This explains the predicted paradox that the animal should run more quickly to a stimulus which it prefers less. Conclusions

An example was given of a structural hypothesis about the nervous system. Some tests of this hypothesis in its application to learning were quoted, and it was shown how it could be used to explain experimental evidence concerning behavior. The theory was then applied to.a detailed examination of learning consequent upon electrical self-stimulation of the brain. It was shown how the unusual features of this learning could be understood. Within the space of the present paper, it has not been possible to trace other ramifications of the theory, as much of the detail has been omitted. Other tests of the theory have been devoted to the area of normal reward and satiation. Here it has been possible to identify some of the postulated elements of the theory with neurophysiological entities and so to predict paradoxical relations between reward and satiation values of ingested solutions (Deutsch & Jones, 1959, 1960). The use of hypotheses about the kind of system which produces the observed behavior has also been applied to the explanation of experimental evidence in other areas. The method

LEARNING

AND

BRAIN

STIMULATION

213

has been applied to shape recognition in the rat (Deutsch, x955, Ig6ob), in the octopus (Deutsch, rg6oa), in the bee and other nervous systems (Deutsch, in press), andeto figural-after-effects (Deutsch, 1956, Ig6ob), and to problems of arousal and selective attention (Deutsch & Deutsch, in press). My thanks are due to Dr. L. J. Herberg, Dr. C. I. Howarth, and my wife for many helpful discussions and valuable suggestions. I am also indebted to Dr. D. A. Hamburg for his help and encouragement. This work was wholly supported by N.I.M.H. Grant U.S.P.H.S. M4563 and N.S.F. Grant G21376. REFERENCES ANDBRS~~N, B. & MCCANN, BOLLES, R. & hTTUNOWCH,

S. M. (1955). Acta Physiol. Stand. =,333-346. L. (1954). J. camp. physiol. Psychol. 47,378-380. Bowaa, G. H. &MI-, N. E. (195Q.J camp. physiol. Psychol. 5x,669-674. BRADY, J. V. (1958). In “Biological and Biochemical Bases of Behavior” (Harlow, Woolsey, C. N., eds.). Univ. of Wisconsin Press. BRADY, J. V. & CONRAD, D. S. (1960). J. camp. physiol. Psychol. 53, 128-137.

H. F. &

X34-137.

BRADY, J. V., BOREN, J. J., CONRAD, D. & SIDMAN, M. (1957). r. camp. physiol. Psychol. 50, BRODIB, D. A., MALIS, J. C., MORZNO, 0. M. & BOREN, J. J. (1960). Science 131~929-930. DEESE, J. (1951). J. camp. physiol. Psychol. 44, 362-366. Dw~nc~, J. A. (1953). Brit. 3. Psychol. 4.4, 304-317. DEUTSCH, J. A. (1954). Quart. J. rxp. Psychol. 6, 6-1 I. DEUTSCH, J. A. (1955). Brit. J. Psychol. 46, 30-37. DEUT~CH, J. A. (1956). &it. J. Psychol. 47, 208-215. DBUTGCH, J. A. (1959). Quart. J. erp.‘Psychol. II, 155-163. DEUTSCH, J. A. (r96oa). Nature 185,442-446. DEUTSCH, J. A. (1960b). The Structural Busis of Behavior. Chicago: Chicago University Press. DEUTSCH, J. A. (1963). Psychol. Rev. In press. DEUTSCH, J. A. & ANTHONY, W. (1958). Quart. 3. exp. Psychol. IO, 22-28. DEUT~CH, J. A. & CLARKSON, J. R. (1959a). Quart.J. exp. Psychol. II, x50-154. D~trrs~~, J. A. & CLARKSON, J. R. (1959b). Quart.J. exp. Psychol. XI, 155-163. DFXJTSCH, J. A. & DEUTSCH, D. (1963). Psychol. Rev. In press. Deuyac~, J. A. & HOWARTH, C. I. (1962). &ience 136, 1057-1058. DEUT~XH, J. A. & JONES, A. D. (~~59). Nature 183, 1472. Dmr~sc~, J. A. & JONES, A. D. (1960). J. camp. physiol. Psychol. 53, 122-127. DEUT~CH, J. A., HOWARTH, C. I., BALL, G. G., & DEIJTSCH, D. (1962). Nature 196, 699-700.

GLANZER, M. (1958). J. camp. physiol. Psychol. 51, 332-335. HERRERG, L. J. (x962). Nature 195, 628. Hobos, W. & VALENSTBIN, E. S. (x962). J. camp. physiol. Psychol. 55, 81~84. HOP~EL, B. G. & TEITELFIAUM, P. (1962). Science 135, 375-377. HOW~TH, C. I. & DIKITSCH, J. A. (1962). Science 137, 35-36. HULL, C. L. (1933). 3. camp. Psychol. 16, 255-273. KENDLER, H. H. (1946). J. cup. Psychol. 36, 212-220. LIZEPER, R. (1935). J. genet. Psychol. 46, 3-40. LILLY, J. C. (1958). In “The Reticular Formation of the Brain” (M. M. Jasper et al., e&s.). Henry Ford Hospital International Symposium. Boston: Little, Brown & Co. pp. 7o5721.

MARGULBS, D. L. & OLDS, J. (1962). Science 135, 374-375. MA~BERMAN, J. H. (1941). Psychosom. Med. 3, 1-25. MONTGOMERY, K. C. (1952). -7. camp. physiol. Psychol. 45, 287-293.

J.

214

A.

DEUTSCH

MORGANR, P. J. (1961). Science x=,887-888. OLDS, J. (1956). r. corn& physiol. Psychoi. 49, 507-512. OLDS, J. (1958). Science 127, 315-324. 0~~s. J. & MILNER, P. (1954). J. COUP.physid. Psychol. 47,4x9-427. OLDS, J., TRAVIS, R. P. & SCHWING, R. C. (1960). y. camp. physiol. Psychof. 53, 23-32. REYNOLDS, R. W. (1958). J. camp. physiol. Psychol. 51, 193-198. ROBERTS, W. W. (1958). J. camp. physiol. Psychol. 51, 400-407. SEWARD, J. P. & Lm, N. (1949). J. exp. Psychol. 39, 660-668. SEWARD, J. P., UYHDA, A. & OLDS, J. (1959). J. camp. physiol. Psychol. 52, 294-299. SEWARD, J. P., UYEDA, A. & OLDS, J. (1960). J. camp. physiol. Psychol. 53.224-228. SIDMAN, M., BRADY, J. V., BOREN, J. J., CONRAD, D. G. & SCHULMAN, A. (x955). Science 122, 830-83 SKINNER, B.

I.

Psych. Rat. 57, 193-216. Rec. xzq, 263-264. STEIN, L. (1958). Science 127~466-467. SUTHERLAND, N. S. (1957). J. camp. physiol. Psychof. 50, 358-362. WYRWICKA, W., DOBRZJXKA, C. & TAXNECKI, R. (1959). Science 130, 336-337. SMITH,

F. (1950).

D. A. (1956).

hut.