Brain Research, 195 (1980) 337-344 © Elsevier/North-Holland Biomedical Press
337
O P E R A N T C O N T R O L OF P R E C E N T R A L N E U R O N S : B I L A T E R A L SINGLE UNIT CONDITIONING
ALLEN R. WYLER, STEPHAN C. LANGE and CAROL A. ROBBINS Department of Neurological Surgery, University of Washington, Seattle, Wash. 98195 (U.S.A.)
(Accepted February 14th, 1980) Key words: operant conditioning -- cortex -- pyramidal tract neurons -- monkey
SUMMARY Two Macaca mulatta monkeys were reinforced to operantly control a precentral neuron's firing pattero while a contralateral unit was monitored simultaneously. The results from 38 complete experiments indicate the following: (a) upon alerting to the operant task, both the contingent and the non-contingent neurons changed firing patterns from preconditioning levels. However, as the monkey brought the contingent unit under operant control, there were no significant changes in the firing pattern of the non-contingent neuron; (b) when the contingencies were reversed so that the monkeys were reinforced to control the originally non-contingent neuron, the firing pattern of the originally contingent neuron returned to near baseline levels. These data indicate that although many precentral units may change firing patterns when the monkey attends to the operant task, the reinforced changes in firing pattern are not the result of a generalized phenomenon at the spinal level.
INTRODUCTION Previous studies have demonstrated that monkeys may be operantly conditioned to control the firing rate and pattern of precentral neurons z-5. Recent work from this laboratory has been directed at determining how this operant control is mediated 6-10. Various data would indicate that monkeys achieve control over precentral neurons by making subtle peripheral movements which activate mechanoreceptors that provide afferent feedback to the cortical neurons 6-1o. However, we have also noted that as individual neurons are brought under operant control, they often demonstrate relatively fixed modal interspike intervals (ISis) near 30 msec and that these modal ISis are not significantly different than ISis observed from neurons involved in repetitious peripheral movements 11. This brings to question whether the change in
338 firing patterns observed during conditioning experiments is due merely to the monkey alerting and increasing muscle tone or whether the monkey controls the neuron by finding subtle specific peripheral movements that coactivate with the firing of the central unit. To evaluate these two possibilities, the present experiments were conducted in which units were recorded bilaterally from homologous regions of precentral cortex. METHODS Data are from two male Macaca mulatta monkeys prepared for chronic experiments using previously described methods s, with the exception that recording mounts and chronic pyramidal tract (PT) stimulating electrodes were implanted bilaterally.
Operant task The operant task is termed Differential Reinforcement of Tonic Patterns (DRTP) and requires the monkey to change the neuron's firing pattern from phasic to tonic s. Functionally, the monkey is reinforced to produce consecutive ISis within a requisite range called the target. Since the operant task is similar to a tracking task, the monkey's performance is quantified in the following manner: each 15 sec an on-line PDP8/e computer summates, in milliseconds, the time the neuron fired within and outside the IS1 target. These two values are termed hits and error, respectively. At the end of each behavioral period, the mean and standard deviation for hits and error for the twenty 15 sec epochs are calculated. A more detailed description of the operant task can be found in ref. 8. Experimental protocol The monkeys underwent daily experiments in which initially only 1 unit was conditioned. The side from which the unit was recorded was alternated each day such that the monkey received equal amounts of training in controlling units from both hemispheres. After the monkey demonstrated convincingly that he could control 3 units from each hemisphere, bilateral experiments were begun. After a neuron had been isolated unequivocally from each hemisphere, it was tested for an antidromic response to PT stimulation. After a 5 min preconditioning period, 5 min DRTP periods were alternated with 5 min time out (TO) periods until 10 sequentially numbered DRTP periods had been accrued. After the fifth DRTP period, the contingencies were changed such that the neuron which was initially noncontingent became contingent and vice ~,ersa. "lherefore, in each individual experiment, the monkey was reinforced to control each neuron for 5 consecutive DRTP periods. Each day, the side from which the originally contingent unit was isolated was alternated. For a unit to be considered operantly controlled by the monkey, the first DRTP period (DRTP1) was used as baseline and subsequent DRTP periods were required to show a statistically significant (Student's t-test, P < 0.05) increase in time-on-target with a decrease in error (time-off-target). For units which were initially non-contingent,
339 DRTPe (the monkey's first period in which reinforcement was contingent for that unit) served as baseline against which subsequent periods were compared. DRTPb is that period subsequent to baseline for which the monkey's performance was the best. In all experiments, the initial ISI target range was 30-60 msec. Data were analyzed on-line by a PDP8/e computer and all data were recorded on an Ampex 7-channel FM tape recorder for additional off-line analysis. Audio and visual feedback to the monkey was always from the contingent unit. RESULTS
Data were accepted for this analysis only if the following criteria were met: (a) each unit had been unequivocally isolated and uninjured throughout the entire experiment and (b) both units were recorded through the preconditioning and 10 consecutive DRTP and time out periods. Thirty-eight experiments conducted on two monkeys met these criteria. Of the 76 units, 51 were PTNs with antidromic latencies ranging from 0.8 to 3.6 msec. Of the 38 experiments, 7 involved bilateral PTNs, 11 bilateral non-PTNs, and 15 one PTN and one non-PTN. In these experiments, 28 of the initially contingent units were operantly controlled. Upon reversing the contingencies, 26 of the initially non-contingent neurons were controlled. However, there was a trend that the majority of noncontrolled units were recorded during initial experiments, and as the monkeys gained further experience, they became more capable of controlling units within 5 operant periods. Fig. 1 is an example of 1 experiment involving bilateral non-PTNs. Row A illustrates the unit recorded from the right precentral cortex whereas row B is the unit recorded from the left precentral cortex. Note that during the DRTP1 periods both C
C
C
' "'*'"
NC
NC
0.% E = 1899
NC
NC
N¢
21 ', 1 ,
IF '::,.°
C
I
c
II
-- 1 8 7 5
Fig. 1. Interspike interval histograms from selected behavioral periods of one experiment involving two non-PTNs. Each histogram represents all ISis within each 5 rain behavioral period. PRE, preconditioning; C, contingent; NC, non-contingent. E is the mean error in milliseconds for each period. Bin width 2 msec. Cursors ---- 30-60 msec ISI target.
340 C
C
C
PRE
:)RTPI E = 5735
DRTP 3
E : 12896 H = 800 M = 21
IE =
1035
IH= 11809
"t = 5 7 8 8
W=
NC RTP
47 f
= = =
NC
3RTPIo E : 6086 H= 3801 ~= 59
5898 4881 38
42
I
/
NC
NC E= H= M=
NC
C
IDRTP 3
PRE
)RTP 6
5305 4188
JH E=
3480 = 5304
: = 2299 ~1= 6876
20
=
W=
58
48
C)RTPIo E= 805 H = 10974 ~= 44
?\ Fig. 2. Interspike interval histograms of selected periods from one experiment involving bilateral PTNs. All histograms are comprised of all ISis for each 5 rain behavioral period. C, contingent; NC, noncontingent; E is 5 min mean error; H is 5 min mean time-on-target; M is modal interspike interval. Bin width 2 msec. Cursors mark 30-60 msec ISI target window. neurons showed increased number of ISis within the 30-60 msec target and hence the error for both units decreased. Between DRTP1 and DRTPs, the monkey significantly controlled the right precentral unit, and decreased the mean DRTP5 error to 669 msec/l 5 sec. Between these same periods there were no significant changes in the error or hit scores for the non-contingent neuron. At the fifth time out period, the contingencies were reversed so that the monkey was reinforced to control the left-sided unit and, between DRTP5 and DRTP6, the originally non-contingent unit decreased its error from 4570 msec to 2798 msec. By DRTP10, the monkey had decreased further the error score for the left-sided unit. Because the monkey did not significantly increase hits or error for this unit between DRTP6 and DRTP~0, it would not meet the criteria for being operantly controlled. However, periods DRTP6-10 are significantly different than when the monkey was controlling the right-sided unit. Thus, one sees two effects on this unit's IS[ distribution: (a) the effect due to the monkey alerting to the operant task in general, and (b) the effect of the monkey attempting to operantly control the units. Between DRTP6 and DRTP~0, the firing patterns of the right precentral unit decreased progressively and during DRTP~0 the error score was not significantly different from that recorded during the preconditioning period. Fig. 2 is a subsequent experiment from the same monkey in which bilateral PTNs were recorded. In addition to the mean error scores, the mean time-on-target (HI) and the modal ISI (M) are given. Essentially, the findings are the same as previously illustrated except that the monkey's performance plateaued for the initially contingent unit by the third D R T P period ; i.e. the monkey's ability to rapidly control units had improved. It should be noted that for the non-contingent neuron, the histogram during DRTP~ became more gaussian in shape when compared to
341 preconditioning, and the neuron's modal ISI entered the target range although the monkey was not reinforced to control this unit. In comparing this experiment to the previous one, note that during DRTP6 the mean error and hit scores returned to DRTPI levels for the right-sided unit (row A) and did not significantly change through the subsequent D R T P periods when this unit was non-contingent. For the unit in row A, the modal ISI changed significantly during the D R T P periods in which this unit was non-contingent; however, it did not change significantly during the D R T P periods when this unit was contingent. The same finding holds true for the left-sided unit (row B); i.e. the modal ISI became relatively stable during periods in which the monkey was operantly controlling the neuron. Table IA summarizes the changes in error scores between DRTP1 and DRTP5 for the initially contingent and non-contingent neurons. For the initially contingent neurons, 28 showed statistically significant decreases in error between DRTP1 and DRTP5 and 10 showed no significant change in error; i.e. were not operantly controlled. For the non-contingent neurons during the same operant periods, I0 showed a significant decrease in error, 8 showed a significant increase in error, and 20 showed no change. Table IB summarizes the data for the same units after the contingencies had been reversed. In this group there were slightly more neurons which were not statistically controlled and, in fact, 2 neurons showed statistically significant increases in error between DRTP6 and DRTP10. Data were further analyzed to compare the error scores of non-contingent neurons during periods in which contingent neurons were not controlled. In general, no change in error values occurred in the non-contingent neurons while the monkey was attempting, unsuccessfully, to control the contingent neurons, although 60 ~ of non-contingent units' error scores were lower during these periods than during the preconditioning periods. Table II gives the mean and standard deviation for contin-
TABLE I
(,4) Number o f neurons showing statistically significant changes in mean error scores between DRTPI and D RTP5
Decrease Increase Unchanged
Contingent
Non-contingent
28 0 10
10 8 20
( B) Number o f neurons showing statistically significant changes in mean error scores between DRTP6 and DRTPlo
Decrease Increase Unchanged
Contingent
Non-contingent
26 2 10
6 6 26
342 TABLE II
Mean and standard deviations o f modal ISis (in msec) for 38 contmgent and non-contingent neurons during preconditioning, baseline ( D RTP1 or D R TP6) and D RTPb periods
Contingent Non-contingent
Pre
D RTPI
D RTPb
46.5 ± 19.6 47.5 ± 19.8
36.7 :E 18.4 35.4 ± 18.5
35.5 :~ 12.9 35.4 ± 15.0
gent and non-contingent neurons' modal ISis during the preconditioning, baseline (DRTP1) and DRTPb periods. It can be seen that simply alerting the monkey to the operant task decreased the mean and variance of the modal ISis of the non-contingent units. If the changes in unit firing patterns are due to a generalized response (such as increased 7 efferent tone) rather than specific manipulations of peripheral mechanoreceptors (as previous data suggests), changes in the units' firing patterns should significantly covary. For the periods of DRTP1,5,~,10, correlations (Pearson's r) between the units' 15 sec hit scores were computed. In no case was a significant correlation found (sign ignored); i.e. hits did not positively or negatively covary between units during operant periods. It should be noted that all units were recorded within the hand-arm region of both precentral cortices, as confirmed anatomically. DISCUSSION
These data help clarify results from other single unit conditioning studies. In a previous report s, it was noted that many neurons demonstrated significantly different firing patterns when comparing the pre- and postconditioning periods; neurons successfully controlled showed more tonic firing patterns during the postconditioning period. Since the monkey's behavioral state between pre- and postconditioning remained the same, it was suggested that the change in firing pattern might reflect either a change in the efficiency of the synaptic pathways which mediated the response, or that the operant response itself became a secondary reinforcer which did not extinguish immediately after reinforcement was withdrawn. The present experiments would indicate that the latter is more likely since, as the monkey became more proficient in changing contingencies from one neuron to the other, it was noted that the firing behavior of the originally contingent neuron rapidly reverted to baseline patterns. Figs. 1 and 2 represent 2 experiments from the same monkey at different points in time. Fig. 1 is an early experiment with this monkey, while Fig. 2 is an experiment conducted a month later. If one examines the DRTP1 and DRTP6 periods for the unit in row A of these figures, it will be noted that during the earlier experiment (Fig. 1) the histogram in DRTP6 is significantly different than DRTP1. On the other hand, in Fig. 2, the DRTP6 histogram more closely resembles that of DRTP1. These data indicate that previously reported sustained postconditioning changes in firing pattern are probably secondary to the monkey's anticipation of being reinforced again for the operant response.
343 Evarts 1 reported that the mean of modal ISis for precentral neurons is approximately 45 msec during periods in which monkeys are awake but motionless. We have reported that when monkeys operantly control similar neurons the modal ISis are near 30 msec even though the monkeys are motionless 1°,11. In the present experiments, it is shown that even neurons in cortex contralateral to operantly controlled units generate average modal ISis of 35.4 msec. Therefore, with a target range of 30-60 msec, this operant task requires the monkeys to increase the occurrence of intervals which are physiologic for the units. Single unit operant control is encumbered after dorsal column section 7 and abolished with contralateral ventral rhizotomy 9. These and other data 6-1° strongly suggest that the operant response is mediated by the monkey manipulating the peripheral mechanoreceptors which secondarily generate afferent feedback to the cortical neurons. However, those data do not rule out the possibility that the operant response is mediated by the monkey simply learning to make some non-specific response at the spinal level, such as increasing y motoneuron activity. The results from these bilateral experiments demonstrate quite clearly that although most contingent and non-contingent neurons change their ISI distributions into the 30-60 msec target range when the monkey alerts to the operant task, only the contingent neurons show longitudinal increases in time-on-target with concomitant decreases in mean error. These data would indicate that at least between hemispheres in cortical areas receiving projections from the same spinal segments, the monkey is indeed operantly controlling the firing pattern of the contingent neuron above that which is secondary to a generalized alerting effect. Therefore, although these results do not address the question of how specific the afferent pathways are that mediate the operant response, they make it highly unlikely that the reinforced operant response is generated from a generalized spinal mechanism. ACKNOWLEDGEMENTS This research was supported by N I H Research Grants NS 04053, NS 14590, and Teacher Investigator Award (ARW) NS 00195 awarded by the National Institute of Neurological and Communicative Disorders and Stroke, P H S / D H E W . A.R.W. is an affiliate of the Child Development and Mental Retardation Center, University of Washington, Seattle, Washington.
REFERENCES 1 Evarts, E. E., Temporal patterns of discharge of pyramidal tract neurons during sleep and waking in the monkey, J. Neurophysiol., 27 (1964) 152-171. 2 Fetz, E. E., Operant conditioning of cortical unit activity, Science, 163 (1969) 955-958. 3 Fetz, E. E. and Baker, M. A., Operantly conditioned patterns of precentral unit activity and correlated responses in adjacent cells and contralateral muscles, J. Neurophysiol., 36 (1973) 179-204. 4 Fetz, E. E. and Wyler, A. R., Operantly conditioned firing patterns of epileptic neurons in the monkey motor cortex, Exp. Neurol., 40 (1973) 586-507. 5 Schmidt, E. M., Bak, M. J., McIntosh, J. S. and Thomas, J. S., Operant conditioning of firing patterns in monkey cortical neurons, Exp. Neurol., 54 (1977) 467-477.
344 6 Wyler, A. R. and Burchiel, K. J., Factors influencing accuracy of operant control of pyramidal tract neurons in monkey, Brain Research, 152 (1978) 418-421. 7 Wyler, A. R. and Burchiel, K. J., Operant control of pyramidal tract neurons: The role of spinal dorsal columns, Brain Research, 157 (1978) 257-265. 8 Wyler, A. R. and Finch, C. A., Operant conditioning of tonic firing patterns from precentral neurons in monkey neocortex, Brain Research, 146 (1978) 51-68. 9 Wyler, A. R., Burchiel, K. J. and Robbins, C. A., Operant control of precentral neurons in monkeys: evidence against open loop control, Brain Research, 171 (1979) 29 39. 10 Wyler, A. R. and Robbins, C. A., Operant control of precentral neurons: the role of reinforcement schedules, Brain Research, 173 (1979) 341-343. 11 Wyler, A. R., Lange, S. C., Neafsey, E. J. and Robbins, C. A., Operant control of precentral neurons: control of modal intervals, Brain Research, in press.