Physiology& Behavior, Vol. 48, pp. 113-120. ©PergamonPressplc, 1990. Printedin the U.S.A.
0031-9384/90$3.00 + .00
Learning and Rhythmic Human EMG in Ecological Perspective M A R Y C. W E T Z E L
Department of Psychology, University of Arizona, Tucson, AZ 85721 Received 19 June 1989
WETZEL, M. C. Learning and rhythmichumanEMG in ecologicalperspective. PHYSIOLBEHAV48(1) 113-120, 1990.--Previous evidence of strong interactionsbetween learning and human treadmill locomotionled to a simplifiedsystem for studying learned rhythms in a framework of behavioralecology. Motor control combinedwith instrumentalconditioningin a rhythmic hand task with repeating trials, blocks, and complete regimens. Regimencontexts differed with respect to the pattern of stimulationbefore and after an electromyographic(EMG)response. Both an antecedentstimulus(a light flash) and a consequentstimulus(a tone indicatingsuccess or failure) were necessary for conditioning.Argumentswere given for definingreinforcementas a composite of interdependentand size-scaled processes, some includingknowledgeof results, instead of as a single event after a response. Instrumentallearning Behavioral ecology
EMG biofeedback Reinforcement
Motor control
THIS investigation of the ecology of highly skilled rhythmic electromyographic (EMG) activity by human hand muscles was prompted by findings from studies whose first focus was ensemble locomotor behavior in animals and humans (41, 42, 47). In a number of later experiments a discrete learned behavior was introduced, one stimulus together with a fine-grained EMG response, that incorporated elements of EMG biofeedback training. The new behavioral unit affected fine details of ongoing EMG in neighboring muscles as well as overall interlimb patterns for the two legs during walking (44, 46, 48). This and other evidence (43,45) for multiple mechanisms driving patterned locomotion encouraged a more integrative approach to learning that would emphasize its context and in which small-to-large scale behaviors and their intereonnections could be studied. Four general principles concerned with analysis of a behavioral ecosystem were assembled and made the basis of the present work. None is unique to this study, and all four were implicit in an earlier review (43) that questioned a view of instrumental (operant) reinforcement as a mechanism that simply strengthens responses. 1. Relational Def'mitions of Stimulus and Response That Incorporate Reinforcement. In a 1986 review (43), the author advanced empirical and theoretical reasons for defining all stimuli in terms of the response(s) they control (S.R) and for defining reinforcement as establishing new stimulus control. In the case of instrumental (operant) learning, an antecedent stimulus, $1, acquires a discriminative stimulus function, Sd. The convention has the advantage of accommodating multiple mechanisms for accomplishing discriminative control, just as Thorndike (39) pointed out that "The word 'association' may cover a multitude of essentially different processes" (p. 378). 2. Single Stimuli and Responses Combine to Ensembles of Multiple Behavior. Individual behaviors can be identified and studied further in ensembles. Multiple stimulus functions produce multiple responses by way of a nervous system whose anatomy and physiology provide for pattern generation as well as individual
Rhythmicbehavior
Manualskill
connections. The principle is very general and well documented within neuroscience. Numerous discrete but integrated pathways have been identified since the time of Sherrington (35), as well as central pattern generator networks whose outputs consist of rhythmic neural activity and motor performance (13, 14, 34, 36, 38). 3. Stimulus.Response Units Not Independent of Context. Behaviors, whether defined as small units or large ensembles, are shifting processes that come and go. Their weights move up or down depending upon the external situation and the presence of other larger or smaller ongoing behaviors; for example, tapping a finger while talking. Neural evidence of the importance of context is overwhelming in the case of such "learning" parts of the brain as the hippocampus (23), and behavioral emphasis on context has increased from Tolman's (40) time to more recent years (3). 4. Taking Behavioral Scale Into Account. Scale involves more than the number of S.R units comprising a given behavior (No. 2 above); scale carries the sense of size as well as number. Large scale molar acts have been emphasized so much in psychology that one author wrote "a 'response' refers to the way in which an organism affects its environment; it does not refer to a muscle twitch" [(25), p. 155]. In contrast, most physiologists have selected muscle twitches and their electrical activity for understanding skilled behavior (16). SchOner and Kelso (34) made a case for understanding "at different scales of analysis (including behavioral patterns, neural networks, and individual neurons) and the linkage between them" (p. 25). Within the framework of behavioral ecology outlined above a simplified experimental system was constructed for studying learned rhythms in context. The study merged principles of instrumental learning with those of motor control physiology. One set of elements was included that have been widely accepted as basic for instrumental learning from at least the time of Thorndike (39) and his Law of Effect: an antecedent stimulus, a response, and a consequent stimulus called "reinforcing" because it is 113
114
WETZEL
essential to the establishment or strengthening of new behavior. Utilized from the field of motor control were computer-assisted laboratory procedures for measuring details of EMG electrical potentials and biofeedback (4, 9, 10). A related source of information was available on skilled human motor performance involvingresponse consequences or feedback (1, 2, 26, 31, 33). In the performance literature a common expression is "knowledge of results" or "Information provided after a response that tells of the learner's success in meeting the environmental goal" [(31), p. 355]. The expression implicates a person's previous social experiences and awareness more broadly than does the usual view of the Law of Effect. The instrumental conditioning literature has also acknowledged a more complex, learned form of reinforcing consequence than innate properties of, for example, food or water. For the most part the concept and study have been confined, however, to conditioned reinforcement based on Pavlovian conditiouing (6, 17, 18, 20, 21, 25, 32). A simpler context was used here than in the previous natural treadmill locomotion setting, in which many muscles and processes were combined. A test response, R, consisted of strictly rhythmic bilateral EMG activity by a single hand muscle. The protocol was I) simple in that it tested the contribution to learning (and extinction, omission of reinforcement after acquisition) of one antecedent stimulus, S 1, and one consequent stimulus, $2, but it was 2) comprehensive in defining acquistion at all scales: single trials, blocks of trials, and complete learning regimens. The study also was comprehensive in providing different tests in which S 1 (a brief light flash) and $2 (high or low tone for success or failure, respectively) were presented alone or as a pair with the EMG response, R. EMG was not measured in any single way but by multiple computer-scored criteria of amplitude, duration, and timing that are characteristic of biofeedback experiments but not of most learning experiments. To exclude moment-to-momentinterference by the subjects' personal behavior, as far as possible, demanding time constraints of a few hundred msec for performance were selected that in previous work (46,48) had prevented subjects from consciously planning a response strategy for use within a trial. An unexpected outcome was that $2 failed to reinforce when it was presented in isolation, calling for a reevaluation of both the instrumental Law of Effect and concepts of knowledge of results, at least for rhythmic skills. METHOD
Subjects and Introduction to Situation All four men (Nos. 47, 49, 50, and 51) and two women (Nos. 45 and 48) were new to the laboratory and without previous experience involving EMG "biofeedback." Before the experiment began, the same environment was arranged by a single experimenter for all subjects, who were tested separately. The person was seated in a quiet room in a padded armchair with forearms resting comfortably on wooden arms of adjustable height and position. After electrodes were attached (see below) the hands were placed with the palms downward and held in place with Velcro straps. No restriction was made on finger movement. EMG as a biopotential was explained to the subject by means of visible and audible accompaniments of his or her EMG activity from a monitor oscilloscope and microphone during practice sessions. The experimenter interpreted brief practice EMG bursts as resembling those to be produced during the experimental conditions, but pointed out that the computer would measure success or failure of EMG activity and not finger movement per se. At this point all EMG indicators were discontinued, and the
particular experimental condition and protocol for the first (S1) and second ($2) stimulus were described thoroughly. As in previous work from the same laboratory, the test relied on the efficacy of the experimental treatments to overrule any private strategies by subjects, so no attempt was made to conceal the particular protocol. In this regard, subjects were advised that responses were always required at precise 2-second intervals.
Experimental Arrangements of Stimuli and Response The first stimulus, S 1, was antecedent; a brief light flash before R. The second stimulus, $2, was a brief tone sounding after R. From these elements three conditions were formed. 1) S1.R.S2 (light and tone). This condition included both antecedent and consequent stimuli and was expected to favor learning if both were important to the reinforcement mechanism. Light preceded the EMG response, R, and tone followed R (all subjects except No. 5 I). Extinction of this behavior, in which light flashed at the usual time but no tone followed R, was tested after conditioningin those subjects for whom sufficient weeks remained in the study (Nos. 45, 47, and 50). 2) S1.R (light only). Without distinctive stimulation after the response there was no known possibility of reinforcement. The response was entirely new in each subject's experience (see section below defining the total set of scoring requirements), to exclude wholesale transfer of stimulus control from any previously learned hand movement. As expected from similar unpublished results during locomotion, no conditioning occurred for one subject, No. 51, despite many consecutive trials. No shaping (37) of the precise, computer-defined EMGs occurred, and their configuration and timing remained highly variable. Moreover, S 1 did not provide for immediate learning within the first block of trials for any subject, even when coupled with the tone, $2 (see the Results section). The demanding computer-scored requirements, therefore, ensured the light's original neutrality. 3) R.S2 (tone only). Here a tone sounded after R (Nos. 45, 47, 48, and 49), in a conventional arrangement usually assumed to be sufficient for reinforcing responses. Since no conditioning occurred within the lengthy duration of the investigation, no extinction (absence of tone) could be tested.
Trials: Definition and Context of Stimuli and Responses Figure 1 shows elements of the most complete trial condition, S1.R.S2, composed of light, EMG burst, and tone. S1 was a green light that flashed for 100 msec. The source was a light-emitting diode with rise time far less than 1 msec, mounted on a panel facing the subject. $2 was a Sonalert tone sounded for 100 msec. If the burst met criteria (see below) $2 was a high frequency tone (4500 Hz); if requirements were not met, a low frequency tone (2900 Hz) sounded instead to indicate failure. These features of the light and tones were identical to those in previous locomotor studies and were completely described ahead of time to the subjects. Figure 1 (right) diagrams a successful trial, in which S 1 acquires an Sd function, thereby identifying $2 as a reinforcer, S + [positive in this case; see (18) or (37) for details]. Although all subjects probably had experienced previous learning involving similar lights and tones as cues or reinforcers in natural situations, their new stimulus role was novel in the experimental context. A silent electrical trigger pulse (24) was given manually by the experimenter when the subject indicated readiness to initiate a block of 20 trials whose events were repeated automatically every 2 sec by means of an electronic timer. The zero point in time (every two sec) for every successive trial marked the start point that timed aperformance duration of 700 msec (PDUR in Fig. 1),
RHYTHMIC H U M A N LEARNING
115
flret dorsal interosseous
1
2 47
left EMG
fail
pass
EMG,
- threshold
RTms x
TONE
I-poum-t ,-'-
EMG
2 sec
-~',
OURATION :
J r a i n 1OO rnseo t rnsx 4 0 0 FIG. 1. Trial: Sd-R.S+. Small scale behavior unit defined by antecedent and consequent stimulation: S1 .R.S2. Elements of a conditioning trial used to construct a simple periodic learned behavior using a human hand muscle (first dorsal interosseous). Light, the antecedent S1, was originally neutral in having no effect upon EMG but in all subjects S1 rapidly gained control as discriminative stimulus, Sd (see text and Fig. 4). Bilateral responding was successful if it met a predefined filtered amplitude (not shown but estimated by dashed line superimposed on raw EMG) within 300 msec (RT~,) and was maintained for at least 100 but no longer than 400 msec within the PDUR (performance duration) of 700 msec. Tone was high for pass or low for fail. In Trial 1, the response failed because latency exceeded 300 msec. Similar EMG with shorter reaction time in Trial 2 was scored as passing.
the period within which the experimental EMG burst had to occur. If scheduled, as in Fig. 1, the light flashed in front of the subject at each time zero, and at PDUR termination the tone sounded. The fast dorsal interosseous muscle was selected for production of the response, R, because of its relatively small number of motor units and discrete action to abduct the index finger (11). A bilateral EMG response (to equalize sensory input and motor outflow on left and right sides) was predefined in terms of four features, all of which had to be satisfied for a high tone to follow: threshold amplitude (see below), duration (100--400 msec), onset latency def'med as a reaction time of 300 msec that had been achieved readily in previous experiments by Wetzel and Gorman (44), and occurrence within the performance duration (700 msec). For computer definition and scoring, the test EMG was rectified and run through two low pass filters: an analog filter in a TECA preamplifier, set at 16 Hz, and a digital f'dter in a MicroNova computer program, with a cutoff frequency of 9.95 Hz. An arbitrary filtered threshold amplitude of 40 p N was required for acceptance by the computer of the experimental EMG burst. The criterion value was set by visually aligning a comparison voltage oscilloscope trace with the oscilloscope's calibration voltage and is represented by a dashed horizontal line in Fig. 1. This voltage level was identical to the standard value that had been used for subjects in previous studies in the same laboratory.
The possibility of spurious contraction-produced timing cues due to extra finger movements was ruled out by monitoring EMG and verifying that it did not occur at times other than those specified by the experimental protocol (see other comments below about raw EMG records). Other obvious extraneous sources of control, such as rhythmic foot tapping, were ruled out by inspection of the subjects as they worked. In addition to measurement by computer of filtered EMG, raw EMG activity (as shown in Fig. 1) was examined later by inspection of taped cycles. Raw EMGs were recorded on magnetic tape at 38.1 cm/sec, as was the signal from the switch used by the experimenter to initiate a block of trials. These records were played back at slower speed, 19.1 cm/sec, to a Gould 2400 strip chart recorder for subsequent visual analysis. Surface bipolar EMG was distinctive against a background of little or no electrical noise when recorded by a TECA electromyographic machine whose filters were set to record a bandwidth from 16--8000 Hz. EMG amplitude varied little from day to day as a result of changes in electrode position or impedance. Inspection of raw EMG was important to reveal configuration of learned bursts. It also verified that successful activity was confined to the PDUR. If rhythmic contractile activity had begun at some time other than when set by the trial conditions, the resulting proprioceptive afferent input could have served as a cue
116
WETZEL
A°
to add to or replace the experimental light flash, but no such activity occurred in the test muscles (see Results section and Fig. 5).
20
Behavioral Ecosystem: Blocks to Regimens Following preliminary practice in producing EMG bursts and description of the S1 and/or $2 arrangement, the operator initiated a MicroNova lab computer program, as described above, to control events of 20 trials in that particular block and regimen. Figure 1 shows two consecutive trials and an intertrial interval for the S1.R.S2 condition, in which light was presented with tone. Note that the fine distinction between a passing and a failing trial is not readily discernible from visual inspection, nor could subjects voice its characteristics. Failure in Trial 1 consisted of a latency only slightly exceeding the 300-msec maximum. A complete regimen, or reinforcement history, consisted of consecutive 20-trial blocks on the same day or subsequent days until a criterion was met of two consecutive blocks in which 90% or more responses were successful or until conditioning was seen to have failed. Tape recordings were made for all blocks and most trials throughout training to provide information about details of raw EMG. Since the S 1-R condition was essentially a control for neutrality of the antecedent light, the main effects examined, in different orders across subjects to equalize experience or sequence effects of previous regimens, were the traditional conditioning arrangements of S1.R.S2 and R.S2 (plus extinction trials where appropriate for three persons). The particular orders were scheduled as follows. Regimens in sequence one: R.S2, S1.R.S2. The computer scored each EMG response and followed it with the appropriate tone (high for success or low for failure, as described previously). For two of five subjects, Nos. 48 and 49 by random selection, this treatment formed the initial regimen. After 25 blocks of trials, criterion was not met by either subject, and the frequency of correct EMG bursts remained less than 50% (see the Results section). At this point conditioning attempts were terminated, and subjects 48 and 49 experienced a second regimen, S1.R-S2, named for the trial type depicted in Fig. 1. This regimen was successful for both subjects (see the Results section). Regimens in sequence two: S1.R.S2, S1.R (extinction), R.S2. Three subjects, Nos. 45, 47 and 50, received the two-stimulus regimen fwst by random assignment and, since conditioning was successful, extinction was attempted: trials in which the light flashed at its usual time but no tone followed R. When little or no change in responding occurred after 25 blocks of extinction trials, i.e., there was no decrease in the performance level, Nos. 45 and 47 experienced the R.S2 condition (time did not permit testing No. 50). As with the other 2 subjects, criterion percentages of successful responses were not achieved after 25 blocks, so testing was ended. RESULTS
The combination of stimulus and response elements, not their individual contributions, proved to be the most important variable. Under the particular set of rhythmic conditions, reinforcing efficacy of $2 following the bilateral EMG response, R, was entirely dependent upon whether or not there had been a designated antecedent stimulus, $1, preceding R. This finding completely outweighed other variables such as the particular subject or the sequence of regimens.
Consequences (Tone) Alone Did Not Shape Operant Behavior The R.S2 condition was so deficient that criterion was not
0>90%
20[ o
B.
48
10
o
v
number
of' correct EMG bursts
Or . . . .
, I . ~ I ~ . 10 15 20 blocks of' trtsls
l~J.,__ • I , t J .
5 consecuttve
HAND
, I
25
If'dr
)
FIG. 2. No reinforcement by consequent stimulus alone; R.S2. Tone alone did not conslruct accurate, rhythmic learned behavior. (A) Shaping was attempted with tone ($2) alone immediately after a successful S1.R.S2 regimen (subjects 45 and 47). (B) Identical to A except that shaping was attempted with $2 in naive subjects (48 and 49), before the acquisition regimen that added light as S1 to tone as $2. Criterion (90% passage in two blocks of 20 trials each) was not met by any subject.
achieved by any of the four subjects. Subjects frequently mentioned their confusion about what to do and, if previously successful with the S1.R-S2 regimen, reported that the tone by itself had a disruptive effect. Figure 2 shows that whether (A) or not (B) a subject had any previous experience with the green light as S 1, little or no improvement occurred across 25 blocks of trials (500 single trials). Subjects in A whose first regimen included light as S1 had somewhat higher frequencies, overall, of correct responses, but there was only one instance of 90% performance and for only one block (see Fig. 2A, No. 47, f'flled circle). Figure 3A-D shows representative consecutive trials, three for each, for the same four subjects whose conditioning records appear in Fig. 2. EMG burst configurations were highly variable across subjects. Timings were imprecise, and sometimes there was no burst at all (see example in Fig. 3B, second trial for No. 47). The poor performances were strildng. Subjects spontaneously reported that the tone interfered with their learning instead of strengthening it, a finding not at all predicted by most views of the reinforcing power of response consequences (37), especially since sounding of the tone was completely predictable at every 2 sec.
Reinforcing With S1 and $2 Highly Effective Trials combining light (S1) and tone ($2) in which the reinforcement regimen constructed a well-defined instrumental unit of S and R gave an overwhelming advantage for conditioning. Two blocks of consecutive trials met the 90% criterion of success at or within 10 total blocks for all five subjects, and Fig. 4 (left side) shows the rapid conditioning for the three subjects who also
RHYTHMIC H U M A N LEARNING
117
RI
AI
47
Ir
I pass
fail
pass
I Fail
fail
pass
O.
C.
left TONE ONLY right
fall
pass
fail
pass
fail 14
fail 2 sac
~ ',
FIG. 3. Examples of shaping failures in R.S2 trials. Defective rhythmic EMGs appear in trials with S2 alone (T=tone); 3 consecutive trials for each of the same 4 subjects as in Fig. 2 and same labeling conventions as in Fig. 1. Burst configuration for a given person was similar from first to third trial, but timings varied. (A) Second trial failed because the EMG bursts began too soon, with the portion within the PDUR less than 100 msec long. (B) First 2 trials failed because burst was too late or did not occur. (C) Two trials (first and third) failed because EMGs were too long. (D) Failed EMGs (second two trials) were too long.
experienced extinction trials. There was little evidence of extinction for 25 subsequent blocks (right side) in which the tone never occurred. Conditioning, in summary, was both rapid and durable. Figure 5 shows that brief, successful EMGs and onset latencies were similar within and across subjects 45, 47, and 50 during criterion trials (A--C) and at the termination of testing for extinction (samples from last block in D-F). In sharp contrast to Fig. 3, fine-grained bilateral responding, evident in individual records, had been achieved by the three-term sequence. DISCUSSION
Relational Definitions of Stimulus and Response That Incorporate Reinforcement Neither light alone (S1) nor tone alone ($2) directed or improved responding. The result was expected in the case of the antecedent S 1, since it carded no indication about the amplitude or duration of the response to be produced. In contrast, the failure of the consequent $2, voiced by some subjects as interference, was not expected ahead of time. When the experiment was designed it had been speculated that the tone alone might be somewhat less effective than light and tone together, since tone marked 2-sec intervals but did not signal the start of a trial or burst. Nevertheless, $2 should have been at least partially effective since tone followed the response immediately in the accepted instrumental fashion (5), was understood clearly by the subjects as indicating success (if high) or failure (if low), and was repeated after
precisely two seconds, an elapsed time presumably long enough to prevent difficulties with reaction time that have been reported for shorter rhythmic intertrial intervals of 2 per sec (22). It is unlikely that any qualities of tone per se, apart from its feedback role, were responsible since the auditory modality is known to favor reproduction of rhythms (12). Other evidence about reinforcement points to more than one process, as may help to explain rapid acquisition under the present S1.R.S2 condition. Experiments since Thorndike's time have generated at least three views of how instrumental reinforcement operates, as reviewed by Rescorla (27,28), Rescorla and Holland (29), and Colwill and Rescorla (5). In the first view an S-R association is the reinforced entity (7, 15, 18, 19), with the reinforcer spoken of as having a "catalyst" action and not otherwise participating in the bond. A second view, sometimes called "two-process" (30), focuses on findings that the particular reinforcer may have effects beyond just establishing an association; specifically " a second association, that between the antecedent stimulus and the reinforcer" [(5), p. 56] and formulated as a kind of "parallel Pavlovian association that forms in the course of instrumental learning" (p. 56). A third view has been attributed by Rescorla and his collaborators to BoNes, Mackintosh and Dickinson, and it is also largely true of Skinner (37) and many other operant behaviorists: that the response, R, is reinforced by the subsequent reinforcing stimulation. For reasons that are not completely clear, but that almost certainly derived from the demanding response and timing criteria,
118
WETZEL
20~ LIGHT, TONE
numberop correct
EXTINCTION
@~90% 0<90%
1o
EMG buPot;8
45 c
I,,,,I,,.,I
I
....
I ....
I ....
I
....
I ....
l
I....I..~.I....I
0Ioo,l.,,,I 5 10 consecut;,ve
block8
l,,t,l 0
.... I . . . . I . . . . I 5
of'
I0
15
.... 20
~0 J 25
triBIo HAND
(fd
~)
FIG. 4. From S1.R.S2 to Sd.R.S+. Rapid and durable conditioning with light as S1 and tone sounding as $2 at end of PDUR. Three subjects (Nos. 45, 47, and 50 from top to bottom) were asked to produce an EMG burst when green light flashed. Criterion (same as in Fig. 2) was met in 10 or fewer blocks (left side). Extinction (right side) trials showed that when tone was omitted, the learned behavior defined as Sd.R continued at or near criterion performance for many blocks.
the present study constructed a system in which relational aspects of reinforcement were accentuated. The first two suggestions listed above about reinforcement also relate stimulus and response. They remain plausible, although a relationship between S1 and $2 need not be confined to a catalytic or Pavlovian association, neither of which seems comprehensive enough to account for the complexities of human learning in the rhythmic task. The third, most "absolute" view, that reinforcement strengthens responses, was challenged previously for its failure to take S 1 and the context into account, especially in formation of the discriminative stimulus, Sd (43). The challenge was upheld by the present data. What the S1.R.S2 arrangement provided, and what was missing in the R.S2 treatment, was a mechanism for successfully modifying (shaping) all features of the EMG response across
trials and synchronizing its timing in relation to $1 through reaction time and performance duration. The inadequacy of $2 alone was immediately obvious in the malformed and poorly timed EMGs of Fig. 3. A finely detailed and demanding response required an equally detailed and inclusive set of teaching stimuli, whose features are set out below by reference to three other principles contributing to a behavioral ecology.
Single Stimuli and Responses Combine to Ensembles of Multiple Behavior The finding that efficacy of $2 was conditional on S1 in the rhythmic task is in accordance with research on weighting, competition, and behavioral inhibition, in which it has been pointed out that "stimulus control is inherently conditional in nature" [(49), p. 474]. The two sources of stimulus control also formed an ensemble learning mechanism that is compatible with a nervous system composed of "multiple parallel connections and multiple associations [combining] to build an overall representation" of the environment [(28), p. 159, see also (3)].
Probably several interconnected processes were involved in reinforcement and/or knowledge of results. The sequence of moment-to-moment events was largely unconscious. Perceptions of success and failure appeared to be associated with blocks of trials as a whole but not with individual trials repeating so rapidly, in what Adams (2) called "features of movement that are too fine-grain for cognitive representation" (p. 22). At the same time, consciousness and explicit knowledge of results were undoubtedly involved in a larger sense. Each experimental condition was fully described ahead of time and formed an overall background for that regimen. All subjects were aware that the light flash, when present, was a signal to produce an EMG burst, and they knew that a high tone meant success and a low tone meant failure. Beyond the issue of conscious awareness, the combined power of S1 and $2 depended on previous experience in ways beyond those usually identified with conditioned reinforcement. Reinforcing power did not derive from any known pairing of tone with an innately reinforcing consequence, as stipulated in most animal research (18,37). There was no evidence, moreover, that tone acted as a discriminative stimulus, although " A s is well known, a long-standing claim has been that the necessary and sufficient condition for a stimulus to become a conditioned reinforcer is that it serve as a discriminative stimulus" [(49), p. 474]. Tone in one trial did not become a discriminative stimulus for producing a successful EMG response in the next trial, as was not surprising because of the long 1300 msec delay.
Stimulus.Response Units Not Independent of Context Knowledge of the rhythmic context was important to revealing a dependency between the discrete S 1 and $2. The EMG response was also set in context and consisted of far more than a single property such as amplitude or duration. R included both of these properties and further defined them in a complete temporal context of performance duration and reaction time within each trial. Finally, overall context was a pattern of strictly rhythmic trials fixed for each regimen.
Taking Behavioral Scale Into Account With a complete behavioral system defined, both large and small scale events could be seen in size perspective. Large scale learning across trials and regimens was measured with conventional criteria but was almost all-or-none: rapid success or total failure to achieve the 90% criterion of a regimen, together with total retention of the rhythmic skill if it had been acquired. At small scale, the starting point was an individual EMG response defined in fine detail. Each trial was a self-contained all-or-none test of conditioning whose significant events occurred within periods encompassing a few tens of msec. Reinforcement could be measured at this small scale, therefore, as well as in terms of overall acquisition that surely involved knowledge of results in an important general sense.
CONCLUSIONS Instrumental rhythmic learning proved to be a patterned, context-dependent phenomenon. There seems to be no previous study of a simple behavior ecosystem "idealized" by strict periodicity, despite an enormous literature pioneered by work of Ferster and Skinner (8) and concerning repetitive responding under a variety of reinforcement schedules. The strong effects seen here along a complete scale of measurement under strict constraints raised basic questions about how response consequences participate in reinforcement and knowledge of results. These questions
RHYTHMIC H U M A N LEARNING
119
g.
A.
Cg
50
47
left
LIGHT
!
TONE righ.~
3ass
pass
T_
)ass
pass
F•a
E,
D,
pass
pass
5O
47
t EXTINCTION
right- - - ~ ; L
m
)ass
L.
kr,
mL,
.[.
W
I
pass
aass
I L
"-
m
pass
I.
I
pass
2 sec
-I
FIG. 5. EMG timings and form shaped in trials of S 1.R.S2, resulting in fine-grained successful responses. Individual records (as in Fig. 3, with L--light as S1 and T = tone as $2) showing two consecutive trials for subjects whose records appear in Fig. 4. Performances during criterion trials (A--C) and extinction trials (D-F sampled at the termination of testing) were indistinguishable for a single subject and were also quite similar across subjects.
warrant fairly extensive reexamination of old assumptions in new learning systems, whether constructed in the laboratory or occurring in natural environments.
ACKNOWLEDGEMENT The study was supported in part by USPHS Grant AM29660 to Mary Wetzel.
REFERENCES
1. Adams, J. A. Response feedback and learning. Psychol. Bull. 70:486-504; 1968. 2. Adams, J. A. Learning of movement sequences. Psychol. Bull. 96:3-28; 1984. 3. Balsam, P. D.; Tomie, A., eds. Context and learning. Hillsdale, NJ: Lawrence Erlbanm Associates; 1985. 4. Basmajian, J. V. Muscles alive. Their functions revealed by electromyography, 4th edition. Baltimore: Williams and Wilkins; 1978. 5. Colwill, R. M.; Rescorla, R. A. Associative structures in instrumental learning. In: Bower, G. H., ed. The psychology of learning and motivation, vol. 20. Orlando: Academic Press (Harcourt Brace Jovanovich); 1986:55-104. 6. Dinsmoor, J. A. Observing and conditioned reinforcement. Behav. Brain Sci. 6:693-728; 1983. 7. Estes, W. K. Learning theory and the new "mental chemistry." Psychol. Rev. 67:207-223; 1960. 8. Ferster, C. B.; Skinner, B. F. Schedules of reinforcement. Englewood Cliffs, NJ: Prentice-Hall; 1957. 9. Fetz, E. E. Operant control of single unit activity and correlated motor responses. In: Chase, M. H., ed. Perspectives in the brain sciences, vol. 2. Operant control of brain activity. Los Angeles: University of California, Brain Information Service/Brain Research Institute; 1974: 61-89. 10. Fetz, E. E.; Finocchio, D. V. Operant conditioning of specific patterns of neural and muscular activity. Science 174:431-435; 1971.
11. Gamett, R.; Stephens, J. A. The reflex responses of single motor units in human first dorsal interosseous muscle following cutaneous afferent stimulation. J. Physiol. 303:351-364; 1980. 12. Glenberg, A. M.; Mann, S.; Altman, L.; Forman, T.; Procise, S. Modality effects in the coding and reproduction of rhythms. Mem. Cog. 17:373-383; 1989. 13. Graham Brown, T. On the nature of the fundamental activity of the nervous centres; together with an analysis of the conditioning of rhythmic activity in progression, and a theory of the evolution of function in the nervous system. J. Physiol. (Lond.) 48:18-46; 1914. 14. Grillner, S. Neurobiological bases of rhythmic motor acts in vertebrates. Science 228:143-149; 1985. 15. Guthrie, E. R. Association as a function of time interval. Psychol. Rec. 40:355-367; 1933. 16. Heuneman, E.; Shahani, B. T.; Young, R. R. Voluntary control of human motor units. In: Shahani, M., ed. The motor system: neurophysiology and muscle mechanisms. Amsterdam: Elsevier; 1976: 73-78. 17. Himadi, W. G. Conditioned reinforcement from shock termination. Doctoral dissertation, University of Arizona; 1982. 18. Hull, C. L. Principles of behavior. An ina'oduction to behavior theory. New York: Appleton-Century-Crofts; 1943. 19. James, W. The principles of psychology, vol. 1. New York: Dover; 1950. (Reprinted from Henry Holt, Co.; 1890.) 20. Kelleher, R. T.; Gollub, L. R. A review of positive conditioned
120
reinforcement. J. Exp. Anal. Behav. 5(Suppl.):543-597; 1962. 21. Keller, F. S.; Schoenfeld, W. N. Principles of psychology. A systematic text in the science of behavior. New York: AppletonCentury-Crofts; 1950. 22. Klemmer, E. T. Rhythmic disturbances in a simple visual-motor task. Am. J. Psychol. 70:56--63; 1957. 23. Nadel, L.; Willner, J.; Kurz, E. M. Cognitive maps and environmental context. In: Balsam, P. D.; Tomie, A., eds. Context and learning. Hillsdale, NJ: Lawrence Erlbaum Associates; 1985:385-406. 24. Patterson, F. R.; Gorman, L. K.; Wetzel, M. C. Advantages of a simple contact switch for human locomotion. Am. J. Phys. Med. 63:11-17; 1984. 25. Perkins, C. C., Jr. An analysis of the concept of reinforcement. Psychol. Rev. 75:155-172; 1968. 26. Powell, D. A. Cognitive and affective components of reinforcement. Am. Psychologist 42:409-410; 1987. 27. Rescorla, R. A. A Pavlovian analysis of goal-directed behavior. Am. Psychologist 42:119-129; 1987. 28. Rescorla, R. Pavlovian conditioning. It's not what you think it is. Am. Psychologist 43:151-160; 1988. 29. Rescorla, R. A.; Holland, P. C. Behavioral studies of associative learning in animals. Annu. Rev. Psychol. 33:265-308; 1982. 30. Rescorla, R. A.; Solomon, R. L. Two-process learning theory: relationships between Pavlovian conditioning and instrumental learning. Psychol. Rev. 74:151'182; 1967. 31. Salmoni, A. W.; Schmidt, R. A.; Walter, C. B. Knowledge of results and motor learning: a review and critical reappraisal. Psychol. Bull. 95:355-386; 1984. 32. Schoenfeld, W. N.; Antonitis, J. J.; Bersh, P. J. A preliminary study of training conditions necessary for secondary reinforcement. J. Exp. Psychol. 40:40--45; 1950. 33. Schmidt, R. A. Anticipation and timing in human motor performance. Psychol. Rev. 70:631--646; 1968. 34. Sch6ner, G.; Kelso, J. A. S. Dynamic pattern generation in behavioral and neural systems. Science 239:1513-1520; 1988. 35. Sherrington, C. The integrative action of the nervous system. Cam-
WETZEL
36. 37. 38. 39.
40. 41. 42. 43. 44. 45.
46. 47. 48.
49.
bridge: University Press; 1906 (second edition with a new foreword, 1947). Shik, M. L.; Orlovsky, G. N. Neurophysiology of locomotor automatism. Physiol. Rev. 56:465-501; 1976. Skinner, B. F. Science aiad human behavior. New York: Macmillan; 1953. Stein, R. B.; Pearson, K. G.; Smith, R. S.; Redford, J. B., eds. Control of posture and locomotion. New York: Plenum; 1973. Thorndike, E. L. Animal intelligence. In: Psychological review monograph supplements; 1898, No. 8. Reprinted in: Dennis, W., ed. Readings in the history of psychology. New York: Appleton-CenturyCrofts; 1948:377-387. Tolman, E. C. Cognitive maps in rats and men. Psychol. Rev. 55:189-208; 1948. Wetzel, M. C. Controlled human locomotion on a treadmill. J. Hum. Move. Stud. 7:177-198; 1981. Wetzel, M. C. Operant control and cat locomotion. Am. J. Phys. Med. 61:11-25; 1982. Wetzel, M. C. Operant conditioning in motor and neural integration. Neurosci. Biobehav. Rev. 10:387--429; 1986. Wetzel, M. C.; Gorman, L. K. Learning and locomotor reaction times. Hum. Move. Sci. 5:75-100; 1986. Wetzel, M. C.; Howell, L. G. Properties and mechanisms of locomotion. In: Towe, A. L.; Luschei, E. S., eds. Motor coordination, vol. 5. Handbook of behavioral neurobiology. New York: Plenum Press; 1981:567-625. Wetzel, M. C.; Pierce, D. L. Two-muscle coordination versus natural treadmill locomotion. Am. J. Phys. Med. 66:371-385; 1988. Wetzel, M. C.; Stuart, D. G. Ensemble characteristics of cat locomotion and its neural control. Prog. Neurobiol. 7:1-98; 1976. Wetzel, M. C.; Wetzel, R. E. Integration of learned and naturally occurring flexor EMG in the human step cycle. Physiol. Behav. 38:41-51; 1986. Williams, B. Stimulus control and associative learning. J. Exp. Anal. Behav. 42:469-483; 1984.