JOURNAL
The
OF EXPERIMENTAL
Role
of Stimuli
CHILD
PSYCHOLOGY
Instrumental
17,
322-331 (1974)
Responding
in the
Development
Secondary
and of
Contiguity
of
Infant
Reinforcement
ALBERT
University
SILVERSTEIN'
of Rhode
Island’
AND LEWIS Brown
P.
LIPSITT
University
Ten-month-old infants received contingent pairings of a tone (T+) and food reinforcer. Groups S’ and SD received the food on an FI 23-set schedule for target touching, the former group receiving T+ immediately after the response and 1.5 set prior to food and the latter group receiving T+ at the end of the intertrial interval. Group SC received food reinforcers 1.5 set after T+ with no response required. A second tone (T”) was heard by all groups once during each intertrial interval, at randomly determined points. All groups subsequently were given a spatial discrimination task, receiving T+ for one alternative and T” for the other. Group S’ gave significantly more responses for T+ than for T”, but neither of the other two groups produced a superiority for T+. Thus, both contiguity with a primary reinforcer and the presence of an operant during training appear to be necessary for a neutral signal to acquire the ability to enhance responding.
A common strategy in psychological analyses of human motivation has been to appeal to the operation of learned rewards whose power is derived from their association with biologically basic rewards. When a ‘This research was performed while the first author was a USPHS Special Research Fellow (MH 13492) at Brown University. The program of research was prepared with the support of a travel grant from the University of Rhode Island, research Grant-in-aid No. 132-173. Research facilities were available from USPHS Grant No. HD-03911 to the second author. The authors wish to thank Clement DeLucia, Brown University Psychology Department, who was responsible for constructing the apparatus, and Mm. Helen Haeseler and the staff of the Brown University Child Study Center for their help in providing facilities and scheduling services. *Requests for reprints should be sent to: Albert Silverstein, Department of Psychology, University of Rhode Island, Kingston, Rhode Island 02881. 322 Copyright @ 1974 by Academic Press, Inc. All rights of reproduction in any form reserved.
INFANT
SECONDARY
REINFORCEMENT
323
learned reward is used experimentally to maintain prior behavior or to strengthen new behavior, it is called a secondary reinforcer (S). The concept of secondary reinforcement often has been used successfully in accounting for the socializat,ion of children (e.g. Bijou & Baer, 1961, 1965), on the assumption that its operation begins early in infancy. Yet, despite a voluminous literature on S“ (cf. Hendry, 1969; Wike, 1966), the first experimental demonstration of the phenomenon in human infants was only recently produced (Silverstein, 1972). In that study, a secondary reinforcer was used successfully to produce a spatial preference with lo-month-old infants during a single 25-min session. The occurrence of a tone which had previously been paired with the delivery of food (T+) was made contingent on touching either the left or right of two identical targets, while a neutral tone (T”) was made contingent on the other alternative. During a subsequent 5-min discrimination task, a significantly greater number of responses was made to the side that produced T+ than to the side that produced T”. The definition of a secondary reinforcer used both in the Silverstein (1972) experiment and in the present study is a purely empirical one: a stimulus that enhances the level of a preceding response (beyond that of an appropriate control stimulus) because of its prior contingent presentation with a primary reinforcer. This definition does not commit one to any particular theoretical account of either the development of S’ or the mechanism by which S’ enhances level of responding. The verification of any theory of the process by which stimuli acquire the power to enhance responding in infants requires the exact specification of what experimental arrangements are necessary during the pairing of a neutral stimulus with a primary reinforcer for the former to produce a preference such as that found by Silverstein (1972). In common with most animal studies of S, that experiment included both the presence of an instrumental response and an invariably close contiguity between T+ and food delivery. Ss were required to touch a centrally located, illuminated target to receive each paired presentation of T+ and food and they received T” once during each intertrial interval at randomly determined times. The present experiment was designed to determine whether both the instrumental response requirement and the strict stimulus contiguity are necessary conditions for the effect found in the prior study. In the conditioning phase of the present experiment, one group of infants (S’) replicated the conditions of the Silverstein (1972) experiment, a second group (SC, for sensory conditioning) passively received the paired T+ & food presentations, and a third group (SD) received T+ at the end of the intertrial interval as a signal that touching the target would bring food reinforcement. If the training procedure in Group SD
324
SILVERSTEIN
AND
LIPSITT
were completely successful, then T+ would exclusively control the target touching response, but it would still be more separated in time from food delivery (by the occurrence of the response) than was the case for Groups S1’ and SC. To the extent that T+ failed to become a true SL’ for that group, its occurrence would be even more separated from food delivery. In order to make Group S” comparable to the other two, both T+ and T” were presented for 3 XC, although it would have been a more efficient form of discrimination training to have T+ continue until S touched the target. In all three groups, the discriminationlearning task was identical to that of the prior study (Silverstein, 1972). To evaluate the effect of S’s initial hand preference, half the Ss in each group were given T + on their preferred side and half were given T + on their nonpreferred side. METHOD
Subjects A total of 48 Ss, ranging in age from 9.87 months to 10.60 months, with a mean of 10.25 months, were used in this experiment. The Xs were randomly assigned to conditions within blocks of three, with the restriction that approximately the same ratio of males to females appear in each group. In two of the groups there were eight male and eight female Ss, while in group S’ there were nine males and seven females. An additional 16 Ss were removed from the design either because of apparatus failure or failure to meet the minimum requirements for full participation (see below). Nine of these Ss had been assigned to group S’, four to group SD, and three to group SC. All Ss whose names were drawn from the files of the Child Study Center at Brown University were normal, healthy full-term babies. Apparatus A complete description of the apparatus and a photograph may be found in Silverstein (1972). Each S was seated in a feeding table which was set into a gray wooden frame. A rotating feeder was attached to the top of this frame to deliver Froot Loops to the tray of the table via a concealed tube. Photocells were embedded into the top of the feeding table at points corresponding to where a center target and two side targets were projected. The center target used in the conditioning phase was a patterned black and yellow cross and the two side targets used in the discrimination task were identical red circles separated by 15 in. Each target was projected by a separate slide projector housed above X’s head at a mirror which reflected the image downward at an angle of approximately 65 degrees onto the tray. The tones used for T+ and
INFANT
SECONDARY
REINFORCEMEXT
325
T” were generated by two speakers enclosed together in a small metal box and placed directly beneath S’s feet. Cjne \vas a continuous tone of 4500 cps and the other was a pulsating tone of 2500 cps. Onset and offset of the tones and the 0l)erntion of the fee&r were controlled by a Hunter six-channel timer (model 1516). Touching one of the targets broke a photocell beam which triggered the delivery of food or tone reinforcement, and touclles were registered on a chart recorder. Procedure Prior to the beginning of the experiment, E determined each S’s current hand preference by offering him two identical toys on an otherwise empty table and noting which one was selerted most often. Then, S was assigned Tf on the same or opposit,e side as his preference according to a counterbalancing procedure. Each S was placed in the feeding table by his mother, who then sat opposite him, in view, during the remainder of the experiment. The E sat in an adjacent room. The center target was visible throughout the conditioning phase for Groups 8’ and 9’ but was not seen by Group SC. The sound of the projector fan was left on for Group SC to keep the auditory conditions equal to those of the other groups. For Group S“ when X touched the target, one of t’he two tones was automatically activated for 3 SW and one Froot Loop fell in the region of the target 1.5 set following tone-onset. For 20 set following toneoffset S could receive no further cereal or T-t, and at randomly predetermined times during each of these intertrial intervals, the second tone (T”) was presented for 3 sec. The two tones were assigned to be T+ and T” in a counterbalanced order for each group. A total of 20 reinforcements was the maximum possible for each S in the conditioning phase. Group SC received an equivalent number of paired T+ and food presentations, but was required to make no response to obtain them. To keep the temporal conditions for this group similar to those of Group S’, the mean latencies for each of the 20 target-t’ouching responses obtained in the conditioning phase of the previous experiment were added t’o the time-out period to determine when each tone-food pair was presented to S. Group SC also received Tn once during each 20-see period following food delivery according to a randomly predetermined schedule. For Group S” the occurrence of T+ signalled the end of the timeout period and meant that the next target touching response would bring a Froot Loop. For these Ss, the delivery of food was not, accompanied by T+ , but T” was again delivered at random times during each subsequent time-out period. All Ss were shifted to the spatiaI-discrimination test 30 set following
326
SILVERSTEIN
AND
LIPSIT’I
the conclusion of the training phase. Here, the two side targets were visible continuously for 5 min, during which time each response to the positive circle (Ci+) produced T+ for 3 set and each response to t,he neutral circle (C”) produced T” for 3 sec. All responses made to either target except those made during the durat.ion of a tone’s presentation produced the appropriate tone. On those rare occasions when X touched both circles simultaneously, both tones sounded for 3 sec. S could not keep a tone on for more than 3 set by resting his hand upon one of the targets; he had to withdraw and touch again. Any S who either failed to respond for 3 min or who cried continuously for that long during the conditioning phase received no further training. If he had already received at least ten reinforcements at that point, he was shifted into the discrimination task, and his data were retained if he stopped crying and responded during the first minute of testing. In Group S three Ss received only 18 reinforcements, one received 16 reinforcements, and one received 14 reinforcements, while in Group SD two Ss received only 16 reinforcements. Those Ss who failed to receive at least 12 reinforcements were rejected. RESULTS
Conditioning Phase For Groups SI‘ and SD, latency from the end of each time-out period was obtained for the reinforced target touching response in order to determine whether Ss learned to discriminate the temporal interval following a reinforcement during which no furt’her reinforcement was possible. Table 1 shows the mean latencies for successive blocks of two reinforcements for those Ss who received at least 18 reinforcers (N = 14 for both groups). Both groups show a basically flat curve, and there appears to be no substantial difference between the latencies obtained for the two groups. These observations are supported by an analysis of variance which produced values of F of less than one for the factor of stimulus contingency and for its interaction with successive reinforcements. The F(8,208) for the reinforcements effect was only 1.72,
MEAN
Group Group
B SD
TABLE 1 FOR TWO-TRIAL
RESPONSE-LATENCIES
BLOCKS
IN CONDITIONING
1
2
3
4
5
6
7
8
9
17.3 16.1
14.0 12.9
14.2 9.6
15.0 16.4
17.9 14.4
15.3 13.2
19.5 12.9
10.4 15.0
9.7 6.9
IKFANT
SECONDARY
327
REINFORCEMENT
considerably less than needed for significance. These data show that Ss in Group S did not learn to discriminate the length of the nonreinforcement period, and that providing the 5s in Group SD with an external signal did not aid them in forming such a discrimination. The degree of responsiveness to the target during the intertrial interval was also not affected by the presence or absence of a cue signalling the beginning of a new trial. The mean number of responses per trial was 2.96 for Group SD and 3.01 for Group S’, F( 1,30) < 1. Both groups showed a tendency to increase the number of intertrial responses with successive reinforcements, as had Ss in Silverstein’s (1972) study, but it cannot be shown that this trend was exclusively the result of the food delivery. Test for Secondary
Reinforcement
Table 2 shows the mean number of responses made to C+ and C” during the 5-min test period by SY in each of the conditions, and for subgroups that received C+ on their preferred side or on their nonpreferred side. In reading this table it must be noted that the placement of C” for each subgroup is opposite to the placement of C+. Thus, Ss in each group who received C+ on their nonpreferred side received C” on their preferred side. A highly visible effect in this table is that of initial hand-preference. Ss in groups S“ and SC clearly gave more responses to their preferred side, whether it was C+ or C”. There also appears to be a sizeable interaction between experimental condition and secondary reinforcement: Group S’ showed a superiority for C+ responses, Group SD TAE%LE 2 NUMBER OF RESPONSES TO EACH CIRCLE IN TESTING PLACEMBNT OF Cf AT PREFERRED OR NONPREFERRED
MEAN
C+
at side pref.
C+ at side nonpref.
ACCORDING SIDE
Combined
___Group Cf C”
17.72 6.82
9.75 14.50
c+ Cn
15.62 11.75
13.90
10.08
12.77 Group
c+ Cn
Sr 9.81 ‘SD 7.69 12.56
5.63 10.63 Group 11.38 13.12
SC 13.50 12.44
TO
328
SILVERSTElN
Ah-D
LIPSIT’I’
showed a superiority for C” responses, and Group SC showed virtually no difference. An analysis of variance confirmed both these observations: the F(1,42) for the interaction between hand-preference assignment and C + z/s C” = 5.75, p < .05, and the F(2,42) for t,hc interaction of experimental conditions and C + us C’” = 3.25, p < .05. The value of F(2,42) for experimental conditions was only 1.92, p > .lO, and no other F ,zpproached significance. The overall percentages of responses that were made to C+ were 58.2 for Group Sr, 40.2 for Group S”, and 52.6 for Group SC. The first value is very close to that obtained for the identically treated group in t,he Silverstein (1972) experiment. These values may be broken down into the number of responses made to C+ and CL1at each of the 5 min for the three experimental groups (see Fig. 1) An analysis of variance was performed on the data from each of these groups. For Group S”, significant effects were found for both the time variable, F(4,135) = 6.59, p < .Ol,
I I
I 2
3
4
GROUP
Sr
GROUP
SD
GROUP
SC
5
MINUTES
FIG.
(middle),
1. Mean number of responses and Group SC (bottom).
per
minute
for
Group
S’
(top),
Group
SL
INFANT
SECONDARY
REINFORCEMENT
329
and the secondary reinforcement variable, F(1,135) = 3.91, p < .05, indicating that S’s responsiveness decreased across the 5-min period and that more responses were made for T+ than for T”. Despite the apparent diminution in superiority for C+ responses t.oward the end of the test period seen in Fig. 1, the interaction between time and secondary reinforcement was not significant (F < 1). Group SU also showed significant effects of both the time variable, F(4,135) = 5.88, p < .Ol, and the secondary reinforcement variable, P(lJ35j = 13.16, p < .Ol. In this case, however, it was T” that produced significantly more responses. In addition, the interaction of the two variables was significant, F(4,135) = 2.69, p < .05, indicating t.hst the superiority of T” diminished with time. Group SC responded virtually identically to the two targets, and only the effect of time was significant for that group, F(4,135) = 8.16, p < .Ol, the other values of F falling short of one. DISCUSSION
The major conclusion to hc drawn from the present experiment is that a tone paired only 20 times with food reinforcement can acquire the capacity to produce a subsequent spatial preference with infants if that tone’s presentation is both in close contiguity with food delivery and contingent on the infant’s making an instrumental response. This confirms a prior finding of Silverstein (1972) and extends it by showing that the absence of either of these presentation conditions precludes the tone’s serving to enhance subsequent responding. The requirement that X perform an instrumental act in order to obtain the paired tone-food presentations, shown by the equality of responsesto T+ and T” in Group SC, is t’he more empirically significant finding. But this experiment does not permit us to decide whether S had to learn merely a contingency between any instrumental response and reinforcement during training, or whether the response had to be a “replicate” of the criterion response in the testing phase (Reynolds, 1949). The lat’ter hypothesis is predicted by an Incentive Theory of S’ (Bolles, 1967) which emphasizes the role of reinforcers as conditioned stimuli for approach responding. In Group S”, the presentation of T+ immcdiatcly after X touched the target and 1.5 set prior t’o food delivery made that tone a CS for touching the cereal in the vicinity of the target. It is rcneonablc to assumethat this CR could easily be transferred to a similar response in the testing phase. This hypothesis corlld be tested by manipulating the compatibility of the responsesrequired during the training and testing phases of the cxperimerit’. Keehn (1962) found that the S’ value of an audit,ory signal for rats’ approaching a food cup could be maintained during t.he test phase only if the training and test resl)onses were compat,ihlr, and a similar manipulation should be attempted with infants.
330
SILVERSTEIN
AND
LIPSITT
The failure of T+ in the S” condition to become a real discriminative stimulus, i.e., during the training phase Ss in that group showed neither a progressive decrease in response latency nor shorter mean latencics than the Ss in Group S” who rcceivcd no signal for the end of the intertrial interval, restricts the empirical applicability of the finding that T+ failed to produce superior responding to Tn in that condition. If T+ bad shown S” properties during training and still had been less effective in producing the test response than was T+ in Group S“, then the requirement of stimulus contiguity would have been established with greater generality. On the other hand, it is possible that if T+ had remained on until S touched the target during training, it nlight have become an S” and also might have produced more responding in the test phase. But it is clear that this would have required presenting T+ for longer periods than T”, a condition that did not obtain for either of the other groups. Also, the fact remains that an SD training procedure was less effective in producing a preference for one of the tones than was a procedure which emphasized the temporal contiguity of the tone and food. Indeed, T+ was significantly inferior to T” in Group S”, a finding for which no consistent and convincing explanation is available but which may be related to T” having been more contiguous to SY’ reaching for the food than was T-t- for the majority of that group. It is surprising that the various stimulus contingencies in Phase I of the experiment did not affect the absolute number of responses made to both targets in Phase II, especially since Group SC saw no target in Phase I and thus received no practice in target touching.3 While admittedly speculative, the most plausible explanation for this is that the task and situational constraints produced a constant level of overall responsiveness and that the tones used could only alter the distribution of responses, Some support for this explanation was found in the data from pilot Ss run to determine (a) whether there were a priori preferences for the two tones and (b) that Froot Loops were better reinforcers than tones. Despite wide individual differences, the mean sum of responses to both targets was quite similar in these two small groups (20.7 US 24.8) and comparable to the means of the experimental groups reported here. The ability of the food-contingent tone in Group Sr to produce more responses than did T” cannot have been the result of the latter having a However, this finding is supported by the absence of any correlation in Groups S’ and SD between responsiveness to the target in Phase I (mean number of intertrial responses) and absolute number of responses to both targets in Phase II. The values of T were .300 and .239, respectively, both far short of significance.
IKFANT
SECONDARY
331
REINFORCEMENT
acquired aversive properties, since T” was presented at almost totally randomly determined times, excluding only the 5 set prior to food delivery and the 2 set following it (in order to avoid S hearing both tones simultaneously). This conclusion is substantiated by the fact that T” produced more responses than did T+ in Group SC, even though it was more remote from food delivery in this group than in Group S’. Since T” did not actively weaken responding in Group SD, there is no basis for supposing that it did so in Group Sr. Indeed it seemsquite possible that T” acquired some reinforcement value of its own by virt’ue of its OCcasional contiguity with the food being picked up and eaten or manipulated. While a progressive separation of the T+ and T” curves in Group S’ would have provided the strongest evidence for a conditioned reinforcement effect, this is not a necessary criterion. The measurement of a conditioned reinforcer involves the concurrent extinction of its reinforcement capacity, except for those studies in which a complex schedule continues to allow the delivery of primary reinforcers. Thus it would be expected that a procedure involving only one short session and a maximum of 20 prima.ry reinforcements would produce only a very transient superiority of T+. This, of course, limits the practical application of such a procedure. More stable effects would be expected if a lengthy series of such sessionswas given to a few Ss in order to rrcondition T+ (Wike, 1969). Moreover, with repeated training sessions,a color or form discrimination could be employed, which would remove the problems of initial position preferences. REFERENCES S. W., & BAER, D. M. Child Development: I. A Systematic and Empirical Theory. New York: Appleton-Century-Crofts, 1961. BIJOU, S. W., & BAER, D. M. Child Development: II. Universal Stage of Infancy. New York: Appleton-Century-Crofts, 1965. BOLLES, R. C. Theory of Motivation. New York: Harper & Row, 1967. HENDRY, D. P. (Ed.) Conditioned Reinforcement. Homewood, Illinois: Dorsep, 1969. KEEHN, J. D. The effect of post-stimulus conditions on the secondary reinforcing power of a stimulus. Journal of Comparative and Physiological Psychology, 1962, 55, 22-26. REYNOLDS, B. The acquisition of a black-white discrimination habit under two levels of reinforcement.Journal of Experimental Psychology, 1949, 39, 7fj&76Q. SILVERsTEIN, A. Secondary reinforcement in infants. Journal of Experimental Child Psychology, 1w& 13, 13&144. WIKE, E. L. (Ed.) Secondary Reinforcement. New York: Harper & Row, 1966. WIKE, E. L. Secondary reinforcement: Some research and theoretical issues. 1n BIJOU,
J. Arnold & D. Levine, (Eds.), Nebraska Symposium Lincoln, Nebraska: University of Nebraska Press, 1969, 3~4, W.
on
Mot&,at&n.