Implicit rewards as reinforcers and extinguishers

Implicit rewards as reinforcers and extinguishers

JOURNAL OF EXPERIMENTAL CHILD PSYCHOLOGY 37, 31-40 (1984) Implicit Rewards as Reinforcers and Extinguishers CHRISTOPHER F. SHARPLEY Monash Univer...

612KB Sizes 0 Downloads 75 Views

JOURNAL

OF EXPERIMENTAL

CHILD

PSYCHOLOGY

37, 31-40 (1984)

Implicit Rewards as Reinforcers and Extinguishers CHRISTOPHER F. SHARPLEY Monash University. Clayton. Victoria Previous studies of implicit reinforcement with children have presented the implicit reward phase after baseline conditions. The present study replicated this design and compared these effects within a design where implicit rewards were presented after direct rewards to both targets and peers. Thirty-two fourth grade children copied the letters of the alphabet under varying reward conditions. Data indicated the presence of reinforcement effects when presented after baseline, and extinction effects when presented after direct reward conditions.

The reinforcing effects of rewarding a “target” subject while “peers” are observing (but not performing a task) have been well documented in reviews of “vicarious” reinforcement (see Flanders, 1968; Kazdin, 1977, for reviews). In addition, several studies have been carried out to measure “implicit” reward effects in classroom situations (e.g., Broden, Bruce, Mitchell, Carter, & Hall, 1970; Christy, 1975; Drabman & Lahey, 1974; Kazdin, 1973; Patterson, 1974; Reppucci & Reiss, 1970; Scott & Bushell, 1974; Sechrest, 1963). The “implicit” reward situation has been defined by Bandura (1971, p. 234) as a situation where both targets and peers are performing the task, but only the target subjects are receiving direct rewards. The outcomes of these studies suggest that implicit “rewards”’ also act as reinforcers. However, two other studies of implicit “rewards” (Ward & Baker, 1968; Sharpley, Irvine, & Hattie, 1980) did not support this suggestion. Ward and Baker (1968) found no significant effects upon peers, while Sharpley et al. (1980) noted quite opposite effects wherein these implicit “rewards” acted (apparently) as extinguishers upon both target and peer subjects. While Bandura (1971, Note 1) has noted that it is not clear whether implicit “rewards” will have punishing, extinguishing, or reinforcing effects, these apparent contradictions require further investigation. Requests for reprints should be sent to C. F. Sharpley, Faculty of Education, Monash University, Clayton, Victoria 3168, Australia. ’ “Rewards” refers here to “something given or received in return for doing something they do not necessarily increase the probability of the behavior they follow” (Kazdin, 1980. p. 29). 31 0022-0965184 $3 .OO CopyrIght 0 1984 by Academic Press, Inc. All rights of reproduction in any form reserved.

32

CHRISTOPHER

F. SHARPLEY

Scrutiny of these previous research studies revealed that there were differences in design between those studies which showed reinforcing effects and those which showed extinguishing effects from implicit “rewards.” Two contrasting designs became evident-ABB, and AB,B (where A = Baseline, B = Direct rewards, B, = Implicit rewards). Because the conflicting data (in terms of whether implicit “rewards” acted as reinforcers or extinguishers) arose from the Sharpley et al. (1980) study wherein an AB, design was utilized, the present study sought to clarify this issue by combining both designs in a comparative study (this may be seen in Fig. 1: there were two classes, each with two groups-targets and peers-with the order of presentation of reward conditions varying between the two classes). METHOD

Subjects and Setting Subjects were 32 children (20 boys, 12 girls) from a typical school in a large country town in New South Wales, Australia. All children were from grade 4 (CA range = 9.0 to 10.4, x = 9.6). There were no outstanding academic or behavioral problems noted, and all children were volunteers from a typical middle-class neighborhood. All children had been born in Australia and all were monolingual. The study was carried out in two normal classrooms during one morning (9 AM to 12 noon) of a typical school day. The principal, teachers, and children had agreed to the study on the basis that it would improve the children’s handwriting. (Reference to Fig. 1 will show that at the end of the study all groups were performing at levels superior to those measured during Baseline. This promise was, therefore, fulfilled, and the study seen to be of obvious benefit in these terms to school staff and children.) The author was able to make this promise on the basis of previous research incorporating these procedures for teaching handwriting (Sharpley et al., 1980). Subjects were given a prestudy writing test which comprised the printing of the 26 letters of the alphabet and then grouped into matched pairs according to their performance. These pairs were then split to form two equal groups of 16 each, and one group was assigned to each treatment (i.e., 8 targets and 8 peers). Experimenters There were two experimenters assigned to each condition, thus four in all. Two of these were male, with the overall mean age being 27.1 years (range 21 to 45). All of the experimenters were postgraduate students in special education. Dependent Variable Handwriting has been suggested as a particularly sensitive measure of the transient motivational state of children (Flunckiger, Tripp, & Weinbeck,

IMPLICIT

REINFORCERS

AND

EXTINGUISHERS

33

1961). The rise in concern for the basic skills of literacy in schools (Baum, 1976; Corbin, 1976), plus the advantage that handwriting provides a permanent product of responses which is available for later restoring (Hall, 1970), suggested the printing of letters by the children in this study as the task activity. This was to be performed upon prepared slips of paper appropriate to the grade. All subjects were required to write their names and the session number upon their sheet of paper prior to beginning the task. Depending upon the particular intervention strategy, all children received their papers from the previous session and the appropriate intervention response before completing the task for that session. The letters of the alphabet were arranged in random order on a separate stimulus sheet for each trial of each particular intervention period. There were five trials during each intervention, thus five different stimulus sheets were prepared. The letters were initially printed by an experienced primary teacher who copied them from the official Course of Study (NSW Department of Education) as used in the school. The stimulus was presented on an overhead projection screen 2.5 m from the children in a normal classroom with random seating which was maintained throughout all phases of the study. The scoring of handwriting is often associated with subjective responses on the part of the scorer (Anderson, 1965; Feldt, 1962), but recent methods have produced high levels of reliability (Helwig, Johns, Norman, & Cooper, 1976). Using one of these methods, transparent overlays were constructed to measure deviations in children’s responses to the model letter from 0 to 2 mm. The precise criteria used are detailed in Helwig et al. (1976, pp. 232). The correction of each session’s responses was carried out immediately after the session by eight independent evaluators who were naive as to the actual nature and phase of the study. These evaluators were all postgraduate students in special education who had received a minimum of 2; hr training in the use of the specified criteria and the transparent overlay as a correction method. Reliability during training was 100% agreement between all of the evaluators and the trainer. During the actual period of the study, these evaluators were seated outside the classrooms so as not to be visible to the children inside. The slips of writing paper were collected by the two experimenters, taken out of the room to the evaluators and returned the same way. Poststudy checks were carried out between evaluators on one session per phase. Procedure: “Rewards”

The “rewards” chosen were feedback, improvement indicators (“Smiley” vs “Grumpy” faces), verbal praise, and Smarties (a sweet similar to M & M’s chocolate candies). The potential of these rewards as reinforcers was demonstrated by measuring the effects of their introduction to subjects under the ABC design. Comparisons of responses from phase A to B

34

CHRISTOPHER

F. SHARPLEY

were made by time-series analysis (Glass, Willson, & Gottman, 1975), and showed that these rewards could act as reinforcers for these subjects and task (targets: t(6) level = 4.01. p < .05; r(6) slope = 10.62, p < .OS; peers: t(6) level = 4.65, p < .05; t(6) slope = 6.33, p < .05). Experimental

Conditions

All children were exposed to the same reward conditions unless specified because of the nature of the reward phase. Children were asked to “copy these letters as well as you can onto the piece of paper we gave you.” No specific instruction was then given to copy the letters perfectly, rather to copy them as well as each child was able to in terms of ability and motivation. A-Baseline. Handwriting was recorded for each subject over the five sessions of this intervention, with the experimenters’ only comment being “Thank you” as the individual pieces of writing paper were collected. Any questions were politely answered in a noncommittal fashion by the experimenters. B-Intervention I: direct relzlards to all subjects. During this phase all children received their previous session’s writing sheet back with the following experimenter response: (a) those subjects who had improved upon the previous trial were told, “Good work. That’s an improvement. Have a smartie.” This experimenter response was also give to subjects who repeatedly scored 26/26; (b) those subjects who had not improved upon the previous trial were told. “That’s not an improvement,” said without criticism.

After perusal of these sheets for about 30 set, children were given a blank sheet and asked to copy the trial’s exercise onto it. Both sheets were collected at the end of the session. B,-Zntervention 2: implicit rewards. Prior to this intervention, each treatment group was subdivided into two subgroups of equal size and ability. One subgroup became “targets” (who received their writing sheets and the experimenter response as during Intervention I) and the other became “peers” (who did not receive either their writing sheets back or the experimenter’s comments. This constituted a return to Baseline conditions for those peer subjects). When the experimenters were questioned as to why only the targets received their sheets back (and the accompanying experimenter response), the experimenter replied that “The person who corrects them only gave me these back.” Because of seating arrangements, all children were able to see and hear these implicit reward conditions. Following the final session, the experimenters thanked all the children for their participation and the implicit reward conditions were explained “as a way in which we were trying to find out what happened if some teachers forgot to give every child a reward for good work.”

IMPLICIT

REINFORCERS

AND

EXTINGUISHERS

35

In order to test the hypothesis that there would be no effect due to the order in which the implicit reward phase was presented, time-series analysis of baseline vs implicit reward vs direct reward phase data was planned. The use of time-series statistics to analyze data from planned changes of intervention over time has been documented elsewhere (e.g., Glass et al., 1975; Jones, Vaught, & Weinrott, 1977) as overcoming the confounding effects due to serial autocorrelation which emerge when traditional data analysis procedures are used with operant data (see Gottman & Glass, 1978; Sharpley & Rogers, 1981), and produces t statistics for changes in level (i.e., mean changes in performance) as well as slope (i.e., alteration in the trend of data over phases). RESULTS

Reliability Interscorer reliability was maintained by the training procedure which required 100% agreement, and by using a large group of scorers (8) to correct each session’s writing sheets. As an additional check on reliability, scorers were given children’s sheets on a random basis for each session, thus ensuring that no single child’s responses were biased by being scored by only one scorer for all sessions. Finally, 50 sheets were chosen at random and restored by the author after the study. Reliability for this postcheck was .961, which maintains the high level of training. The scoring procedure is therefore generalizable across conditions and consistently dependable over the total span of the study. Data from the Study Graphed representations of session by session group means are shown in Fig. 1. As was performed for the comparison between A and B phases for the experimental group ABB,, further analyses were carried out by use of time-series procedures and are presented below. (1) AB,B (Fig. la) Following a decreasing performance of correct responses during A, both target and peer subjects increased in level and/ or slope for correct responses during B, (targets: t(6) level = 1.94, p < .lO; t(6) slope = 2.48, p < .05; peers: t(6) level = 2.60, p < .05; t(6) slope = 3.39, p < .05). This is a replication of previous studies mentioned above. The application of direct rewards for all children during B led to significant increases in the level of correct responses for both targets (t(6) level = 2.91, p < .05) and peers (t(6) level = 4.40, p < .05), thus demonstrating that the reinforcing power of the rewards used had not been negated by the previous implicit reward phase in this experimental group. (2) ABB, (Fig. lb) Both target and peer subjects’ responses decreased during Baseline, but showed significant increases in level and slope during condition B (p < .05). The reward thus acted as a reinforcer for correct handwriting responses during this second phase.

36

CHRISTOPHER

F. SHARPLEY

lirect

reward

25

to both

B

l

---------

1

II

12345

I

I

-l

I

I

12345

I

I

I

I

Target Peers

I

I

I

12345

TRIALS

FIG. la.

Correct responses

over

trials, AB,B design subjects.

However, data from phase Bl indicate that there were downward trends in the number of correct responses for both targets and peers during this phase (targets: t(6) slope = -9.14, p < .05; peers: t(6) slope = -8.67, p < .05). Level changes were not significant, although these may have become so if there had been more sessions during this phase. These downward trends during implicit reward phases for those subjects under design ABB, constitute a major finding which suggests the presence of extinction effects during the implicit (B,) reward condition for this design. The homogeneity of target and peer responses during B, was tested by repeated measures analysis of variance upon target and peer data from B, with data from B covaried out. No significant differences were noted

IMPLICIT

REINFORCERS Wxt

37

AND EXTINGUISHERS reward

to borl

‘icerious

B

25.

rewards

Bl

20.

b 2 B v) 5 k 2

15.

10.

--------

* Targets Peers

5.

0,

f

I

I

I

1

I

12345 TRIALS

FIG. lb.

Correct responses over trials, ABB, design subjects.

between the peers (who were undergoing extinction conditions) and targets (who were still receiving previously reinforcing “rewards”): F(5, 8) = 2.84, n.s. It is difficult to accept that satiation effects were confounding these data when the responses from a similar group of subjects (ABiB, Fig. la) showed no evidence of satiation, and actually were increasing correct responses after the same number of trials as those subjects who were decreasing correct responses within the ABB treatment group. It was thus apparent that the “reward” which had shown reinforcing effects during the second phase (i.e., B) for this experimental group, was now effective as a punisher during the final “reward” phase.

38

CHRISTOPHER

F. SHARPLEY

DISCUSSION

The present study has replicated previous results which suggest that certain implicit “reward” conditions may possess reinforcing properties. Of major interest, however, is the finding that these same implicit “reward” conditions can also possess extinctive properties when applied after direct “reward” conditions. Two points emerge from the data concerning this finding. First. while these same implicit “reward’ conditions were reinforcing for peers under an AB,B design, they were extinctive for peers under an ABBl design. Second, direct “reward” conditions which were reinforcing for targets under the AB,B design were extinctive for targets under the ABB, design. Thus, the temporal order of the entire implicit reward condition appears to be a major determining factor in explaining the disparity between studies which was referred to above. Although these two “reward” conditions may be treated separately, they are both necessary for any discussion of the “implicit reward” paradigm. While the application of implicit “rewards” prior to direct “reward” conditions (i.e., AB,B) may constitute an incentive condition for peers and not disturb the reinforcer power of “rewards” received by target subjects (Broden et al., 1970, lends support to this position) the inclusion of implicit “rewards” ufter such a direct “reward” condition (i.e., ABB,) has been shown to act as an extinction process for both targets and peers. Although not unexpected for peers who are undergoing a typical extinction paradigm during the implicit “reward” conditions, the discovery that targets also experience an extinction condition has not previously been reported in the wider literature. It appears that the presentation of the nil-reward aspect of implicit “rewards” must constitute an extinction condition for peers in order to act as an (implicit) extinction condition for targets also (who have actually now become an implicit “reward”as-punisher peer group). That a “reward” can possess reinforcing powers under one set of temporal order presentations, and then show extinguishing properties under a separate temporal order suggests the presence of other factors than those simply inherent in the nature of the “reward” stimuli themselves and reiterates fundamental definitions of “reinforcers” and “punishers” as determined by their effects rather than as a priori evaluation. Explanations of the previously noted vicarious “reward” effects have suggested that some cognitive judgments may have been made by peers (e.g., Christy, 1975; Kazdin, 1977). A similar process may be operating during the implicit extinction phase of the present study. It may be that the reinforcing power of the “rewards” used decreased for target subjects when the implication of extinction was present under the ABB, design. That is, just as a subject may deduce “If he got rewarded for doing that, then maybe I’ll be rewarded for doing it too” (Flanders, 1968, p. 320) when vicarious or implicit reinforcers are applied, so a subject may deduce “If he lost his reward in spite of still doing that, maybe I’ll lose

IMPLICIT

REINFORCERS

AND EXTINGUISHERS

39

my reward soon also,” thus responding to an “implicit extinction” paradigm. These suggestions lend support to the recently presented view that behavior-modification procedures need to consider more variables than just stimulus and response factors (e.g., Bandura, Note 1). There is no one theoretical position which fully encompasses the data collected here, and further investigations are necessary before a reliable theoretical statement could be made. What has emerged quite clearly, however, is that the effects noted here have not been previously investigated and that these effects call for a reformulation of some previous ideas regarding the efficacy of “rewards” as reinforcers. Bandura’s (1971, p. 234) caution that it is not possible to determine if implicit “rewards” will have “rewarding, punishing or extinctive” effects has been removed to some extent by the present research. Certain specific conditions have been shown to possess reinforcing effects (i.e., when implicit “rewards” are administered before direct “rewards”) and others to possess extinguishing effects (i.e., when administered after direct “rewards”). Further research into the nature of the hypothesized cognitive evaluations is called for to determine if these do occur and, if so, what is the nature of such evaluations. Implications from the present data call for caution when applying “rewards” within groups. REFERENCES Anderson, D. W. Handwriting research: movement and quality. Elemetrtory English, 1965, 42, 45-53. Bandura, A. Vicarious and self-reinforcement processes. In R. Glasser (Ed.). The nature of reinforcement. New York: Academic Press, 1971. Baum, J. The politics of back-to-basics. C/lunge. 1976. 8, 32-36. Broden, M., Bruce, C., Mitchell, M. A.. Carter, V.. & Hall. R. V. Effects of positive altention on attending behavior of two boys at adjacent desks. Journal of Applied Behavior Analysis. 1970, 3, 199-203. Christy. P. R. Does use of tangible rewards with individual children affect peer observers? Journal of Applied Behavior Analysis. 1975, 8, 187-196. Corbin, D. The classroom teacher on the spot. Teacher, 1976. 93 22. 24-25. Drabman, R. S., & Lahey, B. B. Feedback in classroom behavior modification: Effects on the target and her classmates. Jownal qf Applied BehalBior Analysis. 1974, 1, 591598. Feldt. L. S. The reliability of measures of handwriting quality. The Jottrnul qfEducationa/ Psychology, 1962. 53, 288-292. Flanders, J. P. A review of research on imitative behaviour. Psycholopical Bulletin, 1968, 69, 316-337. Flunckiger, F. A., Tripp, C. A., & Weinbeck, G. H. A review of experimental research in graphology 1933-1960. Perceptual and Motor Skills, 1961, 12, 67-90. Glass, G. V.. Willson, V. L., & Gottman. .I. M. Design and una/ysis of rime-series experiments. Boulder: Colorado Associated Univ. Press, 1975. Gottman. J. M.. & Glass, G. V. Analysis of time-series experiments. In T. R. Kratochwill (Ed.). Single subject research: Strategies,for evaluating change. New York: Academic Press, 1978. Hall, R. V. Managing behavior: Behavior mod$?carion, Part I. Measuring behavior. Merriam, Kan.: H & H Enterprises. 1970.

CHRISTOPHER

40

F. SHARPLEY

Helwig, J. J., Johns, J. C., Norman. J. E.. & Cooper, J. 0. The measurement of manuscript letter strokes. Journul of Applied Behavior Analysis, 1976, 9, 231-236. Jones. R. R., Vaught. R. S., & Weinrott, M. Time-series analysis in operant research. Journal

of Applied

Behavior

Atudvsis,

1977. 10, 151-166.

Kazdin, A. E. The effects of vicarious reinforcement on attentive behavior in the classroom. Journal of Applied Behavior Analysis. 1973, 6, 71-78. Kazdin, A. E. The token economy: Review! and evuluution. New York: Plenum Press. 1977. Kazdin, A. E. Behavior mod$cation in applied settings. Homewood, Ill.: Dorsey. 1980. Patterson, G. R. Interventions for boys with conduct problems: Multiple settings, treatments and criteria. Journul of Counsulting and Clinical Psychology, 1974, 42, 471-481. Reppucci, N. D., & Reiss, S. Effects of operant treatment with disruptive and normal elementary school children. Proceedings of the 78th Annual Convenrion of the American Psychological

Association,

1970, 5, 743-744.

Scott, J. W., & Bushell. D., Jr. The length of teacher contacts and students’ off-task behavior. Journul of Applied Behuvior Am&is. 1974. 7, 39-44. Sechrest, L. Implicit reinforcement of responses. Journal of Educational Psychology. 1963. 54, 197-201. Sharpley, C. F.. Irvine, J. W., & Hattie, J. A. Changes in performance of children’s handwriting as a result of varying contingency conditions. Alberru Journal of Educafional Research, 1980. 26, 183-193. Sharpley, C. F.. & Rogers, H. J. Means, gruphs. ts und fs: inudequate measures for operant research) Paper presented at the Fourth Australian Behaviour Modification Association Conference, Sydney, May 1981. Ward, M. H., & Baker, B. L. Reinforcement therapy in the classroom. Journal ofApplied Behavior Analysis, 1968, 1, 323-328.

REFERENCE 1. Bandura, A. Personal communication. RECEIVED:

May 3. 1982;

REVISED:

NOTE

March 1979.

February 2, 1983.