JOURNAL OF VERBAL LEARNING AND VERBAL BEHAVIOR 8, 7 2 5 - 7 3 1
(1969)
Chunking: Associative Chaining versus Coding1 NEAL F. JOHNSON The Ohio State University, Columbus, Ohio 43210
When Ss learn response sequences, they organize them into subunits, or chunks. If all the information Within a chunk is stored in the same memorycode, then if the code is lost from memory, all the information should be unavailable to the S. During OL, Ss learned responses consisting of three chunks (SBJ FQL ZNG). The IL responses were identical to OL except that one letter in each of two chunks was changed within each paired-associate response, with no changes in the other chunk. The OL recall data indicated a marked loss of both changed and unchanged letters within changed chunks but no loss of unchanged chunks. The results were interpreted as indicating a loss of the OL codes from memory.
During the past few years it has become increasingly clear that when Ss learn a response sequence, they do so by organizing the sequence into subunits. It would appear that they integrate the subunits separately, and then learn the order of the subunits (Johnson, 1968). That conclusion is also supported by the early work from the laboratories of both Mailer (see Woodworth, 1938, for a review) and Thorndike (1932). The data reported by these investigators show relatively close associative relationships among the members of subunits, but the individual members of one subunit do not seem to be related to the members of the next subunit. Miller (1956) introduced the term " c h u n k " to refer to these subunits, and the most commonly used operational definitions for the term are based on the assumption that the items within a chunk will tend to occur in an all-or-none manner or adjacently in an S's recall attempt. The use of clustering in free recall (Cohen, 1966) and transitional-error probabilities (TEPs) in serial recall (Johnson, 1968) are illustrations of these definitions. F r o m a theoretical point of view, it seems reasonable to assume that a chunk is a set o f
response items which is stored within the same memory code. Such an assumption can explain why the items tend to be recalled in an all-ornone manner and adjacently. That is, given that the code is recovered, then all the information currently stored in the code is also available, and the S should produce all that information before attempting to recover the next code. A set of experiments has been described elsewhere (Johnson, 1968) which suggests that it might be reasonable to view these codes as if they are opaque containers. I f that is the case, then it should be possible for an S to be in a state of implicit code recall without having immediate knowledge of the information represented by the code. Having recovered the code, the S would know that he has recovered all the appropriate information which is available to him from his memory, but until he decodes the code itself he would not know what specific information the code contained. The present study was designed to explore further the hypothesis that all the information within a chunk is stored in a c o m m o n opaque code. I f an S forms a code to represent a 1 The study reported here was supported by grant chunk such as SBJ in his memory, then the MHl1236 from the National Institute of Mental recovery and decoding of that code will entail the production of S, B, and J, in that order. Health, United States Public Health Service. 725
726
JOHNSON
I f the Ss are then a s k e d to l e a r n a new chunk, SBX, a new code will have to be f o r m e d because the old code generates SBJ. I f these chunks are l e a r n e d as responses to a c o m m o n stimulus, then the stimulus-to-code relationship between the two tasks will f o r m a n A - B , A-C retroactive-inhibition (RI) paradigm. Therefore, after second-list l e a r n i n g the firstlist code s h o u l d be u n a v a i l a b l e to the S, a n d he should n o t be able to recall either the c h a n g e d o r the u n c h a n g e d i n f o r m a t i o n within t h a t code. H o w e v e r , i f within t h a t same sequence there is a n o t h e r c h u n k w h i c h is n o t c h a n g e d f r o m the first t a s k to the second, the S can use the same code for that c h u n k o n the two tasks. I f a new code is n o t used on the second task, then there should be no loss o f t h a t first-list code, a n d the S ' s ability to recall the i n f o r m a t i o n within t h a t c h u n k s h o u l d be unaffected b y the second learning task. I n the present study, each S l e a r n e d two p a i r e d - a s s o c i a t e s which h a d the digits 1 a n d 2 as stimuli a n d each response consisted o f a sequence o f nine consonants. W h e n the Ss were shown the responses d u r i n g the study interval, the letters were g r o u p e d into three g r o u p s o f three c o n s o n a n t s each, as, for example, SBJ F Q L Z N G . P r i o r d a t a h a d i n d i c a t e d t h a t Ss w o u l d use the g r o u p i n g as a basis for c h u n k i n g (Johnson, 1965). A f t e r the original learning (OL) task, the Ss were a s k e d to learn a n o t h e r list o f two paired-associates. The i n t e r p o l a t e d - l e a r n i n g (IL) t a s k was identical to the O L t a s k with the exception t h a t one letter in each o f two chunks was c h a n g e d ; for example, SBJ F Q L Z N G changed to SXJ F Q L Z N K . It was p r e d i c t e d that the unc h a n g e d c h u n k (UCC), F Q L , w o u l d be recalled a b o u t as well as for a rest c o n t r o l group, because the first-list code w o u l d n o t be lost. H o w e v e r , the u n c h a n g e d letters S, J, Z, a n d N within the c h a n g e d c h u n k ( U C L ) s h o u l d n o t be recalled, because the codes for those two chunks w o u l d be c h a n g e d on the second list a n d the first-list code w o u l d be forgotten. Finally, the changed letters (CL) also should be forgotten, because they were
s t o r e d in the c h a n g e d code which was forgotten. The critical c o m p a r i s o n is between the U C L a n d the U C C , because b o t h letter types are c o m m o n to the two lists, with the only difference being whether they a p p e a r e d in a c h u n k which was changed. The recall o f the C L a n d the U C L s h o u l d be equal because they are represented b y the same code. METHOD Materials
The letter sequences were constructed using the Underwood and Schulz (1960) letter-associationnorms, such that adjacent letters had zero or near zero associative probabilities. In addition, no letter was used more than once, that is, 18 different letters were used. The two OL responses for half the Ss were SBQ JHF ZCL and MKX VGW PNY. The other half of the Ss had PWG VKZ JHY and RDS BQC XFN as responses during OL. For each pair of sequences above, nine subgroups were used to insure that all the letter positions within the sequences would be changed in IL. The rules used for changing letters in the IL list were: (a) whatever changes were made in one of the sequences, exactly the same change was made in the other; (b) within a sequence, the position of the changed letter within the first changed chunk was not the same as the position of the changed letter within the other chunk; and (c) the IL sequences had to maintain the same low interletter associations as occurred in the OL sequences. The ordinal positions of the changed letters within each of the nine conditions were: (1) 1 and 5, (2) 2 and 7, (3) 3 and 8, (4) 1 and 9, (5) 2 and 6, (6) 3 and 4, (7) 5 and 9, (8) 6 and 7, and (9) 4 and 8. Each of the nine letter positions was changed for two conditions, and a variety of combinations were included. Each chunk was a UCC for three conditions. In that there are only 21 consonants, with 18 occurring on OL, there were only three new consonants for the four needed letter changes in each pair of IL sequences. Therefore, it was necessary to use as one of the "new" letters one of the changed consonants that had occurred during OL. The restriction made of that letter repetition was that in the IL list it had to be in a different sequence, a different chunk position, and a different position within the chunk that in OL. For example, if the repeated letter occurred in the second position within the first chunk of one of the sequences during OL, in the IL list it had to be in the other sequence, in the second or ttiird chunk, and in either the first or third position of that chunk. In addition to the experimental conditions, two control conditions were used. Neither control condi-
CHUNKING tion learned an IL list, and one was tested for OL recall immediately after OL, while the other engaged in symbol cancellation for 6 min. and 40 sec. before they took the test (IL took that amount of time). These two conditions did not differ in OL recall.
727
TABLE 1 PROPORTION OF ITEMS RECALLED PER SEQUENCE POSITION Condition
Sequence position 1
Procedure Standard anticipation learning was used for both OL and IL. The materials were presented at a 4 : 4-sec. rate with a 4-sec. intertrial interval. During the 4-see. interval when the stimulus digit was presented alone, the Ss were asked to recall as many of the letters as they could. If Ss were not attempting to guess after a few trials, the experimenter encouraged them to do so during the intertrial interval and continued the encouragement until they started anticipating. The experimenter kept a verbatim record of the Ss' responses during both learning tasks. Learning continued for 20 trials for both OL and IL. After IL, the Ss were given a sheet of paper on which the digits 1 and 2 were printed. Following each digit, there were two rows of nine dashes and the Ss were asked to write both the OL and IL sequence that went with each stimulus, putting one letter on each dash. After writing the letters, the Ss were told to indicate which sequences were in the first list and which were in the second list. Recall, then, was an MMFR. The control condition recalled only the OL sequences.
Subjbcts The Ss were 132 introductory psychology students who participated as part of a course requirement. 'there were 12 Ss in each of the 11 conditions (9 experimental and 2 control), and the Ss were assigned to conditions in alternating order as they appeared for the experiment. RESULTS
Degree of Learning The degree of original learning was estim a t e d f r o m the recall performance of the two c o n t r o l conditions. The m e a n recall ,for each of the nine positions for each control c o n d i t i o n is given i n Table 1. Overall recall was 68 ~ , a n d the two c o n d i t i o n s were n o t different, F < 1.00. The decrease i n recall across the nine positions was significant, F(8,176) = 4.19, p < .001, b u t the conditions by positions interaction was n o t significant, F < 1.00. Degree of second-list learning Was estimated from the experimental Ss" recall of the second list o n the M M F R . These results also are
Delay No delay IL
2
3
4
5
6
7
8
9
.83 .79 .79 .75 .75 .58 .50 .54 .58 .92 .75 .75 .63 .58 .58 .58 .63 .67 .93 .87 .85 .80 .80 .79 .82 .80 .79
given i n Table 1. M e a n recall was 83 ~ , and, as for OL, there was a significant drop in recall across the nine positions, F ( 8 , 9 7 1 ) = 6.10, p < .001. While the m e a n recall of the second list by the Ss in the experimental conditions was greater t h a n the recall of the first list by the Ss i n the control condition, the difference was n o t significant, t ( 1 3 0 ) = 1.12, p > .05.
Transitional-Error Probabilities Stop TEPs (Johnson, 1969) are reported for each within-sequence t r a n s i t i o n for b o t h O L a n d I L i n Table 2, along with the m e a n frequency the preceding word was correct. F o r a n y one transition, the stop T E P w a s c o m p u t e d by c o u n t i n g the frequency d u r i n g the 20 learning trials that the Ss recalled the item before the t r a n s i t i o n correctly, b u t completely omitted all the items following t h e transition. T h a t frequency was divided by the frequency that the item before the t r a n s i t i o n was correct. TABLE 2 TRANSITIONALERRORPROBABILITIESFOREACHTRANSITION AND THE MEAN FREQUENCY THE PRECEDING WORDWASCORRECT Transition
Condition 1 OL: TEP Preceding word IL: TEP Preceding word
2
3
4
5
6
7
8
.02 .03 .28 .11 .13 .47 .11 .06 28 26 23
15 14 12 8
8
.02 .02 .20 .05 .08 .35 .07 .06 33 31 29 22 21
19 14 14
728
JOHNSON
As can be seen from Table 2, the TEPs for I L were consistently lower than those for OL. The important issue, however, is the fact that during both OL and I L the TEP spikes occurred on the transitions where the blanks appeared. Therefore, the S-defined chunks conformed to the E-defined groups.
Recall Data The mean OL recall for the UCC, UCL, and CL for each of the nine counterbalancing conditions is given in Table 3. The total number of letters each S recalled of each letter type was divided by the number of letters of that type within each sequence. These scores for each letter type could vary from 0 to 2. For the control Ss, total letters recalled was divided by 9. In that recall was about the same for the two control conditions, they were combined, and in Table 3 recall is expressed as a percentage of the combined performance of the control condition. The expected difference in recall for the three-letter types was obtained, F(2, 198) = 96.65, p < .001, but the mean recall for the nine conditions did not show significant variation, F < 1.00. There was a significant conditions x letter-type interaction, F(16,198) = 2.84, p < .005, but the expected difference in recall for the threeletter types was obtained for each of the nine TABLE 3 O L RECALL FOR THE EXPERIMENTAL CONDITIONS EXPRESSED AS A PROPORTION OF THE PERFORMANCE OF THE CONTROL CONDITION
Letter type
Condition 3 and 4 5 and 9 3 and 8 1 and 9 2 and 6 6 and 7 4 and 8 1 and 5 2 and 7 All
UCC
UCL
CL
.92 1.11 .79 .64 .82 1.30 .97 1.05 .76 .92
.59 .50 .65 .44 .59 .37 .23 .42 .61 .49
.24 .34 .47 .34 .28 .16 .16 .31 .39 .29
conditions. An inspection of the values in Table 3 suggests the interaction may have stemmed from the unusually wide variation across conditions in recall level for the UCC. In general, recall for the U C C was very high for those conditions where the first chunk was the UCC. The difference in recall between the U C L and the CL was significant beyond the .001 level, t(107)=7.36, p < . 0 0 1 , as was the difference between the U C L and the UCC, t(107) = 9.59,p < .001. The difference between the U C C and the combined control conditions was not significant, t(130) = .75,p > .40. These results conform quite closely to those anticipated from the model. That is, if a chunk was not changed during IL, the recall of the items in that chunk was about the same as for a control condition which had no interfering interpolated activity. However, if unchanged letters appeared in a chunk which had a change, recall of these letters showed a marked decrement in recall. One difference between the U C C and the U C L was that the U C L were generally closer to the CL than were the items in the UCC. It is possible that the proximity to the CL resulted in the lower recall of the U C L than the UCC. To check on that possibility, the Ss in the conditions which had CLs on either side of the two between-chunk boundaries were used. The Ss were scored for their recall of the CL, the U C L which was adjacent to it within the same chunk, and the U C C letter which was adjacent to the CL, except it was on the other side of the chunk boundary. In half the conditions, the U C C preceded the CL and the U C L followed it, and the reverse was true for the other half of the conditions. In all conditions, the proximity of the U C C and the U C L to the CL was the same. Mean recall of the CL for these conditions was 3 2 ~ . The mean recall of the adjacent U C C was 86 ~ , and recall of the adjacent U C L was 5 4 ~ . The U C C - U C L difference was significant beyond the .01 level, t(47)= 2.60. These data would suggest that the diff-
CHUNKING
erence in overall recall for the UCC and the UCL cannot be explained in terms of the differential proximity to the CL, because the difference is obtained even when proximity is controlled. One final issue of interest is the possibility that Ss might integrate chunks by forming directional interitem associations between the letters. If that were the case, then if the middle letter within a chunk was the CL, the recall of the first letter within the chunk should be unaffected, and the last letter should have a recall level no higher than for the CL. That might explain why the recall for the UCL was significantly higher than for the CL. The results of the analysis are reported in Table 4. Recall is expressed as a percentage of perfect performance. Because of the nature of the design, separate analyses had to be performed for each chunk, and, in each case, the issue is the extent to which the difference between recall of the first and last letter is greater for the experimental conditions than for the control. For all three chunks the recall of the UCL was below that of the control condition, with the smallest effect for the middle chunk, F(1,46) = 6.09, p < .05. In addition, recall of the last letter was below that of the first for the first and second chunks for both the experimental and control conditions. The difference was significant beyond the .01 level for both chunks, with the smallest difference for the middle chunk, F(1,46) = 7.41, p < .01. The advantage for the final letters in the last chunk fell just short of significance, F(1,46) = 3.99, with 4.05 needed for the .05 level.
729
None of the interactions were significant, with the largest effect for the middle chunk, F(1,46) = 1.84, p > .05, and that interaction was in the wrong direction. The insignificant interactions indicate that there was no significant differential effect of the CL on the recall of the first and last letters within these chunks. It would appear difficult to reconcile an associative-chaining explanation of chunk integration with these results. Furthermore, learning the order of the three chunks seems not to be a matter of associative chaining either, because recall of the third chunk when it is a UCC is even better with respect to the control performance than is recall of the first chunk when it is a UCC. If the chunks are chained, loss of the first two should result in a loss of the third. DISCUSSION There are several issues pointed out by these data regarding the nature of chunks. The first point is the idea that all the items within a chunk share a common storage in the form of an opaque container. That hypothesis would imply that anything that would tend to depress the probability of recalling the container or code would influence all the information within the code uniformly. The marked difference in recall between the UCC and the U C L offers support for that hypothesis. Both the UCC letters and the UCL were common to the two lists, and when adjacency to the CL was controlled, the UCC and U C L recall was significantly different. The only inconsistency between that interpretation and the data is the fact that the loss
TABLE
4
PROPORTION OF FIRST AND LAST LETTERS RECALLED FOR EACH CHUNK
Condition Control Experimental
First-chunk letters
Second-chunk letters
Third-chunk letters
First
Last
First
Last
First
Last
1.75 .83
1.46 .50
1.38 .71
1.13 .63
1.04 .42
1.25 .46
730
JOHNSON
within the changed chunks was not uniform. would not be recalling implicitly the OL The CLs were lost to a greater degree than the trigram when the IL trigram appeared, so UCLs. One possible interpretation of that there would be no differential C L - U C L effect is that during IL Ss may have, on some confirmation. The results of the study supported the trials, implicitly or explicitly recalled the OL predictions. After IL, the Ss were asked to code for that chunk position, as well as the items Within the code. When the memory recall the OL responses, and there was a drum turned for the study interval, the S C L - U C L difference for the group that had would be in a state of OL chunk recall, and the same color patches in OL and IL, but there when he saw the correct chunk, there would was no difference for the group that had the be differential confirmation of the CL and color patch changed. Another way this same the UCLs. That differential confirmation may issue could be examined more simply would explain why there is some selective loss of the • be to separate anticipation and study using a study-test procedure. Under study-test, the CLs within the changed code. To examine that possibility, Barron (1968) Ss would not be anticipating during study, used a procedure whereby Ss were signaled so there should be no differential confirmation between the anticipation interval and the of the CLs and UCLs within the OL codes. One qualification which should be imposed study interval regarding the correctness of their recall. If, during IL, Ss recalled an OL on the interpretation of the results of the code but they were signaled that it was wrong present study is the relatively low level of before they had an opportunity to see the original learning. It seems reasonable to correct response, then they would not be in a assume that if the Ss had highly overlearned state of OL chunk recall when the correct the OL sequences, they would have immediately items appeared. Given that, there would be recognized the relationship between the OL no opportunity for differential C L - U C L con- and IL lists and simply learned the IL list as corrections on the OL codes. That is, if an OL firmation. Barron used a three-step learning task in chunk is SBJ and the IL chunk in that position which digits were the stimuli, and S antici- is SBQ, the IL chunk may be learned as " O L pated with the name of a color patch. Then code, but change J to Q." Under these the color patch appeared, and he anticipated circumstances, one would expect a very high a trigram, which then appeared. On IL, one level of performance on that chunk during IL, letter was changed in each trigram (either the as well as near-perfect recall of the OL chunk. first or the last), and the trigrams were In the present study, OL was only about 70 ~ , m a t c h e d with their correlated OL digits. For and the relationship between the two lists half the Ss the OL color patch was used for may not have been easily recognized by the each triplet, and for the other half the colors Ss. Given that, they would adopt an entirely new code rather than attempt to place a were different during IL. When a digit appeared, if the Ss anticipated correction on the established code. The the OL responses, that recall would be re- important point is that with the level of OL inforced if the OL coior patch then appeared. used in the present study, Ss do not seem to be The Ss might then anticipate the OL trigram. able to correct old codes. Therefore, when the drum turned, there Another question raised by the data is why would be an opportunity for differential con- the UCC recall was not elevated by the IL firmation of the CL and the UCL within the task. If the OL code was used during IL, then OL code. If the color patch was changed on there should have been an increased learning IL, the OL recall would be rejected when the of the chunk. The fact that recall of the UCC I L color patch appeared. Therefore, the Ss was only slightly over 6 0 ~ suggests there
CmrNKING would be r o o m for improvement had it occurred. One possibility may be that if Ss learn a response which is identical to a previously learned response, there will be no summation of the learning from the two tasks unless the Ss are aware of the relationship between them. In addition, it would be necessary to assume that there would be no interference under these circumstances, which is not unreasonable. Another issue which needs to be considered is the question of the relationship of the items within a chunk to each other. The early work of Mtiller and Thorndike suggested that the items within one chunk were associatively independent of items within other chunks. The present data might be interpreted as indicating that chunks are not integrated by the formation of a sequence of interitem associations. I f Ss did use such associative chaining, and if the middle item of a chunk were subjected to R I and lost from memory, then the last item also should be lost, because the stimulus which elicits it, that is, the middle item, would not be present. F r o m the same argument, there would be no reason to believe that forgetting the middle item would have any influence on the recall of the first item. At the very minimum, one would expect somewhat greater loss of the final item. The results indicated that if the middle item was changed, there was no differential loss of the last letter over the first. An alternative view of chunk integration is that Ss incorporate the information into a c o m m o n code and that each item is directly related to the code but not to any of the other items within the sequence. Under this view, the only relationship that one item would have to another is that they would both be re-
731
presented in memory by the same code. Any operation that would result in a loss of the code would affect all the items equally. Furthermore, if one item was lost from memory, but the code was retained, then all ~the other items should be retained and it should make no difference which specific item was lost. While there is no definitive evidence supporting this particular view, it does appear to be somewhat more Consistent with the present data than an associalive-chaining position. REFERENCES
BARRON, R. W. Differential loss of information in chunks as a function of chunk integration. Unpublished Master's thesis, The Ohio State University, 1969. COHEN,B. H. Some-or-none characteristics of coding behavior. Journal of Verbal Learning and Verbal Behavior, 1966,5, 182-187. JOHNSON,N. F. Linguistic models and functional units of language behavior. In S. Rosenberg (Ed.), Directions in psyeholinguisties. New York: Macmillan, 1965, Pp. 29-65. JOHNSON, N. F. Sequential verbal behavior. In T. Dixon & D. Horton (Eds.), Verbal behavior and -general behavior theory. Englewood Cliffs, N.J. : Prentice-Hall, 1968. Pp. 421-450. JomqsoN, N. F. The effect of a difficult word on the transitional error probabilities in sentences. Journal of Verbal Learning and Verbal Behavior, 1969, 8, 518-523. MILLER, G. A. The magical number seven, plus or minus two: SOme limits on our capacity for processing information. Psychological Review, 1956, 63, 81-97. THORNOIr,~,E. L. The fundamentals of learning. New York: Teachers College, 193Z UNDERWOOD, B. J., & SCHULZ, R. Meaningfulness and verbal learning. Philadelphia: Lippincott, 1960.
WOODWORTH, R. S. Experimental psychology. New York: Holt, 1938. (Received May 28, 1969)