JOURNAL OF VERBAL LEARNING AND VERBAL BEHAVIOR
11, 4 9 7 - 5 1 1 (1972)
Context Effects in Recognition Memory 1 DONALD M.
THOMSON 2
University of Toronto, Toronto, Ontario, Canada The role of context in recognition memory was examined in seven experiments. In the first four experiments a context word was added to or deleted from to-be-remembered units. Recognition was impaired both when the context word added or deleted was associatively related and when it was assoclatively unrelated. The effects of changing context disappeared when context was only added in Experiment 5, but were still present when context was only deleted in Experiment 6. In Experiment 7, recognition performance was studied over several retention intervals, with critical words tested in changed or unchanged context. The deleterlous effects of changing context increased with intralist retention interval. Context effects observed m these experiments are interpreted as evidence for retrieval processes in recognition memory.
Previous Research There are a n u m b e r o f published studies in which verbal context has been manipulated and recognition performance observed. Some o f these studies are recent (Cofer, Segal, Stein, & Walker, 1969; Hall & Crown, 1970; Light & Carter-Sobell, 1970; Tulving & T h o m s o n , 1971 ; Winograd, Karchmer, & Russell, 1971), and some are old (Feingold, 1915; Meyer, 1914). The results o f all these experiments indicate that recognition o f T B R items is impaired when their context is changed f r o m input to test phase. However, two o f the context studies (Cofer et al., 1969; Hall & Crown, 1970) are open to other interpretations, since, m both studies, the manipulation o f context was confounded with the total a m o u n t o f material presented in the input phase. Further, in Tulving's (1968) experiment, context effects in recogniThis paper is based on a thesis submitted to the tion m e m o r y were inferred f r o m the data Department of Psychology, University of Toronto, in somewhat indirectly. Only the five remaining partial fulfillment for the PhD degree. The author expresses his appreciation to Endel Tulving, Bennnet studies (Feingold, 1915; Light & CarterB. Murdock, and Robert S. Lockhart for their advice Sobell, 1970; Meyer, 1914; Tulving & T h o m and encouragement. This research was partially sup- son, 1971; Winograd, Karchmer, & Russell, ported by Grant No. APA-39 from the National 1971) were explicitly designed to study, and Research Council of Canada to Endel Tulving. yielded data directly relevant to, the effects o f 2 Requests for reprints should be sent to the author at the Department of Psychology, Monash Umversity, changes in the context o f T B R items u p o n their recognition. Clayton, Victoria 3168, Australia. 497
This paper reports the results from seven experiments in which the verbal context o f to-be-remembered (TBR) items, c o m m o n English words, was experimentally manipulated and the effect o f these manipulations on recognizability of T B R items observed. The theoretical importance o f context effects in recognition m e m o r y lies in the fact that they cast d o u b t on those theories of recognition m e m o r y that assume the functional identity of nominally identical verbal items, or theories that assume that there is no problem of "access" to the stored trace o f an old item, when an old item is shown to subject (S) in the test phase. Such theories would have to be modified, or perhaps even rejected, if context plays an important role in recognition memory.
Copyright © 1972 by Academic Press, Inc. All rights of reproduction m any form reserved.
498
THOMSON
In Meyer's (1914) experiment, recognition of previously presented nonsense syllables was tested under four conditions: (a) an " o l d " syllable preceded by the syllable which had immediately preceded it in the input phase, (b) an " o l d " syllable presented alone, (c) a " n e w " syllable preceded by an " o l d " one, and (d) a " n e w " syllable presented alone. Meyer found that a word was better recognized ~f it followed the " o l d " syllable which was its immediate antecedent in the input hst. Feingold's (1915) Ss were successwely presented two lists o f items and asked to classify the second list into one o f four categories: (a) all second list items different from first hst items, (b) all but one different, (c) only one different, (d) none different Fewer second hsts containing only one first list item were correctly classxfied than second lists comprising all the first list items, a findmg that indicated recognition of an 1tern was impaired when it was tested m a new context. Light and Carter-Sobell (1970) manipulated the context o f h o m o g r a p h s so that different contexts suggested different meamngs of the homographs. Testing the h o m o g r a p h s m the context they had m the input phase resulted in superior recognition. In the Tulving and T h o m s o n (1971) study, recognition of T B R words was impaired by the addition or deletion o f associatively related context words Winograd, Karchmer, and Russell (1971) f o u n d testing T B R words without their input cues resulted m poorer recognition when Ss had been asked at input to f o r m bizarre images between T B R words and cues but not when Ss had been asked to relate T B R and cue words. The seven experiments reported below were designed to further explore the conditions under which context effects are found in recognition m e m o r y and the nature o f these effects. The rationale underlying the design in all seven experiments was the same: if input conditions are held constant, differences in recognition performance in different test conditions can be attributed to factors operating at the time o f the recognition test rather
than to differences in a m o u n t and orgamzatlon o f stored information. The first four experiments were concerned with the effect on recognition performance of adding and deleting context. Experiment 5 examined the effect on recognition performance when context was only added. Conditions were included in Experiment 6 in which the context word accompanying the T B R word in the input phase was replaced by another word in the test phase. Experiment 7 investigated the effect o f context on recognition as a function of intralist retention interval. EXPERIMENTS l - 4 : SINGLE-AND DOUBLE-WORD INPUT LISTS Since Experiments 1, 2, 3, and 4 were all variations o f the same basic experimental design, the design and procedure o f these four experiments are described together. The design and procedure of Experiment 1 are first outlined and then the ways in which Experiments 2, 3, and 4 differed f r o m Experiment 1 are discussed. Method Materials and hsts The crmcal experimental material was a basic pool of 54 word triplets selected from two sources of free association norms (Bilodeau & Howell, 1965; Rlegel, 1965). Each triplet consisted of a response word, R, and two stimulus words, Sw and Ss. According to the norms, R was weakly associated to Sw (1%), and strongly associated to Ss (53 % on the average). The selection of triplets was constrained by the condition that Sw and Ss not be assoclatively related. Some examples of Sw, Ss, R sets were cheese, grass, green; deep, bed, sleep. In addmon to the words contained m the word pool, other words--single words, weakly associated pairs, and strongly associated pa~rs--were selected to serve as filler items in the input hsts and as "new" items m the test lists. There were 110 words in the input list--34 single words and 38 pairs. The first and last 25 words in the list--5 single words and 10 pairs--served as filler items and did not appear in the test list. The remaining 60 words in the middle of the list--24 single words and 18 pairs--constituted the critical experimental material, and were subsequently tested for recognition. The two categories of words defined the two input condltmns.
499
CONTEXT EFFECTS IN RECOGNITION MEMORY A test list was made up of the 60 "old" words, plus 60 "new" words. The 120 words of the test list conslsted of 48 single words and 36 pairs. Half of the 48 single words were old, and half were new. Similarly, the composition of the 36 pairs was equated with respect to old and new words. Design Two main input conditions--absence and presence of context words--were combined orthogonally with two main test conditions--absence and presence of context words--to yield four main experimental condmons Four categories of words represented the main experimental conditions. (a) a TBR word presented and tested singly--condition singlesingle, (b) a TBR word presented singly, but tested in the company of a new assocmt~vely related w o r d - condmon single-pair, (c) two TBR words presented as members of an assocmtively related pair and then tested singly--conditmn pair-single, (d) two TBR words presented and tested as members of an associatively related pair--condition pmr-pair. Each word In a pair served the dual roles of target and context for the other member of the pair. The four mare experimental conditions are illustrated in Table 1 with a small sample of input and test items Excluding the filler items of the input last, the words in Table 1 represent one-sixth of an input list, and one-sixth of a corresponding test hst Careful study of the examples in Table 1 should facilitate understanding of the experimental desagn. Table 1 shows four single words in the input hst d~rty, cut, bug, minor. In the test list dirty and cut were presented as single words (condmon single-single); bug and minor were paired with new, associatively
related words--insect bug; minor music (condition single-pair). Of the three pairs of input words, the pmr sky blue was spht up and tested as single words (condition pair-single), and the other two input pairs--low high; frmt flower--remained intact in the test hst (condition pair-pair). Examples of new single test words m Table 1 have been arbitrarily assigned to conditions single-singleand palr-smgle In actuality, all the new single test words provided the basas of estimating the false positive rate for both condmons single-single and pair-single. The new words used as a measure of false positives in condltlOn slngle-pmr were pairs of new, assocmtively related words. In condition pmr-palr, responses to new words paired with old assoclatively related words provided the false positive estimate. In both input and test lists there were an equal number of pairs containing weakly and strongly assocmted words. The reason for balancing the number of "weak" and "strong" pairs was to attempt to evenly distribute any systematic effect associative strength mzght have across all relevant experimental condmons Four different input lists were used in the experiment. The initial and terminal filler items were identical for all four lasts Crmcal words m each input hst were drawn from the basic pool of 54 triplets, but specific words contributing data to different treatment cond~tmns were always different in the four lists The 24 single words and 18 pairs were arranged m a randomly determined order within each hst The presentation order of each test list was also randomly determined. Thus the four input-test list comblnatmns cons~tututed four replications of the basic experiment.
TABLE 1 DESIGN OF EXPERIMENT 1a Test condition Input condmon (24) Single
(36) Paar
Sample words dirty cut bug minor sky blue low high fruit flower
Single (12)
dirty cut (12)
sky bhte
Pmr (12) game loud
(12) insect bug minor music
(12) flag bone
low high frmt flower
(24)
(12) lamb sheep whisky water (12) (insect bug) b (minor music) b
" An illustrative sample of words presented In input and test hsts and the experimental conditions Old words on test list are italicized. The number of words presented in each study and test list condition is shown in parentheses. b These pairs are shown in parentheses because the responses to the old item provided the hit data for the singlepaar condmon and the pairs have been included there. In this condition the responses to the new item provide the false positive data
500
THOMSON
Procedure. The Ss assigned to each replication were usually tested in groups of ten. The Ss were told. A series of words will be shown on the TV screen at the front of the room Either a single word or a pair of words will appear at a time. Close attention should be paid to each word since your memory for these words wall be tested an a subsequent recognmon test. Whenever two words appear together on the screen you should try to relate the two words as this might be helpful in remembering the two words. The test booklets distributed to Ss were not opened until the input list had been presented. At that time the instructions contained an the booklet were read and followed. The input list, consisting of 50 filler words and 60 TBR words was then presented on a TV screen by means of a closed circuit system. The rate of presentation was 1 word/sec: a single word was shown for 1 sec, a pair for 2 sec. After the input list had been presented, Ss opened their test booklets and read the recognition-test instructions. These instructions stated that test words would be presented on the TV screen, sometimes single words, sometimes pairs. For each word that appeared on the screen Ss had to circle eather a "yes" or " n o " on a numbered hne m the booklet, depending on whether they thought the word had occurred in the input hst. It was pointed out that half the test items were old and half were new In addition Ss were to rate the confidence of their judgments on a 3-point scale depending on whether they were "very sure," "quite sure," or "guessing." A judgment of "yes" or " n o " as well as a confidence rating had to be made for each individual word, no omissions were permitted. The test-list words were then presented at the rate of 5 sec/word. Experiment 2. Experiment 2 differed from Experamerit 1 m only one respect' single words were presented for 2 sec instead of 1 sec in the input phase The reason for this manipulataon was to determine if the pattern of results found in Experiment 1 would be altered by increasing study time for single words. Expertment 3. The one difference between Experiment 1 and Experiment 3 was that, an Experiment 3, Ss were explicitly instructed to disregard the context of input items. In the input instructions of Experiment 3, Ss were told that whenever two words appeared together on the TV screen they should try to study and remember each word independently. Experiment 4. In contrast to Experiment 1, the members of a word pair were not associatively related but were selected randomly. A basic pool of 120 words was drawn from the same sets of free association norms (Bilodeau & Howell, 1965; Raegel, 1965) used in Experl-
ment I with the restriction that the words selected were not, according to these norms, assocmted. Half of these words became members of the input list and hence old items in the study list, and half were new items in the test last. In addition to the basic pool, 50 other unrelated words were selected from the norms to serve as filler items in the input hst. Subjects. One hundred and sixty female students attending psychology classes at the University of Toronto participated as Ss in the four experiments, 40 Ss an each.
Results and Discussion Hits. M e a n p r o p o r t i o n o f hits ( c o r r e c t i d e n t i f i c a t i o n o f o l d items) o b t a i n e d for the f o u r i n p u t - t e s t c o n d i t i o n s in E x p e r i m e n t s 1, 2, 3, a n d 4 are p r e s e n t e d m T a b l e 2. A n a l y s e s o f v a r i a n c e w e r e p e r f o r m e d o n hit scores, first wathin e a c h i n p u t c o n d i t i o n f o r e a c h experim e n t w i t h t e s t - c o n t e x t a n d list as v a r i a b l e s , a n d t h e n a c r o s s i n p u t c o n d i t i o n s o r experiments with input condition or interexperim e n t a l m a n i p u l a t i o n a n d list as variables. T h e m o s t s t r i k i n g f e a t u r e o f t h e hit scores p r e s e n t e d in T a b l e 2 is the i m p a i r e d r e c o g n i t i o n o f T B R w o r d s w h o s e c o n t e x t was c h a n g e d f r o m i n p u t to test phase. T h e e x t e n t o f this i m p a i r m e n t is r e m a r k a b l y c o n s i s t e n t across the f o u r e x p e r i m e n t s : a d d i t i o n o f a c o n t e x t w o r d d e c r e a s e d hit scores b y . 14, d e l e t i o n o f a c o n t e x t w o r d by .09. I n all f o u r e x p e r i m e n t s , the effect o f a d d i n g o r d e l e t i n g c o n t e x t was significant at the .01 level, F ( 1 , 36) ~> 12.2. T w o o t h e r aspects o f the d a t a p r e s e n t e d in T a b l e 2 are o f i n t e r e s t : the effect o f interexperimental manipulations, and recognition o f w o r d s tested singly as a f u n c t i o n o f b e i n g p r e s e n t e d singly o r in pairs in the i n p u t phase. T h e effect o f i n t e r e x p e r i m e n t a l m a n i p u l a t i o n s m a y be s u m m a r a z e d as f o l l o w s : 1. I n c r e a s i n g the t i m e to s t u d y single i n p u t w o r d s in E x p e r i m e n t 2 i n c r e a s e d t h e r e c o g n i t i o n o f these w o r d s - - f r o m .62 a n d .48 in E x p e r i m e n t 1 to .75 a n d .61 in E x p e r i m e n t 2, F ( 1 , 36) = 13.09, p < . 0 1 - - w i t h o u t d i m i n i s h ing t h e d e l e t e r i o u s effect o n r e c o g n i t i o n o f adding a new context word. Recognition of w o r d s p r e s e n t e d in pairs in the i n p u t list also t e n d e d to i m p r o v e w~th i n c r e a s e d s t u d y t i m e
CONTEXT EFFECTSIN RECOGNITIONMEMORY
501
TABLE 2 MEAN PROPORTION OF HITS IN EXPERIMENTS 1, 2, 3, AND 4 FOR THE FOUR INPUT-TEST CONTEXT CONDITIONSa Test condition
Experiment 1
2 3 4
Single-single
Single--pair
Pair-single
Pair-pair
.62 (0.18) .75 (0.18)
.48 (0.18) .61 (0.17)
.76 (0.12) .79 (0.15)
.85 (0.12) 88 (0.10)
.70 (0.16) .68 (0.17)
.56 (0.18) .54 (0.19)
.76 (0 16) .71 (0.14)
.85 (0.13) .80 (0 25)
" Standard deviations are shown in parentheses. Hit rates are based on 480 observations except for the pair-pair condition where there are 960. for single input words, an outcome which may have been due to Ss rehearsing some of the pair-words in the extra time given for single words. 2. Instructing Ss to study each word independently in Experiment 3 had no effect on the overall recognition of words presented in pairs in the input phase nor on the magnitude of recognition impairment when a context word was deleted--hit rates were .85 and .76 in Experiment 1 and .85 and .76 in Experiment 3. The hit rates for words presented singly in the input phase did, however, improve when Ss had been instructed to study each word independently--from .62 and .48 in Experiment 1 to .70 and .56 in Experiment 3, F(1, 36) = 6.25, p < .05. Perhaps the instructions resulted in Ss paying more attention to single words, but if this were the case, it was not at the expense of attention paid to words presented in pairs. 3. Words presented in randomly chosen input pairs were less well recognized than words presented in assoclatively related pairs - - h i t rates for associatively related words were .85 and .76 in Experiment 1, .80 and .71 in Experiment 4, F(1, 3 6 ) = 4.23, p < .05. This finding suggests that time and effort spent in
relating randomly chosen words was at the expense of other phases of memory. The other interesting aspect of the data contained in Table 2 has to do with recognition of words tested singly. In Experiment 1, recognition of items tested singly was higher if these items had been presented as members of pairs than if they had been presented as single words, F(1, 36) = 16.27, p < .01. A similar outcome, though not statistically significant, occurred in Experiment 2 even when equal study time was given for single words and pairs of words. False positives. Table 3 contains the mean proportions of false positives (incorrect identification of new items as old) for the four test conditions in Experiments 1, 2, 3, and 4. The false positive data were provided by old responses to new words in the same type of test-context in which old words were being labeled old. The main feature of the data presented in Table 3 is, with two exceptions, the absence of any systematic variation in false positive rates as a function of test conditions m the four experiments. The two exceptions are the greater number of false recognitions for new words paired with old words in Experiments 2 and 3, F(2, 36) >_ 8.13, p < .01.
502
THOMSON
presented shortly, suggests that rejection of the second, "common feature," explanation would be premature. Subsidiary findings. Some other findings, derived from subsidiary analyses of the data Test condition obtained in Experiments 1, 2, 3, and 4, suggest that conditions necessary for recognition Palred with Paired with to be impaired by addition and deletion of Experiment Single new word old word context are quite different. 1 .23 18 .24 The input lists of Experiments 1, 2, and 3 (0.15) (0.13) (0.15) consisted of single words and pairs of associa2 .19 21 27 tively related words. Half of the input pairs (0 14) (0.11) (0.15) comprised words with a strong associative 3 .26 .27 .34 relationship, and half with a weak associative (0.14) (0.18) (0.10) relationship. Further, a word in an input pair 4 .23 .24 .25 occupied the left or right side of that input (0.12) (0.13) (0.16) pair. Thus, recognition scores for words from input pairs can be examined as a functxon of a Standard deviations are shown m parentheses. test context, strength of assoclatwe relationFalse positive rates are based on 980 observations except in the pair-pair condition where there are 480. ship of input pairs, and the side occupied m an input pair. These recognition scores, hit rates, averaged across Experiments 1, 2, and 3 are Two explanations of the higher false posi- presented in Table 4. The false positive rates tive rates for new words paired with old asso- for test conditions corresponding to the hit ciatwely related words in Experiments 2 and 3 rates are also contained in Table 4. than m Experiment 1 may be entertained, The hit data in Table 4 show one striking given that these old words were better recog- exception to the uniformity of hit rates across razed in Experiments 2 and 3 than in Experi- test-context conditions: deletion of context ment 1. The first explanation is that Ss were impaired recognition considerably more when more likely to respond yes to a pair member the word had occupied the right side of a weak because the other member had been recog- associatlvely related input pair. This pattern nized. The second explanation is that Ms' of results was observed in all three experijudgments of old are made on the basis of ments. features or attributes of words being recogIn Experiment 4, words of an input pair were nized. If one assumes that associatively related randomly chosen. The hit rate for words prewords have many features in common, as the sented on the left and right sides of an input recognition of old words increases, so too will pair was .79 and .81, respectively, when the false recognition of new, associatively words were tested with their input context, related words. However, neither explanation and .74 and .68, respectively, when tested seems to be consistent with the finding that the singly. Thus, as in Experiments 1, 2, and 3, false positive rate in Experiment 1 was the the deletion of context tended to impair same for new single words and new words recognition more for words occupying the paired with old associatively related words. right side of an input pair, F(1, 108) = 3.42, Similarly, the false positive rates in Experiment . 1 0 > p > .05. When the hit scores obtained 4 appear to be inconsistent with the first, m Experiment 4 are compared with those "response bias," explanation. Nevertheless, a presented in Table 4, it can be seen that deletsubsequent analysis of false positive rates, to be ing context appears to be less deleterious to the TABLE 3
MEAN PROPORTION OF FALSE POSITIVESIN EXPERIMENTS 1, 2, 3, AND 4 FOR THE THREE TEST CONTEXT CONDITIONSa
503
CONIEXT EFFECTS IN RECOGNITION MEMORY TABLE 4 MEAN PROPORTION OF HITS FOR WORDS PRESENTEDIN PAIRS IN THE INPUT LIST AS A FUNCTION OF ASSOCIATIVES~IRENGTH OF INPUT PAIRS, SIDE OF INPUT PAIRS, AND TEST CONTEXTa Assoclatwe strength and side of input pmr Response measures
Test context
Weak-left
Weak-right
Strong-left
Strong-right
Hits
Single Pair
.84 .86
.64 .84
82 89
79 .86
False positives
Single Pair
.23 .24
.23 .23
.23 .34
.23 .31
" Mean proportion of false positives corresponding to h~t entries are also shown, Means presented were obtained by averaging over Experiments 1, 2, and 3. False positive entries for stems tested singly were all derived from the same data base.
recognition of words occupying the right side of randomly selected input pairs than to the recognition of words occupying the right side of input pairs with a weak associative relationship. This outcome is surprising, as it would seem that randomly chosen pairs m Experiment 4 would have weaker associatwe relationships than the weak associative relationships in Experiments i, 2, and 3, and this weaker assocmtive relationship would interact with test context and side to produce greater rather than less recognition impairment. A comparison of the false positwe data in Table 4 shows similar false positive rates for new words tested singly and for new words tested on the left or right side of a weakly associated old word, but higher false positive rates for new words tested on the left or right side of a strongly associated old word. This pattern of false positive rates was found consistently in Experiments 1, 2, and 3. This finding is consistent with the "common feature" hypothesis of false recognitions discussed earlier, if one assumes that strongly associated words have significantly more features in common than weakly associated words The hit rate of words presented singly and tested in the context of a new word can be examined as a function of the associative strength between members of a test pa~r in Experiments 1, 2, and 3 and the side a word
occupied in a test pair in Experiments 1, 2, 3, and 4 A similar analysis of false positive rates can be made of responses to new words tested in the context of new words. The finding from these analyses was that both hit and false positive rates were unaffected by the side a test word occupied m a pair or by the strength of its associative relationship to the other member of the pair. Thus, addition of context impairs recognition of old words, irrespective of the side on which a new context word is added, or the associative relationship between old and new words. In summary, the results of Experiments 1, 2, 3, and 4 showed that the effect of context is a robust one. Specifically, it was found that: (a) addition or deletion of context impairs recognition of TBR words, and (b) context plays an important role in false recognition. Finally, the subsidiary findings suggest that the necessary conditions for impairing recognition of TBR words differ when context is added and when context is deleted. EXPERIMENT 5: SINGLE INPUT ITEMS
The question Experiment 5 was designed to answer was whether recognition of TBR items would be impaired by the addition of a context word when the input list contained only single words. In addition, the effect on recognition of adding a context word in the test phase was
504
THOMSON
examined when the context word was also a m e m b e r o f t h e i n p u t list.
Method Materials. The word pool from which test words were drawn consisted of 40 pairs of words which, according to free association norms (Bilodeau & Howell, 1965, Rlegel, 1965) had a mean association strength of 27 ~ , plus an additional 48 words selected from these norms such that they were not associated to any of the words m the word pairs, nor to one another. A further 40 unrelated words were taken from the norms to serve as buffer items m the input lists. Design. In this experiment the input list contained only single words. Altogether there were five experimental conditions: (a) a TBR word was tested singly: (b) a TBR word was tested in the presence of a new assoclatively related word; (c) a TBR word was tested in the presence of a new assoclatlvely unrelated word; (d) a TBR word was tested in the presence of an associatxvely related word which had been a member of the input list; (e) a TBR word was tested in the context of an associatwely unrelated word which had also been a member of the input list. The input list comprised 96 single words. The first and last 20 words served as fillers and were not shown again in the recognition test. The 56 words in the middle of the list were the critical experimental material and occurred in the recognition test as old items together with 56 new words which had not been previously presented. Thus the test list contained a total of 112 words in all. Four different input-test list combinations were used in this experiment. The critical items in each Input list were drawn from the same basic pool, but specific items contributing data to the treatment conditions were always different in the four lists. The buffer items for the input lists were the same for all four lists. Order of presentation of critical words in the input hst and all words in the test lists was determined in a random fashion. All experimental conditions appeared within a single test list following a given study list, hence all comparisons were within-subject comparisons. Subjects. The 36 Ss were male and female students attending summer classes at the University of Toronto Nine Ss were assigned to each of four rephcations of the experiment. Procedure. Nine Ss at a time were scheduled to partlcipate in each experimental session, but whenever there were less than nine, an additional session was held to make up the numbers. The Ss were told they would see a list of words on the TV screen which should be studied carefully as later on a recognition test for these words would be given.
The input hst of 40 filler and 56 critical words was then presented on the TV screen. Each word was displayed for 1 sec. When the input list was completed, Ss were asked to open their booklets and read the recognition instructions contained an the booklet. The instructions and the recognmon test procedure were the standard ones employed in Experiments 1, 2, 3, and 4.
Results and Discussion T h e m e a n p r o p o r t i o n o f h i t s a n d false p o s i t i v e s f o r t h e five e x p e r i m e n t a l c o n d i t i o n s a r e p r e s e n t e d i n T a b l e 5. The hit data show that the change of context did not impair recognition of TBR words, F ( 4 , 128) = 1.11 ; it d i d n o t m a t t e r w h e t h e r t h e context word added was or was not assoclatively related, whether the context word added was new or was itself a TBR word. The type of c o n t e x t d i d , h o w e v e r , affect false r e c o g n i t i o n , F ( 4 , 128) = 4.89, p < .01. T h e f a l s e p o s i t i v e r a t e f o r n e w w o r d s p a i r e d w i t h old, a s s o c i a tively related words was higher than the next highest false positive rate, t(128)=2.31, p < .05. F u r t h e r d i s c u s s i o n o f t h e s e r e s u l t s will b e p o s t p o n e d till t h e r e s u l t s o f E x p e r i m e n t 6 are presented.
EXPERIMENT 6: DOUBLE-WORD INPUT LIST The main purpose of Experiment 6 was to e x a m i n e t h e effect o f d e l e t i n g c o n t e x t o n r e c o g n i t i o n w h e n t h e i n p u t list c o n t a i n e d o n l y a s s o ciatively related pairs of words. Thus, Experim e n t 6 c a n b e t h o u g h t o f as b e i n g c o m p l e m e n t a r y t o E x p e r i m e n t 5. O t h e r c o n d i t i o n s were included in Experiment 6 to determine t h e effect o n r e c o g n i t i o n o f r e p l a c i n g t h e i n p u t context with another context word.
Method Materials. The basic word pool comprised 12 single words, 36 word pairs, and 12 word quadruplets. The single words, pairs, and three words of the quadruplets were drawn from association norms (Bllodeau & Howell, 1965; Riegel, 1965). The pairs were selected such that the right member occurred as a free association response to the left member (mean associative strength of 27). Each quadruplet consisted of a response (R1) word from the norms and two stimulus
505
CONTEXT EFFECTS IN RECOGNITION MEMORY TABLE 5 MEAN PROPORTION OF HITS AND FALSEPOSITIVESAS A FUNCTION OF TEST CONTEXTa Test context
Response measure Hits
False positives
1
2
3
4
5
Experiment
Nil
New associated word
New random word
Old associated word
Old random word
5
.74 (0.19)
.71 (0.20)
.70 (0.23)
.76 (0.16)
.74 (0 16)
6
.75 (0.19)
.74 (0.20)
.71 (0.27)
.88 (0.13)
71 (0 16)
5 6
.17
.23
.21
(0.15)
(0.19)
(0.16)
29 (0 24)
(0.21 )
.21
.18 (0.t9)
22 (0 20)
18 (0 14)
.27 (0.20)
21 (0 19)
" Standard deviations are shown in parentheses. Hit and false posRlve rates are based on 288 observations in Experiment 5 and 180 observations in Experiment 6 except for hits in condmons 4 and 5 and false posmves in conditions 2 and 3 which are based on 576 observations in Experiments 5 and 360 in Experiment 6. words $1 and $2 which evoked R1 as a response. The S1 and $2 words were selected in such a fashion that, according to the free associatmn norms, they were not associated to one another. The fourth word (R2) was, in the judgment of the writer, a word which could be easily related to $2. An example of a quadruplet is. noise (S1), blow ($2), wind (RI), htt (R2). With the exception of R2, the words used in thls experiment were the same as those used in Experiment 1. In addition to the pool described above, 30 other assoclatively related words were selected from the norms to serve as filler items in the input list. Deslgn There were five different experimental conditions All words were presented in the input hst as members of assoclatwely related pairs, and tested: (a) singly; (b) in the context of a new associatwely related word; (c) in the context of a new assoclatwely unrelated word; (d) as a member of an undisturbed input pair, that is, in the context of an old assoclatlvely related word; and (e) in the new context of an associatwely unrelated word which had been presented with some other word in the input hst. A total of 60 assoclatlvely related pairs was presented in the mput list The first and last 15 pairs were filler items and did not appear in the test list. The remaining 30 pairs constituted the crmcal experimental material. Forty-two words of the experimental material occurred in the recogmtlon test as old words together with 42 new words The 84 words m the test hst consisted of 12 single words and 36 pairs.
In each test condition there were an equal number of left and right side members of pairs presented In the input phase. A word tested as a pair member always occupied the same side as it did in the study list. Three different lists were used in this experiment. The critical words in each input-test list were drawn from the same basic pool, but specific words contributing data to the different test condltmns were always different in three hsts. The initial and final filler items were, however, identical for all three lists. The 30 critical pairs in the input list were arranged m a randomly determined order. Similarly the order of presenting the 12 single words and 36 pairs of the test list was a random one Thus, the three input-test lists constituted three replications of the experimental treatment. Subjects. The 30 Ss were male and female psychology students attending summer classes at the Umverslty of Toronto. Ten Ss were tested in each of the three rephcations of the experiment. Procedure. Instructions preceding the presentations of the input list were similar to those of earlier experiments. The Ss were informed that they would be shown a list of word pairs on the TV screen and that their memory for these words would be tested m a recognition test. The Ss were advised to try to relate the two members of a pair as th~s might assist them to remember. The input list consisting of 30 filler pairs and 30 critical pmrs was then displayed on a TV screen at a rate of 2 sec/pair.
506
THOMSON
At the conclusmn of the input hst, test booklets were opened and the standard recogmtion instructions were read.
Results and Discussion Table 5 contains the m e a n p r o p o r t i o n o f hits a n d c o r r e s p o n d i n g false positives for the five experimental conditions. The m a j o r finding o f this e x p e r i m e n t wlth respect to the hit d a t a was t h a t r e c o g m t i o n o f T B R items was i m p a l r e d when their context was changed f r o m i n p u t to test phase, F ( 4 , 108) = 5.12,p < .01. It & d n o t seem to m a t t e r h o w the context was changed, j u s t so long as it was changed. A n e x a m i n a t i o n o f the false positive d a t a in T a b l e 5 shows t h a t context plays an i m p o r t a n t role in new w o r d s being falsely recognized, F(4, 108) = 2.74, p < .05. N e w w o r d s tested in the presence o f old associatively related w o r d s were m o r e likely to be falsely recognized t h a n new w o r d s tested singly, t(108) = 2.83,p < .01, with false recognitions o f new w o r d s tested in the o t h e r c o n d i t i o n s falling in between these two conditions. E x p e r i m e n t s 5 a n d 6 were designed to determine whether the i m p a i r e d r e c o g m t l o n f o u n d in Experiments 1, 2, 3, a n d 4 when c o n t e x t was c h a n g e d d e p e n d e d on the presence o f single w o r d s a n d pairs o f w o r d s in the i n p u t list. The results o f E x p e r i m e n t s 5 a n d 6 indicate that the presence o f single a n d pairs o f words in the i n p u t list is a critical c o n & t i o n for recognition to be i m p a i r e d by a d d i t i o n o f context but is n o t for recognition to be i m p a i r e d b y deletion o f context. This finding can be construed as further evidence t h a t c o n d i t i o n s necessary for r e c o g n i t i o n to be i m p a i r e d by the a d d i t i o n or deletion o f context are &fferent. The second p o i n t to be n o t e d has to d o with false r e c o g n i t i o n o f new words. W h e t h e r the i n p u t list c o m p r i s e d single w o r d s or pairs o f w o r d s seems to be o f little consequence, as the p a t t e r n a n d m a g n i t u d e o f false positive rates for the five same test c o n d i t i o n s in Experiments 5 a n d 6 were very similar.
EXPERIMENT 7: SUCCESSIVE CONTEXT In Experiments 1-6 a study-test p r o c e d u r e was employed. U s i n g this p r o c e d u r e it is difficult to d e t e r m i n e the effect o f r e t e n t i o n interval on r e c o g n i t i o n p e r f o r m a n c e when context is changed. To examine this question, a c o n t i n u o u s r e c o g m t l o n t a s k was e m p l o y e d in E x p e r i m e n t 7. I n a c o n t i n u o u s r e c o g n i t i o n task a series o f items is presented sequentially to Ss who m u s t j u d g e whether each item has or has n o t occurred earlier in the series. R e t e n t i o n interval in such a t a s k can be defined in terms o f l a g - - t h e n u m b e r o f items intervening between successive p r e s e n t a t i o n s o f an item.
Method Matertals. Two hundred and thirty homographs were chosen from Roget's thesaurus. For each word two meanings were employed and the selection of the homographs was restricted by the con&non that for each of the two meanings of the homographs, three other words could be found which were synonymous or which belonged to the same conceptual category. The following example illustrates two groups of words employed with the two meanings ("A" and "B") of ",ron" and "gorge": A. copper, zinc, steel, iron B. sweep, wash, mend, iron A. glutton, cram, guzzle, gorge B. canyon, crevasse, gully, gorge Two lists of words were then constructed In L~st 1, each homograph was typed on a line together with the three other related words belonging to the A meaning. Likewise, Lst 2 comprised each homograph with the three related words belonging to the B meaning. The two meanings of a homograph were randomly assigned to the A and B classifications. Booklets were then made of the two lists and 14 graduate-student judges rated each group of four words according to the "degree of coherence" of the group on a 10-point scale. Half of the judges rated List 1 first, then two days later rated List 2, for the other judges the order of rating was reversed. From the 230 homographs, 48 were chosen such that the average coherence ratmg for the words of both meanings in their groupings was at least 7.0. These 48 homographs together with the three other related words for each of the two meanings of the homographs made up the experimental word pool. The related words were used to provide verbal context in which to present a homograph. In ad&tlon to the basxcpool just described, 40 trzplets of words were selected such that the three words corn-
507
CONTEXT EFFECTS IN RECOGNITION MEMORY prising the triplet were synonyms or belonged to the same category These 120 words were to serve as filler items in the expemmental list. No word appeared more than once in the word pool. The experimental list comprised 408 words including 48 critical "study" words and 48 critical "test" words. Critical study words and test words were always preceded by two conceptually related words--the context of the critical study and test words. Critical study words were homographs and were followed at lags of 2, 5, 17, or 62 words by critical test words. There were two types of Cmlcal test words: old (homographs repeated) and new words. The two types of critical test words were tested in one of two contexts, old or new. Thus a critical test word could have been (a) a critical study word repeated m its old context, (b) a critical study word repeated in a new context, (c) a new word in an old context, or (d) a new word in a new context, that is, the critical test word was unrelated to the critical study word. Thus, for example, the study words pull, drag, draw may be followed by the test wordspull, drag, draw; or paint, illustrate, draw, or pull, drag, haul; or
pamt, illustrate, sketch. All ! 6 conditions appeared within a single experimental list, hence all comparisons are within-subject comparisons. Eight different expemmental hsts were constructed so that both meanings of a homograph contrlbuted data to each of the four critical test wordcontext conditions. Subjects. Forty female psychology students attending summer classes at the Umverslty of Toronto were used as Ss. Five Ss were assigned to each of the eight rephcatlons of the experiment.
Procedure. Typically Ss were tested in groups of five. After Ss recewed their answer booklets, the experimental instructions were read to them. These instructions stated that Ss would see a list of words presented one at a time on the TV screen. For each word presented, they had to decide whether or not it had previously appeared m the list and circle a "yes" or " n o " in an answer booklet. In addition, Ss had to rate the confidence of their decision on a 3-point confidence scale. Ample time was given all Ss to complete their answers although they were encouraged to respond briskly. Results and Discussion H i t a n d false p o s i t i v e r a t e s f o r t h e d i f f e r e n t c o n t e x t c o n d i t i o n s o v e r t h e f o u r lags a r e p r e s e n t e d i n T a b l e 6. A comparison of hit rates in Table 6 reveals t h a t t h e effect o f c o n t e x t i n r e c o g n i t i o n inc r e a s e d as r e t e n t i o n i n t e r v a l i n c r e a s e d . T h e i n t e r a c t i o n o f T e s t C o n t e x t × L a g w a s signific a n t a t t h e .01 level, F ( 3 , 2 2 4 ) = 8.57. A t L a g 2 there was a relatively small difference between the hit rates of words tested in old and new c o n t e x t s (.98 a n d .93, r e s p e c t i v e l y ) , b u t a t L a g 62 t h e d i f f e r e n c e w a s q u i t e l a r g e (.92 a n d .65), t(224) = 8 0 9 , p < .01. T h e m a i n effects o f test context, F(1,224)= 46.13, a n d lag, F(3, 224) = 22.28, w e r e b o t h s i g n i f i c a n t a t t h e .01 level.
TABLE 6 MEAN PROPORTION OF HITS AND FALSE POSITIVES AS A FUNCTION OF CONTEXT AND LAG IN EXPERIMENT 7 a
Lag Response measure Hits
False positives
Context condition
2
5
19
62
Old
.98 (0 07)
.98 (0 07)
96 (0 11)
.92 (0.1S)
New
.93 (0.14)
.93 (0.16)
.85 (0 23)
.65 (0.32)
Old
06 (0.15)
16 (0.23)
.14 (0.24)
.20 (0.25)
New
.09 (0.20)
.05 (0.12)
.06 (0.15)
.06 (0.15)
a Standard deviations are shown in parentheses. Hit and false positive means are based on 240 observations.
508
THOMSON
False positives made to new words tested in old context increased over lag, F(3, 224)= 4.27,p < .01, from .06 at Lag 2 to .16 at Lag 5 and then more gradually to .20 at Lag 62. Statistical tests on the false positive data show that there were more false positives for new words tested in an old context than a new one, F(1,224) = 16.96, p < .01. The important conclusion to be drawn from both the hit and false positive data of this experiment is that the role of context in recognition memory increases as retention interval increases. GENERAL DISCUSSION
There are three main sections in this final discussion. In the first section, the major findings of the reported experiments are summarized, the implications of these findings for theories of recognition are discussed, and speculations are made to account for some of the specific findings. The second section compares aspects of recognition data obtained in these studles with previously reported recall data. In the final section, measures of recognition are considered.
Empirical Findings The most important thing to be learned from the experiments reported in this paper is the ubiquity of context effects in recognition memory. Context effects were obtained: (a) when context of TBR words was changed by adding, deleting, or substituting context words, (b) with homographs, associatively related, and associatively unrelated words as the recogration material, and (c) in study-test and continuous recognition tasks. However, the addition of context to TBR words had no effect on recognition performance when the input list comprised only single, unrelated words (Experiment 5). One other finding pertaining to context effects is also of importance: context becomes more crucial in successfully recognizing an old item as retention interval increases (Experiment 7).
Two sources of forgetting can usefully be dlstingmshed: stored information may be displaced from the memory store during the retention interval (the information is unavailable), or the information cannot be retrieved from the store despite its availability (the information is inaccessible). In the design of all experiments reported here the amount and organization of information in the memory store at the end of the presentation of the input list for words within a given input condition was equated. Therefore, within a given input condition, the differences found in recognition performance as a function of different test conditions must reflect the accessibility rather than the availability of the relevant stored information. This finding clearly runs counter to those views of recognition memory which explicitly deny access difficulties as a source of forgetting in recognition memory (e.g., Bower, Clark, Lesgold, & Winzenz, 1969; Kintsch, 1969, 1970; Murdock, 1968). In the light of the generality of context effects reported in this paper and other similar findings (e.g., Feingold, 1915; Light & CarterSobell, 1970; Meyer, 1914; Tulving & Thomson, 1971; Winograd, Karchmer, & Russell, 1971) revision of such theories of recognition memory seems to be necessary. The explanation to be offered here for the general finding that changing the context of a word impairs its recogmtion revolves around two propositions. The first proposition is that the encoding of a word is determined by its physical features and its cognitive environment which may be defined by, among other things, its verbal context. The second proposition is that access is gained to a stored trace by a test word to the extent the encoding of the test word matches the encoded trace. Thus, the more the encoding of a TBR word varies in the text phase from the encoded trace laid down during the input phase, the less probably access is gained to the stored trace--the "encoding vanablhty" hypothesls. The effect of changing the context of a TBR word in the test phase is to increase the likelihood that its
CONTEXT EFFECTS IN RECOGNITION MEMORY
encoding differs in the input and test phases, and thus to decrease the likelihood that the test word gains access to the trace of the "same" word stored during the input phase. The finding in Experiment 7 that the deleterious effect on recognition of changing the context of a TBR word increased over time suggests one or both of two things. The first thing is that the stored trace of the old context rather than the new context may define Ss' cognitive environment. For example, having just seen "iron" in the context of "copper" and "zinc," Ss may perceive "iron" as a metal when "iron" is presented again soon after, even though iron is presented in the context of "wash" and "mend." As the retention interval increases, the trace of the old context becomes unavailable or inaccessible and the immediate context then provides the cognitive environment for the word. Alternatively, the effect of retention interval may be explained by assuming that the presentation of a word results in many of its features being stored (e.g., semantic, orthographic, phonemic; see Underwood, 1969). Information about the orthographic and phonemic features becomes unavailable or inaccessible more rapidly than semantic information. Thus, at short retention intervals a word can readily be recognized on the basis of its orthographic or phonemic features even though the semantic features are different. Once the orthographic and phonemic features are lost, access to the stored trace can be gained only by matching semantic features. The encoding variability hypothesis can be further explored in relation to some of the specific findings. Consider the finding of Experiments 1, 2, and 3 that deletion of context had a minimal deleterious effect on recognition of words from strong associatively related input pairs, and maximal effect on recognition of words from the right side of weak associatively related input pairs. Gwen tt;at (a) the relationship between words used in pairs in Experiments 1, 2, and 3 was generally bidirectional for strong associatively related
509
pairs and unidirectional for weak associatively related pairs, and (b) single words elicit associates as implicit responses, the following explanation seems plausible. At the time of testing, associates elicited by single words served as context for these words, and thus test words from strong associatively related input pairs were likely to have their input context implicitly reinstated whereas the input context of words from the right side of input pairs was unlikely to be Implicitly reinstated. There are, however, certain aspects of the data in the experiments reported here that are not consistent with this explanation, for example, the relatively high recognition of words tested singly which were left-side members of weakly associated input pairs in Experiments 1, 2, and 3 ; the failure to find that recognition was not more impaired by the substitution of new context words than just the deletion of context words in Experiment 6. A satisfactory account of the conditions necessary for recognition impairment by changing context will probably not be forthcoming until further investigation has been carried out.
Comparison of Recognition and Recall Data The recognition data obtained in Experiments 1, 2, 3, and 4 for single input words takes on further significance when aspects of these data are compared with results of cued recall studies. The finding in Experiments 1, 2, 3, and 4 that the presence at testing of a new word having a strong associative relationship to the TBR word impairs recognition performance contrasts with a common finding in cued recall tasks (e.g., Thomson & Tulving, 1970) that the presence at recall of a new word having a strong associative relationship to the TBR word facilitates recall. The contrasting outcomes of the presence of a new, strong assoclatively related word at the time of output would appear to be important as another instance of recall and recognition being differentially affected by the same variable. A simple explanation of the differential effect of new, strong associatively related words
510
THOMSON
m cued recall and recognition tests can be offered. In cued recall tasks, new, strong associatively related words are more effective retrieval cues than the cues that are available in free recall situations. However, in recognition memory, the optimal retrieval cue o f a T B R unit is a unit nominally identical to that T B R unit. The addition of a strong associatively related word to an old single word produces a unit that is a less effective retrieval cue than the old word by itself and consequently recognitaon performance is impaired.
example, old context caused Ss to call more old items " o l d " and more new items " o l d , " and hits and false positives are combined subtractively, the effect of context is largely eliminated in the consequent index. Thus, by automatically combining measures in recognition tasks, investigators have overlooked a potentially fruitful area o f research: the systematic exploration o f variables affecting Ss' recognition o f items, old and new.
Measures o f Recognition
BOWER, G. H , CLARK, M. C, LESGOLD,A M., & WINZENZ, D. Hierarchical retrieval schemes 111 recall of categorized word lists. Journal of Verbal Learning and Verbal Behavtor, 1969, 8, 323-343. BILODEAU,E. A., • HOWELL, D. C. Free association norms. Washington, D C. : United States Government Printing Office, 1965. Cat. No. D210 2:F87. COFER, C. N., SEGAL, E., STEIN, J, & WALKER, H. Studies on free recall of nouns following presentation under adjectwal modification. Journal o f Experimental Psychology, 1969, 79, 254~264. ENGLISH, H. B., & ENGLISH,A V. A comprehensive dwtionary of psychological and psychoanalytical terms New York McKay, 1958. FEINGOLD, G. Recognition and discrimination. Psychologwal Monographs, 1915, 78, 1-128. HALL, J. W, & CROWN, I. Associative encoding of words m sentences. Journal of Verbal Learning and Verbal Behavwr, 1970, 9, 303-307. K1NTSCH,W. Models for free recall and recogmtlon. In D. A Norman (Ed.), Model of Memoty. New York: Academic Press, 1970. KINXSCH,W. Recognition and free recall of organized lists. Journal of Experimental Psychology, 1969, 78, 481-488. LIGHT, L, & CARTER-SOBELL,L Effects of changed semantic context on recognition memory Journal of Verbal Learning and Verbal Behavior, 1970, 9, 1-12. MEYER, H. W. Bereltschaft und Wledererkennen. ZettsehrtftfurPsychologte, 1914, 70, 161-221. MURDOCK,B. B. Modahty effects in S.T.M. Storage or retrieval ? Journal of Expemmental Psychology, 1968, 77, 79-86. RIEGEL, K. F. Free associative responses to the 200 stimuli of the Michigan restricted association norms. Report No. 8. Ann Arbor: University of Michigan Press, 1965.
In a typical recognition m e m o r y experiment such as those reported in this paper, two separate measures are obtained: hits and false positives. Almost invariably these two measures are combined or transformed into a single index, such as d', or some f o r m of corrected recognition score. In this final section, It is argued that the decision to combine these two measures should be determined by the purpose o f the experiment. It has become increasingly popular to blur any distinction between recogmtmn and discrlmlnation. Yet, a cursory examination o f a dictionary reminds us that recognition and &scr~mmatlon are not synonymous. Enghsh and English (1958), for example, define recognmon as "awareness o f an object as one that has been previously experienced," and discrimination as the "perception of differences." Recognition m e m o r y paradigms can be and often are used in &scrimination studies, for example, the similarity o f new and old ~tems is manipulated and Ss' ability to discriminate between these items is measured. If the experiment is a discrimination one, then combining hits and false positives is appropriate, such measures as d' and corrected scores being adopted directly f r o m psychophysics. On the other hand, if the purpose o f the experiment is to investigate varmbles that influence Ss' awareness that certain events have occurred before, then at seems particularly inappropriate to combine measures. If, for
REFERENCES
THOMSON,D M , & TULVING,E Associative encoding and retrieval: Weak and strong cues Journal of Experimental Psychology, 1970, 86, 255-262.
CONTEXT EFFECTS IN RECOGNITION MEMORY TULVING, E. When is recall higher than recognition ? Psychonomtc Science, 1968, 10, 53-54. TULVIN~, E., & THOMSON,D. M. Retrieval processes in recogmtxon memory: Effects of associative context. Journal of Experimental Psychology, 1971, 87, 116-124. UNDERWOOD, B. J. Attributes of memory. Psychological Review, 1969, 76, 559-573.
511
WINOGRAD, E., KARCHMER,M. A., & RUSSELL, L S. Role of encoding unitization in cued recognition memory. Journal of Verbal Learmng and Verbal Behavior, 1971, 10, 199-206.
(Received January 15, 1972)