JOURNAL OF VERBAL LEARNING AND VERBAL BEHAVIOR
12, 440-447 (1973)
Frequency Judgments and Recognition of Homonyms I EDWARD J. ROWE Memorial University of Newfoundland, St. John's, Newfoundland, Canada Two groups of subjects made absolute frequencyjudgments of homonyms presented one to five times in different phrase contexts designed to evoke either the same or different semantic encodings of the homonyms on each presentation. Same-meaning homonyms produced lower frequency estimates than the two control conditions, which consisted of either repetition of identical phrases or repetition of the homonyms alone. Differentmeaning homonyms produced the lowest estimates of all. Recognition memory was not differentially affected by the three phrase context conditions, suggesting that the processes which mediate frequency judgments and recognition memory are not equivalent. The results are discussed in relation to the Anderson and Bower (1972) model of recognition memory.
words, that is, will absolute frequency judgments be affected if the meaning of a word is different each time it is presented ? To this end, homonyms were presented from one to five times in a list in the context of different phrases which were intended to evoke either the same semantic encoding (for example, chocolate bar, candy bar) or a different encoding (for example, chocolate bar, sand bar) on each presentation, followed by frequency judgments of the homonyms alone. Two control lists where the same phrase was repeated intact or where only the homonyms were presented were also included. The second experiment was designed to provide some information about the degree of commonality between frequency judgment and recognition memory processes. Prima faeie, the two types of tasks seem to be intrinsically related. On the one hand, the fact that a person can report that a word has been presented a certain number of times in a list implies that this item was recognized as the same word on each presentation. Thus, recognition may be viewed as a process which is basic to frequency estimation. On the other hand, it seems equally apparent that recognition of an z This research was supported by Grant A8580from item means that the subject has some knowlthe National Research Council of Canada. Lee A n n Montgomery assisted in the data collection and edge of its frequency of occurrence, namely, that it has been experienced at least once in analysis. Copyright © 1973 by Academic Press, Inc. 44O The human subject's memory system is sensitive to variations in the frequency of occurrence of presented items. Both absolute and comparative frequency judgments of words presented varying numbers of times in a list reliably reflect differences in presentation frequency (for example, Hintzman, 1969; Underwood, Zimmerman, & Freund, 1971). An important advance in the understanding of the processes which mediate frequency judgments has been made by Hintzman and Block (1971). These investigators demonstrated that subjects can retain separate frequency information for the same words occurring at different frequencies in two successive lists, thereby suggesting that each repetition of an item establishes a new memory trace instead of simply increasing the strength of an already existing representation. The first experiment to be reported here was concerned with the nature of these independent representations which serve as the basis for frequency estimation. The specific question investigated was whether frequency judgments are dependent on the semantic properties of
All rights o f reproduction in any form reserved. Printed in Great-Britain.
FREQUENCY AND RECOGNITIONOF HOMONYMS some reference situation, for example, a study list. The latter position has been made explicit by U n d e r w o o d and his associates (Underwood, 1972; U n d e r w o o d et al., 1971) in the frequency theory of recognition memory, which proposes that, other things being equal, subjects discriminate between old and new items on a recognition test by virtue of the fact that old items have acquired a situational frequency of one whereas new items have a situational frequency of zero. Experiment II of the present study was a replication of the first experiment, except that the subjects were required to make recognition confidence ratings instead of frequency j u d g m e n t s at the time of test. To the extent that the two tasks share the same underlying mechanisms, we expect to obtain parallel effects of semantic context in both experiments. Two recent investigations, one involving a frequency j u d g m e n t task a n d the other recognition, bear directly on the present research. Jacoby (1972), as part of a more complex experimental design, examined frequency j u d g m e n t s of words presented one, two, or four times with 0, three, or 10 other items intervening between successive repetitions. The words were presented in sentence contexts which were either identical, similar, or different on each repetition, but with each word retaining the same m e a n i n g throughout. The average frequency estimate was higher for words presented in the context of similar modifiers than for the identical-context or differentcontext conditions, which were a b o u t the same, especially at longer lags. These results suggest that frequency j u d g m e n t s should be the same for h o m o n y m s repeated in identical phrases or in different phrases which evoke the same meaning. There appears to be no empirical precedent to indicate what effect, if any, changing the semantic encoding of a word will have on frequency judgments. However, W i n o g r a d and Raines (1972) f o u n d no overall difference in recognition m e m o r y between conditions where h o m o n y m s were presented twice in either identical or different semantic
441
contexts° This finding leads us to expect a similar result for recognition m e m o r y in the present study, at least as far as low levels of presented frequency are concerned. EXPERIMENT I Method Subjects and design. The subjects were 80 volunteers from introductory psychology classes at Memorial University who were paid 81.00 each for their participation. Most of the subjects were tested in groups of four, with one subject assigned to each of the four experimental conditions: Repeated words (RW), Repeated phrases (RP), Same-meaning phrases (SM), and Different-meaning phrases (DM). When one or more of the subjects failed to appear for the experiment, additional subjects were run at a later time to equate the number in each condition at 20. The items studied by the subjects occurred at one of five frequencies within the constraints of each experimental condition. The design was, therefore, a 4 x 5 factorial, with repeated measures on the second factor. Materials. One hundred twenty homonyms were selected from various dictionary sources such that 24 had at least five distinct meanings, 24 had at least four distinct meanings, and 24 had at least three, two, and one meaning, respectively. List DM was constructed by embedding the homonyms representing each of these five categories in short phrases not more than four words in length designed to evoke each of the different meanings of the word. For example, the phrases used to convey the five different meanings of the word bar were sand bar, bar service, chocolate bat', called to the bar, and barfi'om entering. The items for list SM were obtained by selecting one phrase from the set drawn up for each word of list DM and composing additional phrases to denote the same meaning of the word, e.g., chocolate bar, candy bar, piece of bar, block of bar, a caramel bar. Each of these two lists was reviewed independently by four judges (the author, another faculty member, a graduate student, and a research assistant), who checked to see that the phrases were meaningful and that they conveyed either the same meaning (list SM) or different meanings (list DM) for each word. Phrases which were not unanimously accepted were replaced. The remaining two lists were drawn up by randomly selecting a phrase for each word in list DM and presenting either the word alone (list RW) or the entire phrase (list RP) from one to five times within the list. Each item was typed in upper case pica on a 3 x 5-in. white index card. Two separate decks of cards were constructed from the basic list of items, with 12 words representing each level of frequency in each deck. The cards were assigned to positions in the deck by a
442
ROWE
modified block-randomization procedure to ensure that items representative of each frequency level would be distributed evenly throughout the deck. This was accomplished by dividing the deck into six sections. Two of the 12 items which occurred only once were then assigned to each section. Repeated items were always assigned to adjacent sections, with different sections being used as the starting point equally often. For example, for items occurring three times, there were four possible sections in which to begin a particular item: 1, 2, 3, and 4. Three of the 12 items at this frequency appeared for the first time in each of these four sections. This procedure resulted in 17, 33, 40, 40, 33, and 17 cards occurring in each successive section of the deck. The cards in each section were then shuffled thoroughly, giving an average lag of 25 between repeated presentations of an item. Procedure. The subjects went through the deck of cards in time with the clicks from an electronic metronome, which sounded every 3 sec. They were told to read the words to themselves in preparation for a later memory test, but the nature of the test was not specified. Half of the subjects in each condition received one set of items and the remainder the other set. Practice in going through the cards at the 3-see rate was provided by a set of 10 cards, each of which had a three-digit number typed on it. Immediately after going through the deck, each subject was given an answer booklet with the following instructionson the first page: Some of the words that you saw on the cards, along with some new words, are printed on the following pages with a blank beside each. You have to estimate how often you saw each word in the deck of cards by writing a number in each blank. For example, if you think you saw the word DOG twice, you would write the number 2 in the blank beside "dog." You may use any number you wish as your estimate. If you don't remember seeing a word when you went through the cards, write a 0 in the blank. Work quickly but carefully through these sheets. Write a number in every blank, that is, do not leave any out. Each answer booklet contained four pages, with 20 words per page. Three of the words from each frequency level occurred on each page in a random order, along with five words which had not occurred in the deck. The same answer booklets were used for each experimental group, but there were, of course, different booklets for the two different decks. In addition, half of the booklets for each deck had the pages in the reverse order from the other half. Results and Discussion
The basic d a t u m was the m e a n frequency estimate assigned to the items at each frequency
level by each subject. The m e a n of these means for each experimental condition is shown in Figure 1. The data for zero-frequency words were analyzed separately by a one-way analysis of variance, which showed that the estimates assigned to these items did not differ for the four experimental groups ( F < 1). The remaining data were subjected to a 4 × 5 analysis of variance w i t h groups a n d frequency level as factors. Significant effects were f o u n d for b o t h groups, F(3, 7 6 ) = 13.0, and frequency, F(4, 304) = 110, and the groups × frequency interaction, F(12, 304) = 3.77 (all ps < .001). 4
o RW
o
3
•
• RP • SM
oj
• DM
o ~ e /
/e
0
8
Inn
0
t 0
I 1
i 2
PRESENTED
I 3
I 4
I 5
FREQUENCY
FIG. 1. Absolute frequency judgments as a function of presentation frequency and list type (repeated words, RW; repeated phrases, RP; same-meaning homonyms, SM; different-meaning homonyms, DM). Pairwise comparisons a m o n g the four experimental groups by the Newman--Keuls procedure with the data collapsed across frequency level revealed significant differences (p < .05) between all conditions except R W versus RP. The basis of the groups × frequency interaction was further examined by separate one-way analyses of variance followed by N e w m a n - K e u l s comparisons at each level of presented frequency. The pattern of results was exactly the same for frequency levels 1, 2, a n d 3, where all comparisons were significant (p < .05) except D M versus SM a n d R W versus RP. 2 F o r presented frequencies of 4 and 5, the pattern was somewhat different. Here conditions R W and R P still did n o t differ sig-
FREQUENCY AND RECOGNITION OF HOMONYMS
nificantly, nor did RP versus SM, but conditions DM versus SM did. All other comparisons were also significant for frequency levels 4 and 5, Three main conclusions may be drawn from the data. First, frequency judgments appear to be equivalent for words which are presented either singly or in the same phrase context on each repetition. Even though the estimates given by the RP group were consistently lower than those given by the RW group for items occurring more than once, the statistical analysis provided no basis for rejection of the null hypothesis for these two list conditions. The analysis did, however, show that these groups produced estimates which were reliably higher and generally more accurate than the remaining two conditions. Second, words which are repeated within the same context produce consistently higher estimates than words which retain the same meaning but occur in varied context, with the difference being reliable only at lower levels of presented frequency. This result at first glance seems contrary to that reported by Jacoby (1972), who found no difference between identical-context and different-context conditions analogous to those studied here. However, the number of items intervening between successive repetitions, which averaged 25 in the present study, was not greater than 10 in Jacoby's experiment, suggesting a possible explanation for the discrepancy. Also, the indication of a generalized set effect in the estimates for condition RP cautions against an interpretation of the difference between conditions RP and SM in terms of item context differences alone. 2 Strictly speaking, there is no reason why the frequency estimates assigned to once-presented words should differ for the various context conditions. The fact that significant differences did occur indicates the presence of a generalized set on the part of subjects in conditions SM and D M to assign lower estimates than the subjects in conditions RW and RP. Thus, the obtained differences between conditions RW and RP on the one hand and SM and D M on the other for items presented more than once may be exaggerated.
443
The third conclusion to be drawn from the results of Experiment I is that words which retain the same meaning within different phrases produce higher estimates than words which have a different meaning on each presentation, especially when presented frequency is high (that is, greater than 3). Thus the question raised at the outset as to whether frequency judgments are influenced by semantic attributes can be answered in the affirmative. This finding will be discussed in more detail presently. It would be useful to know if the depressed performance in conditions SM and DM is related to a failure in recognition memory. Winograd and Raines (1972) found no difference in recognition of homonyms presented under conditions similar to SM and DM, but it could be that differences would emerge for presented frequencies greater than two. A measure of the recognizability of the words presented in the various experimental cons ditions may be obtained by considering any non-zero frequency judgment for an old item to reflect recognition of the item. Similarly, any non-zero judgment of a new item constitutes a false alarm. The proportion of correct recognitions and false alarms for each condition, calculated in this way, is shown in Figure 2. The results differ markedly from the frequency judgment data. An analysis of variance of the false alarms showed no significant variation attributable to the experimental conditions ( F < 1). A 4 × 5 analysis of variance of the hit data with groups and frequency as factors yielded a significant main effect for frequency, F(4, 304) = 55.1,p < .001), but no significant effect of groups, F(3, 76) = 2.60, or the groups × frequency interaction ( F < 1). The preceding analysis shows that item recognition improved as a function of item repetition but was unaffected by context, suggesting that the obtained differences in frequency estimation among the four context conditions cannot be readily interpreted as reflecting an underlying recognition memory
444
ROWE 1.0 o
9 J
rr
rr
°
o~
.8
j
• J
-o •
.J//.~___---~
7 Z
rr
.6
~
o
oRW
0 ,5 I-r'¢ 0 .4 (3..
,
t
*RP
o
o_
o
(.9 z
•
"-
•
•
, 0
,
.SM .DM
~
1 PRESENTED
U.I (,.) z U.I D ,iz
•
, 3 4 FREQUENCY
FIG. 2. Recognition responses as a function of presentation frequency and list type (repeated words, RW; repeated phrases, RP; same-meaning homonyms, S M; different-meaning homonyms, D M).
effect. This conclusion indicates the existence of a freq~aency judgment process which is, in some respects at least, separate from recognition memory, and vice versa. Since the recognition measure derived above might be a relatively crude indicator of recognition effects, a second experiment was conducted in which the subjects were asked to make recognition confidence ratings of words presented under the same conditions as in Experiment I. EXPERIMENT II Method The lists of items from Experiment I were presented using the same procedure as in that experiment, the only difference being in the instructions which accompanied the test booklet. The instructions paralleled those of Experiment I but asked the subjects to indicate how certain they were that each word in the test booklet had occurred in the study deck by writing a number from 1 to 5 in the blank beside each word. The number 1 was used to signify certainty that the word had not occurred and 5 to signify certainty that it had. Again, the nature of the memory test was not specified until the end of the study trial. Twenty paid volunteers from undergraduate psychology classes were assigned to each of the four experimental conditions (RW, RP, SM, and DM).
Results The results, presented in Figure 3, closely resemble those obtained for the recognition analysis of Experiment I. An analysis of variance of the false alarms yielded an F ratio
o
o RW
e,
*RP
•-
• SM
z ,<
A DM
PRESENTED
FREQUENCY
FIG. 3. Recognition confidence ratings as a function of presentation frequency and list type (repeated words, RW; repeated phrases, RP; same-meaning homonyms, SM; different-meaning homonyms, D M).
which was marginally significant at the .05 level, F(3, 76)= 2.75, probably reflecting the somewhat inflated false alarm rate for the RP condition. The analysis of the hit data produced significant main effects of groups, F(3, 76)=8.11, p < . 0 0 1 , and frequency, F(4,304) = 78.9,p < .001, and a nonsignificant interaction. Newman-Keuls comparisons among the four experimental groups with the data collapsed across frequency showed that the RW group gave higher confidence ratings than the other three (p < .05), which did not differ among themselves. The overall superiority of the RW group is probably due to the smaller ensemble of individual words encountered in the study list. Apart from the significant superiority exhibited by the RW group, these results substantiate the recognition data of Experiment I. It therefore appears that recognition memory and frequency judgments are affected in different ways by changes in the type of context which accompanies repeated presentations of a word, and thus cannot be linked to the same underlying mechanisms. DISCUSSION
The discussion will center around two related points : the representation of frequency
FREQUENCY AND RECOGNITION OF HOMONYMS
information and the distinction between frequency judgments and recognition memory processes. Underwood (1969) has listed frequency as one of a number of attributes which can be stored in the memory complex set up by a presented item. This view implies that some sort of frequency counter is associated with each stored representation, which increments upon each repetition of the item recognized as such by the subject. A simple model of this type can account for certain experimental findings, such as the extremely accurate performance exhibited in a continuous frequency judgment task, where the subject is required to give a frequency estimate each time an item is presented in the study list (Begg & Rowe, 1972), but its role in the kind of task used in the present study is unclear. One alternative would be to assign frequency the status of a derived attribute which is realized, not as part of an item encoding, but as a retrieval process based on the number of encodings present for a to-be-judged word at the time a judgment is made. This alternative seems more readily compatible with the idea that item repetitions have independent representations (Hintzman & Block, 1971), and would permit a frequency count for each item at retrieval without the need of specifying a separate frequency attribute at storage. Evidence related to the two alternatives has been provided by Jacoby (1972), who has shown that subjects are capable of estimating the frequency of certain presented events (sentences) independently of the frequency of their constituent components (words). To explain this finding in terms of an encoded-attribute interpretation of frequency would require that a frequency attribute be formed for each memory trace and, in addition, for each discriminable subunit of a trace. As Jacoby suggested, such an analysis soon becomes unwieldy. It is, therefore, less desirable than the simpler interpretation provided by the derived-attribute view. Given that frequency is a derived attribute, questions regarding the nature of the represen-
445
tation of item frequency become trivial, in that they reduce to questions regarding the nature of memory representation per se. Thus, any encoded version of an item, regardless of the encoding dimension activated, can be used in frequency estimation, provided that the various encodings of repeated items can be located and recognized as representing the same word at retrieval time. For example, as far as the present results are concerned, it would seem to be easier to retrieve and relate three encodings of the word bill presented as grocery bill, dentist's bill, and doctor's bill than would be the case if the encodings were based on grocery bill, duck's bill, and Buffalo Bill, where the number of overlapping semantic features or attributes are fewer in the second instance than in the first. The results of Experiment I lend some support to this interpretation. The present approach to the understanding of frequency judgment processes, as well as their relation to recognition memory, can be elaborated in the context of the model of recognition memory proposed by Anderson and Bower (1972). This model operates within the framework of an associative network theory of human memory. It is assumed that, upon presentation of an item in a study list, a corresponding memory node in the network is activated. In addition, simultaneous activation of a number of nodes or elements which represent prevailing contextual stimuli also occurs, this collection of elements being then recorded by a unique "second-order" node called a list marker. A word is encoded in memory by virtue of an associative link being set up between its node and the currently active list marker. At the time of' test, the decision whether or not a given test item was present during study is based upon the number of contextual elements corresponding to the study situation which are associated to the list marker. When the number of such elements exceeds some criterion threshold, the item is designated as old and a recognition response made. If an item is presented more than once in the list, a new list marker, summarizing the
446
ROWE
extant contextual stimuli, becomes available for association to each repetition, allowing frequency estimates to be made by simply counting the number of retrieved list markers associated to the item. Let us see how the model can be applied to the results obtained with lists RP, SM, and DM in the present study. The list markers set up for a repeated item would~be expected to be more similar for repeated presentations (list RP) than for the other two lists, where the changing phrase context will contribute different contextual elements to each successive marker. Furthermore, the conditions of list DM will provide additional distinctive contextual elements for the different "repeated" list markers, since each repetition should also activate a different internal semantic context. At least two types of errors are possible in making frequency estimates. The first is a failure to associate some item presentations with a list marker, which can be viewed as a failure of encoding or trace formation. The second is a failure to locate and count all the list markers associated to an item at the time the frequency judgment is made, which involves a failure of the retrieval system. The present experiments cannot decide between the relative influence of these two sources of error on the differential performance in the three list conditions. However, since the phrases in all conditions were similarly constructed and presented at the same rate, there is no reason to suspect that any given presentation would have a lower average probability of association to a list marker in any of the three lists. On the other hand, in accord with previous discussion, it is not difficult to imagine how the location of different list markers during the search phase could vary as a function of similarity of context. List markers associated with repeated items in list DM would share fewer contextual elements than would the corresponding list markers in list SM, which would in turn have fewer elements in common than those of list RP. Thus, the probability of rejecting a relevant list marker would increase in order
from list RP to SM to DM to produce a corresponding decrease in the number of list markers located and, in turn, the frequency value assigned. Consider now what might happen when a recognition response is required instead of a frequency estimate. According to the model, an item will be given a confidence rating in accordance with the number of contextual elements associated with the retrieved list marker. With repeated presentations of an item, when there is more than one list marker available, a correct recognition response, or high confidence rating, need depend on the retrieval of only one relevant list marker. This might be the first one found, or the search might continue for a "stronger" or less ambiguous list marker if the first one produces an equivocal response. We cannot, at this point, be more specific about what happens under these circumstances, but the point is that the degree of relatedness or associability of different list markers for the same item need not affect recognition memory. There is little reason to suppose that, on the average, the item presentations in either list RP, SM, or DM would be any tess likely to arouse sufficient contextual cues to mediate a correct recognition. Thus, frequencyjudgments, which take into account all recognized list markers related to a given item, would reflect different patterns of effects from recognition ratings, which can be based on only the strongest list marker available. The increase in recognition which accompanies increases in presentation frequency presumably reflects the increased probability of finding at least one relevant list marker as more become available.
REFERENCES ANDERSON, J. R., &BOWER, G. H. Recognition and retrieval processes in free recall. Psychological Review, 1972, 79, 97-123. BEGG, I., & ROWE, E. J. Continuous judgments of word frequency and familiarity. Journal of ExperimentalPsychology, 1972, 95, 48-54.
FREQUENCY AND RECOGNITION OF HOMONYMS
HINTZMAN,D. L. Apparent frequency as a function of frequency and the spacing of repetitions. Journal of Experimental Psychology, 1969, 80, 139-145. HINTZMAN,D. L., & BLOCK,R. A. Repetition and memory: Evidence for a multiple-trace hypothesis. Journal of Experimental Psychology, 1971, 88, 297-306. JACOBY,L. L. Context effects on frequency judgments of words and sentences. Journal of Experimental Psychology, 1972, 94, 255-260. UNDERWOOD,B. J. Attributes of memory. Psychological Review, 1969, 76, 559-573.
447
UNDERWOOD, B. J. Word recognition memory and frequency information. Journal of Experimental Psychology, 1972, 94,276-283. UNDERWOOD, B. J., ZIMMERMAN,J., & FREUND, J. S. Retention of frequency information with observations on recognition and recall. Journal of Experimental Psychology, 1971, 87, 149-162. WINOGRAD, E., • RAINES,S. R. Semantic and temporal variation in recognition memory. Journal of VerbaILearning and Verbal Behavior, 1972, l l , 114-119. (Received January 17, 1973)