2.05 Serial Learning A. F. Healy and W. J. Bonk, University of Colorado, Boulder, CO, USA ª 2008 Elsevier Ltd. All rights reserved.
2.05.1 2.05.2 2.05.3 2.05.4 2.05.4.1 2.05.4.1.1 2.05.4.1.2 2.05.4.1.3 2.05.4.2 2.05.4.2.1 2.05.4.2.2 2.05.4.2.3 2.05.4.2.4 2.05.4.2.5 2.05.5 References
Concepts Tasks Results Theories Classic Theories Associative chaining Positional coding Positional distinctiveness Contemporary Theories Perturbation model Start-end model Primacy model OSCAR TODAM Theoretical Issues and Conclusions
53 53 54 57 57 57 59 59 59 59 60 60 61 62 62 63
In many activities in everyday life, we are required to learn the serial order of a set of elements. For example, whenever we acquire a new word, we must learn a novel order of sounds. As Lashley (1951) pointed out in his classic paper, the problem of learning serial order is that elementary movements occur in many different orders in different actions. How, then, can an individual who knows the elementary movements in an action learn to produce them in the correct sequence? For example, how can a pianist who knows the notes occurring in a given song learn to produce the notes in the correct order? That is the central problem involved in serial learning.
letters must appear in a fixed order for a given word to be identified. The words ‘tap’ and ‘pat’ have the same items, but their different orders create different meanings. We can describe the order of the items either in terms of their ordinal positions without reference to their relational sequence (in ‘tap,’ ‘t’ is first, ‘a’ is second, and ‘p’ is third) or in terms of their relational sequence without reference to their ordinal positions (in ‘tap,’ ‘t’ precedes ‘a’ and ‘p’ follows ‘a’). In cognitive psychology, learning typically refers to the process of acquiring information over time, whereas memory usually refers to the retention (or forgetting) of information. Thus, in the study of serial learning, we are most concerned with the acquisition of order information, but we also need to understand the underlying memory processes that provide the foundation for such learning over time. In practice, assessments of learning typically involve multiple study and test episodes, but assessments of memory usually involve a single study episode followed by a single test. Thus, memory research can be viewed as providing a snapshot of the first stage of the learning process.
2.05.1 Concepts To address this problem, we need to define several important concepts and make some crucial distinctions: Performing a serial task requires subjects to display knowledge of both the elements in the task and their arrangement. The task elements are items, and their arrangement is their order. In discussing serial learning, order is usually based on the temporal sequence in which the items occur, but in some cases order is based instead on the spatial locations of the items. Thus, letters are the items in words, and the
2.05.2 Tasks The original procedure used to investigate serial learning was established by Ebbinghaus (1885/1913). 53
54 Serial Learning
In this repeated study–test procedure, a list of items is studied and then tested by requiring recall of the items in the order in which they were shown. This procedure is repeated until the subject reaches a criterion of recalling the list without error. Later researchers replaced this procedure with the method of anticipation, in which subjects are shown one item for a fixed amount of time and are then required to anticipate the next item in the sequence. Subsequently, the next item is shown, which provides feedback to subjects as to the correctness of their last response. This procedure continues throughout the presentation of a list, and list presentation is repeated until the subject reaches a criterion, perhaps one time through the list without error. The investigator tabulates how many presentations of the list are required to reach the criterion. In recent investigations, the focus has shifted from the learning of order information to immediate memory for order information. Consequently, the most popular procedure is that of serial recall. In this case, subjects are given a series of items to study and are then required to recall the entire list in sequential order. Serial recall can be contrasted with free recall, in which subjects are free to report the items in any order they want and do not need to indicate the sequence information in any way. In serial recall tasks, subjects learn multiple different lists rather than the same list repeatedly. Often the investigator includes a delay between the presentations of successive items (interitem interval) or between the presentation of the last item on the list and the recall test (retention interval). Sometimes extraneous distracting items are interpolated during either the interitem interval or the retention interval to prevent subjects from rehearsing (practicing) the items during those intervals. In the recall procedures, to respond correctly, subjects must remember the items. Another method was developed to isolate memory for order even further by eliminating the need for the subjects to remember the items. Specifically, the items are given to the subjects either in advance or during the trial, and the subjects simply have to reconstruct the order in which the items occurred. For example, for the list ABCDEF the subjects might be told that the items were BFACED, and they would have to rearrange the items into the correct sequence by placing A into the first slot, B into the second slot, and so on. In reconstructing the order, the subjects thus place each item into its appropriate position, perhaps in a horizontal array of slots, but the slots do not necessarily have to be filled in order from left to right. If a left-toright response is required, the task is serial
reconstruction of order, whereas if no constraints on response order are specified, the task is free reconstruction of order, using the same distinction described earlier for serial and free recall. A new repeated study–test procedure has been developed to investigate both memory for and learning of serial lists, with successive snapshots of the learning process taken until a criterion is reached (Bonk, 2006). This procedure can be viewed as a combination of three common tasks already described: serial learning, serial recall, and serial reconstruction of order. Under this procedure, subjects view a display showing a set of items including both targets and distractors. The targets are then highlighted one at a time to indicate the required sequence. Subjects observe this presentation and then reconstruct the sequence by choosing one item at a time. The items can vary in type, but in the initial study were clip art pictures. The sequences can vary in length, and in the initial study they were from 6 to 15 items long. To respond correctly, subjects must remember both the identity of the target items and the order in which they occurred. A given sequence of target items is shown and recalled multiple times until the subject reaches the criterion of two perfectly recalled sequences in a row.
2.05.3 Results The most widely cited experimental result in the study of serial order is the serial position function, first described by Nipher (1878; see also Stigler, 1978) for serial recall. To obtain this function, every position in the list is scored separately, and the total number of correct responses at a given position is computed either across repetitions of the list (in a learning paradigm such as the method of anticipation) or across different lists (in a memory paradigm such as serial recall). The function typically takes on a bow shape (like a bow in archery), wherein items at the start and end of the list are remembered better than intermediate items. The advantage for the initial items is the primacy effect, and the advantage for the final items is the recency effect. In serial learning, the primacy advantage is typically much larger and includes more items than the recency advantage, which sometimes includes only a single item. Asymmetrical bow-shaped functions for the initial test of a given list in the new repeated study–test procedure developed by Bonk (2006) are shown in Figure 1 for each of 10 list lengths. Asymmetrical
Proportion correct
Serial Learning
1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
55
List length 6 7 8 9 10 11 12 13 14 15 1
2
3
4
5
6
7 8 9 10 11 12 13 14 15 Serial position
Figure 1 Mean proportion of correct responses as a function of list length and serial position for initial tests in serial learning experiment by Bonk (2006).
Proportion correct
1.0 0.9
List length
0.8
6 7 8 9 10 11 12 13 14 15
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 1
2
3
4
5
6
7 8 9 10 11 12 13 14 15 Serial position
Figure 2 Mean proportion of correct responses as a function of list length and serial position for all attempts through the first perfect recall of a given list by a given subject in serial learning experiment by Bonk (2006).
bow-shaped functions are shown in Figure 2 for all tests of a given list through the first perfect recall, again for the same 10 list lengths. Figure 1, thus, shows curves reflecting serial recall, whereas Figure 2 shows curves reflecting serial learning. The curves are different, but both show an asymmetrical bow shape. Although the level of performance in serial recall or serial learning may depend on many factors, such as the rate at which the items are presented or the familiarity of the items, the serial position curve typically takes on the same shape when it is normalized. Normalization requires computing the proportion of all correct responses that occur at each serial position of a given list by a given subject. For example, if a subject on a six-item list took three attempts to reach the criterion and during those attempts made a total of 15 correct responses,
with 3 of them on the first serial position, the normalized proportion correct for that position would be 3/ 15 ¼ 0.20. The fact that the shape of the normalized serial position function is constant across serial learning conditions was first demonstrated by McCrary and Hunter (1953). The normalized functions for the serial learning results of Bonk (2006) are shown in Figure 3, again for all 10 list lengths. Other widely studied results involve the errors made by subjects in tasks requiring serial order. A frequent type of error is one in which the correct item is given but is not placed into its correct position. In a serial recall task, this is a transposition error. Typically, transposition errors occur in pairs because the positions of two adjacent items are confused. For example, if subjects are given the list ABCDEF and they recall ACBDEF, they have transposed a pair of
Proportion correct
56 Serial Learning
0.2
List length
0.1
6 7 8 9 10 11 12 13 14 15
0.0 1
2
3
4
5
6
7 8 9 10 11 12 13 14 15 Serial position
Figure 3 Mean normalized proportion of correct responses as a function of list length and serial position for all attempts through the first perfect recall of a given list by a given subject in serial learning experiment by Bonk (2006).
letters, B and C. Such a paired transposition would result in errors at two of the six list positions, positions 2 and 3. A nontransposition error is any other type of error in this task, such as when an item that did not occur in the list is substituted for a correct letter. For example, if subjects respond with YBCDEZ to the sample list, they have made nontransposition errors at two of the six list positions, positions 1 and 6. Even with shorter lists, a bow-shaped serial position function is found for serial recall. In a procedure known as the distractor paradigm, items are briefly presented, followed by a retention interval filled with an interpolated task, often consisting of items of a different type, all of which must be read aloud by the subjects. For example, a list of four letters might be presented followed by a variable number of digits, with subjects reading aloud both the letters and digits before they recall the letters in the order shown. This procedure allows the investigators to examine the amount of information remaining in memory after various delays when rehearsal of the information is prevented. Using this procedure and differentiating between transposition and nontransposition errors, Bjork and Healy (1974) found that symmetrical bow-shaped serial position functions were found for total errors at each of three different retention intervals (3, 8, or 18 interpolated digits). These functions, when decomposed into transposition and nontransposition errors, showed a bow shape only for the transposition errors; the functions for nontransposition errors were much flatter, as shown in Figure 4. Transposition errors can be further described in terms of a positional uncertainty gradient, which is a function of the distance between the input positions
of the correct item and the item that substitutes for it in the subject’s recall response. When the distance is short, the probability of an error is typically larger than when the distance is long. Such error gradients are shown in Figure 5 for two different conditions in which order information was isolated by telling the subjects in advance which items would occur and using the same set of items on every trial of the experiment (Healy, 1975). The list items occurred one at a time in different spatial locations arranged in a row, with the spatial and temporal positions independently manipulated. In the temporal condition, the items occurred in fixed spatial locations, so only the temporal sequence of the items needed to be learned and remembered. In the spatial condition, the items occurred in a fixed temporal sequence, so only the spatial locations of the items needed to be learned and remembered. The items in this experiment were four consonant letters in each condition. As in the experiment by Bjork and Healy (1974), there were three different retention intervals, with 3, 8, or 18 interpolated digits. These functions show three striking differences in the retention of temporal and spatial order information. First, the decline in accuracy across retention intervals is sharp in the temporal condition but modest in the spatial condition. Second, the serial position function (evident by examining correct responses) is bow-shaped in the temporal condition but not in the spatial condition. Third, the error gradients are steeper in the temporal condition than in the spatial condition. One specific type of nontransposition error that often occurs is a confusion error, in which a given item is replaced with another item that is confusable
Serial Learning
(a) 0.7
3 digits
Proportion of errors
0.6 0.5
Error type
0.4
Total Transposition Non-transposition
0.3 0.2 0.1 0.0
1
2
3
4
(b) 0.7
8 digits
Proportion of errors
0.6 0.5 0.4 0.3
could be classified as a phonological confusion error. Other types of confusion errors are also possible. For example, if the items were words, the confusions could be based on similarity of meaning rather than sound, in which case they would be semantic confusion errors (e.g., replacing ‘cot’ with ‘bed’). Often a nontransposition error is not based on item similarity but, rather, on positional similarity. Specifically, subjects show a tendency to replace an item in a given list with an item from the same position in an earlier list (e.g., Conrad, 1960; Estes, 1991). For example, if subjects see the list ABCDEF followed by the list GHIJKL and they recall the second list as GHIJEL, they have replaced the item in the fifth position of the second list with the item in the same position of the previous list. This type of error is a serial order intrusion. Both the serial position functions and the different types of errors give us clues that help us understand the cognitive processes underlying memory for and learning of serial order information.
0.2
2.05.4 Theories
0.1 0.0
1
2
3
4
(c)
2.05.4.1
Classic Theories
0.4
The classic theories are largely theories of serial learning because the most popular experimental paradigm used at the time they were developed was the method of anticipation, and this paradigm provided the data that were to be explained by the models.
0.3
2.05.4.1.1
0.7
18 digits
0.6 Proportion of errors
57
0.5
0.2 0.1 0.0 1
2 3 Serial position
4
Figure 4 Mean proportion of errors as a function of retention interval (i.e., for 3, 8, and 18 interpolated digits), and serial position for all errors (total) and separately for transposition and nontransposition errors in serial recall experiment by Bjork and Healy (1974).
with it. In our example, a confusion error would occur in the second position if subjects respond with AGCDEF. The confusion error is presumably a result of similarity between the original item and the one replacing it, such as the similarity in sound between the letters B and G in the example. Such an error
Associative chaining An early description of serial learning was based on an associative chaining model wherein one item in a sequence was linked to (associated with) the next item in a chain (see, e.g., Crowder, 1968). This model was a natural outgrowth of the serial learning task involving the method of anticipation in which each item in the list is explicitly given as a cue for the next item. In our example of the list ABCDEF, the letter A would be linked to the letter B, B to C, and so on. However, even for that task, the simple associative chaining model may not be appropriate, as is evident intuitively from the observation that missing one item in a serial list does not lead to failure to report all subsequent list items. For example, a chaining model may predict that, in memorizing a complete poem, if any word is forgotten then it would be impossible to recall subsequent words in the poem. This particular problem is overcome if there are associative links of
58 Serial Learning
Temporal condition
(a) 1.0 0.9
Proportion of responses
0.8 0.7
Retention interval (in digits) 3 8 18
0.6 0.5 0.4 0.3 0.2 0.1 0.0
C
C
C
C
11 12 13 14 21 22 23 24 31 32 33 34 41 42 43 44 Positions Spatial condition
(b) 1.0 0.9
Proportion of responses
0.8 0.7
Retention interval (in digits) 3 8 18
0.6 0.5 0.4 0.3 0.2 0.1 0.0
C
C
C
C
11 12 13 14 21 22 23 24 31 32 33 34 41 42 43 44 Positions Figure 5 Mean positional uncertainty gradients for temporal and spatial conditions of experiment by Healy (1975). The point plotted for position ij represents the proportion of instances in which the response occurring at position i was the item appearing in input position j of the trial. The label C indicates correct responses.
varying strength among all items in the list, not just neighboring items, with the associations for adjacent items stronger than the more remote associations linking items that are not adjacent in the list. Thus, if a word in a poem is forgotten, subsequent words could still be recalled on the basis of remote associations from earlier words in the poem that could serve as cues. However, even such compound chaining could not overcome other types of evidence against this class of models. For example, in one experiment using the method of anticipation, subjects learned a
serial list of adjectives to a criterion of one perfect trial. Then they were given a task to learn a set of paired associates, with experimental pairs formed from adjacent adjectives in the previous list and control pairs formed from unrelated adjectives. Subjects learned the experimental pairs no faster than they learned the control pairs in the paired associate task (Young, 1962), which seems inconsistent with the assumption that in the serial learning task subjects formed strong associations between adjacent items
Serial Learning
(but see Crowder, 1968, for counterevidence supporting the existence of such associations). 2.05.4.1.2
Positional coding Another early description of serial learning is also based on associations between stimuli and responses; it involves a simple positional coding model. In this case, the associations are not from one item to the next but, rather, between a given item and its ordinal position (see, e.g., Young et al., 1967). In our example, the letter A would be associated with ordinal position 1, B would be associated with ordinal position 2, and so on. One version of this theory is a box model (Conrad, 1965), according to which each successive item in a list is entered into a box, with the boxes preordered in memory. Item information in the boxes gets degraded with the passage of time, and at recall subjects output items for each box in turn using whatever information is still available. Transposition errors occur in this model not because of a reordering of the boxes but, rather, because information about an item in a given box is degraded so that the remaining partial information may be consistent with another list item, leading to report of that other item rather than the correct one for that position. This simple model was also refuted by experiments testing it. For example, in a study like the earlier one testing the chaining model, subjects learned, using the method of anticipation, an ordered list of adjectives to a criterion of one perfect trial. Then they were given a task to learn a set of paired associates, in this case with the ordinal position numbers as stimuli and the serial list adjectives as responses. Subjects did not perform as well on the paired associate task, at least on the intermediate items, as they should have if they had in effect learned those associations previously during the serial learning task (Young et al., 1967). 2.05.4.1.3
Positional distinctiveness A simple but powerful model was proposed by Murdock (1960) to account for the serial position function in serial learning solely in terms of the distinctiveness of the positions. By this model, a given position’s distinctiveness is determined merely by comparing its ordinal position value to the values of all of the other list positions. For example, in a fiveitem list, the difference between the ordinal position value for the first position and the value for the other positions is the sum of |1 2|, |1 3|, |1 4|, and |1 5|, which is 1 þ 2 þ 3 þ 4 ¼ 10. In contrast, a similar calculation for the third position yields
59
|3 1|, |3 2|, |3 4|, and |3 5|, which is 2 þ 1 þ 1 þ 2 ¼ 6. Thus, as is also clear intuitively, the first position is more different from the other positions than is the third position. The actual calculation of distinctiveness is a bit more complex because log values are used instead of the ordinal numbers themselves. The use of log values allows the model to account for the finding that primacy effects are typically stronger than recency effects. According to this model, the serial position function should be the same shape for all lists of a given length, even if the lists vary in terms of their presentation time or the familiarity of the items that comprise them. Indeed, as mentioned earlier, normalized serial position functions have the same shape across all experimental conditions (McCrary and Hunter, 1953). 2.05.4.2
Contemporary Theories
Unlike the classic theories, contemporary theories are largely theories of immediate serial memory because the most popular experimental methodology became the immediate serial recall paradigm, and this paradigm provided much of the data that were to be explained by the theories. Thus, the emphasis has shifted from the learning of serial order information to immediate memory for serial order information. The new data by Bonk (2006), which examine serial recall on successive learning trials, provides an empirical integration of serial memory and serial learning results, but little theoretical integration has yet been proposed. 2.05.4.2.1
Perturbation model An elegant model was proposed by Estes (1972) to account for serial recall performance in the distractor paradigm. Like the classic models, the perturbation model is based on simple associations. However, the associations in this case are between an individual list item and a control element, which represents the given context or environment in which the list was learned. At the core of the model is the concept of a reverberating loop that links the control element to a given list item, with a recurrent reactivation of the list item each time the control element is accessed. Because all the items in a list are associated to the same control element, the difference in reactivation times reflects their input order. The timing of the reactivations, thus, provides the basis for knowledge of the order of the items in a list. This knowledge is assumed to be perfectly stored in memory immediately after the list is presented. Loss of such
60 Serial Learning
information, resulting in failure to recall the items in the correct order, may then occur for one of two reasons. First, the subject may lose access to the control element, perhaps because the experimental context has shifted as a function of time or because of some interpolated, interfering activity. Second, there may be perturbations, or disturbances, in the timing of the recurrent reactivations, presumably resulting from random neural activity. If the timing perturbations are large enough, two adjacent list items may be interchanged so that the later item is reactivated before the earlier item, thus leading to transposition errors in recall. The perturbation process can account for the symmetrical bow-shaped serial position functions found in immediate serial recall of short lists because the likelihood of interchanges resulting from timing perturbations is greater for intermediate list items (which have neighboring items on both sides) than for end items (which have a neighboring item on only one side). This same mechanism easily accounts for the positional uncertainty gradients observed for temporal (but not spatial) order recall, wherein the likelihood of a transposition error decreases as the distance in time increases between the input positions of the correct item and the one replacing it. After its original formulation, the perturbation model was refined to account for the fact that order information can be viewed as hierarchical (Lee and Estes, 1981). If lists are divided into subsets, perhaps by adding pauses between groups of items, then subjects need to know on which list a given item occurred and in which subset of the list it occurred, as well as its relative position in the subset. According to the refined version of the perturbation model, each item is coded for its placement in this three-tier hierarchy. The hierarchy of codes is repeatedly reactivated, and the perturbation process applies independently at each level, so at each reactivation there is a probability that the relative position of adjacent lists, subsets, or items will be disturbed. This hierarchical perturbation process produces serial order intrusion errors, when an item in a given list or list subset is replaced by an item from the same position in an earlier list or list subset.
as anchors, or markers, to code for each item’s position in the list (see Feigenbaum and Simon, 1962, for an earlier use of the notion of list end items serving as anchors). Each item gets a two-value code based on the strength of both the start and end markers at that point in the list. The start marker is assumed to be strongest at the beginning of the list and to get progressively weaker for subsequent list items. In contrast, the end marker is assumed to be weakest at the beginning of the list and to get progressively stronger for subsequent list items. Although the end is not evident at the start of the list, subjects anticipate the end (at least when they know the list length), and that expectation allows for the use of the end marker. The model reproduces the general finding that primacy effects are larger than recency effects by giving greater strength to the start marker than to the end marker. This model makes use of a distinction between types and tokens as a way of representing items. A given item, such as a word, may occur in multiple lists or on multiple occasions in a given list. Each time the word occurs, the item is the same type, but the different instances of the word constitute different tokens. In the start-end model it is assumed that each item token codes both identity and positional information. The identity information specifies the content of that item (e.g., which word has occurred). The positional information is derived from the strength of the start and end markers for that item token. According to the model, the item tokens are unordered in memory; instead, they are ordered at the time of recall. Specifically, at recall the position of a given item is cued by its start and end marker strength values; the identity of the item that matches the cued strength values most closely is recovered and then recalled at that position. Another assumption made by the model is that once an item is recalled, its type is suppressed so that subjects will be less likely to recall a given item type more than once in a trial. This aspect of the model allows it to account for the Ranschburg effect (e.g., Jahnke, 1969), whereby subjects are likely to fail to recall second occurrences of a given item.
2.05.4.2.2
2.05.4.2.3
Start-end model The start-end model (Henson, 1998) was proposed to account for the serial position functions, the positional uncertainty gradients, and the distributions of different types of errors in the serial recall task. At the heart of this model is the observation that the start and end of a list are most salient and therefore serve
Primacy model The primacy model (Page and Norris, 1998) is related to both the perturbation and start-end models but was formulated to account for a different set of results. The results in this case are those that formed the basis of Baddeley and Hitch’s (1974) model of the phonological loop, which is a qualitative
Serial Learning
description of working memory that describes rehearsal processes but does not provide any specific mechanisms for serial recall. Thus, the primacy model can be viewed as a computational version of the phonological loop model (see also Burgess and Hitch, 1999, for an alternative quantitative version of this model). The primacy model does not specifically code position information, but such information is derived at the time of recall from the relative activation strengths of list items. These activation strengths vary as a function of the time when the list items occurred, forming a primacy gradient, with the strength greatest for the first item and declining for successive items in the list. These activation strengths can be thought of as reflecting the degree to which the context defining the start of the list is associated with each successive list item. By this view, the startof-the-list context resembles both the control element of the perturbation model and the start marker of the start-end model. However, unlike the start-end model, there is no corresponding end marker in the primacy model. To model the recall process, the primacy model implements the assumption that in a repeating cycle, the item with the greatest activation is selected for recall, and after it is recalled, it is suppressed. Subsequently, the item with the next highest activation is recalled and then suppressed, and so on. During the recall process, the activations for all list items decay exponentially with time. Errors result from the fact that there is noise in the process of selecting the item with the strongest activation (which can be viewed as noise in the perception of the activation strengths), even though there is no noise in the activation strengths themselves. Primacy effects fall out of the model naturally because of the primacy gradient, but recency effects occur because end items can only participate in a paired transposition error in one direction (i.e., with one neighbor), whereas intermediate items can participate in paired transposition errors in both directions (i.e., with neighbors on both sides). Paired transposition errors occur in this model whenever the perceived activation strength of a given list item is either less than the perceived activation strength of a subsequent list item or greater than the perceived activation strength of a preceding list item. Such paired transposition errors also rely on a property of the model called fill in, which is the assumption that when an item is missed in recall because of a transposition it is likely to be recalled in the next position. This model is, thus, consistent with the observation from Bjork and
61
Healy (1974) that transposition errors show a bowed serial position function but nontransposition errors do not. Nontransposition errors typically increase as a function of serial position. To account for this finding, the primacy model assumes that once the item with the strongest perceived activation is selected, the activation is compared to a threshold value. If the activation is above threshold, the item will be recalled, whereas if it is below threshold, it will be omitted and the subject will resort instead to guessing an item, with this threshold comparison subject to noise. Thus, the primacy model can account for nontransposition errors as well as transposition errors. 2.05.4.2.4
OSCAR A novel approach to explaining serial recall was taken by Brown et al. (2000) in their oscillatorbased computational model OSCAR. Oscillators are timing mechanisms that generate continuously changing rhythmic output. Oscillators occur at different frequencies, with high-frequency oscillators repeating more often than low-frequency oscillators. An analogy can be made to the hands in a clock face. The second hand completes its cycle more rapidly than the minute hand, which in turn completes its cycle more rapidly than the hour hand. OSCAR accounts for the learning of order by making use of oscillator timing mechanisms presumed to occur naturally in the mind. In OSCAR, during list presentation, associations are formed between a vector (an ordered series of numbers) representing a list item and a vector representing successive states of the learning context. The learning context is the current state of the dynamically changing internal set of timing oscillators. Thus, OSCAR, like other models, makes use of associations between items and a representation of the learning context. However, in OSCAR, unlike other models, the learning context changes continuously during list presentation. Just as Lee and Estes (1981) postulated a hierarchy of codes in the perturbation model, the oscillators in OSCAR vibrate at different rates, reflecting different levels of a three-tier hierarchy, including item position within a subset, subset position within a list, and list position within a session. Unlike the perturbation model, however, order errors arise in OSCAR solely during the retrieval stage. Specifically, at the time of retrieval, a sequence is recalled by reinstating the states of the set of oscillators that comprise the learning context. Each successive learning context vector is used as a probe recovering the list item vector that is
62 Serial Learning
associated with it. Retrieval errors occur based on the quality of the learning context vector and the extent to which that vector is specific to a particular item. Items occurring close together in time have similar learning context vector values; thus, noise in the retrieval process leads to positional uncertainty gradients like those found for the temporal condition in the study by Healy (1975). This model, unlike some of the others, can thereby explain observed differences between recall of temporal and recall of spatial order information. 2.05.4.2.5
TODAM Unlike the other contemporary theories reviewed here, which are restricted to memory for serial order, a model by Lewandowsky and Murdock (1989) is designed to account for serial learning as well as serial recall. This theory of distributed associative memory (TODAM) also differs from the other contemporary models in being based on associative chaining. Although, as mentioned earlier, problems had been found for the classic associative chaining model, these were largely overcome in TODAM. A third difference between TODAM and the other models reviewed here is that TODAM provides a more general account of memory, not being restricted to serial order (see Anderson et al., 1998, for another general model incorporating serial recall). A fourth difference is that the memory representations in TODAM are not localized but are, rather, distributed. Specifically, in TODAM the representations of all list items are stored together in a common memory vector. The numbers making up the memory vector in TODAM represent values of individual features. Successive items are associated using a mathematical operation convolution that blends the constituent item vectors. The resulting convolution is also added to the common memory vector. If all information is contained in a single memory vector, how can the model recover the individual list items when needed? The retrieval mechanism used for this purpose is correlation, which is the inverse of convolution (i.e., it essentially undoes that operation). Thus, a memory probe representing a particular stimulus item can be correlated with the common memory vector to yield another vector that approximates the response item with which it had been associated. Once the approximation to the response item is generated via the correlation process, it must be deblurred (interpreted) before it can be recalled. If the deblurring process yields an overt recall response,
the new vector resulting from that response can then be used as a stimulus probe to recover the next item in the list. The deblurring process might not result in an actual overt response. Nevertheless, the recall process can move forward to the next item in the list because the vector approximation can be used as a stimulus for a subsequent response. This implementation allows TODAM to overcome one of the key problems mentioned earlier plaguing the classic chaining model, namely, that missing one item in a serial list does not lead to failure to report all subsequent items. A subsequent version of TODAM (Murdock, 1995) also uses associations between higher-order chunks of items to avoid problems with simple associative chaining models. To model serial learning occurring across repeated presentations of the same list, in a closedloop variant of TODAM, the new information added to the memory vector for an item is reduced by the amount of information already present in the vector. This aspect of the model captures the idea that gradually less is learned about each item during successive repetitions of a list.
2.05.5 Theoretical Issues and Conclusions A variety of theoretical mechanisms have been proposed to account for serial memory and learning, but there is little consensus as to which is the best. The various models differ along numerous important dimensions, such as the relation between item and order information and whether or not position or sequence information is explicitly coded. Some models do not discriminate between temporal and spatial order, whereas others apply only to temporal order. Crucially, most models do not attempt to provide a theoretical integration of serial memory and serial learning results. Thus, despite the theoretical insights and innovations in the five decades since Lashley (1951) first discussed the importance of this problem, we have not yet achieved a full and widely accepted understanding of the processes underlying serial order behavior, which provides the foundation for many activities in everyday life.
Acknowledgments Preparation of this chapter was supported in part by Army Research Institute Contract DASW01-03-K-
Serial Learning
0002 and Army Research Office Grant W9112NF05-1-0153 to the University of Colorado.
References Anderson JR, Bothell D, Lebiere C, and Matessa M (1998) An integrated theory of list memory. J. Mem. Lang. 38: 341–380. Baddeley AD and Hitch GJ (1974) Working memory. In: Bower GH (ed.) The Psychology of Learning and Motivation: Advances in Research and Theory, vol. 8, pp. 47–89. San Diego, CA: Academic Press. Bjork EL and Healy AF (1974) Short-term order and item retention. J. Verb. Learn. Verb. Be. 13: 80–97. Bonk WJ (2006) Sequence memory with visual item, spatial, and order information. Paper presented at the 76th annual convention of the Rocky Mountain Psychological Association, Park City, UT. Brown GDA, Preece T, and Hulme C (2000) Oscillator-based memory for serial order. Psychol. Rev. 107: 127–181. Burgess N and Hitch GJ (1999) Memory for serial order: A network model of the phonological loop and its timing. Psychol. Rev. 106: 551–581. Conrad R (1960) Serial order intrusions in immediate memory. Brit. J. Psychol. 51: 45–48. Conrad R (1965) Order error in immediate recall of sequences. J. Verb. Learn. Verb. Be. 4: 161–169. Crowder RG (1968) Evidence for the chaining hypothesis of serial verbal learning. J. Exp. Psychol. 76: 497–500. Ebbinghaus H (1913) Memory: A Contribution to Experimental Psychology (trans. Ruger HE and Bussenius CE). New York: Teachers College. (Translated from the 1885 German original.) Estes WK (1972) An associative basis for coding and organization in memory. In: Melton AW and Martin E (eds.) Coding Processes in Human Memory, pp. 161–190. Washington, DC: V. H. Winston.
63
Estes WK (1991) On types of item coding and sources of recall in short-term memory. In: Hockley WE and Lewandowsky S (eds.) Relating Theory and Data: Essays on Human Memory in Honor of Bennet B. Murdock, pp. 155–173. Hillsdale, NJ: Erlbaum. Feigenbaum EA and Simon HA (1962) A theory of the serial position effect. Brit. J. Psychol. 53: 307–320. Healy AF (1975) Coding of temporal-spatial patterns in shortterm memory. J. Verb. Learn. Verb. Be. 14: 481–495. Henson RNA (1998) Short-term memory for serial order: The start-end model. Cognitive Psychol. 36: 73–137. Jahnke JC (1969) The Ranschburg effect. Psychol. Rev. 76: 592–605. Lashley KS (1951) The problem of serial order in behavior. In: Jeffress LA (ed.) Cerebral Mechanisms in Behavior, pp. 112–146. New York: Wiley. Lee CL and Estes WK (1981) Item and order information in short-term memory: Evidence for multilevel perturbation processes. J. Exp. Psychol-Hum. L. 7: 149–169. Lewandowsky S and Murdock BB, Jr. (1989) Memory for serial order. Psychol. Rev. 96: 25–57. McCrary J and Hunter WS (1953) Serial position curves in verbal learning. Science 117: 131–134. Murdock BB Jr. (1960) The distinctiveness of stimuli. Psychol. Rev. 67: 16–31. Murdock BB (1995) Developing TODAM: Three models for serial-order information. Mem. Cognition 23: 631–645. Nipher FE (1878) On the distribution of errors in numbers written from memory. Transactions of the Academy of Science of St. Louis 3: ccx–ccxi. Page MPA and Norris D (1998) The primacy model: A new model of immediate serial recall. Psychol. Rev. 105: 761–781. Stigler SM (1978) Some forgotten work on memory. J. Exp. Psychol-Hum. L. 4: 1–4. Young RK (1962) Tests of three hypotheses about the effective stimulus in serial learning. J. Exp. Psychol. 63: 307–313. Young RK, Hakes DT, and Hicks RY (1967) Ordinal position number as a cue in serial learning. J. Exp. Psychol. 73: 427–438.