COGNITIVE
PSYCHOLOGY
Modality
18, 123-157 (1986)
and Suffix Effects in Memory for Melodic and Harmonic Musical Materials LINDA
A. ROBERTS
AT&T Bell Laboratories
and Rutgers University
Four experiments examined modality and suffix effects in serial recall of musical materials. In the first two experiments, recall patterns for melodic and harmonic sequences were compared with recall for digits. Whereas melodic, harmonic, and language sequences all demonstrated strong primacy effects, the modality effect (superior recall at recency positions for auditory relative to visual presentation) occurred only for linguistic materials; for melodic and harmonic sequences, recency effects were of comparable magnitude regardless of whether the presentation modality was visual or auditory. In the third and fourth experiments, suffix effects for melodic and harmonic lists were measured, using either a single note or a chord suffix. Experiment 3 examined sufhx effects for visual materials and Experiment 4 was the auditory analog of Experiment 3. For both modalities, note and chord suffixes resulted in diminished recency recall (the suffix effect) for melodic materials but only the chord suffix interfered with recall of recently presented harmonic items. Findings of recency and suffix effects for written music refute PAS (precategorical acoustic store), primary language, and static vs changing state theories of modality and suffix effects. Rather, these results support more general sensory or short-term memory theories. 0 1986 ACdemlc Press. Inc.
This study examines modality and suffix effects in memory for music. This is a potentially fruitful endeavor because results for music might be able to differentiate some of the current theories of the modality effect observed in language. Another aim of this research is to compare recall patterns for music and language. This is of interest, particularly in view of recent analogies that have been drawn between these two communication systems (see, for example, Lehrdahl & Jackendoff, 1977). A further Reprint requests may be sent to AT&T Bell Laboratories, 600 Mountain Avenue, Murray Hill 07974, NJ. This research was supported by AT&T Bell Laboratories and by a postgraduate scholarship from the Natural Sciences and Engineering Research Council of Canada. Some of this material was described in a doctoral dissertation in psychology submitted to Rutgers University. I am especially indebted to Vivien C. Tartter for her invaluable comments and enthusiasm throughout the project, to Mark Altom, David Krantz, and Michael Kubovy for helpful discussions and suggestions, and to Max Mathews for his continued support, encouragement, and advice. I am grateful to Caroline Palmer for her many useful suggestions as well as for her assistance in collecting the data. I also thank David Millen and Steve Poltrock for their help in programming the experiments and Gary Perlman for his statistical programs. 123 OOlO-0285/86$7.50 Copyright Q 1986 by Academic Press, Inc. All rights of reproduction in any form reserved.
124
LINDA
A. ROBERTS
goal is to use some of the techniques developed to study memory for language in order to discover more about musical memory. In studies of memory for language, it is well documented that recall of lists of auditorily presented items is superior, at terminal list positions, to recall of the same items presented visually. This phenomenon, termed the modality effect, has been reliably demonstrated in a wide variety of tasks, including free recall (e.g., Engle, 1974; Murdock & Walker, 1969), serial recall (e.g., Baddeley, 1968; Conrad & Hull, 1968), and serial probe (Murdock, 1966; Norman, 1966). In free recall tasks, both visual and auditory presentation of items produce a bow-shaped serial recall function, in which items are relatively well recalled early (primacy effect) and late (recency effect) in the list. Here the modality effect is due to a recency advantage for auditory presentation. For serial recall and serial probe tasks, there is a strong recency effect for heard items, whereas the recency effect for seen items is virtually absent. The most widely accepted theory of the modality effect, called PAS theory, was put forth by Craik (1969) and by Crowder and Morton (1969). According to this view, enhanced recall for terminal auditorily presented list items is due to a relatively long lasting auditory sensory store (called precategorical acoustic store (PAS) which supplements information in short-term memory. In contrast, the corresponding visual sensory store is of too short a duration to influence recall for terminal visually presented list items. There is evidence of a fairly long lasting auditory sensory store (Cole, Coltheart, & Allard, 1974; Darwin, Turvey, & Crowder, 1972), which would support this explanation. There is also evidence that auditory information, per se, is important. For example, it has been shown that auditory presentation and visual presentation in which subjects vocalize the list items have virtually identical serial recall functions (Henmon, 1912; Murray, 1965; Wong & Blevings, 1966). The strongest support for PAS theory is found in suffix experiments. In this paradigm, a redundant, verbal suffix is appended to the end of a list of to-be-remembered items with instructions to use it simply as a recall cue. Here the recency effect is greatly attenuated (suffix effect) when both list and suffix items are auditory. A corresponding visual suffix appended to visually presented list items has little impact on recall functions (e.g., Morton & Holloway, 1970). Morton, Crowder, and Prussin (1971) found reduced suffix effects when suffixed items differed in spatial location, loudness, or pitch, relative to the memory items. Thus, the suffix effect occurs primarily when there is a physical match between the list and suffix items, which supports ,the view that recency information is stored in relatively raw form. However, a growing body of evidence has accumulated which refutes PAS accounts of modality and suffix effects. Shand and Klima (1981)
MEMORY
FOR MUSICAL
MATERIALS
125
found that recency and suffix effects also occur in a visual language, American Sign Language (ASL), for its native users. For hearing subjects, it has been shown that the auditory modality is not necessary to produce modality and suffix effects (as originally proposed by Crowder & Morton, 1969): Suffix and recency effects have been observed for lip-read (Campbell & Dodd, 1980; Spoehr & Corin, 1978) and mouthed (Nairne & Walters, 1983) verbal materials. Other results have shown that recency effects which have been ascribed to language processing may not be specific either to the auditory modality or even to language (Campbell & Dodd, 1980). Recency and suffix effects have been found for nonspeech sounds (Spoehr & Corin, 1978) as well as for pictures of abstract objects (Broadbent & Broadbent, 1981; Hines, 1975; Phillips & Christie, 1977a, 1977b). The most widely accepted theoretical alternative of modality and suffix effects emphasizes short-term memory mechanisms (e.g., Ayres, Jonides, Reitman, Egan, & Howard, 1979; Nairne & Walters, 1983). According to this view, auditory information is encoded more easily into short-term memory because this store is primarily acoustic or phonological in nature (Laughery & Fell, 1969; Posner, 1967; Sperling & Spielman, 1970; Spoehr & Corin, 1978). Whereas auditory material is directly recoded into this store, visual material must be translated into some form of auditory or phonological code, requiring an extra stage of processing. Shand and Klima (1981) have generalized this short-term memory explanation to account for their findings for American Sign Language, cited earlier. They argue that the recency advantage is not due to auditory presentation per se, but rather occurs for material presented in the subject’s primary language. Support for these short-term memory theories of the modality effect include findings that the modality effect is diminished with slower presentation rates (Craik, 1969; Murdock & Walker, 1969). This suggests an extra step in visual processing (or in the processing of nonprimary language items). Other theoretical positions, emphasizing precategorical processing, have recently been put forth. Broadbent and Broadbent (1981) have proposed an “undecaying visual sensory memory” to account for recency and suffix effects for pictures. They argue that sensory (feature) rather than categorical information best predicts the recall functions for both pictures and language materials; graphic lists of words show little recency because their rapid categorization minimizes the role of a visual sensory store in recall. Campbell, Dodd, and Brasher (1983) suggested that results for lip-read materials may reflect a tendency for changing state information to be processed differently from static information (such as written words). They argued that “movement itself, or features such as direction that are given by movement, may themselves be coded as a feature”
126
LINDA
A. ROBERTS
(p.585), which might result in a durable representation in visual sensory memory. Thus, there may be a general tendency, not specific to any one modality, for changing-state information to produce a recency advantage. Music is a natural candidate for further studies of modality, recency, and suffix effects. Not only can the study of music further assess the generality of recency and suffix effects for nonlinguistic phenomena, but, unlike nonspeech sounds and pictures, music has both auditory and visual presentation modalities, as does language. Thus, issues such as modality differences, primary language processing, acoustic factors in PAS or short-term memory, and static vs changing-state processing can be addressed. One purpose of this study is to test the various theoretical positions of modality and recency effects. If information residing in an auditory sensory store is responsible for enhanced recall of auditorily presented linguistic items, then this store should also be accessible for auditory presentation of musical materials, resulting in better recency recall for auditory than visual presentation of musical materials. This result might also be expected if primary language processing is critical; since we hear music long before we learn to read it, heard music would appear to be primary in nature. Furthermore, changing-state information (i.e., auditory presentation of music) should enjoy privileged processing in shortterm memory compared to static information (i.e., written music). Results of a recent study on short-term recall in music (Roberts, Millen, Palmer, & Tartter, 1983) refute each of these theories (PAS, changing state, and primary language). I reserve discussing potential explanations for findings observed for music until I have described the Roberts et al. research. A second purpose is to compare memory patterns for music and language. Recently, several writers have drawn parallels between music and language (Brown, 1980; Delis, Fleer, & Kerr, 1978; Jackendoff & Lerdahl, 1980; Lerdahl & Jackendoff, 1977; Sloboda, 1977). For example, Jackendoff and Lerdahl (1980) have discussed similarities between musical and linguistic grammars. In experimental research, there have been reports of processing similarities between music and language. A number of studies have observed categorical perception for musical stimuli (Burns & Ward, 1974, 1978; Locke & Keller, 1973; Siegel & Siegel, 1977) very much like the effects observed with speech (e.g., Liberman, Harris, Hoffman, & Griffith, 1957). Several researchers have demonstrated the importance of phrase structure in music (Gregory, 1978; Sloboda, 1976, 1977), and similar laterality effects have been observed with both music and language (Bever & Chiarello, 1974; Reineke, 1981; Salis, 1980). Given these apparent similarities between music and language, we might expect to find similar modality and suffix effects.
MEMORY
FOR
MUSICAL
MATERIALS
127
A third purpose of this research is to study musical memory. Whereas studies of memory for music can perhaps shed some light on modality and suffix effects previously observed in language, the methodologies used in language research may also aid in understanding more about the nature of memory for music. In particular, recall patterns were compared for both melodic and harmonic musical materials. In addition, I observed the effects of chordal and single-note suffixes on these materials. Results of both manipulations should tell US whether or not melodic and harmonic patterns are processed similarly. Three studies have looked at serial recall patterns for musical materials (Deutsch, 1981; Halpern & Bower, 1982; Roberts et al., 1983). However, the first two studies did not examine possible auditory-visual differences in music. Deutsch (1981) showed that serial-recall functions for heard melodies had the typical bow-shaped curve observed for spoken language. Halpern and Bower (1982) looked at recall for “good,” “bad,” and “random” visually presented melodies. They reported that there were “no serial position effects” for their musicians. However, a closer examination of their data (the authors kindly sent me their raw data to reanalyze) revealed strong primacy effects but no recency effects for all three melodic conditions. Together, these results suggest that processing of music is similar to language processing. More recently, however, Roberts et al. (1983) showed that modality and suffix effects differ for music and language. Unlike previous studies (Deutsch, 1981; Halpern & Bower, 1982), serial-recall patterns were observed for both auditory and visual presentation of music, within one experiment. In the Roberts et al. study, musicians were presented with series of musical notes (visual) and tones (auditory) for immediate serial recall. In the first experiment, subjects saw and heard musical sequences, drawn from an eight-item vocabulary (the Dorian mode), presented at a rate of one item every 2 s. Unlike findings for language, there were strong primacy and recency effects for both visual and auditory presentation modalities. Unfortunately, the first experiment of the Roberts et al. study yielded an overall visual advantage which made it difficult to compare recency effects between auditory and visual presentation. In an attempt to solve this problem, a second experiment was run in which the stimulus set size was decreased and presentation rate was increased. In this experiment, musicians saw and heard eight-item lists drawn from a three-note vocabulary (the notes of a minor triad), at a two-item per second presentation rate. In this experiment, strong primacy and recency effects were again observed for both presentation modalities. However, unlike the first experiment, where an overall visual advantage was observed, there was an overall auditory advantage here. It was not clear whether these general
128
LINDA
A. ROBERTS
auditory and visual superiorities were simply random findings, due to differences in task difficulty or to differences in presentation rate. Consequently, the first two experiments of the present study were designed in part to replicate these findings and in part to resolve some of these methodological difftculties. The most important results of the Roberts et al. study are the strong primacy and recency effects for both visual and auditory presentation of music. Although results for auditory presentation support the previous findings of Deutsch (1981), the recency effect for visual presentation does not agree with results reported by Halpern and Bower (1982), described earlier. It is likely that differences occurred because the tasks differed in the two studies: In the Roberts et al. study, subjects were presented with each item sequentially (for either 0.5 or 2 s) and subjects recalled these items in their exact serial order. In contrast, subjects in the Halpern and Bower study were presented with each lo-item melody for a total duration of 4 s, and no specific instructions were given for recall. Perhaps in the latter study subjects did not see the final list items in every trial either because they focused their efforts on remembering the earlier list items or because they did not have the time to do so. One explanation for this recency effect for written musical materials is that of Broadbent and Broadbent (1981), described earlier, which implies similar results for written music as for pictures. It may be that, like abstract pictures, written musical materials are retained in a fairly long lasting sensory store, resulting in enhanced recall at terminal positions. This might be expected if written music is coded as a pattern of note movements on a staff, for example. A second explanation is that musicians form both visual and auditory representations of written music, thus resulting in strong recency effects similar to that observed for auditory presentation. Perhaps differential treatment of music and language occurs because, unlike written language, written music is usually accompanied by a corresponding sound. That is, when we read language, we rarely say the words aloud. In contrast, when musicians read music, they usually play or sing what they see. This description does not imply that no acoustic recoding occurs for linguistic materials; there is clearly evidence of some articulatory encoding of these materials (e.g., Schiano & Watkins, 1981). Rather, the association between written music and its corresponding auditory representation might be stronger and more automatic than that between written language and its auditory representation. Perhaps the association is stronger for mouthed and lip-read items as well, since sound normally accompanies these movements. Although this notion of auditory imagery has not been studied to any great extent, there is considerable literature for visual imagery (e.g.,
MEMORY
FOR MUSICAL
MATERIALS
129
Cooper & Shepard, 1973; Shepard & Metzler, 1971) and anecdotal evidence that people form mental images of auditory stimuli. For example, people will say that a harp sounds more like a guitar than a bassoon and this judgment can be easily made without actually hearing these instruments (Brown & Herrnstein, 1981). As another example, many have experienced an almost involuntary replay (in the mind) of a piece of music which was heard perhaps as long as a few days ago. One goal of the first two experiments was to replicate the Roberts et al. findings for melodic patterns. Their observation of comparable recency recall patterns for auditory and visual presentation was the basis for the hypothesis that processing of visually presented musical patterns entails both auditory and visual representations. Thus, it is critical that these findings are shown to be reliable. The next two experiments assessed suffix effects for visual and auditory presentation of musical patterns. In this case, note and chord suffixes were presented at the end of melodic and harmonic lists. If similar suffix effects occur for both auditory and visual presentation, then this would provide additional evidence that similar processes occur in the two modalities for music. EXPERIMENT
1
In this experiment, serial recall of eight-item lists of “melodies” is contrasted with serial recall for digits. Digits are assumed to characterize linguistic materials, since there is an arbitrary relationship between the written symbol and its corresponding meaning. This contrasts with the more analog representation of written music, in which, for example, the notes move higher as the pitch increases. In an earlier study, Roberts et al. (1983) observed an overall visual advantage for melodic materials when trial sequences were drawn from an eight-note vocabulary, but an overall auditory advantage occurred when trial sequences were drawn from a three-note set. Because these results were confounded with presentation rate, it was not clear whether the general auditory or visual advantages were due to differences in task difficulty or to presentation rate differences. The stimulus set used in this experiment comprised the notes of a pentatonic scale (F,G,A,C’,D’). This scale has the advantages of (1) a smaller subset of notes than other scales and (2) “musical sequences” that occur given any ordering of the notes. (This scale was used by the composer and music educator, Carl Orff, in his teaching method for children, in which children can improvise by playing any ordering of these notes without creating discordant sequences.) It was hoped that, by using a five-note vocabulary (this stimulus-set size is midway between the previous two set sizes used), the two modalities would be equated for difficulty.
130
LINDA A. ROBERTS
Method Subjects. Subjects were 18 professional musicians from the New Jersey area. Subjects were chosen who had good relative pitch, as measured in a baseline task (described below) prior to the experiment. Subjects ranged in age from 24 to 79 and each subject was paid $10 for participating. Design. All subjects participated in four conditions. In the two melodic conditions, subjects saw and heard random sequences drawn from a one-octave range of the pentatonic scale (F,G,A,C’,D’). In the two “language” conditions, subjects saw and heard random sequences of digits (1 to 5). Stimuli. In each condition, subjects recalled eight-item lists, presented at the rate of 2 items/s. Lists were random orderings of melodies and digits, with the constraint that there were no sequential repetitions of the items in a given list. Visually presented melodic sequences were generated on an Apple computer. Throughout this condition, five lines representing a musical staff were centered on the screen. Subjects were told to assume a treble clef. The musical notes were presented in the center of the screen, in their appropriate places on the staff. Visually presented digits were presented in the center of an Apple computer screen. Each item was seen for 350 ms. Auditorily presented melodic sequences were generated on a SEL 32 computer, using the Music V sound synthesis program (Mathews, 1969). The waveform samples were read from a disk, converted to an analog signal with a 16-bit digital-to-analog converter at a rate of 20,000 samples per second, filtered with a Rockland 752A low-pass filter set at 8 kHz, and taped onto a NAGRA IV-S tape recorder. Each tone contained the fundamental frequency and 9 upper partials whose amplitude decreased with frequency relative to the fundamental (9 dB per octave). Phases of the partials were randomly selected to be 0 or 180 degrees, to achieve a tone with a relatively low peak factor. The tones began with a linear attack lasting 15 ms and ended with a linear decay lasting 15 ms. The resulting sound may be described as a typical bland electronic organ timbre. The five tones comprising the stimulus set were F (349 Hz), G (392 Hz), A (440 Hz), C’ (523 Hz), and D’ (586 Hz). In a list, each item was heard for a 350-ms duration. Auditorily presented digits were spoken by a female speaker (the author) and were taped onto a Sony tape recorder. The speaker used a metronome in order to present the items at a 2-item/s rate. She attempted to minimize the stress of naturally spoken stimuli by using a monotone voice. Procedure. Prior to the experiment, subjects were given a baseline task in which they were presented with two random eight-note sequences of tones drawn from a one-octave range of the Dorian mode. They heard one item (l-s duration) every 3 s and were required to write the letter names of the notes as they were presented. The note D just above middle C was given prior to each trial so that they could make relative pitch judgments. Only those subjects who made fewer than two errors participated in the experiment. Subjects were tested either individually or in groups of two in a double-walled soundproof chamber. Subjects received different random orderings of the four conditions and were given a short break after each condition. For each condition there were 17 trials, the first 2 of which were practice. At the beginning of a trial, subjects heard a warning signal. For the two visual conditions, this signal was a beep; for the auditory digit condition, they heard a verbal “ready” signal; and for the auditory music condition, they heard the tone F so that they could make relative pitch judgments of the tones comprising a trial. These warning signals were included for all conditions so that the conditions would be equally subject to prefix effects (e.g., Jahnke & Perez, 1981).The warning signal was followed, after 2 s, by an eight-item sequence, presented at the rate of 2 items/s. They then had 15 s in which to make their responses on the answer sheets provided before the onset of the next trial. For the two-digit conditions, they wrote the digits, and for the music conditions, they wrote the
MEMORY FOR MUSICAL MATERIALS
131
appropriate notes on a musical staff. Subjects were told to recall the items in the exact order they were seen or heard, beginning with the first item and proceeding to the last item. They were also told not to go back to correct earlier positions and were not permitted to articulate or hum during presentation and/or recall. These instructions were monitored throughout the experiment. Subjects were advised to either guess or leave blanks when they were unable to recall given items.
Results
A series of analyses of variance were performed on the data, using the total number correct at each position as the dependent variable. Figures la and b show the serial-recall patterns for the digit and melody conditions, respectively. Mean percentage correct is plotted as a function of serial position. The solid lines indicate recall for auditory presentation while the broken lines indicate recall for visual presentation. As shown in Table 1, digits were recalled better than melodies [F( 1,17) = 52.15, p < .OOl]. This superiority was manifested at every serial position, for both modalities. Additionally, auditorily presented items were recalled better than visually presented items [F( 1,17) = 38.58, p < .OOl] at every serial position for both stimulus types. As is evident in Table 1, the difference in recall between visual and auditory presentation was greater for digits (19.4%) than for melodic sequences (8.47%), indicated by a significant stimulus-set type (digit vs melody) x modality (auditory vs visual) interaction [F(1,17) = 5.68, p = .029]. However, since the functions do not cross, this interaction could be interpreted as a floor effect in the case of musical stimuli. The most important result is that there was a greater modality difference at recency positions for digits than for melodies: The stimulus-set type x position x modality interaction was statistically significant [F(7,119) = 2.79, p = .Ol]. In order to substantiate this conclusion, separate analyses were performed for the two stimulus-set types. Digits
For digits, primacy effects are similar for both modalities but recency effects differ. Figure la shows that, although recall was better for auditory presentation at the first four serial positions, recall patterns decreased at about the same rate. In a separate analysis of the first four serial positions no interaction was observed between the factors of position and modality 0, = .318). In contrast, results for the final four serial positions show that recency recall for auditory presentation is much greater than that for visual presentation. This effect is indicated by a significant position x modality interaction [F(3,51) = 9.38, p < .OOl]. This modality effect for digits was verified by a post hoc comparison (Scheffe test) of these recency effects for the two modalities, using differences between Positions 8 and 6 as the dependent variable. Differences
LINDA A. ROBERTS
132
--i--e’ I (b)
I
I
I
I
5
6
7
I3
1 I 5 4 POSITION
I 6
I 7
I 8
4
, MELODY
I
I 1
1 2
1 3
FIG. 1. Mean percentage correct recall at each serial position for digits (a) and for melodies (b) in Experiment 1. Solid lines indicate results for auditory presentation and dashed lines indicate results for visual presentation.
between visual and auditory presentation were statistically significant (p c .OS). Melodic
Patterns
As with digits, auditorily presented melodic items were recalled better than visually presented items [F(1,17) = 6.08, p = .025]. Unlike results
133
MEMORY FOR MUSICAL MATERIALS
TABLE 1 Mean Percentage Recall for Digits and Melodies in Visual and Auditory Presentation Modalities (Experiment 1)
Auditory Visual Mean
Digits
Melodies
Mean
84.1 64.7 74.4
58.8 50.4 54.6
71.5 57.4
for digits, however, both primacy and recency effects were comparable for the two melodic conditions. There were no significant interactions in separate analyses of the first four and final four positions. In addition, post hoc comparisons demonstrated no significant recency differences between the two modalities. Thus, for these melodic patterns, modality differences are due to enhanced recall for auditorily presented items at all positions in the curve. Discussion
These results give more direct support for previous suggestions that there are differences in recall patterns for music and language (Roberts et al., 1983). There was an overall auditory superiority for both stimulus-set types, but auditory-visual differences were greater for digits than for melodies. For digits, a modality effect was observed in which auditory presentation resulted in superior recency recall relative to visual presentation; primacy effects were comparable for the two presentation modalities. In contrast, enhanced recall for auditory (vs visual) presentation of melodies occurred about equally at all portions of the curve; primacy and recency effects were of the same magnitude for the two musical modalities. It might be argued, however, that any differences are primarily due to superior recall for auditory digits. Although it is inappropriate to make direct comparisons of the different stimulus types, it is curious that the shape of the curve differs for the auditory digit condition relative to the remaining three conditions. For auditory digits, a bow-shaped curve is observed in which recall is comparable at the first and last serial position. In contrast, recall is superior at primacy positions for the visual digit and the two music conditions. Perhaps there is something special about spoken language, which then results in exclusive access to PAS. However, it seems unparsimonious to suggest different processing for different types of auditory stimuli, particularly at a precategorical stage. The vocabulary set used in this experiment did not eliminate previously observed overall auditory-visual differences. Although the observed auditory superiority in the present experiment was smaller than in
134
LINDA A. ROBERTS
the Roberts et al. (1983) experiment (where the same presentation rate but a smaller vocabulary set was used), it appears that presentation rate is also an important factor in recall for visual vs auditory presentation of melodic sequences. EXPERIMENT
2
In this experiment, IO-item lists of digits, melodic materials, and chords were presented for serial recall. The list length was increased from 8 to 10 items because of the extremely high recall rates for spoken digits in Experiment 1. For all three stimulus-set types, materials were presented both visually and auditorily. One purpose was to observe whether findings for melodic sequences can be generalized to other aspects of music: in particular, whether harmonic sequences (chords) have primacy and recency recall patterns. Harmonic patterns differ from melodic ones in that they are less “singable” and therefore may be less conducive to an auditory representation in short-term memory. Although harmonic sequences rely on pitch information (as do melodies), their processing requires both vertical and horizontal scanning, unlike the more simple horizontal aspects of melody. As a result, there may not be enough time to form auditory image translations for these harmonic materials. There is some difficulty in creating harmonic patterns that are uncontaminated by melodic properties. This problem was dealt with by using chords in which only the inner voices differ since it seems reasonable that melodic properties, such as contour, are perceived primarily when pitch changes occur at the highest and lowest pitches of chord sequences. In addition to observing recall patterns for harmonic materials, this experiment was also aimed at equating the visual and auditory musical tasks for difficulty. In the Roberts et al. (1983) study, there was an overall visual advantage when items were presented at a slow rate (1 item/2 s) and an overall auditory advantage when items were presented at a faster rate (2 items/s). (Recall that the presentation rate in Experiment 1 was 2 items/s.) Although the vocabulary used in the first experiment of the present study appeared to have some influence on decreasing previously observed auditory-visual differences, it seemed desirable to present the items at a slower rate than in Experiment 1. Consequently, presentation rates of 1 item/s were used for all conditions in this experiment (midway between the rates used in the Roberts et al. study). Method Subjects. Subjects were 19 undergraduate and graduate music students from The Juilliard School in New York. They ranged in age from 18 to 32. Each subject was tested for 2 h and was paid $20 for participating in the experiment. Design. All subjects participated in six conditions. The melodic and digit conditions were
MEMORY FOR MUSICAL MATERIALS
13.5
the same as in Experiment 1: Subjects saw and heard random sequences of musical notes (F,G,A,C’,D’) and digits (1 to 5). In the two harmonic conditions, subjects saw and heard random sequences drawn from a stimulus set of three chords (C major, G major, and E minor). An obvious flaw in this design is that the harmonic conditions have different vocabulary sizes than the melody and digit conditions. Unfortunately, there appeared to be no way to overcome this problem without making the digit and melody conditions too easy (given a three-item vocabulary). Furthermore, for the harmony conditions, there was no way of increasing the vocabulary without the potential contamination of melodic properties such as contour. Stimuli. Lists were random orderings of the three stimulus-set types (melody, harmony, and digits), with the constraint that none of the items were repeated in successive positions in the list. For the melodic and digit conditions, the stimulus sets and their generation were the same as Experiment 1. Visually presented harmonic sequences were generated on an Apple computer. As with the visual melodic condition, five lines representing a musical staff were centered on the screen throughout this portion of the experiment. Subjects were told to assume a treble clef. The chords were presented in the center of the screen in their appropriate positions on the staff. Each chord was seen for a duration of 800 ms. (This duration was also used in the other conditions in which this kind of control was possible.) A vocabulary of three 4-note chords was used. For each chord, the bass note was G above middle C and the highest note in pitch was G an octave higher. The three chords, shown in Fig. 2, were a G major chord in root position, a C major chord in second inversion, and an E minor chord in first inversion. An advantage of using these chords is that, should subjects attend to changes occurring in one of the inner voices, any clues about chord type would be ambiguous. The change of the inner note from C to B could reflect movement to either the G or the E chord. And the change from D to E could indicate movement to either the C or the E chord. Consequently, it was hoped that subjects would attend to the overall impressions generated by these chords. Auditorily presented harmonic sequences were generated as in Experiment 1. The stimuli were the same as in the visual harmonic condition, with the exception that the chords were presented an octave lower in pitch. Procedure. Subjects were tested in groups of one to three in a quiet room at The Juilliard School in New York. Each subject participated in all six conditions, ordered according to a Latin square design (six different orderings). Subjects were randomly assigned to the condition orderings and were given a 5-min break after every two conditions. Before each of the four music conditions, subjects were required to write the names of the notes or chords as they were seen or heard. For each condition, there were three randomly ordered trials of 10 stimuli per trial, presented at the rate of 1 item/3 s. Before the two harmony conditions, the experimenter named one IO-item list of the chords. The purpose of this task was to give the subjects experience in both hearing and naming the various musical items so that perceptual and/or recording problems were minimized during the experiment. Each condition consisted of 17 trials, of which the first two were unscored practice trials.
(ROOT POSITION)
(2nd
(1st
INVERSION1
INVERSION)
FIG. 2. The three chords used as harmonic stimuli.
136
LINDA A. ROBERTS
At the beginning of a trial, subjects heard a warning signal. These signals consisted of a beep for the three visual conditions, a verbal “ready” signal for the auditory digit condition, the tone F for the auditory melody conditions, and the tone Cl for the auditory harmony condition. The warning signal was followed, after 2 s, by a IO-item sequence, presented at the rate of 1 item/s. They then had 18 s in which to make their responses on the answer sheets provided. For the digit conditions, they wrote the digits; for the melodic conditions, they wrote the names of the notes (F,G,A,C,D); and for the harmonic conditions, they identified the chords by their root names (G,C,E). Instructions for serial recall were the same as in Experiment 1.
Results
As in Experiment 1, an analysis of variance was performed on the overall data, and then separate analyses were undertaken for each stimulus-set type. The dependent variable was the total number correct at each serial position. Figure 3 gives the serial-recall patterns for all three stimulus-set types. Mean percentage correct is plotted as a function of serial position. Solid lines represent the auditory conditions while the broken lines represent the visual conditions. Table 2 gives the mean percentage recall for all three stimulus-set types, for both visual and auditory presentation. As in Experiment 1, digits were recalled most accurately, indicated by a significant main effect of stimulus-set type [F(2,36) = 5.62, p = .008]. Contrary to what was expected, recall was better for harmonic than for melodic patterns. It was expected that the task would be harder when each list item consisted of four notes (harmonic conditions) than when it consisted of one note (melodic conditions), particularly in the visual conditions. Auditory materials were recalled better than visual materials [F(1,18) = 6.98, p = .017], as shown in Table 2. However, this effect of modality (a mean difference of 2.8%) was much smaller than was observed in Experiment 1 (a difference of 13.9%). This auditory superiority is due only to the digit conditions (a mean difference of 8.5%). As is evident in Table 2, there was only a slight auditory advantage for the harmony conditions (0.83%) and a slight visual advantage in the melody conditions (1.02%). There was a significant interaction between stimulus-set type and modality [F(2,36) = 5.79, p = .007]. As shown in Fig. 3, primacy effects appear to be comparable for these conditions, but recency effects differ for digits; for digits, the difference between auditory and visual presentation at recency positions is much greater than auditory-visual differences for melody or harmony. There was a significant stimulus-set type x position x modality interaction [F(18,324) = 4.73, p < .OOl]. In order to verify these differential modality effects, separate analyses were performed for the three stimulus-set types.
137
MEMORY FOR MUSICAL MATERIALS a)
I go-
s k+ ?QE 5 k? 5 50a 5 Y
30-
I
I
III
J
I
l
I
I
hl “I MELODY go-
: i
TO-
s 5 8 8 a
50-
5 Y
30 t
s cl -
I I I I I I I 1 I HARMONY
9\ \,
\
\ ‘0..
#I’
‘-h--d
2, II 1
2
II 3
4
IIll 5 6 POSITION
7
8
1 9
I 10
FIG. 3. Results for Experiment 2. Mean percentage correct is plotted as a function of serial position for auditory (solid lines) and visual (dashed lines) presentation modalities, for each of the three stimulus-set types.
138
LINDA A. ROBERTS TABLE 2 Mean Percentage Recall for the Three Stimulus-Set Types for both Modalities (Experiment 2)
Auditory Visual Mean
Digits
Melody
Harmony
Mean
46.4 37.9 42.2
34.7 35.1 35.2
38.9 38.1 38.5
40.0 31.2
Digits
As occurred in Experiment 1, recall patterns for digits differed only at recency positions. In an analysis of the first 4 positions, the only significant finding was a main effect of position [F(3,54) = 19.62, p < .OOl]. Although primacy effects are of equal strength, the recency effect is much more pronounced for the auditory digit condition than for the visual digit condition. In an analysis of the last 4 positions, there was a significant position x modality interaction [F(3,54) = 17.33, p < .OOl]. To further validate these differences in recency recall for the two modalities, a Scheffe test was performed, using difference scores between Positions 10 and 8 as the dependent variable. As expected, significant differences were observed between the two modalities (p < .Ol). Melodic
Patterns
For melodic patterns, primacy and recency effects of about equal strength occurred in both modalities. An analysis of the first four positions showed only a significant main effect of position [F(3,54) = 58.05, p < .OOl], as did an analysis of the last four positions [F(3,54) = 23.21, p < .OOl]. Post hoc comparisons did not yield significant recency differences between the two modalities, which further indicates that, unlike digits, recency effects are comparable for these two conditions. Harmonic
Materials
For harmonic materials, an analysis of the first four positions yielded a significant position x modality interaction [F(3,54) = 5.10, p = .0041. This interaction is due to superior recall for visual presentation in the first 2 positions and approximately equal recall for auditory and visual presentation in the next 2 positions. It is unlikely that this result is meaningful. An analysis of the last four positions also showed a position x modality interaction [F(3,54) = 4.09, p = .Ol]. This interaction is due to a stronger recency effect from Positions 8 to 10 for visual presentation, which is an unusual result. Results of a Scheffe test (using difference scores between
MEMORY
FOR MUSICAL
MATERIALS
139
Positions 10 and 8) indicated no differences between the two modalities at recency positions. Discussion
As in Experiment 1, recall was better for digits than for musical materials and was better for auditory than visual presentation. Similar to the findings of Experiment 1, both differences were due primarily to superior recall for auditory digits, particularly at recency positions. However, unlike Experiment 1, recency recall was better in the four music conditions than in the visual digit condition. Thus, in view of other results demonstrating little or no recency for visually presented linguistic materials, it is likely that the relatively strong recency effect for visual digits in Experiment 1 was a chance occurrence. Nevertheless, as indicated earlier, caution must be exercised in comparing results across different stimulus types. The most important finding is that results differed for music and language. As occurred in Experiment 1, modality differences at recency positions occurred only for digits. In contrast, comparable recency effects were observed for both visual and auditory presentation of musical materials. This latter observation replicates the findings of Experiment 1 and of Roberts et al. (1983). It is particularly interesting that these patterns occur for both melodic and harmonic patterns. It would seem, then, that previous findings for melodic patterns can be extended as a general musical phenomenon. However, it should be pointed out that, despite the care taken to ensure harmonic processing of the chord sequences, there is no way of ruling out the possibility that this subject population generated some less obvious listening strategies. A faster presentation rate, relative to Experiment 1, resulted in some interesting effects for both digits and melodies. The influence of presentation rate on recall for linguistic items is well documented. Several researchers have reported that the superiority of auditory presentation is greater at faster presentation rates (Fell & Laughery, 1969; Laughery & Pinkus, 1966; Murdock & Walker, 1969; Sherman & Turvey, 1969). This finding was also observed in the present experiment for digits as well as for musical materials; for both materials, the auditory superiority was reduced when slower presentation rates were used. However, there appear to be differential effects of presentation rate for the musical and linguistic materials used in the present study. Slower presentation rates (Experiment 2) essentially eliminated auditory superiority at primacy list positions for both language and musical materials. But this manipulation eliminated modality differences at recency portions only for music. In this case, the overall auditory superiority observed in Experi-
140
LINDA
A. ROBERTS
ment 1 was eliminated. For digits, the typical modality effect was retained at these slower rates. Thus, it appears that recall patterns differ for music and language, regardless of presentation rate. These differences are due to better recency recall for written music than for written language materials. EXPERIMENTS
3 AND 4
The purpose of these experiments was to examine further memory for music, again using the methodologies developed in studies of language. In particular, suffix effects for musical materials were observed. For language materials, visual presentation results in only a slight, usually nonsignificant recency effect (in serial recall tasks). Thus, any effect of presenting a suffix has been shown to be either small or nonexistent (Hitch, 1975; Morton & Holloway, 1970). Because suffix experiments have usually been carried out to test PAS theories of the modality effect, most experiments looking at suffix effects have dealt with auditory presentation of list items. The most critical finding is that the suffix effect is greatest when the suffixed item shares physical features with the list items (Crowder, 1971; Crowder & Raeburn, 1970; Morton et al., 1971). Thus, the suffix effect was considered to be due to interference among precategorical traces. The influence of physical similarity between list and suffixed items has also been demonstrated for nonspeech sounds. Foreit (1976) observed that a tone suffix depressed recall of the last tone in a list. In recall for lists of environmental sounds or their verbal labels, Rowe and Rowe (1976) found that a suffix effect occurred only when the lists and suffixes were of the same type (e.g., both environmental sounds). Consequently, it appeared that the suffix effect was not limited to language, but rather something characteristic of auditory processing. However, recent findings show that auditory presentation may not be the critical variable in modality and suffix effects. Suffix effects of much greater magnitude than previously found for visually presented materials have been reported for lip-read (Campbell & Dodd, 1980), mouthed (Naime & Walters, 1983), and signed (Shand & Klima, 1981) linguistic materials as well as for abstract pictures (e.g., Broadbent & Broadbent, 1981). The present experiments examine suffix effects for both visual and auditory presentation of musical materials (melody and harmony). In Experiment 3, visually presented lists were followed by a written note or chord suffix and Experiment 4 was the auditory analog of Experiment 3. If physical similarity between the list items and the suffix is the critical variable, then it would be expected that, for both experiments, a single note or tone suffix will selectively impair the melodic lists and that a
MEMORY FOR MUSICAL MATERIALS
141
chord suffix will have greater impairment on recall for chords. An alternative expectation is that, since chords consist of a number of notes, both suffixes will equally impair the recall of melodic materials. It is clear, however, that the importance of physical similarities between list and suffix items has been demonstrated only for auditory presentation. Although Shand and Klima (1981) observed a suffix effect when an ASL sign was suffixed to a list of ASL signs, for example, no studies have ascertained whether physical similarity is critical in the suffix effect of visually-presented materials. Therefore it is important to compare the findings between Experiments 3 and 4 in the present study. In earlier experiments, it has been observed that recall patterns for musical materials are similar for the two modalities. This finding would lead one to expect that suffix effects will also be comparable for visual and auditory presentation of these patterns. Method-Experiment
3
Subjects. Subjects were 16 music students from The Juilliard School in New York, 13 of whom were tested previously in Experiment 2. Subjects were paid $20 each for their participation in the experiment. Design. Each subject participated in six different conditions. Subjects saw random sequences of musical notes or chords presented in the center of a computer screen. Three of the conditions consisted of to-be-remembered melodic sequences, and the other three conditions consisted of to-be-remembered chord sequences. For each of these stimulus-set types, there was a control condition in which subjects serially recalled the IO-item lists in the absence of a suffix. In the other four conditions, both melodic sequences and chords were suffixed by a note (C above middle C) or by a chord (F major). This suffixed note or chord was to be considered as a cue to recall the IO-item lists which preceded the s&fixed item. Stimuli. The visually presented stimuli were generated as in Experiment 2, except that, in this experiment, the melodic sequences were drawn from a five-note pentatonic scale with G above middle C as the tonic. The three-chord vocabulary in the harmonic conditions was the same as in Experiment 2 (C major, G major, E minor). For the two note-suffix conditions (melodic list-note suffrx, harmonic list-note suffix), the redundant nonrecalled suffix was C above middle C. For the two chord-suffix conditions (melodic list-chord suffix, harmonic list-chord suffix), the suffrxed item was always an F major chord (with F above middle C as the root and the F an octave higher as the highest note). Procedure. Subjects were tested in groups of one to four in a quiet room at The Juilliard School in New York. Each subject participated in all six conditions, ordered according to a Latin square design (five different orderings of the six conditions). Subjects were given a 5to IO-min break after every two conditions. Prior to the first presentation of melodic or harmonic list items, subjects were given a baseline task in which they wrote the names of the notes or chords as they were seen. There were two IO-item lists for each stimulus set type, presented at a l-item/3 s rate. This was to minimize any perceptual or recording problems that might occur during the ensuing memory tasks. Each condition consisted of 14 trials, the first two of which were practice. At the beginning of a trial, subjects heard a warning beep. After 2 s, a IO-item sequence was presented at a l-item/s rate. For the four suffix conditions, each IO-item list was sufftxed by a note or
142
LINDA A. ROBERTS
chord, presented 1 s after the final list item. Subjects were told to keep their eyes on the screen until the redundant suffix was presented and to use this item as a cue to begin their recall of the previous 10 items. Subjects had 18 s in which to make their responses on the answer sheets provided. In the melodic conditions, they wrote the names of the notes (G,A,B,D,E) and in the harmonic conditions, they identified the chords by their root names (G,C,E). Instructions for serial recall were the same as Experiments 1 and 2.
Results-Experiment
3
Similar to the results of Experiment 2, there was better recall for harmonic than for melodic materials [F( 1,15) = 5.5 1, p = .033]. As shown in Fig. 4, where data are collapsed across conditions, there was better recall for harmonic materials at all list positions but the first two. There was a position X stimulus-set type interaction [F(9,135) = 7.31, p < .OOl]. Differences among the conditions were most evident at the last serial positions. The effect of position interacted with condition [F(18,270) = 4.66, p < .OOl]. These effects are discussed further in the sections describing each stimulus-set type separately. Figure 5 gives the serial recall patterns for each of the conditions for the melodic (Fig. 5a) and harmonic (Fig. 5b) stimulus sets. The solid lines represent the control conditions and the dashed lines represent the note (tilled squares) and chord (filled triangles) suffix conditions. Visually Presented Melodic
Patterns
In an analysis of the melodic conditions, differences among the conditions occurred only at recency positions. A position x condition interacII
,I
III
III
go-
HARMONY L 40-
MELODY zoIOIII I
2
3
III 4
5 6 POSITION
I 7
I 6
11 9
10
FIG. 4. Mean percentage correct at each serial position, collapsed across conditions (Experiment 3).
143
MEMORY FOR MUSICAL MATERIALS
MELODY
POSITION
b) HARMONY 90 60 L 8L
70-
CO-
c 6 kg 50a 5 40Y
30 -
,OtL I 1
I 2
I 3
I 4
I I 6 5 POSITION
I 7
I 8
I 9
I 10
FIG. 5. Serial recall patterns for the two stimulus-set types in Experiment 3. The control condition is represented by solid lines (unfilled circles) and the two suffix conditions are represented by dashed lines (filled triangles for the chord suffix condition and filled squares for the note suffix condition).
tion was not significant in the primacy part (first four positions) of the curve @ = .864) but was significant in the recency (last four) positions [F(6,90)
= 2.72,~
= .018].
Observation of the recency portion of the curves (Fig. 5a) shows that the control condition had enhanced recall relative to the two suffix conditions. The note-suffix and chord-suffix conditions had about equal interference effects at terminal positions. In order to determine whether there were significant recency differences in the three conditions, additional
144
LINDA
A. ROBERTS
post hoc analyses were carried out on the difference scores at recency positions. There were no significant differences between the conditions when difference scores between Positions 10 and 8 were examined. However, Scheffe tests demonstrated significant differences between the control and the note-suffix conditions (p < .05) and between the control and chord-suffix conditions (I, < .05) when difference scores between Positions 10 and 9 were employed. In this case, there was no difference between the two suffix conditions. Together, these results suggest that the two suffix conditions had comparable interference effects. Visually Presented Harmonic
Materials
For harmonic materials, separate analyses of the first and last four serial positions revealed significant differences among the conditions at both primacy and recency portions of the curve, demonstrated by significant position x condition interactions [F(6,90) = 6.24, p < .OOl, and F(6,90) = 2.64, p = .021, respectively]. These effects are shown in Fig. Sb. In the primacy positions, recall for the control condition was best at Position 1 and worst at Position 4. Unfortunately, the meaning of this interaction is not clear. In the terminal positions, the control and notesuffix conditions show a positive slope in recall from Position 9 to 10 while the chord-suffix condition has a negative slope. Thus, whereas the note-suffix condition has a small effect on recency recall, the chord-suffix condition appears to have a dramatic effect on recall of the terminal list items. A test of the differential effects of the two suffixes involved post hoc comparisons (Scheffe test) of recency effects between the conditions. The dependent variable was the difference between Positions 10 and 8. There was no reliable difference between the two suffix conditions. However, a significant difference was found between the control and chordsuffix conditions (p < .05) but not between the control and note-suffix conditions, thus suggesting a trend in favor of selective interference from the chord suffix. Discussion
In contrast to the small or nonexistent suffix effects for visual presentation in language studies, suffix effects appear to be fairly reliable for visually presented musical materials. For melodies, comparable suffix effects from both types of suffixes were apparent. Although post hoc analyses using differences between the last and third-to-last positions showed no reliable suffix effects (these differences were used in earlier analyses), suffix effects were reliable when differences between the final two serial positions were employed. For harmonic materials, there was a trend suggesting a selective impairment due to the chord-suffix condition. This
MEMORY FOR MUSICAL MATERIALS
14.5
trend was implied from the observation that, although there were no reliable differences between the note- and chord-suffix conditions, the chord suffix differed significantly from the control condition whereas the note suffix did not. Furthermore, the chord-suffix condition resulted in a negative recency slope, whereas the control and note-suffix conditions had positive slopes at terminal positions. Apart from this apparent selective impairment of chord suffixes on harmonic lists, the chord suffix appeared to have a greater effect on memory for melodic materials than did the note suffix on recall of harmonic materials. This asymmetry suggests that a chord could impair recall of melodic passages because a chord consists of a series of notes. Moreover, each of these notes are physically similar to the list items. A note suffix may not impair harmonic lists because the single note is not sufftcient to wipe out the chordal representation in memory. Even if one of the notes of the chord is masked or overwritten by this suffix, the remaining information in memory may be sufficient to retain the representation of a given chord. Or it may be that a chord is represented as more than simply the sum of its parts. Each chord may be perceived and encoded as a “gestalt,” each with its own unique perceptual identity. In this case, a note suffix will be perceived as being very different and will have very little interference. However, the former explanation carries more weight because, if a chord is perceived as a unique, single entity (a “gestalt”), then this representation would not be expected to interfere with melodic recall, since it would be perceived differently than the melodic list items (i.e., not composed simply as an item containing four individual notes). Unlike results for language, these patterns for written music can not be reconciled with PAS theories since no external physical sound is present to be passively overwritten by a suffix. Rather, these results are more supportive of visual sensory and short-term memory theories. From both standpoints, it would be expected that, should recently presented musical materials manage to gain access to the sensory or short-term store, then the physical similarity of list and suffix items will be critical. Since these terminal list items will have a tenuous representation in memory, any additional item that enters the sensory or short-term store will interfere with recall of these list items. The magnitude of the suffix effect should depend on the physical similarity of list and suffix items. Method-Experiment
4
Subjects were 15 music students from The Juilliard School in New York. Thirteen of these subjects participated previously in Experiment 3. Each subject was paid $20 for her/his participation. Design. The design was identica1 to that of Experiment 3 except that the stimuli were Subjects.
146
LINDA A. ROBERTS
auditorily presented. The same trial sequences were used for the comparable conditions in Experiments 3 and 4 (e.g., melody control). This manipulation was employed so that more reliable comparisons might be made between the auditory and visual modalities, for the 13 subjects who participated in both experiments. Subjects were tested in Experiment 4 three weeks after Experiment 3, and, when questioned at the end of this experiment, none of these 13 subjects who were in both experiments indicated that they recognized the sequences. Stimuli and procedure. The auditorily presented sequences were generated as in Experiment 2. For each condition, the trials were the same as Experiment 3, except that, for the harmonic conditions, the chords and their suffixes were presented an octave lower in pitch. The procedure was the same as Experiment 3.
Results-Experiment
4
Similar to the findings in Experiments 2 and 3, there was better recall for harmonic than melodic materials, as indicated by a significant main effect of stimulus-set type [F(1,14) = 5.93, p = .029]. Figure 6 shows the position x stimulus-set type interaction [F(9,126) = 8.75, p < .OOl], where the data is collapsed across conditions. As was found for visual presentation of these materials (see Fig. 4), harmonic materials were recalled better than melodic materials in the last seven positions and melodic materials were better recalled in the first two positions. Auditorily
Presented Melodic
Patterns
For melodic sequences, there were interactions between condition and position at both primacy and recency portions of the curve. An analysis III,,,
I
,
,
,
7
6
I 9
10
90-
20I, 1
2
I 3
I 4
I 5
II, 6
I
POSITION
6. Mean percentage correct (collapsed across conditions) plotted as serial position in Experiment 4. FIG.
function of
147
MEMORY FOR MUSICAL MATERIALS
of the first four serial positions indicated a significant position x condition interaction [F(6,84) = 2.98, p < .OOl]. As shown in Fig. 7a, overall recall in the first four serial positions is best for the note-suffix condition and worst for the control condition, although this discrepancy is attenuated at Position 4. An analysis of the final four serial positions revealed a significant position x condition interaction [F(6,84) = 5.53, p < .OOl]. Scheffe tests MELODY 90 80 t :
70-
% ,” 606 g 50aw 9= 40 Lx 30 20 .l I
1
I
I
1
2
3
4
I
I
I
I
I
5 6 POSITION
I
7
8
9
10
b) HARMONY 90 80 t
L p
70-
8 ,”
60NOTE
B g 50f z 40Ll =
3020 ?I. I
I
I
,
1
2
3
4
I
,
5 6 POSITION
I,
7
8
I
I
9
10
FIG. 7. Mean percentage correct at each serial position for the three different conditions for the melodic (a) and harmonic (b) stimulus sets in Experiment 4. The control condition is represented by solid lines (unfilled circles) and the two suffix conditions are represented by dashed lines (filled squares for the note-suffix condition and tilled triangles for the chordsuffix condition).
148
LINDA
A. ROBERTS
were carried out between conditions, using difference scores between Positions 10 and 8 as the dependent variable. There was a significant difference between the control and the chord-suffix conditions 0, < .Ol) as well as between the control and note-suffix conditions 0, < .05), but no reliable differences were observed between the two suffix conditions. Auditorily Presented Harmonic Patterns For harmonic materials, no differences were observed at primacy positions, but there were significant differences among the conditions at recency positions. An analysis of the first four positions showed only a significant effect of position [F(3,42) = 5.84, p = .002], whereas an analysis of the final four positions resulted in a significant position x condition interaction [F(6,86) = 2.92, p = .012]. As occurred for visual presentation of harmonic materials, there was a trend showing that a chord suffix selectively impaired auditorily presented harmonic materials. Scheffe tests were carried out using difference scores between Positions 10 and 8 as the dependent variable. Significant differences were observed between the control and chord-suffix conditions (p < .OS)but not for the control vs note-suffix comparison nor for the two suffix conditions. Discussion-Experiment 4 In general, harmonic materials were better recalled than melodic materials. This finding is in accord with results of Experiment 2 and 3. This overall harmonic superiority might result from the reduced vocabulary size for harmonic materials, which allows more correct guesses. This conclusion is substantiated by the observations that (1) melodic materials were recalled better at early list positions in both Experiments 3 and 4 (when a serial-recall task is employed, it would be expected that fewer guesses occur at earlier list positions because these items are the first ones to be recalled), and (2) fewer blanks were observed for the harmonic compared to the melodic conditions. For melodic sequences, the tone and chord suffixes had comparable interference effects. In contrast, for harmonic sequences, there was a trend showing selective impairment of the chord suffix. As occurred in Experiment 3, the chord-suffix condition resulted in a negative recency slope. Similar to results for visual presentation of these materials (Experiment 3), there appeared to be an asymmetry in which a chord suffix had a greater interference effect on melodic sequences than did a note suffix on harmonic sequences. As suggested earlier, a note suffix may not impair memory for chords because this single note is insufficient to mask the four-note representation in memory. But, since a chord contains a series
149
MEMORY FOR MUSICAL MATERIALS
of notes, this stimulus would be expected to eliminate the melodic representation. A final point of interest is that, in both Experiments 3 and 4, the control condition had smaller recency effects than occurred in the first two experiments. Although differences may simply be due to different subject populations, it is most likely that subjects experienced interference in the control condition because they anticipated a suffix. That is, since five of the six conditions contained suffixes, they may have expected a suffix to occur in the remaining condition as well. Hence, an “anticipatory effect” may have elicited some interference in recall of the final few items. This explanation is supported by the observation that the subjects who were exposed to the control condition first had stronger recency effects than did subjects who had the control condition last. Analysis
of Experiments
3 and 4
In order to compare the results of Experiments 3 and 4 more directly, an analysis of variance was performed, using the data from the 13 subjects who participated in both experiments. As shown in Fig. 8, where the data is collapsed across conditions, recall was better for visual presentation in the first few serial positions and better for auditory presentation in the final few serial positions. There was a significant position x modality interaction [F(9,108) = 4.40, p < .OOl]. As shown in Fig. 9, this crossover effect between the modalities occurs in five of the six conditions. Although this result appears to be reliable, it is perhaps important to note that this position x modality interaction I
I
I
I
I
1
2
3
4
5
I
I
I
6
7
6
11
go607060504030-
9
10
POSITION
FIG. 8. Mean percentage correct (collapsed across conditions) plotted as a function of serial position for the 13 subjects who participated in Experiments 3 and 4.
150
LINDA A. ROBERTS
90 -
c,
MELODY CONTROL -Q .
1 2 3 4 5 6 7 8 9 10
L
1 2 3 4 5 6 7 8 9 10
90
HARMONY CONTROL
:a0 5 70 60 5
w 50 k2 40 L 30 5 20 1
g,ot,,
,
,
,
,
,
(,
,
/-,
12345678910
, 1
,
,
,
,
,
,
,
,
2 3 4 5 6 7 8 9
10
12345678910
12345678940 POSITION
FIG. 9. Mean percentage correct at each serial position for the 13 subjects who participated in Experiment 3 and 4, for each of the six different conditions. Filled circles indicate results for auditory presentation and unfilled circles indicate results for visual presentation.
(due to a crossover somewhere in the middle of the list) is significant only for the melody-control condition [F(9,108) = 2.90, p = .0041and for the two chord-suffix conditions [F(9,108) = 3.19, p = .002 (for melody) and F(9,108) = 6.12, p < .OOl (for harmony)]. Discussion-Experiments
3 and 4
This crossover between visual and auditory recall patterns for music has also been observed in Experiment 2 and in research reported by Roberts et al. (1984). This pattern has not been reported in studies of English language materials; in this case, the modality effect is due to differences occurring only in the recency portions of the curve. It is difficult to expiain this pattern from a sensory position. Rather,
MEMORY
FOR MUSICAL
MATERIALS
151
this result may be interpreted by assuming that representations for written musical materials entail both visual and auditory codes. The dual code for visual presentation of music may enhance recall at primacy portions of the curve relative to auditory presentation. Since a dual code would be likely to result in deeper processing (Craik & Lockhart, 1972) than would a single code (auditory presentation of music), primacy recall should be better in the former case. At final serial positions, there is an advantage for auditory presentation because the auditory image is stronger than occurs for visual presentation (i.e., auditory presentation of music has easier access to short-term memory than does visual presentation) . GENERAL
DISCUSSION
Zmplications for Theories of Modality and Suffix Effects Findings of the present study refute several theoretical accounts of modality and suffix effects. First, these results join a growing body of literature opposing PAS theory (Ayres et al., 1979; Broadbent & Broadbent, 1981; Campbell & Dodd, 1980, 1982; Engle, 1974; Hines, 1975; Hitch, 1975; Nairne & Walters, 1983; Phillips & Christie, 1977a, 1977b; Spoehr & Corin, 1978; Shand & Klima, 1981). In particular, since no sounds are involved for mouthing, lip-reading, signs, pictures, or written music, it is impossible for these recency advantages to be a by-product of the properties of a pre-short-term memory auditory store. Additionally, since written music is neither moving nor a primary language, results of this study refute suggestions that changing-state (vs static) information has privileged access to short-term memory (e.g., Campbell & Dodd, 1980) as well as explanations emphasizing privileged processing of primary language materials (Shand & Klima, 1981). One position that might help to explain many recent findings in the literature, including the present study, incorporates the visual sensory theory of Broadbent and Broadbent (1981). According to this explanation, little recency occurs for written language since its rapid categorization minimizes the role of a visual sensory store. In contrast, features found in abstract pictures are more likely to be maintained in visual iconic memory. Thus, by extending PAS theory, the majority of recent findings could be explained within a general sensory framework. Perhaps written music is maintained in iconic memory and this information supplements that in short-term memory, thus resulting in a strong recency effect. The pictorial representation of music may explain why sensory representations are stronger for written music than for written words. One problem for this sensory explanation is that it cannot explain recency findings for mouthed items; in this case, there is no visual informa-
152
LINDA
A. ROBERTS
tion to be maintained in iconic memory. However, one might propose a tactile sensory store for these items. An alternative explanation that can account for results of this study as well as other results in the literature relies on earlier short-term memory theories. Differential recall for written music and language might occur because written music has faster, more direct access to its corresponding auditory representation in short-term memory. There are two reasons why this might be the case. First, as suggested earlier, musicians normally hear the corresponding sounds as they are reading them whereas we read language materials in the absence of sound. Good recency recall for mouthed and lip-read materials supports this view as well. These materials are also usually accompanied by sounds and may consequently involve more automatic recoding into their acoustic form in short-term memory. Second, it may be that written words involve a more abstract symbol system than does written music. For example, the letters of words do not embody the corresponding speech sounds. But, for written music, the notes move higher as the frequencies rise in pitch. Thus, the symbol system used in music may elicit auditory representations more easily than is the case for language. There is converging evidence that written music entails an auditory representation. First, the present research has demonstrated that recall functions are similar for auditory and visual presentation of musical materials for both serial recall and suffrx experiments. Thus, it seems reasonable that similar mechanisms are used for the two musical modalities. Second, for both language and musical materials the auditory superiority is reduced when slower presentation rates are used. This suggests that, at slower rates, the extra time may be used to effect the translation of these items to their auditory representations in short-term memory. Third, findings of better recall for written music at primacy positions and for heard music at recency positions support this view as well, as is discussed later. And finally, it has been shown that both auditory tone and visual note suffixes interfere with recency recall of written melodic materials (Roberts et al., 1983). Since an auditory tone affected recall of written materials, it seems reasonable to assume that perceptions of tones and notes share a representation. The visual suffix, however, showed greater interference, perhaps because its representation had more in common with visual notes than did the auditory tones. This could be because notes are represented both visually and auditorily and the tone affected only the auditory component. If written notes produce auditory representations, the “timbre” of written list and written suffix items would be identical, whereas the auditory suffix might produce a different timbre from the auditory-imaged notes. Thus the visual suffix would interfere more than the auditory suffix in auditory memory.
MEMORY
FOR MUSICAL
MATERIALS
153
However, just as sensory theories have difficulty explaining results for mouthed materials, the auditory-image hypothesis has difficulty explaining results for abstract pictures and signed items. There is no reason to expect that either of these materials would involve auditory representations in short-term memory. Additional caution should be exerted, since the present experimental procedure did not ensure auditory encoding of these written symbols. For example, although subjects were asked not to hum prior to recall of the materials, the experimenter was unable to control for inadvertent humming (by, for example, measuring exhalation or inhalation of air). Although, in the harmonic case, it would be impossible to hum three notes simultaneously, it may be that, in this case, the musicians generated some unusual listening strategies. A second problem is that, apart from the first experiment (in which subjects wrote the notes on a musical staff), subjects were asked to recall the musical items by their letter names. Although recall patterns did not differ for these two different response modes, it may be that, in the latter experiments, subjects recoded these materials into their letter names rather than their auditory musical representations. In order to rule out this possibility, it would be necessary to show that, for musicians, recall patterns differ for written letters relative to musical representations of these same letter names. Implications
for General Musical
Issues
Apart from the relevance of the present results for these general issues in psychology, these experiments also relate to processing of musical materials per se. The finding of primacy and recency effects for both melodic and harmonic sequences suggests that harmony and melody are processed similarly and perhaps that all aspects of music have access to some special auditory representation for trained musicians. It would be informative to see if these patterns occur for rhythmic patterns as well. This might tell us whether pitch information is a necessary condition to access this auditory memory. Further interesting results for these musical materials were the differential effects of note and chord suffixes on recency recall. Whereas note and chord suffixes interfered about equally with recall of melodic sequences, there was a trend suggesting that a chord suffix selectively interferes with recall of harmonic sequences. This asymmetrical interference suggests that, although harmonic and melodic sequences appear to be processed similarly, they have different perceptual properties. Interference of a chord suffix on melodic processing might be expected since a chord contains a series of notes, each of which are physically similar to the melodic list items. The very small impact of a note suffix on recall for harmonic materials probably occurred because this single note is insuffi-
154
LINDA
A. ROBERTS
cient to erase or mask the four-note chordal representations in memory. In this case, although the note might overwrite one of the chordal components, the remaining information would likely be sufficient to achieve identification. A final point of interest for musical memory pertains to the observed crossover of the recall patterns for the two modalities at middle list positions. One explanation for this result is that, unlike heard music, written music entails a dual representation (i.e., represented both visually and auditorily). The dual code for written music may enhance recall at early list positions, whereas the stronger auditory representation for heard music enhances recall at recency positions. Of course, this explanation relies on two assumptions. First, it is necessary to accept the view that earlier list items entail deeper processing (such as their transference to long-term memory) than occurs for recency items (e.g., Broadbent, Vines, & Broadbent, 1978). Thus, visually presented music is remembered better at primacy positions because of its dual representation; heard music is only encoded auditorily, so it has weaker representations at primacy positions. A second assumption is that there is something special about auditory imagery that enhances recall at recency portions, whether it is the result of an actual sound or due to a more central form of imagery. It may be that an actual musical sound leads to a stronger representation in short-term memory because it has more direct access to the store. This explanation has support in results reported by Tzeng and Wang (1983). They observed serial recall patterns for Chinese logographs and English words, by Chinese and English speakers, respectively. Whereas the typical modality effect was observed for both groups of subjects (better recency recall for auditory than visual presentation), Chinese subjects recalled primacy and middle list items consistently better when they were presented visually rather than auditorily. For the native English speakers, visual and auditory presentation had virtually identical recall at primacy and middle positions. This crossover effect for Chinese subjects is similar to results for music except that, for music, both written and heard materials had strong recency effects. Tzeng and Wang concluded that this visual enhancement at early and middle positions occurred for Chinese script because Chinese logographs have a direct relationship to the morphemes themselves, rather than simply to their pronunciation (as occurs in English). Consequently, written Chinese logographs involve more visual memory than does processing of alphabetic scripts and this additional information (visual, in addition to semantic) aids recall for early positions. Different results at recency positions for music and Chinese script might occur because processing of written music involves both visual and auditory information. The auditory representation for
MEMORY FOR MUSICAL MATERIALS
155
written music could then aid recency recall, whereas the dual representation aids recall at earlier list positions. Summary
In conclusion, modality and suffix experiments for musical materials have revealed some striking and unique serial recall patterns. Although these results are similar to findings for language, there are some notable differences. These results are interpretable in terms of theories that emphasize either sensory processing or auditory encoding. Although these positions have more success than PAS theory in explaining some of the more recent findings in the literature, neither of these views account for all of the results. REFERENCES Ayres, T. J., Jonides, J., Reitman, S., Egan, J. C., & Howard, D. A. (1979). Differing suffix effects for the same physical suffix. Journal of Experimental Psychology: Human Learning and Memory, 5, 315-321. Baddeley, A. D. (1968). How does acoustic similarity influence short-term memory? Quarterly Journal
of Experimental
Psychology,
20, 249-264.
Bever, T. G., & Chiarello, R. J. (1974). Cerebral dominance in musicians and nonmusicians. Science (Washington, D.C.), 185, 137-139. Broadbent, D. E., Vines, R., & Broadbent, M. H. P. (1978). Recency effects in memory as a function of modality of intervening events. Psychological Research, 40, 5-14. Broadbent, D. E., & Broadbent, M. H. P. (1981). Recency effects in visual memory. Quarterly Journal of Experimental Psychology, 33A, l-15. Brown, R. (1980). Music and language. Proceedings of the National Symposium on the Applications of Psychology to the Teaching and Learning of Music. Reston, VA: Music Educators National Conference. Brown, R., & Herrnstein, R. J. (1981). icons and images. In N. Block (Ed.), Imagery. Cambridge, MA: MIT Press. Burns, E. M., & Ward, W. D. (1974). Categorical perception of musical intervals. Journal of the Acoustical
Society of America,
55, 4568.
Burns, E. M., & Ward, W. D. (1978). Categorical perception-Phenomenon or epiphenomenon: Evidence from experiments in the perception of melodic musical intervals. Journal
of the Acoustical
Society of America,
63, 456-468.
Campbell, R., & Dodd, B. (1980). Hearing by eye. Quarter/y Journal of Experimental chology,
Psy-
32, 85-99.
Campbell, R., & Dodd, B. (1982). Some suffix effects on lipread lists. Canadian Journal of Psychology, 36, 509-514. Campbell, R., Dodd, B., & Brasher, J. (1983). The sources of visual recency: Movement and language in serial recall. Quarterly Journal of Experimental Psychology, 35A, 571-587. Cole, R., Coltheart, M., & Allard, F. (1974). Memory of a speaker’s voice: Reaction time to same or different-voiced letters. Quarterly Journal of Experimental Psychology, 26, l-7. Conrad, R. (1964). Acoustic confusions in immediate memory. British Journal of Psychology, 26, l-7. Conrad, R., & Hull, A. J. (1968). Input modality and the serial position curve in short-term memory. Psychonomic Science, 10, 135- 136.
156
LINDA A. ROBERTS
Cooper, L. A., & Shepard, R. N. (1973). Chronometric studies of the rotation of mental images. In W. G. Chase (Ed.), Visual Information Processing. New York: Academic Press. Craik, F. I. M. (1969). Modality effects in short-term storage. Journal of Verbal Learning and Verbal Behavior,
8, 658-664.
Craik, E I. M., & Lockhart, R. S. (1972). Levels of processing: A framework for memory research. Journal of Verbal Learning and Verbal Behavior, 11, 671-684. Crowder, R. G. (1971). Waiting for the stimulus suffix: Decay, delay, rhythm, and readout in immediate memory. Quarterly Journal of Experimental Psychology, 23, 324-340. Crowder, R. G., & Morton, J. (1969). Precategorical acoustic storage (PAS). Perception & Psychophysics,
5, 365-373.
Crowder, R. Cl., & Raebum, V. P (1970). The stimulus suffix effect with reversed speech. Journal
of Verbal Learning
and Verbal Behavior,
9, 342-345.
Darwin, C. J., Turvey, M. T., & Crowder, R. G. (1972). An auditory analogue of the Sperling partial report procedure: Evidence for brief auditory storage. Cognitive Psychology,
3, 255-267.
Delis, D., Fleer, J., & Kerr, N. H. (1978). Memory for music. Perception & Psychophysics, 23, 215-218. Deutsch, D. (1981). The processing of structured and unstructured tonal sequences. Perception & Psychophysics, 28, 381-389. Engle, R. W. (1974). The modality effect: Is precategorical acoustic storage responsible? Journal
of Experimental
Psychology,
102, 824-829.
Fell, J. C., & Laughery, K. R. (1969). Short-term memory: Mode of presentation for alphanumeric information. Human Factors, 11, 401-405. Foreit, K. G. (1976) Short-lived auditory memory for pitch. Perception & Psychophysics, 19, 368-370. Gregory, A. H. (1978). Perception of clicks in music. Perception & Psychophysics, 24, 171-174. Halpem, A. R., & Bower, G. H. (1982). Musical expertise and melodic structure in memory for musical notation. American Journal of Psychology, 95, 31-50. Henmon, V. A. C. (1912). The relationship between mode of presentation and retention. Psychological Review, 19, 79-96. Hines, D. A. (1975). Immediate and delayed recognition of sequentially presented abstract shapes. Journal of Experimental Psychology: Human Learning and Memory, 1, 501-505. Hitch, G. J. (1975). The role of attention in visual and auditory suffix effects. Memory & Cognition, 3, 501-505. Jackendoff, R., & Lerdahl, F. (1980). A deep parallel between music and language. Unpublished manuscript. Jahnke, J. C., & Perez, W. A. (1981). Semantic encoding and the stimulus prefix effect. Journal
of Verbal Learning
and Verbal Behavior,
20, 470-477.
Laughery, K. R., & Fell, J. C. (1969). Subject preferences and the nature of information stored in short-term memory. Journal of Experimental Psychology, 82, 192- 197. Laughery, K. R. & Pinkus, A. L. (1966). Short-term memory: Effects of acoustic similarity, presentation rate and presentation mode. Psychonomic Science, 6, 285-286. Lerdahl, F., & Jackendoff, R. (1977). Toward a formal theory of tonal music. Journal of Music Theory, 21, 111-171. Liberman, A. M., Harris, K. S., Hoffman, H. S., & Griffith, B. C. (1957). The discrimination of speech sounds within and across phoneme boundaries. Journal of Experimenral Psychology,
54, 358-368.
Locke, S., & Keller, L. (1973). Categorical perception in a non-linguistic mode. Cortex, 9, 355-368. Mathews, M. V. (1969). The technology of computer music. Cambridge, MA: MIT Press.
MEMORY
FOR MUSICAL
MATERIALS
157
Morton, J., Crowder, R. B., & Prussin, H. A. (1971). Experiments with the stimulus suffix effect. Journal of Experimental Psychology, 91, 169- 190. Morton, J., & Holloway, C. M. (1970). Absence of a cross-modality “suffix effect” of short-term memory. Quarterly Journal of Experimenral Psychology, 22, 167- 176. Murdock, B. B., Jr. (1966). Visual and auditory stores in short-term memory. Quarterly Journal
of Experimental
Psychology,
18, 206-211.
Murdock, B. B., Jr., & Walker, K. D. (1969). Modality effects in free recall. Journal of Verbal Learning
and Verbal Behavior,
8, 665-676.
Murray, D. J. (1965). Vocalization-at-presentation and immediate recall with varying presentation rates. Quarterly Journal of Experimental Psychology, 17, 47-66. Nairne, J. S., & Walters, V. L. (1983). Silent mouthing produces modality- and suffix-like effects. Journal of Verbal Learning and Verbal Behavior, 22, 47.5-483. Norman, D. A. (1966). Acquisition and retention in short-term memory. Journal of Experimental Psychology, 72, 369-38 1. Phillips, W. A., & Christie, D. E M. (1977a). Components of visual memory. Quarterly Journal of Experimental Psychology, 29, 117- 133. Phillips, W. A., & Christie, D. E M. (1977b). Interference with visualization. Quarterly Journal
of Experimental
Psychology,
29, 637-650.
Posner, M. I. (1967). Short-term memory systems in human information processing. Acta Psychologica,
27, 267-284.
Reineke, T. (1981). Simultaneous processing of music and speech. Psychomuscicology,
1,
58-75.
Roberts, L. A., Millen, D. L., Palmer, C., & Tartter, V. C. (1983). Modality and suffix effects in memory for music. Journal of the Acousfical Sociefy of America, 74, 522(A). Rowe, E. J., 62 Rowe, W. G. (1976). Stimulus sufftx effects with speech and nonspeech sounds. Memory & Cognition, 4, 128-131. Salis, D. L. (1980). Laterality effects with visual perception of musical chords and dot patterns. Perception & Psychophysics, 28, 284-292. Schiano, D. J., & Watkins, M. J. (1981). Speech-like coding of pictures in short-term memory. Memory and Cognition, 9, 1lo- 114. Shand, M. A., & Klima, E. S. (1981). Nonauditory suffix effects in congenitally deaf signers of American Sign Language. Journal of Experimenral Psychology: Human Learning
and Memory,
7,464-474.
Shepard, R. N., & Metzler, J. (1971). Mental rotation of three-dimensional objects. Science (Washington,
D.C.),
171, 701-703.
Sherman, M. F., & Turvey, M. T. (1969). Modality differences in short-term serial memory as a function of presentation rate. Journal of Experimenral Psychology, 80, 335-338. Siegel, J. A., & Siegel, W. (1977). Categorical perception of tonal intervals: Musicians can’t tell sharp from flat. Perception & Psychophysics, 21, 399-407. Sloboda, J. A. (1976). The effects of item position on the likelihood of identification by inference in prose reading and music reading. Canadian Journal of Psychology, 30, 228-235.
Sloboda, J. A. (1977). Phrase units as determinants of visual processing in music reading. British Journal of Psychology, 68, 117-124. Sperling, G., & Spielman, R. G. (1970). Acoustic similarity and auditory short-term memory. In D. A. Norman (Ed.), Mode/s of human memory (pp. 152-202). New York: Academic Press. Spoehr, K. T., & Corin, W. J. (1978). The stimulus suffix effect as a memory coding phenomenon. Memory & Cognition, 6, 583-589. Tzeng, O., & Wang, W. (1983). The first two R’s, American Scientist, 71, 238-243. Wong, R., & Blevings, G. (1966). Presentation modes and immediate recall in children. Psychonomic Science, 5, 381-383. (Accepted August 5, 1985)