The sound of vowels and consonants in immediate memory

The sound of vowels and consonants in immediate memory

JOURNAL OF VERBAL LEARNING AND VERBAL BEHAVIOR, 10, 587-596 (1971) The Sound of Vowels and Consonants in Immediate M e m o r y I ROBERT G. CROWDER Y...

808KB Sizes 1 Downloads 67 Views

JOURNAL OF VERBAL LEARNING AND VERBAL BEHAVIOR,

10, 587-596 (1971)

The Sound of Vowels and Consonants in Immediate M e m o r y I ROBERT G. CROWDER Yale University, New Haven, Connecticut, 06510 When auditory and visual presentation are compared in tests of immediate ordered recall of natural-language stimuli such as digits or letters, the typical finding is better performance On the auditory than on the visual lists. This difference is located in late portions of the serial position curve, and the advantage can be partially removed by presenting a redundant stimulus suffix. In the present research this type of comparison was extended to lists of consonant-vowel (CV) syllables varying either in the initial consonant or in the terminal vowel. The finding was that memory for vowels closely matched the results with digits or letters whereas memory for consonants showed neither the basic modality effect nor the suffix effect. Thus, the special memory system associated with auditory presentation may be said to "contain" vowels but not consonants.

Although memory for such highly familiar symbols as digits is properly considered one of the "higher" mental processes it is nonetheless influenced by such bodily functiffns as the sensory channel, visual or auditory, over which materials to be remembered were received. Modality differences have been found with great regularity in situations where adults receive a rapidly presented list of items for immediate ordered recall. As Washburn (1916, p. 74) noted, auditory presentation of such lists is better than visual presentation in that it (auditory input) results in a stronger recency effect. Writers with more recent evidence at their disposal have found no reason to modify Washburn's conclusion (Corballis, 1966; Conrad & Hull, 1968; Crowder, 1970; Murdock, 1967; Murray, 1966). Crowder and Morton (1969) proposed a theory of acoustic memory which is adequate i This research was supported by NSF G r a n t GB 15157. Thanks are due Alvin M. Llberman for many suggestions concerning the present research strategy and for making available the facilities of Haskins Laboratories for preparation of stimuli. Additionally, R u t h S. Day and James E. Cutting helped indispensably in developing the stimulus syllables and James Antognini m t~stlng Ss. © 1971 by AcademicPress, Inc. 22 587

to deal with the modality effect. The main point of their argument was that although visual and auditory input eventually lead to comparable forms of representation in a central short-term memory (STM) there are logically earlier, more peripheral, sensory memories, one for vision and one for audition, which carry information in prelinguistic form. Crowder and Morton called the peripheral auditory memory Precategorical Acoustic Storage (PAS) and proposed that it holds information at least for a few seconds-dramatically longer than the visual precategorical store is believed to persist. The PAS system is compromised by limited space capacity as well as limited time capacity, however, and this limitation on space has observable consequences for immediate ordered recall tasks. As a result of the space limitation, each item in a vocally presented list degrades the representation of previous items in PAS, presumably in a first-in--firstout manner. Since only the last few items in a series are followed by few or by no new inputs, the PAS effect (i.e., recency) is evident only for these items; that is, only list members which are free of retroactive displacement from their companion list members are expected to

588

CROWDER

show the advantage of extra information in PAS storage. Thus, two closely related observations, the conspicuous recency effect with auditory presentation and the modality effect when auditory and visual presentation are compared, are compatible with the PAS mechanism. Besides these two properties, another characteristic which distinguishes auditory from visual memory is the effect of a spoken "stimulus suffix" occurring at the end of the input list (Crowder & Morton, 1969). When the voice delivering the memory items adds a redundant word, such as "zero," following the last input-list position, there occurs a serious impairment in recall for the last few items in the series, as compared with a control condition where the extra event is not a spoken word but rather a tone or buzzer. According to Crowder and Morton the stimulus suffix effect reflects the displacement property of PAS. The advantage auditory presentation ordinarily receives over visual longlasting PAS information--is partially destroyed because the suffix acts to displace from PAS traces of the terminal items. A test of this theory is to enquire whether a vocal suffix will impair recall if attached to a visually presented stimulus list. If the suffix has its effect in a prelinguistic or peripheral store then it ought not to exert an effect on information coming from another transmission line. Morton and Holloway (1970) have shown this to be true. A spoken suffix has a selective disruptive effect on terminal positions following vocal presentation of the stimulus but it does not have such a selective effect following visual presentation of the stimulus. This difference constitutes another differential characteristic of auditory presentation in contrast to visual. The experiments which have established the recency, modality, and suffix effects have without exception involved natural-language stimuli such as digits or letters. The present research extends this literature to artificial stimuli which permit a more detailed analysis

of what type of information contributes to these effects, that is, analysis of what type of information is held in PAS. The major finding of the studies reported here is that PAS does not hold a faithful representation of the complete acoustic speech signal but rather carries information about vowel sounds as opposed to the sounds of stop consonants. The first four studies show that the recency and suffix effects occur when vowels are being remembered but not when stop consonants are being remembered and the last two studies show that the modality effect occurs with vowels but not consonants. EXPERIMENT I There are two main possibilities regarding the completeness of information in PAS. It could be that representation in a slow-decay PAS occurs for all information which is received through the ears, in which case the phenomenon summarized by Washburn would always be evident when auditory and visual inputs are compared. The other possibility is that PAS is a part of the system providing special handling to some but not all acoustic or phonetic features. Features such as stress, pitch, and duration, among the suprasegmental sources of information, and various phoneme classes, among segmental features, are those to which PAS might be differentially sensitive. Ordinary speech stimuli such as digits generally tend to vary in a correlated way on all of these features at once, making it difficult to decide whether their superior recall and vulnerability to suffix reflect only certain features or rather a nonselective advantage of all auditory stimuli. The digits provide several examples of this covariation. For example, vowels alone are sufficient to discriminate all but five and nine. The present strategy is to construct artificial stimulus vocabularies where the items differ from one another on only a single feature. Experiments where S must remember values of this feature should permit inference on the selectivity of PAS,

VOWELS AND CONSONANTSIN IMMEDIATEMEMORY according to whether or n o t results typical of auditory m e m o r y are obtained. It should be n o t e d that whereas the present discussion has been from the context of the PAS model, other attempts have been made to a c c o u n t for the modality effect. The question at h a n d (i.e., which features of acoustic stimuli are involved i n the effect) is n o less crucial to these other theories (e.g., Sperling & Speelman, 1970, p. 179). All existing models have so far r e m a i n e d completely silent o n this issue.

Method The to-be-remembered stimuli in Experiment I were 50 lists of seven syllables from a vocabulary of three [ba, da, ga/. The selection and ordering of the seven syllables in each stimulus was completely random. The stlmulus syllables were prepared on the Haskins Laboratories Parallel Resonance Synthesizer according to routines worked out by Mattingly (1968). Each syllable was exactly 300 msec long. The three stimulus items in the vocabulary differed from one another only in their initial consonant. All stimulus tapes were assembled automatically with the aid of the Haskins Laboratories Honeywell PDP digital computer. The use of synthetic speech and computer-generated stimulus tapes permitted preose control over timing relations in the experiment and absolute constancy in the acoustic properties of each stimulus syllable wherever it occurred. Four ad&tlonal syllables were prepared for use as stimulus suffixes in the various conditions of Experiment I: /goo/ and /li/ ("Goh" and "Lee") were the suffixes, each produced once m the same "voice" as the stimulus hsts and each produced once also in a more highly pitched "voice." This variation in suffix identity owes to the original purpose of this experiment; since this original purpose is of secondary interest here, data will be reported only for the condltion in which/goo/was used as the suffix in the same voice as the stimulus. There xs nothing in the discarded data to deny or quahfy the conclusions reached here. The control condxtion was estabhshed by a fifth item, a 1000 Hz tone, which was presented in the same temporal location as the suffix syllables. All stimuli lasted 300 msec, except a synthesized "ready" signal which preceded each trial. Each of 15 college age volunteers received the same fixed-order list of 50 seven-syllable memory series, dwided into five blocks of ten trials with rest periods between. The five blocks of trials corresponded to the five experimental conditions distinguished by the

589

nature of the suffix event (viz., tone, [goo[, /goo/ in high voice, /li/, a n d / l i / i n high voice). Subgroups of three subjects heard these five conditions m five orders as determined by a Latin square. Since the memory stimuli were given always in the same order, changing the order of conditions across subgroups accomphshed balancing of conditions against both practice and individual stimuli. On each trial, the hst of seven syllables, followed by the suffix, were given at a rate of two/sec., that is, with 200 msec of silence separating adjacent memory items. The same interval separated the last memory 1ternfrom the tone or syllable used as the suffix. Each trial began with the word "ready" followed, after 1000 msec, by the onset of the first to-be-remembered syllable. Fifteen seconds separated the onset of the last item (the suffix) from the signal initiating the next trial. During this 15-secrecall period Ss wrote their answers on sheets providing seven spaces opposite each trial number. Since the vowel sounds in the syllables were redundant Ss were told to write the single consonant letters, b, d, and g, on their sheets rather than spelhng out "bah," etc. Prior to the experiment, there was an intelligibihty check on the stimulus syllables. First, several examples of each were presented slowly, later two blocks of 14 identification trials were run in which the three syllables occurred in random order, with 1 sec allowed for a written identification. On the second block of 14 identifications, 11 of the 15 Ss had perfect performance, 2 made one error, 1 made two errors, and 1 made four errors. Thus, although Ss found the seven-item sUmuh difficult to understand, according to their own verbal reports and complaints, they nonetheless found the syllables easily recognizable individually.

Results The result of Experiment I is s h o w n i n the u p p e r left-hand panel of Fig. 1, where error p r o b a b i l i t y is shown as a f u n c t i o n of serial position separately for the (tone) control c o n d i t i o n a n d for t h e / g o o / s u f f i x condition. Two aspects of these data are noteworthy. First, there was no recency effect, a n d second, there was n o selective effect of the suffix u p o n the late serial positions. I n fact the suffix seems to have affected performance consistently on all positions except the last.

EXPERIMENTS II AND III Both of the generalizations reached from E x p e r i m e n t I, that is, the absence of the

590

CROWDER I

I

I

I

I

GO SUFFIX

I

I

GO SUFFIX

.40

I

I

[

@~:~0

31o-0/

TONE CONTROL

°7 EXPERIMENT T (n=15)

.3 -

lID

r~,

I

~0"~

/7 Lo.~ °

0/:/°

o

I

I

\ o--oI° p°J

40

). 20 p.

I

I

I

I

I

I

I

I

!

2

5

4

5

6

7

I

I

I

I

I

I

I

>..20 1--

EXPERIMENT "IT (n = I0)

.3 0

e¢.

I

I

I

I

I

1

I

1

2

5

4

5

5

7

o

tu 40

.20

GO SUFFIX /

0~

/

/~o LO

tu 40

TONE CONTROL

/ l ~ t /

e/

20

EXPERIMENT (n- t2)

I

I

I

I

I

I

1

2

3

4

5

6

0s" T:22;2221 (n • 121

I

l

I

I

I

I

I

I

1 SERIAL POSITION

2

3

4

5

6

7

7

FIG. 1. The relation between error probability and serial position in Experiments I-IV. In each case the parameter is the type of redundant event--a tone or a verbal syllable--occurring immediately after the last to-be-remembered item. recency and suffix effects, indicate that acoustically presented series f r o m the vocabul a r y / b a , da, g a / g i v e results typical o f visual stimuli rather than auditory stimuli. However, both of these generalizations also constitute affirmation o f the null hypothesis. In view o f the latter point unusual responsibility attaches to replication, which was the purpose o f Experiments II and III.

Method Experiment II was identical to Experiment I in all but the following respects: (a) ten new Ss were used; (b) order of conditions was not balanced against practice; and (c) the presentation rate was slowed down to one item/sec rather than two/sec. The major change was thus the slower presentation rate. The intent was to make the task easier, since Ss' difficulty in identifying the syllables was considered a possible source of the unusual results of Experiment I. Experiment III was exactly the same as Experiment II except, (a) 12 new Ss were used, (b) the rate of presentation was restored to two/sec, and (c) the length of each memory item was shortened from seven items to six. Using shorter series was intended as another check on whether the difficulty of the task in Experiment I accounted for the findings.

Results The results o f Experiments II and I I I are shown in the upper right-hand and lower left-hand panels o f Fig. 1, respectively. The data are somewhat noisier in Experiment II than in Experiment I; however, the results are substantially the same in showing neither a recency effect nor a selective effect o f the suffix. The overall effect o f the suffix was smaller in Experiment II than in Experiment I, M a n n - W h i t n e y U = 102, p < .05, either because there was a longer delay separating the m e m o r y series from the suffix (Crowder, 1969) or because conditions were confounded with practice in Experiment II. However, slower presentation did not affect performance in the control condition. F o r Experiments I and II combined, performance on the last two serial positions was statistically indistinguishable (i.e., no recency effect) and for the last three positions there was no statistically significant difference between the control and suffix conditions. As the lower left-hand panel

VOWELS AND CONSONANTSIN IMMEDIATEMEMORY of Fig. 1 shows, Experiment I I I gave comparable results. It could be objected that although the intelligibility test yielded nearly perfect performance, the strong phonological similarity a m o n g the three stimulus syllables made identification during rapid stimulus input difficult, thus placing limits upon how much additional effect a suffix could have. Indeed, there were some subjects in Experiments I and II who turned in error rates of 70 and 80 ~ . Therefore, the 13 best (i.e., lowest overall error rates on the control condition) subjects from the two studies combined were examined separately. For the seven positions, the error probabilities in the control condition were. 10, .13, .22, .19, .36, .38, and .34; for the suffix condition, .22, .22, .33, .34, .44, .45, .41. Thus, these individuals, who found the task overall relatively easy, did not produce a suffix effect (on the last items) or a recency effect in the control condition (the drop from .38 to .34 was not significant). EXPERIMENT IV To this point it has been shown that when stimuli consist of syllables differing only in their initial stop consonants, results we have come to expect with perfect regularity from acoustic presentation of conventional stimuli do not occur. The tempting conclusion is that there is "something special" about these consonant letters which disqualifies t h e m from the usual advantage associated with auditory input. This conclusion would be premature at present however, because there are a great m a n y other differences between Experiments I - I I I of the present report and those experiments in the literature which do show the recency and suffix effects. For example, the present research employs a vocabulary restricted to three items, rather than ten or twenty-six; the items are made from synthetic rather than real speech; the timing of input is absolutely rigid metrically, allowing none of the. (perhaps involuntary)

591

pausing which might be characteristic of naturally-spoken stimuli; and there are no differential suprasegmental features in the synthetic stimuli, either within syllables or across series of syllables. The purpose of Experiment IV was to confirm or to eliminate the possibility that the results of Experiments I - I I I came from such confounding factors. Method The method of Experiment IV was identical to that used in Experiment I except that the three syllables used to make the series differed from one another only in their terminal vowel sound rather than in their initial consonant sound. The three syllables were /gin, ga, gA/ (close to the vowel sounds in "gap, got, gut"). The suffix was either a tone (control) or the syllable/bail These two experimental conditions were balanced against practice across the 50 trials by alternating blocks of ten trials (tone-ba-tone-ba-tone). In writing their recall Ss were allowed to use any letters or symbols they wanted in order to represent the three syllables so long as they used a consistent system across the experiment. Time was allowed during the pretest for them to settle on such a system. Twelve new Ss from the same source were tested. Thus, Experiment IV was essentially a replica of the earlier studies except in one feature of interest--the nature of what S had to remember. Results The lower right-hand panel of Fig. 1 shows that the result of Experiment IV differed markedly from those of the earlier three studies. Basically, it appears that if vowels are the distinguishing feature of the stimulus vocabulary one might as well be using digits or letters, since the recency effect and suffix effect are restored. In the control condition there was a statistically significant improvement on the last serial position as opposed to the second-to-last (Wilcoxon T = 9.5, p < .05). Over the first four positions the suffix had a nonsignificant effect (U = 33.0); however, on the last three positions there was a statistically significant suffix effect, U = 7.5, p < .05.

DISCUSSION OF EXPERIMENTSI - I V F r o m the evidence presented so far it appears that reception of stimulus information

592

CROWDER

through the ears is not a sufficient condition for obtaining the recency effect and the suffix effect because these failed to occur when stop consonants were the memory items. The possibility that something other than the nature of the vocabulary used (its size, the method of presentation) was responsible for the results of the first three studies was eliminated by the last study, where vowels were being retained. It seems likely on the basis of these findings that results previously assigned to auditory presentation result from the normal presence of certain types of information in auditory stimuli, perhaps vowels. This inference would be strengthened considerably by a simple direct comparison of auditory and visual presentation showing the usual modality effect with vowel stimuli and none with consonant stimuli. Experiments V and VI were designed for this objective.

EXPERIMENTS V AND V I

In Experiment V Ss performed always under conditions of visual presentation and written recall. The experimental comparisons were based on a two-by-two scheme where one factor was whether S was asked to read the visual stimuli silently or aloud during presentation and where the other factor was whether, during written recall, S was asked to vocalize his output or to remain silent. This factorial combination of input and output vocalization was carried out on two separate groups of Ss, one of which saw series of seven syllables differing in consonants ("Bah, Dah, Gah") and the other of which saw identical stimuli using syllables differing only in vowels ("Boo, Bee, Bih"). In Experiment VI, included for comparison purposes, Ss saw letters of the alphabet under the same four vocalization conditions. Murray (1966) performed a similar experiment using eight-letter stimuli. His main finding was that vocalized input led to considerably better performance than silent input and that this difference increased

directly with serial position. The effect of recall vocalization was considerably less clear in Murray's study. By an ordered recall criterion there was no difference between silent and vocal recall following silent presentation; however, following vocal presentation there was a statistically significant advantage of silent over vocal recall. The locus of this output vocalization effect (following auditory input) was the last two serial positions. The original purpose of Experiment VI was to check for this interaction between input and output vocalization. Although the output vocalization findings will be reported in full the main present interest is in verifying the earlier conclusion about vowels and consonants in PAS. The implication of that conclusion is that visual and auditory input (covert and overt vocalization) should differ for stimuli in which vowels must be remembered, just as with letters of the alphabet, but that there should be no difference when stop consonants are being remembered.

Method Each of 32 Yale undergraduates recalled the same

hst of 60 seven-syllablestimuli, arranged in four blocks of 15. Half of the Ss saw these stimuli as seven-item series of the (written out) syllables "Bah, Dah, Gah" and half of the Ss saw these stimuli as series of the syllables "Bee, Boo, Bih." They were the same stimuli in the sense that trmls and series had idenncal statistical propemes whatever the vocabulary used to reahze them. The stimuh were based on a table of random numbers, with no constraints on the selection. Thus, only over the stimulus set as a whole were frequencies, alternations, etc., balanced. The four blocks of 15 stimuli corresponded to the four condmons resulting from factorial combination of input vocalizanon and output vocalization. Subgroups of eight Ss received these four conditions in orders based on a balanced Latin square; changing the order of conditions while the actual stimuli remain fixed results in perfect balancing of conditions against order effects and against practice. On every trial of the experiment the seven-item series was exposed on a screen for 2 sec, the items written in a stair-stepped manner from upper left to lower right. Also, on all trials S was instructed to write his recall for the series Immediately after the slide went off. Answer sheets were provided giving seven spaces opposite each trial number. The sound of the

593

VOWELS AND CONSONANTS IN IMMEDIATE MEMORY slide change mechanism served as a ready signal, approximately 1/2 sec before display of the stimulus. Following stlmulus offset S had 17 sec in which to recall the series before a new trial was initiated. Instructions emphasized both ordered recall (left-toright with no backtracking) and attention to the appropriate vocalization instructions. Before each

*~o x~x =---= x---x

VOCAL VOCAL SILENT SILENT

types affects letters and vowel-varied syllables in a quite similar manner but that performance on consonant-varied syllables is independent of input vocalization. Statistical tests verified this conclusion. In Experiment VI, involving letters, a 2 × 2 × 8

PRESENTATION VOCAL PRESENTATION SILENT PRESENTATION VOCAL PRESENTATION SILENT

RECALL RECALL RECALL RECALL

6O It-.

r,.I

m

40 xf

0

~x

~.

n

~:

20

0 LI.I

I I i I I I I I 2 5 4 5 6 7 8

f

~v'~l

¢S¢ I

I

I

I

1 2 5 4 5 6 7 8 SERIAL POSITION

I

let,,,,,, 1 2 5 4 5 6 7 8

FIG. 2. The relation between error probability and serial position in Experiments V and VI. The three panels are distinguished by the stimulus vocabulary used. In all cases stimuli were presented visually and recall was written. The parameter for each graph is the combination of input and output vocalization which was added to this basic arrangement. block of trials specific instructions were given covering the vocalization condition to be in effect for that block. Experiment VI, using letters as stimuli, was similar m all aspects of design and execution except (a) 36 different Ss were tested, (b) each S recewed 48 trials, divided into blocks of 12, (c) the stimulus vocabulary was C G H J K L M Q R S T X , from which eight letters were randomly (excluding runs and alphabetic sequences) selected for stimuli.

Results

Shown in Fig. 2 are the results of Experiments V and VI, with separate vocalization comparisons for each of the three vocabularies. Note that comparison of recall probability between the eight-letter stimuli (Experiment VI) and the seven-syllable stimuli (Experiment V) is not permissible, since stimulus length, vocabulary size and vocabulary identity are all confounded. However, the patterns of vocalization effects can legitimately be compared as a function of vocabulary. The main result in Figure 2 is that vocalization of both

analysis of variance was performed, based on input vocalization, output vocalization, and serial position. The results showed statistically significant (p at least less than .05) main effects of input vocalization, of output vocalization, and of serial position, with interactions between each of the vocalization factors and serial position but no other significant effects. Thus, for natural-language stimuli, performance is better if input is vocahzed and better if output is silent, both effects increasing with serial position. The data from Experiment V were analyzed in a 2 x 2 × 2 analysis of variance, based on stimulus vocabulary (vowels vs. consonants), input vocalization and output vocalization. The results showed statistically significant main effects of vocabulary, of input vocalization, and of output vocalization. The statistically sigmficant interactions were between input vocalization and vocabulary, and the

594

CROWDER

triple interaction among the two types of vocalization and the vocabulary. The interactions involving vocabulary are of greatest interest since they bear on the original question of whether the modality effect would be evident with vocalized stop consonants. Figure 3 is a convenient representation of the net morality effect--the combined advan)F-...I

• LETTERS

this is of course the major finding of the present study. Figure 4 is analogous to Fig. 3 except that the net effect (in this case a negative effect on performance) of output vocalization is under consideration. With letters there was a sharp advantage of silent output which increased with serial position, though not so strongly as the advantage of vocal input. Vowel-varied

(CGHdKLMQRSTX)

0 VOWELS

,... I_

(1300, BEE, BIN)

• LETTERS

(CGHJKLMQRSTX)

B

CONSONANTS (BAH, DAH, GAH) •

0 VOWELS

~',,,

+20

4-

'~ z

N _.1

2O

_

(BOO,

BEE,

BIH)

o CONSONANTS (BAH, DAH, GAH)

0

Q. Z~--

I

1

I

I

I

I

I

I

1

2

5

4

5

6

7

8

SERIAL

POSITION

FIG. 3. The advantage of vocalized (over sdent) presentation as a function of serial position in Experiments V and VI, with stimulus vocabulary the parameter. Scores are the algebraic difference between the mean recall probabilities in the two conditions with vocalized inputs and the two conditions with silent inputs.

tage of vocalized over silent input disregarding the output mode--across serial positions with vocabulary the parameter. Even though the data for letters came from a different experiment and from longer series than the data for vowel-varied syllables, Fig. 3 shows that the modality effect was quite comparable numerically in the two cases. In other words, the well known advantage of auditory presentation was not altered by choosing a vocabulary of only three syllables so long as the small vocabulary required vowel discrimination. On the other hand, there was no advantage whatever of auditory presentation when the stimuli required discrimination of stop consonants. With relation to Experiments I-IV

~

I 1

I 2

I 5

SERIAL

I 4

I 5

I 6

I 7

i 8

POSITION

Fie. 4. The advantage of silent (over vocalized) recall as a function of serial position in Experiments V and VI, wlth stimulus vocabulary the parameter. Scores are the algebraic difference between the two conditions with vocalized outputs and the two conditions with silent outputs.

syllables produced a roughly comparable advantage for silent output. In this case the consonant-varied syllables produced a somewhat ambiguous outcome. For five of the seven positions consonants gave better performance with silent than vocal recall. To further examine the possibility of an outputvocalization effect with consonants two scores were computed for each of the 16 Ss who received consonant-varied syllables; one score was his mean error frequency on the two conditions where output was silent and the other score was his mean error frequency on the two conditions where output was vocal. There was a statistically significant difference (T = 28, p = .038) favoring silent recall. Thus, whereas input vocalization effects disappeared

VOWELSAND CONSONANTSIN IMMEDIATEMEMORY when stimuli were changed from varied-vowel to varied-consonant syllables the output vocalization effects persisted. This result was reflected in the nonsignificant interaction between output vocalization and vocabulary ( F = 1.3). Discussion The main result was that when visual and auditory presentation were compared directly under the conditions of Experiment V vowels as memory stimuli behaved just about like letters of the alphabet in Experiment VI whereas stop consonants (at least b, d, and g) showed no modality effect at all. This confirms and extends the earlier results in this report in as complete detail as one could wish. Vowels show all the properties of auditory presentation: (a) the recency effect, (b) the suffix effect (upon late positions), and (c) the advantage of auditory presentation over silent. Consonants show none of these. Two possible interpretations of this result will be presented below in the General Discussion. It is worth noting that certain inherent risks exist in procedures such as those of Experiment V, where the source of auditory input to PAS is S's own voice. He could, for example, easily subvert a nominal vocabulary of only consonant discriminations by making slightly different vowel sounds for each of the consonants. Or, similarly, he could pronounce certain CV syllables louder or with different stress than others, thus reintroducing the type of covariation among acoustic dimensions of stimuli which motivated using artificial vocabularies in the first place. For this reason Experiment V is considered a strong test of the vowel-consonant difference. The other results of Experiments V and VI, which were more specific to the issue of output vocalization than to the main theme of this article, were not so uniformly conclusive. There was a clear failure to replicate the findings of Murray (1966) concerning an interaction between input and output vocalization. This means that output vocalization

595

probably does not affect performance through a displacing action on the contents of PAS, as Crowder and Morton (1969) had thought, since in that case larger effects should have occurred following vocal input than silent input. Another basis for concluding that output vocalization effects do not derive from the same type of mechanism as input vocalization effects is the fact that a change in vocabulary (consonants to vowels) led to the complete disappearance of input effects but not of output effects. One possible mechanism for the output effects is response interference deriving from having a subsidiary vocalizing task added to the written recall task (Routh, 1971). GENERALDISCUSSION" The present experiments were intended to extend the Crowder-Morton model of PAS by determining whether information from speech is stored veridicaUy, as in a tape recording, or whether only certain features are held. The results obtained permit rejection of the tape-recorder model. Furthermore, the data indicate that vowels receive some form of representation in PAS while voiced stop consonants receive none. Although stimuli varying only in their initial stop consonants were received aurally in the present experiments, the results showed they might as well have been presented visually. Empirically, these outcomes permit prediction of the degree to which auditory and visual input will differ (or the size of a recency effect) provided one knows the degree to which the to-be-remembered items depend, for their distinctiveness, upon stop consonants. It is interesting to note that consonant letters which have been termed "acoustically confusable" on the basis of visual memoryconfusion studies (e.g., Conrad, 1964) are also letters which,-when pronounced, coml~ise consonant-vowel syllables differing only in their consonant sounds ("bee, see, dee, jee, etc."). Indeed, a recent experiment by

596

CROWDER

Smallwood and Tromater (1971) shows a dramatically smaller recency effect on that subgroup of letters than on letters which differ in terminal vowel sounds as well. R. A. Cole2 has obtained a similar difference in the recency effect of consonant and vowel sounds in auditory immediate memory. Two possibilities underlying the vowelconsonant difference come readily to mind. One is that registration of information in PAS simply requires some time for "burning in," just as some kinds of visual after-images result more reliably from a prolonged gaze than from a fleeting glimpse. The acoustic cues for distinguishing the stop consonants are established by frequency sweeps lasting only around 50 msec (Liberman, 1970) while the cues for vowels are steady-state and ordinarily (definitely in the present case) last much longer. Perhaps more interesting is the possibility that the different auditory-memory results found with vowels as opposed to stop consonants are a manifestatlon of laterality differences in the perception of speech (Shankweiler & Studdert-Kennedy, 1970). Recent data on lateralized speech-perception effects indicate that whereas the dominant hemisphere (left) receives such "highly encoded" phonemic information as stop consonants best (i.e., a right-ear effect), other stimuli such as vowels and music are best received by the nondominant hemisphere (left-ear effect). The really tempting hypothesis is that PAS is a property of the nondominant (right) hemisphere. Careful choice of artificial stimuli in future research ought to permit inference on which of these two hypotheses is the more valuable. 2 Personal communication, November, 1970.

REFERENCES

CONRAD, R. Acoustic confusions in immediate memory. British Journal of Psychology, 1964, 55, 75-84.

CONRAD, R., & HULL, A. J. Input modality and the serial position curve in short-term memory. Psychonomic Science, 1968, 10, 135-136. CORBALLIS,M. C. Rehearsal and decay in immediate recall of visually and aurally presented items. Canadian Journal of Psychology, 1966, 20, 43-51. CROWDER, R. G. Improved recall for digits with delayed recall cues. Journal of Experimental Psychology, 1969, 82, 258-262. CROWDER, R. G. The role of one's own voice in immediate memory. Cognitive Psychology, 1970, 1,157-178. CROWDER, R. G., & MORTON, J. Precategorical acoustic storage (PAS). Perception and Psychophysics, 1969, 5, 365-373. LIBERMAN, A. M. The grammars of speech and language. Cognitive Psychology, 1970, 1, 301323. MATTINGLY, I. Synthesis by l ule of general American English. Supp. to Status report on speech research. Haskins Laboratories, 1968. MORTON,J., & HOLLOWAY,C. M. Absence of a crossmodality "suffix" effect of short-term memory.

Quarterly Journal of Experimental Psychology, 1970, 22, 167-176. MURDOCK, B. B., JR. Auditory and visual stores in short-term memory. Acta Psychologtca, 1967, 27, 316-324. MURRAY, J. O. Vocahzatlon-at-presentation and immediate recall, with varying recall methods.

Quarterly Journal of Experimental Psychology, 1966, 18, 9-18. ROUTH,D. A. Independence of the modality effect and amount of silent rehearsal in immediate serial recall. Journal of Verbal Learning and Verbal Behavior, 1971, 10, 213-218. SHANKWEILER, D. P., • STUDDERT-KENNEDY, M. Hemispheric speclahzatlon for speech perception.

Journal of the Acoustwal Society of America, 1970, 48, 579-594. SMALLWOOD, R. A., & TROMATER, L. J. Acoustic interference with redundant elements. Psychonomic Science, 1971, 22, 354-356. SPERLING, G., & SPEELMAN,R. G. Acoustic simdarlty and auditory short-term memory experiments and a model. In D. A. Norman (Ed.), Models of human memory. New York: Academic Press, 1970. Pp. 151-202. WASHBURN, M. F. Movement and mental imagery. Boston: Houghton Mifflin Company, 1916. (Received May 14, 1971)