Journal of Memory and Language 40, 153–194 (1999) Article ID jmla.1998.2620, available online at http://www.idealibrary.com on
Prosodic Facilitation and Interference in the Resolution of Temporary Syntactic Closure Ambiguity Margaret M. Kjelgaard E. K. Shriver Center and University of Massachusetts-Boston
and Shari R. Speer University of Kansas Subjects listened to sentences with early closure (e.g., When Roger leaves the house is dark) or late closure syntax (e.g., When Roger leaves the house it’s dark) and one of three prosodies: cooperating (coinciding prosodic and syntactic boundary), baseline (phonetically neutralized prosodic boundary), and conflicting (prosodic boundary at a misleading syntactic location). Prosodic manipulations were verified by phonetic measurements and listener judgments. Four experiments demonstrated facilitation in speeded phonosyntactic grammaticality judgment, end-of-sentence comprehension, and crossmodal naming tasks: Sentences with cooperating prosody were processed more quickly than those with baseline prosody. Three experiments showed interference: Sentences with conflicting prosody were processed more slowly than those with baseline prosody. All experiments demonstrated a processing advantage for late closure structures in the conflicting and baseline conditions, but no differences between syntactic types in the cooperating condition. Cross-modal naming results showed early syntactic effects due to both high-level and intermediate-level prosodic boundaries. We argue that the initial syntactic structure assigned to an utterance can be determined by its prosodic phonological representation. © 1999 Academic Press Key Words: prosody; parsing; syntax; ambiguity; ToBI.
In early psycholinguistic studies, researchers were at pains to show that listeners use syntactic phrase structure rather than acoustic prosodic phrasing to determine the constituency of a spoken sentence as it is understood (Fodor & This research was supported in part by NIMH Grants R29 MH51768-01 and NIMH T32 MH19729 to Northeastern University and NIMH Grant R29 MH51768-02 to the University of Kansas. Portions of the results reported here were reported to the CUNY Sentence Processing conference in 1993. The authors of this paper have contributed equally to this work. The order of authorship was determined by the fact that experiments 3 and 4 were conducted in partial fulfillment of the first author’s doctoral dissertation. Our thanks to Gary Dell, Susan Garnsey, Nancy Soja, Joanne Miller, Kate Dobroth, Rene Schmauder, Amy Schafer, Maria Slowiaczek, Wayne Murray, Keith Rayner, and an anonymous reviewer for helpful comments on earlier versions of this paper. Address correspondence to Margaret M. Kjelgaard, Center for Research on Developmental Disorders, Eunice Kennedy Shriver Center, 200 Trapelo Road, Waltham, MA 02452-6319. E-mail:
[email protected].
Bever, 1965; Garrett, Bever, & Fodor, 1966; but see Wingfield & Klein, 1971). In contrast, contemporary models of sentence processing have begun to include a role for prosodic structure—a role beyond that of emphasizing a particular word or adding an affective connotation. For the purposes of this paper, prosody refers to stress, rhythm, and intonation in spoken sentences. Prosodic structure is formally described in autosegmental phonological theory and has measurable acoustic–phonetic correlates, including variation in fundamental frequency, spectral information, amplitude, and the relative duration of sound and silence. For many spoken sentences, prosodic structure is the only information available to resolve ambiguity at other levels of linguistic analysis (e.g., compare the spoken forms of the sentences “What’s that ahead in the road?” and “What’s that, a HEAD in the ROAD?”; exam-
153
0749-596X/99 $30.00 Copyright © 1999 by Academic Press All rights of reproduction in any form reserved.
154
KJELGAARD AND SPEER
ples attributed to K. Church). In studies of listener judgments, prosodic information has repeatedly been shown to determine the final meaning assigned to many syntactic and semantic ambiguities (e.g., Lehiste, 1973; Streeter, 1978; Price, Ostendorf, Shattuck-Hufnagle, & Fong, 1991; Wales & Toner, 1979; but see Albritton, McKoon, & Ratcliff, 1996). For sentences with temporary syntactic ambiguity, prosody that is consistent with the correct syntactic parse has produced on-line processing advantages compared to prosody that is inconsistent (Slowiaczek, 1981; Speer, Kjelgaard & Dobroth, 1996; Marslen-Wilson, Tyler, Warren, Grenier, & Lee, 1992; Warren, Grabe, & Nolan, 1995; but see Murray & Watt, 1995; Watt & Murray, 1996; Murray, Watt, & Kennedy, 1998). Researchers have suggested that units of prosodic structure act as processing units in human sentence comprehension (Slowiaczek, 1981; Carroll & Slowiaczek, 1987), that prosodic information contributes to the final structuring of an initial syntactically determined parse (Marcus & Hindle, 1990; Pynte & Prieur, 1996), and that prosodic and nonprosodic factors may enter a cue-trading relation in the process by which syntactic and semantic analyses are constructed (Beach, 1991; Stirling & Wales, 1996). Some computational models of natural language parsing also use prosody (“chunks,” intonational boundaries, or intonational phrasing) to inform parsing decisions (Abney, 1990; Marcus & Hindle, 1990; Steedman, 1990, 1991). We have argued that an abstract prosodic representation maintains spoken sentences in immediate memory during comprehension and that this phonological information is available to inform the syntactic parsing process (Speer, Shih, & Slowiaczek, 1989; Speer, Crowder, & Thomas, 1993; Kjelgaard, 1995; Speer et al., 1996). Consistent with this view, Schafer (1997) has claimed that the listener constructs a full prosodic representation, which in turn provides the initial domains for syntactic structuring and semantic analyses. This type of perspective assumes the listener’s mental representation of prosody during sentence comprehension is of the sort formally described in
autosegmental and metrical theory (e.g., Hayes, 1985; Liberman & Prince, 1977; Nespor & Vogel, 1986; Selkirk, 1984, 1986) and in intonation theories in this framework (e.g., Ladd, 1980; Liberman & Pierrehumbert, 1984; Pierrehumbert, 1980; Beckman & Pierrehumbert, 1986). Briefly, autosegmental and metrical theories represent the rhythm and timing of a sentence as a pattern of weak and strong beats (unstressed and stressed syllables in English), hierarchically arranged according to a tree or grid structure to form constituents, e.g., feet, prosodic words, phonological phrases, and intonation phrases. Autosegmental theories of intonation represent the tune of a sentence as a temporal series of tonal events. Some tones are directly associated with stressed syllables (these are pitch accents and may be composed of one or two tones. The inventory for English includes high (H) and low (L) tones and their combinations, H*, L*, L1H*, L*1H, and H1!H*). Other tones are aligned at the right edge of a phrase (these are the boundary tone, L% or H%, a single tone associated with the intonational phrase, and the phrase accent, L2 or H2, a single tone associated with the phonological phrase and realized over the material between the following boundary tone and the final pitch accent in the phrase). The theory we use here specifies the correspondence between tones and phrasing for two levels of prosodic phrase, so that the phonological phrase (PPh)1 is delimited by a phrase accent, and the intonational phrase (IPh) is delimited by a boundary tone. Every utterance has at least one IPh, each IPh has at least one PPh, and each PPh has at least one pitch accent. During sentence comprehension, information about the metrical and intonational structure of an utterance is recovered by the listener as part 1 In phonological theory, there remains some question about the particular nature and number of phrasal levels between the prosodic word and the intonational phrase. However, the majority of theories posit at least one phrasal level between these two, called either “intermediate phrase (iph)” (see Beckman, 1996) or, as we have it here, “phonological phrase (PPh)” (see Selkirk, 1986; 1995). Although there are theoretical distinctions between these two terms, we do not mean to distinguish between them here.
PROSODIC FACILITATION AND INTERFERENCE
of the phonological input from the speaker. On our view, the prosodic representation is recognized at the very earliest stages of processing, with rhythmic and tonal constituency becoming available in parallel with the segmental phonological information that is being organized for word recognition (see Gordon, 1988, for a similar view of the recovery of coarse and finegrained phonetic information during speech recognition). Although the prosodic representation is abstract and hierarchical like other linguistic structures, there are several reasons to believe that it could be available before them. Prosodic constituents are syllable-based and delimited by tones, so that prosodic structure can be identified without the fine-grained analysis necessary to distinguish among phonetic segments. Identification of prosodic structure does not rely on prior lexical access, as syntactic, thematic, and semantic analyses do, so that prosodic structure requires fewer levels of abstraction to be completed before a memory representation can be formed. Formal analyses of prosodic structure indicate that it is less complex than other linguistic structures: most theories of prosodic phonology restrict recursion, and most specify that prosodic constituents are “strictly layered,” so that constituents at each level of the hierarchy are exhaustively parsed into constituents at the immediately superordinate level (e.g., Nespor & Vogel, 1986; Selkirk, 1984, 1995; but see Ladd, 1986). These qualities imply a relatively limited number of attachment sites for incoming prosodic constituents and thus fewer local attachment ambiguities than occur in syntactic structure. Prosodic constituents correspond to constituents at other levels of linguistic analysis, so that in principle, information from the prosodic representation together with correspondence rules (e.g., for the mapping between prosodic phrasing and syntax) could substantially reduce the ambiguity of the incoming language signal. However, as many researchers have noted (e.g., Selkirk, 1984; Pierrehumbert, 1980; Price et al., 1991; Warren et al., 1995), the correspondence between prosody and other linguistic structures is complex. Thus a sentence with particular syntactic structure may be pronounced gram-
155
matically with a variety of prosodies, just as a particular prosodic structure may be used to pronounce grammatically a variety of syntactic forms. A similar complexity concerns the mapping from the phonetic string to the prosodic representation, where a single pronunciation may be ambiguous between two prosodic structures and a single prosodic structure may have more than one phonetic implementation (Shattuck-Hufnagel & Turk, 1996; Beckman, 1996).2 This complexity has two implications that are particularly relevant here: (1) A well-formed prosodic rendition of a syntactically ambiguous sentence may or may not disambiguate it and (2) an ambiguous phonetic sequence can provide a single pronunciation for two prosodic structures, each with a well-formed correspondence to a different syntactic structure. These two implications are used to develop the experimental conditions and materials below. On the one hand, the complexity of the correspondence between prosodic structure and syntactic structure has disadvantages for psycholinguistic experimentation. While some recent studies have shown prosodic disambiguation of syntactic ambiguity (e.g., MarslenWilson et al., 1992; Price et al., 1991), others using similar procedures and syntactic materials have produced only partial replication of the effects (e.g., Albritton et al., 1996; Watt & Murray, 1996). One likely reason for this lack of consistency is a difference in the prosodic structures used—the grammatical prosody in one study may be disambiguating, while a different, but still grammatical, prosody in another study is not. On the other hand, the complex correspondence between prosody and other structures can also work to our advantage: If we carefully specify the phonology and phonetics of our experimental materials (as we do their syntax), we can make use of the ambiguities to begin to develop a principled account of when 2 This state of affairs, where there is no single component at one level of linguistic analysis that corresponds uniquely to a single component at another level of linguistic analysis, has been called the “lack of invariance” problem, is typical of the speech signal, and is a long-standing topic of research on speech perception (e.g., Liberman, Cooper, Shankweiler, & Studdert-Kennedy, 1967; Miller, 1990).
156
KJELGAARD AND SPEER
FIG. 1. Syntactic and intonational phrase structures for a early/late closure sentence pair in three prosodic conditions.
and where particular prosodic entities influence the syntactic parsing process. The following experiments use sentences with temporary ambiguities to examine the effects of prosodic structure on syntactic processing. We used early/late closure sentence pairs like When Roger leaves the house is/it’s dark. The syntactic structure of the sentences is shown in Fig. 1 and is temporarily ambiguous at the attachment of the noun phrase the house, which can serve either as the subject of the second clause the house is dark (an early closure analysis, Fig. 1, top) or as the direct object of the verb leaves (a late closure analysis, Fig. 1, bottom). Syntactic information that can resolve the ambiguity occurs immediately after it, with the word is or it’s. Studies of reading have repeatedly found a processing disadvantage for early closure sentences like these when they are compared to their late closure counterparts. The
relatively longer processing times for early closure sentences have been attributed to an initial misanalysis of the sentence as a late closure structure, followed by restructuring (Frazier & Rayner, 1982; Frazier & Clifton, 1996; Ferreira & Henderson, 1989) or to lexically associated frequency information favoring the verb’s transitive analysis (MacDonald, Perlmutter, & Seidenberg, 1995; Tanenhaus & Carlson, 1989). The strong view that prosodic structure is available to inform syntactic parsing decisions implies three specific claims: (1) When prosodic and syntactic constituent boundaries coincide, syntactic processing should be facilitated, (2) when prosodic boundaries occur at misleading points in syntactic structure, syntactic processing should show interference effects, and (3) processing difficulties associated with dis-preferred syntactic analyses (the syntactic “garden pathing” effects found in reading stud-
PROSODIC FACILITATION AND INTERFERENCE
ies) should disappear for sentences spoken with felicitous prosody. Previous studies have examined prosodic effects on temporary syntactic closure ambiguity. Slowiaczek (Slowiaczek, 1981; Carroll & Slowiaczek, 1987) compared end-of-sentence comprehension times for cooperating and crossspliced conflicting prosody versions of three types of early and late closure sentences. She found faster processing times for sentences with cooperating prosodic and syntactic boundaries than for those with conflicting prosody and syntax, and an overall early closure processing disadvantage, but did not statistically compare early and late syntactic structures within the cooperating prosody condition. Warren, Grabe, and Nolan (1995) examined the effects of cooperating and conflicting prosodic boundaries in the context of stress-shift and lexical syntactic category ambiguity. They used a crossmodal naming task and presented sentences such as: Whenever parliament discusses Hong Kong problems )L2 )H% versus Whenever parliament discusses Hong Kong )L2 )H% problems, followed by a visual naming word resolving the syntactic ambiguity toward early closure, such as ARISE. They too found faster processing times for conditions with coinciding prosodic and syntactic boundaries than for those with conflicting boundaries. While these results are clearly consistent with our view, they do not independently address each of the specific claims above. The demonstration of faster processing for felicitous than for disrupted prosody could indicate either that cooperating prosody speeds syntactic ambiguity resolution or that conflicting prosody interferes, but it does not distinguish between these explanations. In addition, this comparison leaves open the possibility that longer processing times in conflicting prosody conditions are due simply to an unnatural degradation of the language input. In addition, the previous work does not provide a direct comparison of early and late closure structures with coinciding prosodic boundaries. Such a comparison is necessary if we are to claim that prosodic structure can do away with the “garden path” effects found in reading. In order to test predictions of facilitation and
157
interference, we developed three prosodic conditions: cooperating, baseline, and conflicting. The phrasal phonology for the experimental conditions is shown in Fig. 1. In the cooperating prosody conditions, the final H* pitch accent of the first phrase fell on the clause-final word, and an IPh boundary with a low phrase accent (L2) and low boundary tone (L%) occurred at the syntactic constituent boundary (in the figure, following leaves for early closure and house for late closure). In the conflicting prosody conditions, the final H* pitch accent, L2 phrase accent, and L% boundary tone occurred at a misleading location in relation to the syntactic structure—the one consistent with the alternate parse (following house for early closure and leaves for late closure). The baseline prosody conditions contained a L1H* pitch accent on the subject noun and a low phrase accent (L2) at the appropriate syntactic boundary (following leaves for early closure and house for late closure).3 Because L2 “spreads” backward to the most recent pitch accent, and because Fø remains at its current level until the next tonal event, the Fø contours for these two prosodies are quite similar. If the sentences are spoken with minimal lengthening of the phrase-final word, the phonological distinction between the two baseline sentences can be phonetically neutralized, so that their tunes and rhythms are substantially the same. In order for the baseline condition to serve as an appropriate comparison condition, its prosodic structure should be one that is grammatical according to listener judgments, but gives no prosodic information relevant to the location of the temporarily ambiguous syntactic boundary. Therefore, we set three criteria for our baseline pronunciation: (1) it should be a single ambiguous phonetic string that is phonologically appropriate to either the late closure or the early closure version of a sentence, (2) it should be equally acceptable to listeners when used to pronounce either syntactic structure, and (3) it 3 This pronunciation is one that would be appropriate when the subject is narrowly focused or contrastively stressed and might be used when the speaker’s meaning is that the house is more likely to be locked when Roger leaves it than when someone else does.
158
KJELGAARD AND SPEER
should be highly acceptable—as acceptable as the cooperating prosody—for each syntactic structure. We assessed these criteria by pretesting and conducting phonetic analyses, detailed below. To test for effects of facilitation and inhibition, we used three experimental paradigms: speeded phonosyntactic grammaticality judgment, end-of-sentence comprehension, and cross-modal naming. The first two measures are relatively “off-line” tasks and are used to examine the effects of prosodic structure in the presence of disambiguating syntactic information that is typically available during normal sentence processing. If we find prosodic effects even when disambiguating syntactic information is available nearby, such evidence will weaken the argument that prosodic effects are minimal when other linguistic factors are available (Murray & Watt, 1996; Albritton et al., 1996; Sevald & Trueswell, 1997). In addition, these tasks allow us to present intact spoken sentences, undisrupted by truncation and uninterrupted by a concurrent task, and to compare speeded responses in tasks that do and do not include a metalinguistic component. The crossmodal naming task provides complementary advantages to the first two. While it relies on truncated sentence stimuli and a somewhat unnatural language processing task that requires integration of spoken language and text, it also provides a more “on-line” measure and allows us to examine processing at the point of syntactic disambiguation. NORMATIVE STUDY, PHONETIC ANALYSES, AND PRETESTING The sentence materials used in the first three experiments were selected from a larger group of sentences on the basis of a normative study, a series of phonetic analyses, and three pretests. The normative study assessed the transitivity bias of verbs used to construct early/late closure sentences. The phonetic analyses included measuring the duration of words and silences and examining fundamental frequency (Fø) contours. Spoken sentence items were pretested to equate pronunciation acceptability across the cooperating and baseline conditions, to guaran-
tee intelligibility, and to check that matched sentences in the baseline conditions were pronounced with the same phonetic pattern. Normative Study The verbs used in these sentences were chosen from a larger set of verbs tested in a sentence completion study. Twenty-nine Northeastern University undergraduate students participated for credit toward Introductory Psychology course requirements. They were given present tense, past tense or -ing forms of 52 verbs preceded by subordinate clause fragments (e.g., When Frank performs ) and asked to write a sentence completion for each fragment. For each verb eventually used in the main experiments, either transitive or intransitive responses accounted for the majority of fragment completions. A transitivity bias score was constructed for each verb by subtracting the percentage of intransitive completions from the percentage of transitive completions. Transitivity bias scores for each item are presented in the Appendix. Phonetic Analyses Durational measures and fundamental frequency (Fø) analyses were completed for the temporarily ambiguous region of the sentences in order to confirm that they had been pronounced with the intended prosody. We matched early/late closure sentence pairs for length in words and syllables and for lexicallevel stress pattern. We used the Tone and Break Indices (ToBI) system (Silverman, Beckman, Pitrelli, Ostendorf, Wightman, Price, Pierrehumbert, & Hirshberg, 1992; Beckman & Ayers, 1994), based on the intonational framework developed by Pierrehumbert (1980) and Beckman and Pierrehumbert (1986, 1988) to transcribe the phonological analyses of the prosodic structures in our materials. Intonation contours are transcribed on the basis of listening to the sound and viewing the speech waveform and a fundamental frequency plot. We used the set of high (H) and low (L) component tones on ToBI’s “tonal tier” and break indices 1– 4 (where 4 is the strongest break) on the break indices tier (see examples in Fig. 2). Durations were mea-
PROSODIC FACILITATION AND INTERFERENCE
FIG. 2. Acoustic waveforms with word durations (ms), Fø (Hz), and ToBI transcription for an example early/late closure sentence pair with IPh-based cooperating and baseline prosody.
159
160
KJELGAARD AND SPEER
sured using Sound Designer II software, and Fø contours with Signalyze software (Keller, 1994). Sentences were 16-bit digital sound files sampled at 22.1 kHz and recorded in a soundproofed room by a female speaker of American English trained in phonetics and phonology. The sentences were pronounced with two types of prosody, cooperating and baseline. For cooperating structures, the first syntactic clause boundary coincided with an intonational phrase boundary (IPh), including a low phrase accent, a low boundary tone, and a level 4 break. Phonetic indicators of the prosodic boundary included lengthening of the clause-final syllable, a clause-following silence, and a pitch discontinuity (or “reset”) between the end of the first clause and the beginning of the second. For baseline structures, neither potential syntactic boundary was marked by an IPh. The sentence was spoken with an L1H* pitch accent on the subject of the first clause and deaccentuation of the ambiguous region, such that the precise location of a low phrase accent (L2) with an associated level 1 break was phonetically ambiguous. In this region, the Fø was generally low and flat, and word durations were relatively short. Thus, early and late closure sentences in the baseline conditions had different underlying phonological representations, but very similar surface phonetic structures.4 Figure 2 shows early and late closure versions of the example sentence When Roger leaves the house it’s/is dark in the cooperating and baseline conditions. The figure includes the amplitude by time waveform, with durations marked for words and si4
Sentences in the baseline conditions had minimal lengthening and silence at phrase boundaries [we have transcribed the breaks throughout the ambiguous region as (0/1)]. In such cases, phonetic differences between L2 and L2L% may be neutralized. Thus either transcription may be correct for these sentences in the temporarily ambiguous region (see discussion in Beckman, 1996). However, this does not change the indeterminacy of the boundary location for these sentences. For four items (numbers 1,6,8,and 10 in the Appendix), one or more of the content words in the temporarily ambiguous region carried an additional pitch accent followed by a phrase accent. These items’ paired baseline pronunciations nevertheless met our criteria for acceptability, intelligibility, and similarity.
lences in the syntactically ambiguous region, and the Fø contour with ToBI tone and break indices transcribed. Conflicting structures were created by digitally cross-splicing the early and late closure cooperating sentence materials, so that a prosodic boundary occurred at a misleading location in syntactic structure. We chose to splice materials in order to preserve the information identifying the IPh boundaries across cooperating and conflicting conditions. Thus no separate phonetic analysis was necessary for these sentences. For the sentences selected for use in the following experiments, analyses of duration and Fø showed minimal phonetic differences between early/late closure sentence pairs with baseline prosodies, but substantial differences between early/late closure pairs with cooperating prosodies. Phonetic measurements are summarized in Table 1. Duration measurements were compared for the main verb, the ambiguous noun phrase, the silence following the verb, the silence following ambiguous noun phrase, and the total sentence. An analysis of variance showed significant effects of prosody, syntax, and measurement location, and all interactions were also significant (all Fs . 27, all ps , .0001). Planned comparisons were conducted for early versus late closure within cooperating and baseline conditions. Early versus late closure baseline prosody sentences showed no systematic differences in duration in the temporarily ambiguous region for any measure (all Fs , 1.1), with the exception of total sentence duration, where late closure sentences were longer than early closure sentences (F (1,15) 5 10.8, p , .001), presumably due to the use of different disambiguating words in the two conditions (e.g., are versus they’re). In contrast, early versus late closure cooperating prosody sentences showed significant differences in duration for all measures (all Fs . 20, all ps , .0001) (see Kjelgaard, 1995, for additional detail). The average total durations of sentences in the cooperating and conflicting conditions were longer than those in the baseline conditions. (Sentences in the conflicting conditions were also longer in total duration than those in the baseline conditions, mean duration early closure
161
PROSODIC FACILITATION AND INTERFERENCE TABLE 1 Phonetic Measurements for Materials Used in Experiments 1, 2, and 3 (Truncated Versions Were Used in Experiment 3, See Text) Mean durations (ms) Cooperating prosody Early closure syntax Late closure syntax Baseline prosody Early closure syntax Late closure syntax
Verb
Silence 1
NP
Silence 2
Sentence
561 (27.3) 379 (26.0)
471 (25.5) 27 (8.0)
521 (24.9) 605 (24.4)
19 (6.9) 507 (37.7)
2967 (114.2) 2908 (107.9)
410 (22.2) 390 (24.8)
22 (6.5) 24 (7.3)
475 (26.9) 482 (25.8)
8 (3.6) 18 (5.5)
2301 (78.8) 2346 (80.9)
Mean Minimum Fø
Fø Measures (Hz) Cooperating prosody Early closure syntax Late closure syntax Baseline prosody Early closure syntax Late closure syntax
V
NP
Disambiguating word
Fø Range (ambiguous region)
Fø Mean (ambiguous region)
160.8 (2.6) 181.5 (2.8)
172.7 (2.6) 158.4 (2.5)
161.5 (2.3) 174.9 (2.6)
146–308
171.6 (3.6) 173.9 (3.8)
166.4 (3.0) 174.4 (5.2)
168.6 (3.7) 168.4 (3.7)
163.1 (2.9) 165.3 (2.9)
155–210
146–300
154–222
167.9 (3.8) 168.1 (4.3)
Note. Standard error values are in parentheses.
conflicting 5 2961, mean duration late closure conflicting 5 3077). Multiple Fø measurements were compared for these sentences, including (1) the absolute range of fundamental frequency across the temporarily ambiguous regions of all items in a condition (from the verb to the disambiguating word: e.g., leaves the house is/its), (2) the mean fundamental frequency in this region,5 and (3) the mean Fø minimum for the verb, the ambiguous NP, and the disambiguating word. Fø minima were chosen as descriptors of the low tones assigned in the temporarly ambiguous region for these sentences (see Kjelgaard, 1995, for 5
Mean Fø was calculated using the FFT algorithm in Signalyze 3.12 software (Keller, 1994). Fø values were calculated for 10-sample windows across all periodic segments of the temporarily ambiguous regions.
additional detail). For the first two measures, analysis of variance showed no significant differences between baseline early and late closure sentences (all Fs , 1.1). Comparison of the baseline to the cooperating conditions showed a lower mean Fø and a relatively restricted Fø range for the disambiguating region in the baseline sentences, consistent with deaccenting. The Fø range analysis also showed a lower range for low tones in the cooperating conditions than in the baseline conditions. This is consistent with the use of L% for clause-final positions in cooperating conditions, but L2 for baseline conditions (Silverman et al., 1992; Beckman, 1996). Analysis of variance of Fø minima showed no significant differences between the two baseline conditions for the ambiguous NP and the dis-
162
KJELGAARD AND SPEER
ambiguating word (both Fs , 1), but did show a slightly lower Fø minimum for the verb in the early closure baseline condition (F(1,17) 5 7.63; p , .01).6 In contrast, Fø minima showed clear differences between early and late closure sentences in the cooperating condition. Clausefinal words had significantly lower Fø minima than their nonfinal counterparts (for V, early closure lower than late, F(1,17) 5 19.88, p , .0001; for NP, late closure lower than early, F(1,17) 5 30.95, p , .0001). In addition, the Fø minima were significantly lower on the disambiguating word in the early closure condition, where it was clause initial, than in the late closure condition, where it was clause-medial F(1,17) 5 15.45, p , .0001). PRETESTS Intelligibility Because the lexical differences between the two sentences were small, e.g., “is” versus “it’s,” “she’ll” versus “will,” sentences were pretested to be sure that critical words were clearly intelligible. Twenty subjects heard each sentence over loudspeakers and chose which of two phrases were contained in it. For the example sentences, they would have chosen either “door is locked” or “door it’s locked.” To be sure that Ss were not focusing on the critical words, experimental items were mixed with an equal number of controls where the choice was between other word types such as “book on the chair” or “book in the chair.” Sentences that produced more than two intelligibility errors (there were four such tokens) were rerecorded and retested to meet this criterion. Baseline Uniformity To insure that the same baseline prosodic structure was spoken for each sentence in the 6 It is possible that the lower Fø min for this clause-final verb might be taken as a cue to early closure syntax. However, if this were so, it would predict easier processing for the early closure than for the late closure baseline conditions. Instead, our results will show significantly longer response times for early than for late closure baseline conditions. This indicates that if the slightly lower verb Fø did provide an indication of the early boundary, very little processing advantage accrued due to this cue.
pair, two phonetically trained listeners agreed that the two pronunciations sounded the same. Sentences that did not pass this test were rerecorded and rejudged. Pronunciation Acceptability To show that the baseline and cooperating pronunciations were comparably acceptable for each syntactic structure and that the baseline pronunciations were highly acceptable, we collected listeners’ judgments of the appropriateness of the pronunciation of the sentences in the cooperating, conflicting, and baseline conditions. We also wanted to eliminate any sentences that could cause spurious processing difficulty, such as those with recording artifacts. In this task, subjects were given 3 s to read the sentence from a CRT screen and to pull a lever indicating whether they had comprehended it. We hoped that during reading comprehension, subjects would complete syntactic analysis of the sentence, thus reducing or removing any effects of syntactic complexity on their judgments of its pronunciation. Next, they listened to the sentence twice and were asked to judge the speaker’s pronunciation, answering either “error” or “okay.” Forty-eight subjects heard 48 experimental sentences and 52 additional sentences that varied in syntactic and prosodic structure and acceptability of pronunciation. No subject heard more than one pronunciation or more than one syntactic version of the same sentence. Using the acceptability judgment data, we selected 18 sets of early and late closure sentences for which all three prosodic conditions met our criteria. Table 2 shows the acceptability ratings for the 18 sentences in the six conditions. All sentences in the early and late closure baseline and cooperating conditions were accepted on at least 85% of the trials; the average rate of acceptance was just over 92%. There were no statistical differences in pronunciation acceptability among these four conditions (all Fs , 1.1). In contrast, conflicting prosody sentences were accepted on at most 40% of trials; late closure versions were significantly more acceptable than their early closure counterparts (F (1,17) 5 4.17, p , .05).
163
PROSODIC FACILITATION AND INTERFERENCE TABLE 2 Mean Proportion of Acceptability Ratings for the 18 Early and Late Closure Sentences in Cooperating, Baseline, and Conflicting Conditions, Used in Experiments 1–3
Early closure syntax Late closure syntax
Cooperating prosody
Baseline prosody
Conflicting prosody
.931 (.019) .939 (.019)
.903 (.017) .908 (.022)
.121 (.034) .200 (.044)
Note. Standard error values are in parentheses.
EXPERIMENT 1 Experiment 1 examined early and late closure sentence pairs with cooperating, conflicting, and baseline prosodic structures in a speeded phonosyntactic grammaticality judgment task. This task required listeners to make a metalinguistic evaluation of each sentence and was otherwise comparable to the speeded grammaticality judgment task that others have used to examine the effect of syntactic structure on visual sentence processing (cf. Ferreira & Henderson, 1991; Warner & Glass, 1987). The task produced two dependent measures: proportion of “error” to “okay” judgments and speeded judgment time. The logic behind our use of the task was as follows: We asked listeners to decide whether each sentence they heard was “okay” or contained an “error” and to base this decision on whether they thought that the sentence was “the one the speaker had intended to say.” We did not specifically direct their attention to either prosodic or syntactic structure. We assumed that each subject would set a criterion for deciding that a sentence contained an error and that both prosodic and syntactic factors would contribute to where the criterion was set. For example, a syntactically simple sentence with a well-formed correspondence between prosody and syntax should be easy to judge as “okay,” while a syntactically simple sentence with an ungrammatical prosody–syntax correspondence or one with a long hesitation pause should be easy to judge as an “error.” Other sentence types should be more difficult to judge, for example, an early closure sentence with a strongly transitive verb or a sentence with a prosody–syntax correspondence that is grammatical just under certain focus conditions. In
such cases, we assume listeners will take longer to decide whether a sentence is grammatical. Because the task involves time pressure, we assume that subjects will limit the amount of time they search for a grammatical analysis, and when their limit is reached before an analysis is found, they will make an “error” decision. In addition to the early and late closure sentences, we included 20 unambiguous sentences for a companion duration control experiment. As noted above, additional segmental material, clause-final lengthening, and silence resulted in longer sentence durations, on average, in the cooperating and conflicting conditions than in the baseline conditions. We wanted to address the concern that, with the use of an end-ofsentence judgment task, these differences in total sentence duration might influence subjects’ response times. For example, longer sentence durations might allow additional processing time, resulting in faster processing. Conversely, shorter sentence duration might indicate less material to be processed, resulting in faster processing. If either of these situations were the case (and others are of course imaginable), comparison of the cooperating and conflicting conditions to the baseline conditions might be compromised. We constructed duration control sentences to contain either short or long pauses, with pause lengths determined by matching the duration differences to those between a pair of cooperating and baseline sentences. Method Subjects. Sixty-six Northeastern University undergraduates participated in return for partial credit toward a course requirement in an Introductory Psychology course. All participants
164
KJELGAARD AND SPEER
were native speakers of English with normal or corrected-to-normal vision and no reported hearing problems. Materials. The 18 sextuples chosen in the pretests, 20 pairs of items for the companion duration experiment, and seven additional unambiguous filler sentences were used in the experiment. Across the 18 base sentence items used to create the experimental sentences, there was a slight bias toward transitive as opposed to intransitive completion following the verb. The mean constructed transitivity bias score was 24 (bias scores for individual sentences are shown in the Appendix). Cooperating and baseline prosodic structures were unspliced natural speech, and conflicting prosodic structures were created by digital cross-splicing. Early closure conflicting sentences were composed of the beginnings of late closure cooperating sentences, through the end of the silence following the temporarily ambiguous NP, and the ends of early closure cooperating sentences, starting with the disambiguating word, e.g., is. Similarly, late closure conflicting sentences were composed of early closure cooperating sentence beginnings and the ends of late closure cooperating sentences, starting at the disambiguating word, e.g., it’s. Short and long pause versions of the duration control items were created by deleting or inserting silence at a pause location in an unambiguous sentence. Sentences were recorded with pauses, including phrasal boundaries and hesitation pauses (the latter were used to provide additional items to which subjects might respond “error” in the overall experiment). These silences were then edited to create two versions of each sentence, one containing a long pause (mean duration 593 ms) and the other a short pause (mean duration 22 ms) at the same location. Silent durations were determined as follows: We chose 10 early closure cooperating/ baseline pairs and 10 late closure cooperating/ baseline pairs. Short pause silent durations were set equal to clause-final silent durations in the baseline item. Long pause silent durations were set equal to short pause durations plus the difference in total sentence duration between cooperating and baseline conditions for that item.
Individual duration control sentences and silent durations are presented in the Appendix. Inspection of the materials by two listeners indicated no audible splicing artifacts. The six versions of each of the 18 experimental items were distributed among six materials sets, using a condition rotation determined by a Latin square design. The two versions of the duration control items were distributed according to a second Latin square. Each materials set contained three tokens from each of the six experimental conditions, 10 tokens from each of the duration control conditions, and seven unambiguous filler sentences. No subject heard the same item in more than one condition. A practice list of six items, similar to those used in the main experiment, was presented at the beginning of each materials set. Procedure. Subjects participated in groups of up to eight participants. They were seated in a quiet room in individual booths equipped with a CRT screen, headphones, and three reactiontime levers. The experiment was controlled by MPL software (Neath, 1994) on a Macintosh IIci computer equipped with an AudioMedia I card for sound presentation and a Strawberry Tree ACM2IO card configured to have individual millisecond timers for each subject station. On each trial, subjects heard a tone and saw a simultaneous visual attention signal (the sequence **READY** on the screen). Following this, they listened to a sentence and judged whether the speaker had intended to produce the sentence as they heard it. Subjects’ attention was not specifically directed to the prosody or the syntax of the sentences. They were instructed to respond as quickly as possible by pulling one of two response levers, designated “Okay” and “Error” by labels that appeared at the bottom of the CRT screen during and immediately after the presentation of the sentence. Decisions and judgment times were collected. Judgment times were measured from the onset of the sentence sound to the subjects’ lever pull, with individual sentence durations subtracted from each subject’s times. Judgment times longer than 3500 ms (excluding sentence duration) were discarded; these times accounted for less than 4% of the data. As a check on com-
PROSODIC FACILITATION AND INTERFERENCE
FIG. 3. Percentage of error judgments and speeded phonosyntactic grammaticality judgment times (ms) for early/ late closure sentences with cooperating, baseline, and conflicting prosodies, Experiment 1.
prehension, 10 times at intermittent intervals during the experiment, an alert signal sounded and subjects were given 45 s to write a paraphrase, on a sheet of paper provided in the booth, of the sentence they had just heard. Results Mean speeded phonosyntactic grammaticality judgment times and the percentage of “error” responses for Experiment 1 are shown in Fig. 3. For all presented results that follow, responses were analyzed with both subjects (F1) and items (F2) as random effects; all reported effects were significant at p , .01 unless otherwise noted, and unreported interactions were not statistically significant. In all figures, error bars for reaction time (RT) results indicate 6 1 standard error (SE), calculated using a method recommended by Bakeman and McArthur (1996) to
165
estimate the variability in RTs within each individual condition after removing the overall differences in mean response time between subjects. Phonosyntactic grammaticality judgments. Subjects’ speeded grammaticality judgments showed sensitivity to both prosodic and syntactic factors. The top portion of Fig. 3 shows the proportion of “error” judgments for the six conditions. An analysis of variance found a main effect of prosody (F1(2,65) 5 190.82; F2(2,17) 5 58.61), a main effect of syntax significant only with subjects as the random variable (F1(1,65) 5 9.31, F2(1,17) 5 3.23, p , .09), and an interaction of prosody and syntax significant only with subjects as the random variable (F1(2,130) 5 7.89, F2(2,34) 5 2.90, p , .07). Planned comparisons showed that listeners judged cooperating prosody sentences to have fewer errors than baseline prosody sentences, but this effect was significant only by subjects (F1(1,65) 5 10.82; F2(1,17) 5 2.27, p 5 .14). Conflicting prosody sentences were judged to have more errors than baseline prosody sentences (F1(1,65) 5 225.97; F2(1,17) 5 72.77). We found no “garden path” effects of early closure syntax in either the cooperating or conflicting prosody conditions, where prosody provided information about the correct (or incorrect) syntactic parse. However, this effect was present in the baseline condition, where no prosodic boundary information was available to inform the parse. Early and late closure sentences produced comparable proportions of “error” responses within the cooperating and conflicting prosody conditions (.11 versus .11 and .71 versus .67, respectively). In contrast, in the baseline condition, early closure sentences produced more “error” responses than late closure sentences (.31 versus .10, F1(1,65) 5 26.14; F2(1,17) 5 9.60). The duration control sentences with short pauses were judged as errors slightly less often than those with long pauses (.63 versus .67, respectively). The high rate of “error” responding for these sentences was due to a few items. We suspect that the edited pause durations for these items were interpreted as speaker hesita-
166
KJELGAARD AND SPEER
tions, thus provoking a higher percentage of “error” responses. Judgment times. Subjects’ judgment times also showed sensitivity to both prosodic and syntactic factors. The bottom portion of Fig. 3 shows mean response times and standard errors for combined “error” and “okay” judgments in the six conditions. An analysis of variance found main effects of prosody (F1(2,65) 5 19.12; F2(2,17) 5 9.63) and syntax (F1(1,65) 5 18.23, F2(1,17) 5 5.85, p , .05) and an interaction of prosody and syntax significant only with subjects as the random variable (F1(2,130) 5 5.05, F2(2,34) 5 2.1, p , .15). Planned comparisons showed the predicted facilitation effect, that is, cooperating sentences were judged more quickly than baseline sentences (means were 1178.8 and 1387.8 ms, respectively; F1(1,65) 5 25.25; F2(1,17) 5 11.79). However, there was no interference effect, that is, judgment times for conflicting and baseline conditions did not differ significantly from each other (means were 1427.4 and 1387.8 ms, respectively; both Fs , 1). Also as predicted, there was no measurable garden path effect within the cooperating conditions. Instead, the early closure cooperating condition was numerically faster than its late closure counterpart (means were 1175.6 and 1181.9 ms), suggesting that cooperating prosodic information precluded parsing difficulty for the early closure syntactic structures. In contrast to this lack of effect, planned comparisons showed that early closure syntax did produce garden path processing difficulty in the baseline and conflicting conditions (early mean 1518.9, late mean 1305.3 ms; F1(1,130) 5 28.64, F2(1,34) 5 11.88). Verb transitivity and judgment times. We conducted a posthoc analysis of verb transitivity, using the transitivity scores from the normative study, and treating the verb preference variable as continuous (Perlmutter & MacDonald, 1992). We tested for a correlation between response times and verb transitivity scores within each of the six experimental conditions. Results showed only a very weak relationship, with transitivity difference scores significantly related to judgment times only in the
conflicting late closure condition (r 5 2.5, F(1,16) 5 5.30, p , .05; all other Fs , 2, all other ps . .20.) We note that, since we collected judgments at the end of the sentence, the majority of current sentence processing models would predict a relationship between verb transitivity and judgment times, such that transitively biased verbs would produce faster times for late closure sentences, and intransitively biased verbs would produce faster times for early closure sentences. Although the set of verbs we used included some tokens that were strongly transitive or intransitive, the majority of items were not strongly biased in either direction, which may have reduced the detectability of the influence of verb transitivity on response time. Companion duration control experiment. A one-way analysis of variance on combined “error” and “okay” responses for the unambiguous control sentences showed no significant differences in processing time between long and short pause conditions (means were 1205 and 1249 ms, respectively, both Fs , 1). Discussion Results from the speeded phonosyntactic grammaticality judgment task showed that prosodic structure influenced subjects’ metalinguistic judgments, even in the presence of disambiguating syntactic information. We found significant effects of prosody and syntax for both dependent measures. Interestingly, this differs from previous work that used a speeded grammaticality judgment task with visual presentation of temporary syntactic ambiguities, where significant differences were found for judgments, but not judgment times (Warner & Glass, 1987; Ferreira & Henderson, 1991). The lack of effect for judgment times in the previous work may have been due to ceiling effects on time to judge long and syntactically complex garden path sentences. The sentences used in our experiment were short and syntactically relatively simple, and the addition of prosodic structure may have reduced the difficulty of the judgment task. We predicted that information from the prosodic representation (here, the presence of an IPh boundary) would inform syntactic decision
PROSODIC FACILITATION AND INTERFERENCE
making so that judgments in the cooperating prosody conditions would be facilitated when compared to baseline prosody conditions. Results confirmed this prediction. Cooperating sentences, with coincident prosodic and syntactic constituent boundaries, were processed faster than baseline sentences, where phonetic information about the location of the prosodic boundary was ambiguous. In addition, we predicted that syntactic garden pathing effects would be absent in the cooperating prosody condition, because syntactic decision making would be informed by prosodic information about the location of constituent boundaries. Consistent with this prediction, there were no differences between early and late closure cooperating structures in either acceptability ratings or response times, even though there were clear syntactic garden path effects in the baseline and conflicting conditions. Early closure sentences were judged more slowly than late closure sentences in the baseline and conflicting conditions, and early closure sentences produced more “error” responses in the baseline condition. These results support the claims that prosody informs the syntactic parsing process and that cooperating prosody facilitates the processing of temporary syntactic ambiguity. Our view also predicts that the presence of a prosodic boundary at a misleading location in the syntactic structure should interfere with the parsing process, forcing misanalysis of the temporary ambiguity. Thus conflicting prosody sentences should produce more “error” judgments and take longer to judge than baseline prosody sentences. This prediction was only partially confirmed: Although conflicting prosody sentences were significantly more likely to be judged as errors than baseline prosody sentences, there were no significant response time differences between the two conditions. One possible explanation for this pattern of effects is that judgment times in the cooperating conditions may reflect a different type of processing than those in the baseline and conflicting conditions. Judgments in both of the cooperating conditions were relatively fast and predominately “okay” responses, presumably based on an initial, successful analysis of the sentence.
167
This analysis was strongly influenced by prosodic structure and showed no sensitivity to the early/late closure syntactic difference. However, judgments in the baseline and conflicting conditions, where prosodic structure did not predict the correct syntactic analysis, took longer to complete and included more “error” responses. These judgments may reflect the operation of a syntax-sensitive reanalysis process. Even if this is so, why didn’t reanalysis take longer in the conflicting conditions, where prosodic structure was misleading, than in the baseline conditions, where it was merely uninformative? It is possible that the speeded metalinguistic judgment task enforced a ceiling on the time that subjects were willing to spend searching for an acceptable analysis before rejecting the conflicting prosody sentences. If this aspect of the task was responsible for a ceiling effect, we should find that interference effects appear when tasks that do not require such judgments are used. One concern for the comparison of response times in the cooperating and conflicting conditions to those for baseline conditions was that of total sentence duration, because the baseline sentences were, on average, shorter than the others, which contained clause-final word lengthening and silence at the prosodic boundaries. Both the results of the companion duration experiment and the results in the main experiment cast doubt on a duration-based explanation. First, when duration was manipulated directly for sentences of comparable length and structure, it produced no significant effects. Second, a duration-based explanation cannot successfully account for the pattern of effects across conditions unless additional influences (due to syntax and prosody) are added. For example, it could be argued that additional time was used to complete additional processing and that this, rather than prosodic structure, was the locus of the facilitation in the cooperating conditions. However, the same amount of additional time was available for processing in the conflicting conditions and did not produce a comparable facilitation effect. One might argue that the additional time did speed processing in the conflicting conditions as well, with the re-
168
KJELGAARD AND SPEER
sult that we did not find an interference effect. However, this claim incorporates an effect of prosody. In addition, a duration-based account cannot explain the syntactic differences across prosodic conditions—that is, early closure cooperating sentences have long durations and short response times, early closure baseline sentences have short durations and long response times, and early closure conflicting sentences have long durations and long response times. In sum, the companion experiment suggests that duration has little effect when it is directly manipulated, and the results of the main experiment must be explained by more factors than duration alone. A final concern that must be raised in interpreting the judgment times is the differential contribution of “okay” and “error” responses to the conditions being compared. This is because responses for the two types of trials may involve cognitive processes that are qualitatively different from one another. For example, “error” responses could involve additional nonlinguistic checking processes, resulting in longer response times. Since the number of “error” responses contributing to the conditions is larger for just those conditions that produce longer judgment times, we looked at the pattern of response times for the “okay” responses alone. These followed the same pattern as those for “error” and “okay” judgments combined (means: cooperating early closure 1088 ms, cooperating late closure 1131 ms, baseline early closure 1375 ms, baseline late closure 1286 ms, conflicting early closure 1402 ms, and conflicting late closure 1316 ms). However, the small number of observations, particularly in the baseline early and the two conflicting conditions, interferes with item counterbalancing and must seriously qualify any conclusions that can be drawn from these data. It is interesting to compare the pattern of effects here to other studies of prosody and syntax that have collected metalinguistic judgments together with response times. Several recent studies have used a version of the crossmodal naming task in which subjects are asked to listen to a syntactically ambiguous sentence fragment with cooperating or conflicting pro-
sodic structure, to name a visual word that syntactically disambiguates the fragment, and then to give a rating to indicate how well the visual word completes the spoken fragment. Some researchers using this task (Marslen-Wilson et al., 1992; Murray & Watt, 1996) have argued that different patterns of effects in judgments and judgment times reflect different aspects of processing measured by the two dependent variables. However, other work using the same tasks has shown the same pattern of effects common to both (Warren et al., 1995). The speeded phonosyntactic grammaticality judgment task we used brings together the rating of the stimulus and the reaction-time measurement. However, our results suggest that even when the timed task itself involves a metalinguistic judgment, the response time pattern need not mimic listeners’ “off-line” ratings of the stimuli. The results of Experiment 1 were consistent with the view that cooperating prosodic constituent boundaries facilitate syntactic decision making and may preclude syntactic misanalyses. However, there was no clear interference effect on judgment times from conflicting prosodic information. We were concerned that metalinguistic processes inherent in the speeded grammaticality judgment task may have reduced its sensitivity to prosodic interference in syntactic processing and also concerned that the two different types of responses in the task may have differentially contributed to judgment times in conditions. In Experiment 2, we replicated Experiment 1 using an end-of-sentence comprehension task, one that did not require subjects to reflect on the grammaticality of the sentences they were processing, and one that would allow analysis of a set of reaction times from a single type of response. EXPERIMENT 2 Experiment 2 used the materials from Experiment 1 in an end-of-sentence comprehension task. In this task, listeners were asked to simply listen to the sentences as they would for normal comprehension and to indicate as quickly as possible whether each sentence had been under-
PROSODIC FACILITATION AND INTERFERENCE
169
stood. Although “did not comprehend” responses were possible, this task differs from that in Experiment 1 because we did not ask subjects for a qualitative judgment. Our predictions were the same: cooperating prosody should facilitate syntactic processing, and syntactic processing difficulties associated with nonpreferred analyses in reading studies should be precluded in cooperating prosody conditions. Conflicting prosody should mislead the parser, interfering with syntactic processing and producing longer comprehension times. Method Subjects. Sixty-six Northeastern University undergraduates participated in return for partial credit toward a course requirement in an Introductory Psychology course. All participants were native speakers of English with normal or corrected-to-normal vision and no reported hearing problems. Materials. The materials were identical to those described above with the exception that they were redistributed across six materials sets using two new Latin square designs. Procedure. Subjects participated in groups of up to eight participants, using the hardware and software described for Experiment 1. On each trial, subjects heard a tone and saw a simultaneous visual attention signal (the sequence **READY** on the screen). Following this, they listened to a sentence and pulled a lever to indicate whether they had comprehended it. They were instructed to respond as quickly as possible by pulling one of two response levers, designated “Understood” and “Didn’t” by labels that appeared at the bottom of the CRT screen during and immediately after the presentation of the sentence. Comprehension times were measured from the onset of the sentence sound to the subjects’ lever pull, with individual sentence durations subtracted from each subject’s times. Response times longer than 2500 ms (after subtraction) were discarded; these times accounted for less than 2% of the data. As a check on comprehension, at intermittent intervals an alert signal sounded and subjects answered a question that appeared in the center of the screen, by choosing one of two possible
FIG. 4. End-of-sentence comprehension times and standard errors (ms) for early/late closure sentences with cooperating, baseline, and conflicting prosodies, Experiment 2.
answers presented on either side and corresponding to the response levers (e.g., When Roger goes out, the lights are: ON OFF). Results Mean end-of-sentence comprehension times and standard errors for “Understood” responses are presented in Fig. 4. As in the speeded grammaticality judgment task, results showed sensitivity to both prosodic and syntactic factors. An analysis of variance found a main effect of prosody (F1(2,65) 5 39.29; F2(2,17) 5 13.32) and main effect of syntax significant only with subjects as the random variable (F1(1,65) 5 4.74, p , .05; F2(1,17) 5 1.28, p , .28). Planned comparisons showed facilitation in the cooperating conditions, where sentences were comprehended more quickly than in the baseline conditions (means were 641.97 and 741.42 ms, respectively; F1(1,65) 5 10.64; F2(1,17) 5 6.01). There was also an interference effect in the conflicting conditions, where sentences were comprehended more slowly than in the baseline conditions (means were 951.80 and 741.42 ms, respectively; F1(1,65) 5 47.62; F2(1,17) 5 26.95). Additional planned comparisons showed garden path processing difficulty for early closure sentences in the baseline and conflicting conditions (early mean 887.8, late
170
KJELGAARD AND SPEER TABLE 3 Number and Proportion of Trials in Six Conditions on Which Subjects Did Not Comprehend the Sentence in Experiment 2
Early closure syntax Late closure syntax
Cooperating prosody
Baseline prosody
Conflicting prosody
6 (.03) 9 (.04)
7 (.04) 4 (.02)
17 (.09) 10 (.05)
mean 805.39 ms; F1(1,130) 5 7.31, F2(1,34) 5 4.19, p , .05). As in Experiment 1, there was no measurable garden path effect within the cooperating conditions, with early closure cooperating numerically faster than its late closure counterpart (means were 638.8 and 645.1 ms, respectively), suggesting that cooperating prosodic information precluded parsing difficulty for the early closure syntactic structures. Trials on which subjects did not comprehend the sentences accounted for 4.6% of the total trials. The number and proportion of these trials in each condition are presented in Table 3. Subjects failed to comprehend the sentence most often in the conflicting prosody conditions. Transitivity and comprehension time. We conducted a posthoc analysis of verb transitivity as we did for Experiment 1, testing for a correlation between response times and verb transitivity scores in the 6 conditions. As in Experiment 1, results showed only a very weak relationship, with transitivity difference scores significantly related to judgment times in only one condition, this time the baseline late closure condition (r 5 2.47, F(1,16) 5 4.50, p , .05; all other Fs , 2.5, all other ps . .13). Companion duration control experiment. A one-way analysis of variance comparing comprehension times for short and long pause duration control sentences showed no significant differences (short mean 1057, long mean 1104 ms, both Fs , 1.5). Discussion End-of-sentence comprehension times showed prosodic effects even in the presence of disambiguating syntactic structure. As predicted, the presence of a cooperating prosodic boundary facilitated syntactic decision making:
There was a processing time advantage for cooperating conditions over baseline conditions. Importantly, and in contrast to the speeded grammaticality judgment times in Experiment 1, the predicted processing disadvantage for conflicting prosody conditions compared to baseline prosody conditions was significant, indicating that a misleading prosodic phrasal boundary interfered with recovery of the proper syntactic analysis. These results are consistent with the strong version of the view that prosodic information determines syntactic phrase structure assignment. Such a view must predict both facilitation and interference, so that at a temporary syntactic ambiguity, prosody leads the processor toward a syntactic constituent structure that is consistent with prosodic constituent structure, regardless of whether that structure is consistent with upcoming syntactic information. End-of-sentence comprehension times also showed that cooperating prosody precluded processing difficulty for the early closure syntactic structures: In the cooperating condition, there were no differences between early and late closure sentences, although in the baseline and conflicting conditions, early closure structures were processed more slowly than late, showing a syntactic garden path effect. The finding of a late closure advantage in the baseline and conflicting conditions is consistent with effects found for temporarily ambiguous sentences in reading paradigms (e.g., Frazier & Rayner, 1982; Carlson & Tanenhaus, 1989; Ferreira & Henderson, 1991), while the lack of such an effect in the cooperating condition indicates that prosody is used to prevent syntactic misanalyses (or perhaps to reanalyze them quickly). We repeated the companion duration experiment with the end-of-sentence comprehension
PROSODIC FACILITATION AND INTERFERENCE
task in Experiment 2. Again, our concern was that the comparison of response times in the cooperating and conflicting conditions to those for baseline conditions would be compromised by the difference in total sentence duration between the conditions. Our findings were the same as those for Experiment 1, that is, when sentence duration was manipulated directly, it produced no significant effects. In addition, the pattern of results across the six conditions of the main experiment would be difficult to explain on the basis of durational factors, without appeal to prosodic and syntactic factors as well. Our findings are consistent with those from previous studies examining prosodic effects on syntactic closure ambiguity (Slowiaczek, 1981; Carroll & Slowiaczek, 1987; Warren et al., 1995). All of these studies found faster processing times for sentences with cooperating prosodic and syntactic boundaries than for those with conflicting prosody and syntax. While such results are predicted if the parsing process is influenced by prosody, they cannot give strong support to the argument that prosodic boundary information determines parsing decisions at temporary syntactic ambiguities. We have argued that such a view implies that prosodic information can both facilitate and interfere with syntactic decision making and that coincident prosodic and syntactic boundaries should preclude syntactic garden pathing. The two novel contributions of our studies—the baseline comparison and the direct test for syntactic garden path effects in a cooperating prosody condition—were constructed to address these specific claims. Our results suggest that the previous findings were due to both facilitatory and interfering effects of prosodic structure and can exclude the explanation that longer processing times found for sentences with misleading prosodic boundaries were due to the presentation of a disrupted stimulus rather than to the use of prosodic information in the parsing process. If we consider Experiments 1 and 2 together, and look only at the cooperating and baseline conditions, the same pattern of effects appeared for judgments, judgment times, and comprehension times. We interpreted the pattern of data in
171
the cooperating conditions to suggest that prosodic information determines the syntactic structure assigned to a sentence. However, results from the two experiments diverged for the conflicting conditions: Listeners judged conflicting prosody sentences as “errors” significantly more often than baseline prosody sentences, and late closure conflicting items were judged errors as often as early closure ones. Judgment times showed no corresponding interference effect, but did show the syntactic difference. End-of-sentence comprehension times showed both the interference and the syntactic effects. To interpret these effects, we suggested that the judgment component of the speeded grammaticality judgment task left it insensitive to interference effects in the conflicting condition. Consistent with this explanation, results in the end-of-sentence comprehension task, which did not require a metalinguistic judgment, show a clear interference effect. This pattern, along with the presence of the early closure disadvantage in the response time data from both experiments, is consistent with a prosodically informed syntactic parsing process: In the baseline conditions, no information was available from prosodic structure to inform the parse, and so syntactic differences in processing time reflect the same pattern of effects found in reading studies. In contrast, in the conflicting conditions, prosodic information determined the initial structure assigned to the sentence, but immediately following morphosyntactic information was inconsistent with this analysis, necessitating reanalysis for both early closure and late closure sentences. One outstanding question, given this scenario, concerns the source of the unpredicted garden path effect in the conflicting conditions. That is, why did reanalysis take longer for early closure conflicting prosody sentences? Depth-first parsing models suggest one possibility: When the initial parsing decision, based on prosodic boundary location, came into conflict with morphosyntactic information, the resulting reanalysis process was based on verb bias (in these materials, a transitive bias) and other available contextual factors. Another related explanation is that because we measured at sentence end, our results must re-
172
KJELGAARD AND SPEER
flect reprocessing influenced not only by information that precedes the ambiguity, but that follows it. Thus verb bias and more general syntactic and semantic information from the sentences’ final clauses may have been biased toward a transitive interpretation, resulting in longer reprocessing times for conflicting early closure sentences. Thus our results so far demonstrate that by sentence end, the presence of an IPh boundary has both facilitative and interfering effects on the resolution of temporary syntactic closure ambiguity. These results allow us to rule out a number of hypotheses that have been advanced about the relationship of prosody to parsing, including (1) that the prosodic structure perceived for spoken sentences is dependent on the underlying syntactic structure that has been recovered for the sentence (Lieberman, 1968), (2) that prosodic variables have little or no influence on syntactic aspects of the auditory parsing process (Watt & Murray, 1995), and (3) that prosodic structure, while it may assist with processing of syntactically unpreferred analyses, cannot produce processing difficulty for preferred syntactic analyses (Pritchett, 1988). However, end-of-sentence measures leave open the possibility that momentary early effects of syntax went undetected. For example, the finding of no syntactic closure preference in the cooperating prosody conditions may simply reflect very fast reanalysis of the early closure sentence, rather than an initial early closure analysis on the basis of prosodic constituency. Therefore, on the basis of these data we cannot distinguish among those models that posit prosodic influences from the very earliest stages of processing (Speer et al., 1989, 1996; Schafer, 1997) and those that locate prosodic influences at other stages of analysis (Marcus & Hindle, 1990; Steedman, 1991; Marslen-Wlison et al., 1992; Pynte & Prieur, 1996). On our view, because the prosodic representation organizes the phonological input to the syntactic parsing mechanism, it will influence the initial syntactic analysis that is considered, producing immediate facilitation and interference effects on the syntactic parsing process. Other researchers have suggested that prosody
has a secondary and/or delayed influence on parsing. For example, Marcus and Hindle’s (1990) parser considers boundary tones to be “unknown lexical items” (p. 495) in its input stream and uses them as local markers to initiate closure of the current constituent under construction. Constituents created by this process are combined during post-phrase-structure analyses that are sensitive to intermediate phrasing and pitch accent information. Steedman (1991) posits a “prosodic constituent condition” that blocks the combination of constituents determined by other aspects of a categorial grammar if the mapping between these two constituent structures conflicts. Pynte and Prieur (1996) propose a model where prosody does not affect initial syntactic analyses, but influences a second stage of parsing by determining the order in which alternative verb argument structures are considered. Marslen-Wilson, Tyler, Warren, Grenier, and Lee (1992) suggest that sentences are initially structured on the basis of prosodic and morphosyntactic cues, but that prosodic cues are given less weight and are easily overridden by conflicting morphosyntactic information. In Experiment 3, we used cross-modal naming, a more “on-line” task that allowed us to measure processing at the point of syntactic disambiguation. The task has been shown to be sensitive to prosodic and syntactic effects (Marslen-Wilson & Tyler, 1977; Marslen-Wilson et al., 1992; Warren et al., 1995; but see Watt & Murray, 1996). In one version of this task, subjects listened to ambiguous spoken sentence fragments immediately followed by a visual target word that continued the sentence and resolved the syntactic ambiguity (e.g., Auditory fragment: The workers considered that the last offer from the management . . . Visual target: WAS). On each trial, subjects named the visual target and then rated how well the word served to continue the fragment. Naming times were assumed to reflect the ease of integrating the visual target and the fragment together into a sentence. One concern we had about the use of this task was the potential contribution of the metalinguistic judgment component to naming times,
PROSODIC FACILITATION AND INTERFERENCE
due to the concurrent rating task. Some previous studies have shown different patterns of effects in the postnaming rating data compared to naming times (Marslen-Wilson et al., 1992; Watt & Murray, 1996), suggesting a dissociation between the two aspects of the task. However, other studies using this task have shown the same pattern of effects across these two dependent variables (Warren et al., 1995), suggesting some contribution from the rating process to naming time. To the extent that metalinguistic reflection is not a part of ‘typical’ spoken language comprehension, we consider it a contaminant in measures of on-line processing. In addition, we noted in comparing Experiments 1 and 2 that the metalinguistic judgment task seemed to attenuate prosodic interference effects that were measurable with a similar task that did not require this judgment. Because our goal was to demonstrate both facilitation and interference effects on initial parsing decisions, we modified the task to remove the metalinguistic component, while maintaining the necessity for subjects to integrate across the two modalities. We predict the same pattern of effects found in the first two experiments. Specifically, when prosodic and syntactic boundaries coincide, processing will be facilitated, so that cooperating prosody conditions will produce faster naming times than baseline conditions. When prosodic constituent boundaries occur at a misleading point in syntactic structure, there will be an interference effect, so that conflicting prosody conditions will produce slower naming times than baseline conditions. Finally, any syntactic garden path processing disadvantage shown in the baseline or conflicting prosody conditions should disappear in the cooperating conditions. EXPERIMENT 3 Experiment 3 used the same sentence materials as Experiments 1 and 2 in a modified version of the cross-modal naming task (Tyler & Marslen-Wilson, 1977; Marslen-Wilson et al., 1992; Warren et al., 1995). In our task, subjects heard a syntactically ambiguous fragment, named a visually presented disambiguat-
173
ing word, and then completed the sentence. Thus sentence completion, rather than a metalinguistic judgment of the relationship between the named word and the fragment, was used to encourage integration across the auditory and visual modalities. Method Subjects. Sixty undergraduate psychology students at Northeastern University participated in exchange for credit toward a course requirement. All subjects were native English speakers and reported having no hearing, speech, or language problems. Materials. Sentence fragments were created by digitally truncating the early and late closure sentence stimuli from Experiments 1 and 2. Each fragment contained all of the acoustic information from the onset of the sentence through the end of the ambiguous region (e.g., When Roger leaves the house . . .). The fragments, with transitivity scores, phonological transcriptions, and the visual words, are presented in the Appendix. The experimental design contained three prosodic conditions: Cooperating, baseline, and conflicting. In the cooperating conditions, each fragment contained an intonation phrase (IPh) boundary with an associated level 4 break that was coincident with one of the two possible syntactic clause boundaries. Visual target words resolved the syntactic ambiguity. For cooperating prosody conditions, when the IPh followed the verb, the visual target resolved the ambiguity toward early closure, and when the IPh followed the ambiguous NP, the visual target resolved toward late closure. Conflicting conditions were created using the same auditory fragments with the opposite visual targets, so that when the IPh followed the verb, the visual target resolved toward late closure, and when the IPh followed the NP, the visual target resolved toward early closure. For baseline conditions, no IPh boundary occurred in the ambiguous region. The subject of the first clause carried an L1H* pitch accent, and the ambiguous region was deaccented, such that the precise location of a low phrase accent (L2) with a level 1 break was phonetically ambiguous
174
KJELGAARD AND SPEER TABLE 4 An Example Auditory Sentence Fragment with Visual Targets for Cross-Modal Naming from Experiment 3 Condition
Auditory fragment
Visual word
Cooperating prosody Early closure Late closure
H* ((When Roger leaves L2)PPhL%)IPh ((the house H* ((When Roger leaves the house L2)PPhL%)IPh
is it’s
Baseline prosody Early closure Late closure
L1H* ((When Roger leaves L2)PPh (the house L1H* ((When Roger leaves the house L2)PPh
is it’s
Conflicting prosody Early closure Late closure
H* ((When Roger leaves the house L2)PPhL%)IPh H* ((When Roger leaves L2)PPhL%)IPh ((the house
within the temporarily ambiguous region. Table 4 shows an example sentence fragment and the corresponding visual target words in the six conditions. The six versions of each of the 18 experimental items were distributed among six materials sets, using a condition rotation determined by a Latin square design. Each materials set contained three tokens from each of the six conditions and 22 filler fragments. These contained single clauses, adjunct phrases, or noun phrases. Five fragments contained a late prosodic boundary (e.g., At the zooL2H% . . .), 7 contained prosodic boundaries at unpredictable locations (e.g., The youngH2L% Californian . . .), and 10 contained no prosodic boundary (The softball pitch . . .). Procedure. Subjects participated one at a time and were seated next to the experimenter in a quiet room equipped with a monochrome monitor, headphones, and a microphone. The experiment was controlled by MPL software (Neath, 1994) on a Macintosh IIcx computer equipped with an AudioMedia I card for sound presentation and a CMU button box with voiceoperated relay for millisecond timing. On each trial, there was a warning signal, and then subjects listened over the headphones to a sentence fragment that was immediately followed by the
is it’s
presentation on the monitor of a visual target that could continue the sentence. They were asked to imagine they were having a conversation with a good friend and that the two of them knew each other “so well that they could complete each others’ sentences.” As they listened to the spoken fragments, subjects were told to imagine that their friend was speaking and that the word that appeared on the monitor was the one they knew their friend would say next. Finally, they were asked to name the visual word into the microphone as quickly as possible and then complete the sentence for their friend, beginning with the visual word. Visual targets appeared immediately at the offset of the auditory fragments and remained on the screen until the subject responded. Naming times appeared on the monitor after each response. The fragment completions and the accuracy of the naming response were recorded in writing by the experimenter. At the end of the experimental trials, subjects completed 15 additional naming trials, 3 for each of the six visual target words, which were presented following a neutral carrier phrase (The next word will be . . .). These times were collected to assess lexical effects on naming times, such as those due to word frequency, length, orthography, and word-initial phoneme
PROSODIC FACILITATION AND INTERFERENCE TABLE 5 Proportion of Missing Data in Six Conditions for Experiment 3 Condition
Cooperating
Baseline
Conflicting
Early closure Late closure
.05 .03
.15 .10
.13 .03
(e.g., Fowler, 1979). Individual reaction times were corrected for word-based variability by subtracting the difference between the grand mean of the neutral-context naming times and the mean of the three tokens from each item for each subject. Results Trials that produced incorrect naming responses (including wrong words, false starts, and voice key failures) and/or ungrammatical fragment completions were excluded from the analyses. Trials longer than 2 s (before the correction for word-based variability) were also excluded. Missing data accounted for 8% of the total experimental responses and were replaced using the average of the experiment-wise individual subject and item means (Winer, 1971). Table 5 displays the distribution of missing responses. The majority of errors occurred in the baseline prosody conditions and in the conflicting early closure condition. Thus, the conservative data replacement technique slightly underestimated the duration of response times in the baseline and early closure conflicting conditions. Because these were conditions predicted to have slower reaction times, any underestimation worked against the hypotheses. Mean naming times and standard errors for Experiment 3 are shown in Fig. 5. Results replicated the overall pattern of those found in Experiments 1 and 2. An analysis of variance found a main effect of prosody (F1(2,59) 5 48.30; F2(2,17) 5 11.75), a main effect of syntax (F1(1,59) 5 22.72; F2(1,17) 5 11.75), and a significant interaction of prosody and syntax (F1(2,118) 5 17.41; F2(2,34) 5 8.28). Planned comparisons showed facilitation in the cooper-
175
ating conditions, where targets were named more quickly than in the baseline conditions (means were 656 and 753 ms, respectively; F1(1,59) 5 27.6; F2(1,17) 5 12.25). However, this effect did not hold within the late closure conditions. That is, the cooperating and baseline late closure conditions were not statistically different (means were 679 and 691 ms, respectively; both Fs , 1.3). There was also an interference effect in the conflicting conditions, where targets were named more slowly than in the baseline conditions (means were 834 and 753 ms, respectively; F1(1,59) 5 18.7; F2(1,17) 5 9.41). It is important to note that the disruptive effect of misleading prosody was not carried by the early closure condition alone. Conflicting late closure sentences showed significantly longer naming times than baseline late closure sentences, though only in the subjects’ analysis (means were 754 and 691 ms, respectively; F1(1,59) 5 5.7, p , 02; F2(1,17) 5 2.54, p 5 .12). Additional planned comparisons showed syntactic garden path processing difficulty for early closure as compared to late closure sentences in the baseline conditions (early mean 5 817 ms, late mean 5 691 ms; F1(1,59) 5 22.72; F2(1,17) 5 10.10) and in the conflicting conditions (early mean 5 914.60 ms, late mean 5 754 ms; F1(1,59) 5 37.33; F2(1,17) 5 18.75). There was no measurable garden path effect
FIG. 5. Cross-modal naming times and standard errors (ms) for early/late closure sentences with IPh-based cooperating, baseline, and conflicting prosodies, Experiment 3.
176
KJELGAARD AND SPEER
within the cooperating conditions, with the early closure cooperating prosody condition numerically faster than its late closure counterpart (means were 634 and 679 ms, respectively), suggesting that cooperating prosodic information precluded parsing difficulty for the early closure syntactic structures. Verb transitivity and naming times. We conducted a posthoc analysis of verb transitivity, using the transitivity scores from the normative study described above, and treating the verb preference variable as continuous (Perlmutter & MacDonald, 1992). We tested for a correlation between naming times and transitivity scores within each of the six experimental conditions. Results showed a weak relationship, with transitivity difference scores significantly related to naming times in only two conditions, the cooperating late closure and the cooperating baseline conditions (r 5 2.5, F(1,16) 5 5.67, p , .05, and r 5 .51, F(1,16) 5 5.97, p , .05; all other Fs , 2, all other ps . .1). This finding is consistent with those from Experiments 1 and 2, where we found only a weak relationships between reaction time at sentence end and transitivity score. Control analyses. Because our version of the cross-modal naming task required participants to complete the spoken fragment after naming the visual target, we were concerned that naming times might reflect production processes in addition to comprehension processes, thus qualifying the interpretation of our results. To investigate this possibility, we examined subjects’ productions in the fragment completion portion of the task. In general completions were homogenous in their syntactic and semantic complexity, but more variable in length. Ferreira (1991, 1993) has shown planning effects on production initiation times, such that speakers take longer to initiate longer and more complex utterances. We reasoned that if our naming results were contaminated by production planning, the amount of time it took to name a target should be systematically related to the length of the sentence production that followed it. More specifically, early closure productions should be longer than late closure productions in the baseline and conflicting conditions. The number of
syllables and words contained in subjects’ productions were analyzed to test the relationship between naming time and production length. Correlations between naming time and number of syllables and naming time and number of words revealed no reliable relationship (rs 5 .08; rw 5 .07). Planned contrasts showed that subjects produced significantly more syllables for late closure structures than early closure structures (F1(1,59) 5 5.2, p , .05; F2(1,17) 5 5.9, p , .05) but no other differences were found. Note that the conditions where subjects were generally faster were the conditions with longer productions; this pattern is opposite to what one would expect in the case that naming times reflected production initiation times. We ran an additional control experiment in which subjects performed the cross-modal naming task, but were not asked to complete the fragment. This study was inconclusive; naming scores were generally fast and did not show any clear prosodic or syntactic effects. One possibility is that without the impetus from the completion task, subjects failed to integrate the linguistic material across the two modalities. Other researchers have also failed to find parsing effects in a cross-modal naming task with no integration component (Murray & Watt, 1995; but see Mazuka, 1997). Discussion The results of Experiment 3 demonstrated prosodic facilitation and interference effects consistent with those found in Experiments 1 and 2, but at a point much earlier in processing. The cross-modal naming task allowed us to measure at the point of syntactic disambiguation for the temporary ambiguity. We found facilitation when prosodic and syntactic constituent boundaries coincided and interference when prosodic phrasal boundaries were located at misleading points in syntactic structure. Although there were clear syntactic garden path effects in the baseline and conflicting prosody conditions, there were no such effects when cooperating prosodic boundaries were present, suggesting that information from the prosodic representation was available to preclude syntactic parsing difficulty in the early closure condi-
PROSODIC FACILITATION AND INTERFERENCE
tions. The overall pattern of these results is consistent with the strong version of the view that prosodic information can determine syntactic phrase structure assignment. Early versus late closure syntactic structures showed distinctly different patterns of effects across the three prosodic conditions. When an IPh boundary followed the verb (e.g., When Roger leavesH2L% the house . . .), we found faster naming times for the early closure target (e.g., is) and slower times for the late closure target (e.g., it’s), compared to the baseline conditions, where information about the location of the prosodic boundary was unavailable. This pattern suggests that the parser closed the syntactic phrase after the verb, at the location of the IPh boundary. When the IPh boundary followed the noun phrase (e.g., When Roger leaves the house H2L%), it again influenced the operation of the parser. We found slower naming times for the early closure target (e.g., is) in the conflicting condition compared to the baseline. However, we did not find faster processing times for the late closure target (e.g., it’s) when we compared cooperating to baseline late closure conditions. Because this pattern of results is so similar to that found in Experiments 1 and 2, we discuss their interpretation together in the next section. DISCUSSION OF COMBINED RESULTS, EXPERIMENTS 1–3 Taken together, the results of the first three experiments are consistent with the predictions of a model where information from a prosodic representation of the kind specified in phonological theory can influence the operation of the syntactic parsing mechanism. In the cooperating and conflicting sentence materials for these experiments, the right edge of an intonational phrase coincided with a potential syntactic constituent boundary location. This correspondence consistently produced facilitation of temporary syntactic ambiguity resolution for the early closure sentences in cooperating as compared to baseline conditions, in sentence judgments, judgment times, comprehension times, and naming times. In addition, results from these four measures showed that a cooperating pro-
177
sodic boundary at a syntactic choice point consistently did away with the late closure processing advantage, even though this advantage was present in the baseline and conflicting conditions. There are at least three possible interpretations for this pattern of effects. First, it may be that the late closure structure functioned as a syntactic default, consistent with the predictions of a depth-first view of parsing (Frazier, 1987a, b; Frazier & Clifton, 1996). According to this interpretation, the IPh boundary after the NP was consistent with the ongoing late closure syntactic structuring process, so that naming times for the late closure sentences in the cooperating and baseline prosody conditions reflected the same parsing process. Second, it is possible that since the overall lexical biases in the fragments favored slightly the transitive interpretation of the utterance, this reduced response times in the baseline late closure condition. Such an explanation is consistent with a constraint-based parsing perspective (Tanenhaus & Carlson, 1989; MacDonald et al., 1995). A prediction from this view is that if these transitional probabilities were manipulated to favor the intransitive interpretation, processing time in the baseline late closure condition should become longer, resulting in facilitation for the cooperating condition. The third possibility is that the lack of facilitation in the late closure condition was due to floor effects. Some doubt is cast on this explanation by the numerically faster cross-modal naming times measured in the cooperating early closure condition, but since these times were not statistically different from those in the cooperating late closure condition, the possibility that prosodic structure might facilitate even the preferred syntactic analysis must be left open. If prosodic constituency can determine the syntactic phrase structure selected by the parser, prosodic boundaries should also be capable of misleading the parser, causing interference regardless of whether they are consistent with the syntactically preferred analysis. We demonstrated such effects in Experiments 2 and 3, where comprehension times and naming times were longer for conflicting than for baseline
178
KJELGAARD AND SPEER
conditions for both early and late closure comparisons. Although we did not find interference for sentence judgment times in Experiment 1, we speculate that task-related variables, such as time constraints on the metalinguistic aspects of the response, may account for this difference. We predicted interference effects due to the influence of prosodic structure on the initial syntactic decision-making process. From this perspective, the location of the IPh determined the initially assigned syntactic boundary location, and when conflicting morphosyntactic information was encountered, reanalysis ensued. As discussed above, this description alone does not predict the finding of significantly longer processing times for early as compared to late closure conflicting conditions; That is, all else being equal, comparable reprocessing is expected in the two conflicting conditions. Earlier, we suggested that end-of-sentence comprehension times were influenced by the overall weak transitive bias of the verb set or by lexical or semantic biases toward late closure that occurred during the processing of the second clause of the experimental sentences. However, such explanations seem less viable for the crossmodal naming results, where sentence fragments ended after the postverbal NP. Aside from a weak transitive bias that may have guided reanalysis, why would processing have been more difficult for the early than the late closure conflicting conditions? One possibility is that use of the IPh boundary information was delayed for the early closure conflicting conditions, because the truncation used to create the fragments resulted in less determinate phonetic information for that condition. While both early and late closure conflicting fragments contained phrase-final lengthening, boundary tones, and silence at the IPh, the late closure conflicting fragments also contained additional sentence material following the silence. This information may have been available to confirm phonetically the end of the material preceding the prosodic constituent boundary, by indicating the beginning of the next constituent. Such information would have been removed in the truncation of the early closure conflicting frag-
ments, where IPh information was followed immediately by the visual target. We note that an account based on phonetic differences is not available to explain the same pattern of effects for either the end-of-sentence judgment or comprehension time results, where materials were full sentences. There is one final possibility, which would apply to both sentence and fragment materials: Perhaps reanalysis was more difficult for the early closure than the late closure conflicting conditions. More specifically, reanalysis in the conflicting early closure sentences would require detaching an argument NP from within its verb phrase and reassigning it to subject position in the next clause, a difficult process. However, in the late closure conflicting sentences, the prosodic boundary following the verb was consistent with an initial analysis of the verb as intransitive and the NP as the subject of an upcoming sentence. Although reanalysis to a direct object reading would require a similarly difficult restructuring of the closed VP, there is another less frequent syntactic analysis potentially available. If the second clause were reanalyzed as containing a topicalized NP (the door, it’s locked), reanalysis would not require restructuring of the verb phrase and would result a low-likelihood syntactic analysis, but one more consistent with the prosodic boundaries after the verb and NP. If the topicalized reanalysis was sometimes chosen instead of the transitive, this could account for the faster processing times we found in the late closure conflicting conditions. Such an explanation is not available to account for garden path effects in the baseline conditions, where the location of prosodic phrasal boundaries was ambiguous. The results of Experiments 1 through 3, then, provide evidence that prosodic structure can influence the resolution of temporary syntactic ambiguities very early in the parsing process. The evidence is consistent with the view that information available from a prosodic representation of the spoken input determined the assignment of syntactic structure. When prosody was consistent with syntax, these results suggest it precluded processing difficulty for the lesspreferred syntactic analysis. When prosody con-
PROSODIC FACILITATION AND INTERFERENCE
flicted with syntax, it created processing difficulty, for both the preferred and less-preferred syntactic analysis. EXPERIMENT 4 We have argued that phonological information recognized during the sentence comprehension process includes a prosodic representation of the sort described in phonological theory and that the correspondence between that representation and other levels of linguistic analysis is the locus of the effects like those demonstrated above. So far in our experiments, we have manipulated intonational phrase boundaries, which are associated with substantial phrase-final lengthening and silence and delimited by boundary tones (H% or L%). However, in linguistic theories of the prosody–syntax correspondence, the level of prosodic phrasing that is most closely associated with syntactic phrasing is not the IPh, but the PPh (e.g., Selkirk, 1986, 1995; Nespor & Vogel, 1986). The PPh is also delimited by a tone, the phrase accent (H2 or L2). Interestingly, because each prosodic constituent is exhaustively parsed into constituents at the next lowest level of the hierarchy (“strict layering,” see Selkirk, 1995; Nespor & Vogel, 1986), whenever the right edge of an IPh occurs, it is immediately preceded by the right edge of a PPh. Thus, whenever we have associated a syntactic choice point with an IPh boundary and boundary tone, we have also associated it with a PPh boundary and phrase accent. In Experiment 4, we were interested to find out if the prosodic facilitation and interference effects on syntactic parsing that have been demonstrated with IPh boundaries could be shown with PPh boundaries alone. We predicted that if the prosody–syntax correspondence between PPh constituency and syntactic constituency is used by the language comprehension system, PPh boundaries would determine syntactic parsing decisions, producing the same pattern of effects shown in the previous experiments. Such effects would strengthen the argument that the phonological input to the syntactic parsing mechanism includes a prosodic representation. Experiment 4 was also motivated by two
179
alternative explanations that have been suggested to account for the effects of prosodic boundaries on syntactic disambiguation. First, it has been argued that prosodic effects on syntactic ambiguity can be demonstrated only when unusual contours or extreme pitch excursions are used (Albritton, McKoon, & Ratcliff, 1996; Murray & Watt, 1995). The prosodic structures created for Experiment 4 contained PPh boundaries that were quite subtle when compared to those with IPh boundaries. Second, it could be argued that the silent durations at IPh boundaries were responsible for syntactic ambiguity resolution. From this perspective, the parsing mechanism was able to make use of the additional processing time available in the ambiguous region, in which there was no acoustic information, to process the ambiguity more effectively. This explanation implies that an artifact of the IPh boundary, rather than the mental representation of prosodic structure, was driving the prosodic effects in the previous experiments. In order to address this contention, the PPh boundaries in these stimuli contained no phrase-final silence. So, in Experiment 4, we again used a crossmodal naming task, this time to examine the effect of PPh boundaries on the resolution of temporary syntactic ambiguity. Normative Study, Phonetic Analyses, and Pretest The 18 sentences used in Experiment 4 were selected on the basis of a normative study, a series of phonetic analyses, and a pronunciation acceptability pretest. A large set of early and late closure sentences (44 items) was created using verbs from the normative study. They were pronounced in a sound-attenuated room by the same speaker used for Experiments 1–3 and judged to reflect the intended prosodic structures by two listeners trained in phonetics and phonology. Early and late closure sentence versions were matched for the number of words and syllables and for lexical-level stress pattern. Normative study. The set of verbs used in the sentences for Experiment 4 were chosen according to the same procedure used to for the materials for Experiments 1–3, described in the
180
KJELGAARD AND SPEER
discussion preceding Experiment 1. The mean transitivity score (18.37) shows a slight bias toward transitive completion. Transitivity bias scores for each item are presented in the Appendix. Phonetic analyses. We used the ToBI system (Silverman, et al., 1992) to transcribe phonological analyses of the sentences. Intonation contours were transcribed in the same manner used for the materials used for Experiments 1–3 (see examples in Fig. 6). Again, durational and fundamental frequency (Fø) analyses were completed for the temporarily ambiguous region of the sentence fragments in order to confirm that they had been pronounced with the intended prosodic structure. Durations were measured using Sound Designer II software, and Fø contours with Signalyze software (Keller, 1994). The sentences were pronounced with either cooperating or baseline prosody. In the cooperating conditions, the syntactic boundary coincided with a phonological phrase (PPh) boundary, including a high phrase accent (H2) and level 2 or level 3 break (see Beckman & Ayers, 1993).7 In the baseline conditions, neither potential syntactic boundary was clearly marked with a prosodic boundary. Baseline sentences were spoken with a pitch accent (L1H*) on the subject of the first clause and deaccentuation of the temporarily syntactically ambiguous region, such that the precise location of a low phrase accent (L2) and the associated level 1 break was phonetically ambiguous. The Fø contour in this region was generally low and flat. Thus, early and late closure baseline sentences differed in their underlying phonological representations, but not in their surface phonetic structures. Figure 6 shows early and late closure versions of the example sentence When Roger leaves the house is / it’s dark in the cooperating 7
To distinguish the H2 accents from a H2L% sequence (see discussion in Beckman, 1996), we used coarticulation of segments between the first and second clauses of the sentence and minimal lengthening of the phrase-final word. In addition, the filler sentences used in the study contained a variety of sentence-medial IPh boundaries, all of which had substantial lengthening and following silence, as a contrast set. To distinguish the H2 from the presence of no PPh boundary, we avoided pronunciations with a series of down-stepped H* accents that would include the H2 tone.
and baseline conditions. The figure includes the amplitude by time waveform with durations marked for words in the syntactically ambiguous region (there were no measurable silences for this item), and the Fø contour with ToBI tone and break indices transcribed. For the 18 sentences selected for use in Experiment 4, analyses of duration and Fø showed minimal phonetic differences between early/late closure sentence pairs with baseline prosodies, but significant differences between early/late closure pairs with cooperating prosodies. Phonetic measurements are summarized in Table 6. Duration measurements were compared for the main verb, the ambiguous noun phrase, the silence following the verb, the silence following the ambiguous noun phrase, and the sentence fragment, from sound onset to the truncation point. As in the phonetic analyses of the IPh materials for the first three experiments, we found significant effects of prosody and measurement location, and all interactions were significant (all Fs . 22, all ps , .0001). However, the main effect of syntax was not significant (F(1,17) 5 3.69, p , .07). Planned comparisons were conducted for early versus late closure within baseline and cooperating conditions. Early versus late closure baseline prosody sentences showed no systematic differences in duration in the temporarily ambiguous region for any measure (all Fs , 1). At potential pause locations following the V and NP, there was no measurable silence in any item in the baseline conditions. In contrast, early versus late closure cooperating prosody sentences showed significant differences in duration for both the V and the NP in the ambiguous region. Consistent with the presence of a clause-final PPh boundary, the V was longer in cooperating early closure sentences than in cooperating late (means were 534 and 383 ms, respectively; F(1,17) 5 437.0, p , .001), while the NP was longer in cooperating late closure sentences than in cooperating early (means were 498 and 478 ms, respectively; F(1,17) 5 7.7, p , .01). In both early and late closure conditions, three items had very brief measurable silence following the V, but the presence of this silence was nonsystematic. There was no measurable silence fol-
PROSODIC FACILITATION AND INTERFERENCE
FIG. 6. Acoustic waveforms with word durations (ms), Fø (Hz), and ToBI transcription for an example early/late closure sentence pair with PPh-based cooperating and baseline prosody from Experiment 4.
181
182
KJELGAARD AND SPEER TABLE 6 Phonetic Measurements for Materials with Phonological Phrase Boundaries Used in Experiment 4
Mean Durations (ms) Cooperating prosody Early closure syntax Late closure syntax Baseline prosody Early closure syntax Late closure syntax
Verb
Silence 1
NP
Silence 2
Fragment
533.9 (27.3) 382.9 (19.0)
.8 (.5) .4 (.4)
478.2 (18.1) 498.3 (20.6)
0 (0) 0 (0)
1688.7 (62.7) 1576.3 (55.8)
361.8 (23.7) 361.5 (22.0)
0 (0) 0 (0)
400.8 (16.1) 410.9 (17.5)
0 (0) 0 (0)
1602.6 (45.7) 1594.7 (54.1)
Mean Max Fø Fø Measures (Hz) Cooperating prosody Early closure syntax Late closure syntax Baseline prosody Early closure syntax Late closure syntax
Fø Range (ambiguous region)
Fø Mean (ambiguous region)
189.9 (3.4) 213.6 (2.9)
146–258
178.5 (31.8) 170.2 (33.7)
186.2 (6.0) 174.1 (6.3)
139–216
V
NP
225.0 (2.3) 175.3 (4.4) 173.7 (9.8) 170.7 (6.4)
141–230
139–212
156.7 (20.7) 160.3 (21.1)
Note. Standard error values are in parentheses.
lowing the NP in any cooperating item. There were no significant differences in fragment duration across the four conditions (F , 1). Multiple Fø measurements were compared for the ambiguous region (e.g., leaves the house), including (1) the absolute range of fundamental frequency values across all items, (2) the mean fundamental frequency, and (3) the mean Fø maxima for the verb and the ambiguous NP. Fø maxima were chosen as indicators of the presence of a high phrase accent tone in the temporarily ambiguous region in these sentences (see Kjelgaard, 1995, for additional detail). The maximum Fø was chosen as the best indicator of tonal phenomena due to the use of the H2 phrase accent (Beckman & Ayers, 1993). An analysis of variance showed significant effects of prosody and measurement location and significant interactions of syntax and location, and syntax, prosody, and location (all
Fs . 16, all ps , .0001). For the first two measures, analysis of variance showed no significant differences between baseline early and late closure sentences (all Fs , 1.2). Comparison of the baseline to the cooperating conditions showed a lower mean Fø and a relatively restricted Fø range for the disambiguating region in the baseline sentences, consistent with deaccenting. The Fø range analysis also showed a higher range in the cooperating conditions than in the baseline conditions, consistent with the use of H2 for clause-final positions in cooperating conditions, but L2 for baseline conditions (Silverman et al., 1992; Beckman, 1996). An analysis of variance showed no significant differences in Fø maxima between the two baseline conditions for the V (F , 1). On the ambiguous NP, mean Fø maxima were somewhat higher in the early closure condition, but this difference was not statistically reliable
PROSODIC FACILITATION AND INTERFERENCE TABLE 7 Mean Proportion of Acceptability Ratings for the 18 Early and Late Closure Sentences with Phonological Phrase Boundaries Used in Experiment 4 Condition
Cooperating prosody
Baseline prosody
Early closure Late closure
.904 (.019) .874 (.034)
.833 (.022) .900 (.021)
Note. Standard error values are in parentheses.
(F(1,17) 5 3.5, p 5 .08). In contrast, Fø maxima showed clear differences between early and late closure sentences in the cooperating condition. Clause-final words had significantly higher Fø maxima than their nonfinal counterparts (for V, early closure higher than late, F(1,17) 5 58.69, p 5 .0001; for NP, late closure higher than early, F(1,17) 5 12.8, p 5 .001). Pretest for pronunciation acceptability. To show that the baseline and cooperating pronunciations were comparably acceptable for both early and late closure syntactic structures, and that the baseline pronunciations were highly acceptable, we collected listeners’ judgments of the baseline and cooperating pronunciations of the full sentences (rather than truncated versions, which would not have included a syntactic disambiguation). The judgment task was substantially the same as that used to assess compatibility for materials used in Experiments 1–3. Thirty-three subjects heard 44 experimental sentences and 40 additional sentences that varied in syntactic and prosodic structure and acceptability of pronunciation. Using the acceptability judgment data, we selected 18 sets of early and late closure sentences for which all three prosodic conditions met our criteria. Experimental sentences are shown in the Appendix. Table 7 shows the acceptability ratings for the 18 sentences in the four conditions. The average rate of acceptance for the baseline and cooperating early and late closure sentences was 88%. Planned contrasts of percent acceptability for these 18 sentences showed no statistical differences between cooperating early and late closure (F , 1), no difference between baseline early and late closure sentences (F(1,17) 5 2.7, p , .1), and no difference
183
between cooperating and baseline sentences (F , 1). Method Subjects. Forty-eight undergraduate psychology students at Northeastern University participated in exchange for credit toward a course requirement. All subjects were native English speakers and reported having no hearing, speech, or language problems. Materials. The 18 sentence sets, chosen after pretesting and phonetic analyses, were digitally truncated for presentation in the cross-modal naming task. Each fragment contained all of the acoustic information from the onset of the sentence through end of the ambiguous region (e.g., When Roger leaves the house . . .). The experimental design contained the same six conditions as in the previous experiments: Early and late closure syntactic structure was factorially crossed with cooperating, baseline, and conflicting prosody. In the cooperating conditions, a PPh boundary coincided with one of the two possible syntactic clause boundaries in the auditory fragment. Visual target words resolved the syntactic ambiguity so that when the PPh followed the verb, the visual target resolved the ambiguity toward early closure, and when the PPh followed the ambiguous NP, the visual target resolved toward late closure. Conflicting conditions were created using the auditory fragments from the cooperating condition with the opposite visual targets. For baseline fragments, the ambiguous region was pronounced so that the precise location of a low phrase accent (L2) and level 1 break was phonetically ambiguous within the syntactically disambiguating region. Table 8 shows the spoken sentence fragments and the corresponding visual target words. The six versions of each of the 18 experimental items were distributed among six materials sets, using a condition rotation determined by a Latin square design. Each materials set contained three tokens from each of the six conditions and 20 filler fragments. The filler fragments were a subset of those used in Experiment 3. Procedure. The cross-modal naming task was
184
KJELGAARD AND SPEER TABLE 8 An Example Auditory Sentence Fragment with Visual Targets for Cross-Modal Naming from Experiment 4 Condition
Auditory fragment
Visual word
Cooperating prosody H* ((When Roger leaves H2)PPh (the house H* ((When Roger leaves the house H2)PPh)
Early closure Late closure
is it’s
Baseline prosody L1H* ((When Roger leaves L2)PPh (the house L1H* ((When Roger leaves the house L2)PPh
Early closure Late closure
is it’s
Conflicting prosody H* ((When Roger leaves the house H2)PPh) H* ((When Roger leaves H2)PPh (the house
Early closure Late closure
used. The procedure was identical to that used in Experiment 3. Results Incorrect responses were excluded from analysis using the same criteria as those for Experiment 3. Missing data accounted for 9% of the total experimental responses and were replaced using the average of the experiment-wise individual subject and item means (Winer, 1971). Table 9 shows the distribution of missing responses. The majority of errors again occurred in the baseline and conflicting early closure conditions, so that the data replacement slightly underestimated the duration of response times in these conditions. Because these were conditions predicted to have slower reaction times, any underestimation worked against the hypotheses.
is it’s
Mean corrected naming times and standard errors for Experiment 4 are shown in Fig. 7. Results showed the same pattern of effects found in Experiment 3. An analysis of variance found a main effect of prosody (F1(2,47) 5 31.49, F2(2,17) 5 16.54), a main effect of syntax (F1(1,47) 5 35.27, F2(1,17) 5 30.39), and a significant interaction of prosody and syntax (F1(2,94) 5 14.44, F2(2,34) 5 6.82). Planned comparisons showed facilitation in the cooperating conditions, where targets were named
TABLE 9 Proportion of Missing Data in Six Conditions for Experiment 4 Condition
Cooperating
Baseline
Conflicting
Early closure Late closure
.06 .06
.12 .04
.18 .05
FIG. 7. Cross-modal naming times and standard errors (ms) for early/late closure sentences with PPh-based cooperating, baseline, and conflicting prosodies, Experiment 4.
PROSODIC FACILITATION AND INTERFERENCE
more quickly than in the baseline conditions (means were 645 and 708 ms, respectively; F1(1,47) 5 9.0; F2(1,17) 5 4.3, p , .05). As in the previous experiments, this effect did not hold within the late closure conditions, where baseline naming times were numerically faster than cooperating times (means were 627 and 648 ms, respectively; F , 1). There was an interference effect in the conflicting conditions, where targets were named more slowly than in the baseline conditions (means were 813 and 708 ms, respectively; F1(1,47) 5 25.76; F2(1,17) 5 12.2). As in Experiment 3, the disruptive effect of misleading prosody was not carried by the early closure condition alone. Conflicting late closure sentences were slower than baseline late closure sentences, but this difference was significant only by subjects (means were 709 and 627 ms, respectively, (F1(1,47) 5 7.75; F2(1,17) 5 3.64, p 5 .06). Additional planned comparisons showed syntactic garden path processing difficulty in the baseline conditions (early closure mean 5 788 ms and late closure mean 5 627 ms; F1(1,47) 5 29.72; F2(1,17) 5 14) and in the conflicting conditions (early closure mean 5 917 ms and late closure mean 5 709 ms; F1(1,47) 5 49.9; F2(1,17) 5 23.6). As in the previous experiments, there was no evidence of garden pathing in the cooperating conditions, where PPh boundaries eliminated the effect of syntax. Naming times for late closure targets were numerically longer than those for early closure targets (means were 648 and 643 ms, respectively, both Fs , 1). Verb transitivity and naming times. As in Experiment 3, we conducted a posthoc analysis of verb transitivity, using the transitivity scores from the normative study. We tested for a correlation between naming times and transitivity scores within each of the six experimental conditions. Results this time showed no significant correlations between transitivity difference scores and naming times (all Fs , 2, all ps . .20). The lack of effects here is consistent with the findings for the previous experiments. Across the four experiments, data from end-ofsentence measures and data from the naming measure at the point of syntactic disambigua-
185
tion showed a weak relationship between response time and transitivity score. We hypothesize that the absence of a consistent pattern of correlation with lexical preference information is due to the restricted number of items that contained a strong bias toward either transitive or intransitive use. Thus our particular set of items may have reduced the detectability of the influence of verb transitivity on response time. Discussion The pattern of cross-modal naming times found with PPh constituents in Experiment 4 is remarkably similar to that found with IPh constituents in Experiment 3. In both experiments, the use of the cross-modal naming task allowed us to measure processing difficulty very near the prosodic boundary and just after the point where the syntactically disambiguating word was encountered. The results of Experiment 4 showed that cooperating PPh boundaries facilitated syntactic decision making, so that the processing disadvantage associated with a dispreferred syntactic analysis was overcome by a felicitous correspondence between PPh structure and syntactic structure. Conflicting PPh boundaries produced interference for both early and late closure syntactic structures, demonstrating that prosody can mislead the parser regardless of whether it is consistent with the preferred syntactic analysis. The results are consistent with the operation of a sentence comprehension mechanism that is sensitive to the correspondence between PPh constituency and syntactic constituency. They suggest that prosodic boundaries need not involve large pitch excursions, extensive phrase-final lengthening, or substantial silent durations to be effective in the resolution of temporary syntactic ambiguity. The similarity between the results of Experiments 3 and 4 denies the possibility that only the silence available at IPh boundaries enabled syntactic ambiguity resolution. That is, results do not support an account where the parser simply took advantage of extra processing time available in the ambiguous region to conduct reanalysis. Experiment 4 showed that the PPh boundary, which did not provide a substantial
186
KJELGAARD AND SPEER
silence, did indeed resolve the closure ambiguity. Finally, Experiment 4 showed the same pattern of interaction between prosodic and syntactic factors that was found in Experiment 3. We demonstrated both facilitation and interference for sentences with early closure syntax: Compared to the early closure baseline prosody, a cooperating PPh boundary after the verb produced faster naming times, while a conflicting PPh boundary after the NP produced slower times. We found only interference effects for sentences with late closure syntax: Compared to the late closure baseline, a cooperating PPh boundary after the ambiguous NP did not produce significantly faster naming times, while a conflicting PPh boundary after the verb significantly interfered with processing. Finally, we again found an unpredicted late closure advantage in the conflicting conditions. In the previous discussion, we suggested several possible explanations for this pattern of results, all of which may be extended to Experiment 4. One additional possibility focuses on the phonetic ambiguity of the baseline sentences. There has been little discussion on the process by which prosody itself is parsed from the phonetic input (see Beckman, 1996). However, Schafer (1997) has suggested that phonological processes may operate to hold PPh constituents open until the occurrence of evidence to the contrary. This would predict that in our baseline pronunciations, the L2 accent would not be assigned in the phonological representation until after the ambiguous NP, consistent with a late closure syntactic analysis. A disadvantage for conflicting conditions as compared to baseline is predicted here as well. For late closure, at the end of the fragment in the baseline condition the phonology is ambiguous, and the L2 has yet to be assigned. In contrast, at the verb in the late closure conflicting condition, the H2 accent assignment is unambiguous and is consistent with the (erroneous) early closure syntactic analysis. For early closure baseline, the phonology is again ambiguous as the fragment ends, while in the early conflicting condition the H2 accent is unambiguously on the
final NP, inconsistent with the upcoming early closure target. GENERAL DISCUSSION The combined results from the experiments presented here form a consistent picture that we believe can be accounted for most elegantly if the operation of the syntactic parsing mechanism is sensitive to information from a phonological prosodic representation. Across all four experiments, results showed no evidence of syntactic garden path effects when prosodic and syntactic constituent boundaries coincided. We consistently demonstrated a processing disadvantage for early closure syntactic structures in the baseline and conflicting conditions, but that syntactic difference disappeared when the sentences were presented with cooperating prosody. When processing times were measured at the syntactically disambiguating word, we demonstrated facilitation for early closure syntactic structures and interference for both late and early closure structures. When processing times were measured at the end of the sentence, we demonstrated a general pattern of facilitation for cooperating prosody conditions. Interference effects for conflicting prosody conditions were present in phonosyntactic grammaticality judgments and in sentence comprehension times, but not in speeded judgment times. Our naming results demonstrate that prosodic information has its influence on syntactic structuring very early in the parsing process, consistent with previous findings of prosodic effects at the point of syntactic disambiguation (MarslenWilson et al., 1992; Warren et al., 1995). A novel contribution of these studies is the finding of both facilitation and interference at this early point in processing. In addition, because the temporarily ambiguous regions in our sentences were very short (three to five syllables), the measurement taken at the syntactically disambiguating word was very near the relevant prosodic boundaries. The finding of facilitation and interference effects in both naming and end-ofsentence measures conflicts with predictions that could be derived from models positing a delay in the use of prosodic information (Pynte & Prieur, 1996; Marcus & Hindle, 1990). For
PROSODIC FACILITATION AND INTERFERENCE
example, if prosodic information contributes only to the reanalysis of previous syntactic commitments (Pynte & Prieur, 1996), one might expect that facilitation due to cooperating prosody would appear at sentence end, but not in on-line measures. One might also have expected some evidence of longer naming times for the syntactically dispreferred analysis compared to the preferred one in the cooperating prosody conditions. A model that posits early use of boundary tones, but the delayed use of intermediate levels of phrasing (Marcus & Hindle, 1990) should predict a difference in naming times between the IPh and PPh studies. Instead, we found an identical pattern of interactions between syntactic and prosodic factors, both for salient IPh and for more subtle PPh boundaries. The interference effects, demonstrated by longer times in the conflicting than in the baseline conditions, are also relevant for the view that prosodic effects are easily overridden by conflicting morphosyntactic information (Marslen-Wilson et al., 1992). We found interference not only in the naming studies, but also with the end-of-sentence measures, where there was more time for morphosyntactic information to have a revisionary effect. Our findings of facilitation and interference depend on comparison to the baseline condition. We do not claim that the baseline prosody is a “neutral” prosody in some absolute sense. Rather, it is neutral in the empirically determined sense defined by our pretesting procedures and assumptions. Subjects judging the spoken baseline sentences’ acceptability first read them and acknowledged comprehension (we assume they completed a syntactic analysis during this process). To the extent that these syntactically informed judgments of prosodic well-formedness reflect the prosody–syntax correspondence for our materials, our baseline can serve as a neutral comparison condition. The combined results of the experiments presented here dismiss many methodologically based objections to the claim that prosody can determine the resolution of syntactic ambiguity. The cross-modal naming task can be criticized because it is a relatively unnatural, contrived situation for language processing that involves
187
reading, listening, comprehension, and production. The stimuli are interrupted, incomplete sentences. Although we reduced the metalinguistic component of the task with our “good friend” manipulation, subjects still needed to integrate material across modalities in order to complete the task. The two main advantages of cross-modal naming are that it provides a relatively immediate measure of processing, and it involves no possible cross-splicing artifact in the materials in the conflicting conditions. In contrast, the end-of-sentence tasks have the advantage of being relatively natural language tasks, involving simple comprehension or a judgment of the speaker’s pronunciation. They are performed on language materials that are presented in a single modality, and the stimuli are full sentences, so that the listener has access to the disambiguating information that occurs just after the syntactic ambiguity. The disadvantage in these tasks is that they measure relatively late in processing and that some of the processing difficulty shown for conflicting condition sentence materials may be attributable to artifacts associated with digital cross-splicing. A comparison of the speeded phonosyntactic grammaticality judgment task and the end-ofsentence comprehension task allowed us to assess the contribution of a metalinguistic judgment component that was present in one task, but not the other. A comparison of the end-ofsentence results and those from naming allowed us to demonstrate similar patterns of prosodic effects using tasks with complementary strengths and weaknesses. The experiments presented here also begin to reduce the confusion apparent in the literature concerning the type of sound-based difference that is important for the resolution of syntactic ambiguity. Many recent psycholinguistic studies have demonstrated that when spoken materials are used, a wide range of syntactic effects that have been established in studies of reading are reduced or removed. Such effects have been shown for NP vs. S complement ambiguities (Warren, 1985; Marslen-Wilson et al., 1992; Beach, 1991; Stirling & Wales, 1996; Nagel, Shapiro, Tuller & Nawy, 1996), sentences with empty categories (Nagel, Shapiro, & Nawy, 1994), PP attachment ambigu-
188
KJELGAARD AND SPEER
ities (Pynte & Prieur, 1996; Schafer, 1997), early– late closure ambiguities (Slowiaczek, 1981; Speer, Kjelgaard, & Dobroth, 1996; Speer & Dobroth, submitted for publication; Warren, 1985; Kjelgaard, 1995; Warren et al., 1995), and ambiguous coordination structures (Grabe, Warren & Nolan, submitted for publication). However, other recent studies using similar syntactic structures have failed to show clear and consistent prosodic effects (Murray et al., 1996; Watt & Murray, 1996; Albritton et al., 1996; Nicol & Pickering, 1993). Looking across the studies, the methods used to manipulate and describe prosodic structure vary widely. Some studies have used “untrained” or “naive” speakers, who produce a felicitous prosodic contour by speaking while holding in mind one of two meanings for a syntactic ambiguity or by reading the sentence in a disambiguating context. Other studies have used punctuated text, speakers trained as actors or radio announcers, and/or instructions to disambiguate. Still others used speakers trained in phonetics or phonology who instantiate particular prosodic structures or speech synthesizers set to produce a particular set of durations and tones. Once the prosodies are produced, some researchers do not describe their sound characteristics at all, while others give a brief impressionistic description. Many of the more recent studies provide phonetic measurements of duration and fundamental frequency, some for the materials used in the comprehension study, but others for a separate set of similar materials. A few studies provide phonological transcriptions with supporting phonetic measurements for the particular materials used to demonstrate prosodic effects on comprehension. Explicit specification of both the phonetics and the phonology of experimental materials is necessary if we are to replicate and extend experimental findings and develop a principled account of the use of prosody in sentence understanding. Consistent with Warren and colleagues (Grabe et al., submitted for publication; Warren, 1997), we would like to encourage this latter approach. They have argued that the most appropriate characterization of prosody in the study of language processing is as a phonological system, rather than as a set of measurements taken from the speech waveform. Aggre-
gate phonetic measurements, while they can to some extent describe a speech sound manipulation, may also obscure important phonological differences among materials in a set. The similarity in the results from the two naming experiments presented here supports the claim that phonological entities, rather than their phonetic implementations, are the important influence on syntactic parsing decisions. The large L2L% pitch excursions, substantial phrase-final lengthening, and silence associated with the IPh boundaries in Experiment 1 were no more effective than the subtle lengthening and H2 pitch rise associated with the PPh boundaries in Experiment 4. The results presented here can be most easily explained, we feel, by the very early use of information from a phonological prosodic representation during the syntactic parsing process. However, many interesting questions remain concerning how prosodic and syntactic factors interleave as spoken sentence information becomes available to the processor. For example, how pervasive are prosodic effects during parsing? If prosodic and segmental phonology are simultaneously available to inform lexical and syntactic processing, can information about a word’s location in prosodic constituency influence the recovery of lexical syntactic category information? When preceding discourse context is available to determine the resolution of a temporary ambiguity, are prosodic effects still locally influential? What are the effects of strong transitional probabilities, such as subject–verb agreement and verb argument biases, on the use of prosodic phrasal information? Future research will allow us to more completely and precisely specify the impact of the prosodic representation during spoken sentence comprehension. APPENDIX Experimental Materials For each item, we present tones and break indices (in bold typeface) as they occurred in the temporarily ambiguous region for cooperating/conflicting prosody and baseline prosodies. Verb transitivity bias scores (see text) are shown in parentheses.
PROSODIC FACILITATION AND INTERFERENCE
Experiments 1 and 2 Experimental items. For each item, the sentences are in the order: Cooperating Prosody Early Closure, Cooperating Late Closure, Baseline Prosody Early Closure, Baseline Prosody Late Closure. 1. (251.72) Because John studiedH*L2L%4 the material is clearer now. Because John studied the materialH*L2L%4 it’s clearer now. Because JohnL1H* studiedL21 the material H*L21 is clearer now. Because JohnL1H* studiedL21 the materialH*L21 it’s clearer now. 2. (17.24) When Whitesnake playsH*L2L%4 the music is loud. When Whitesnake plays the musicH*L2L%4 it’s loud. When WhitesnakeL1H* playsL21 the music is loud. When WhitesnakeL1H* plays the musicL21 it’s loud. 3. (100) When Tim is presentingH*L2L%4 the lectures are interesting. When Tim is presenting the lecturesH*L2L%4 they’re interesting. When TimL1H* is presenting L21 the lectures are interesting. When TimL1H* is presenting the lecturesL21 they’re interesting. 4. (231.03) When the original cast performsH*L2L%4 the plays are funny. When the original cast performs the playsH*L2L%4 they’re funny. When the original castL1H* performsL21 the plays are funny. When the original castL1H* performs the playsL21 they’re funny. 5. (255.17) When Madonna singsH*L2L%4 the song is a hit. When Madonna sings the songH*L2L%4 it’s a hit. When MadonnaL1H* singsL21 the song is a hit. When MadonnaL1H* sings the songL21 it’s a hit. 6. (286.21) Whenever John swimsH*L2L%4 the channel is choppy. Whenever John swims the channelH*L2L%4 it’s choppy. Whenever JohnL1H* swimsL21 the channelH*L21 is choppy. Whenever JohnL1H* swimsL21 the channelH*L21 it’s choppy. 7. (6.91) When Roger leavesH*L2L%4 the house is dark. When Roger leaves the houseH*L2L%4 it’s dark.
8.
9.
10.
11.
12.
13.
14.
15.
189
When RogerL1H* leavesL21 the house is dark. When RogerL1H* leaves the houseL21 it’s dark. (34.48) Whenever Frank performsH*L2L%4 the show is fantastic. Whenever Frank performs the showH*L2L%4 it’s fantastic. Whenever FrankL1H* performsL21 the show H*L21 is fantastic. Whenever FrankL1H* performsL21 the show H*L21 it’s fantastic. (65.52) Because Mike phonedH*L2L%4 his mother is relieved. Because Mike phoned his motherH*L2L%4 she’s relieved. Because MikeL1H* phonedL21 his mother is relieved. Because MikeL1H* phoned his motherL21 she’s relieved. (44.44) When the clock strikesH*L2L%4 the hour is midnight. When the clock strikes the hourH*L2L%4 it’s midnight. When the clockL1H* strikesH*L21 the hourL21 is midnight. When the clockL1H* strikesH*L21 the hourL21 it’s midnight. (20.01) If Joe startsH*L2L%4 the meeting is boring. If Joe starts the meetingH*L2L%4 it’s boring. If JoeL1H* startsL21 the meeting is boring. If JoeL1H* starts the meetingL21 it’s boring. (89.65) If Josh buysH*L2L%4 the beer is cheap. If Josh buys the beerH*L2L%4 it’s cheap. If JoshL1H* buysL21 the beer is cheap. If JoshL1H* buys the beerL21 it’s cheap. (31.03) Whenever the guard checksH*L2L%4 the door is locked. Whenever the guard checks the doorH*L2L%4 it’s locked. Whenever the guardL1H* checksL21 the door is locked. Whenever the guardL1H* checks the doorL21 it’s locked. (100) If Laura is foldingH*L2L%4 the towels are neat. If Laura is folding the towelsH*L2L%4 they’re neat. If LauraL1H* is foldingL21 the towels are neat. If LauraL1H* is folding the towelsL21 they’re neat. (86.21) If George is programmingH*L2L%4 the computer is sure to crash.
190
KJELGAARD AND SPEER
If George is programming the computerH*L2L%4 it’s sure to crash. If GeorgeL1H* is programmingL21 the computer is sure to crash. If GeorgeL1H* is programming the computerL21 it’s sure to crash. 16. (210.34) If Charles is babysittingH*L2L%4 the children are happy. If Charles is babysitting the childrenH*L2L%4 they’re happy. If CharlesL1H* is babysittingL21 the children are happy. If CharlesL1H* is babysitting the childrenL21 they’re happy. 17. (37.94) When the maid cleansH*L2L%4 the rooms are immaculate. When the maid cleans the roomsH*L2L%4 they’re immaculate. When the maidL1H* cleansL21 the rooms are immaculate. When the maidL1H* cleans the roomsL21 they’re immaculate. 18. (51.72) Before Jack dealsH*L2L%4 the cards are shuffled. Before Jack deals the cardsH*L2L%4 they’re shuffled. Before JackL1H* dealsL21 the cards are shuffled. Before JackL1H* deals the cardsL21 they’re shuffled. Duration controls. Short and long pause durations (ms) are shown in parentheses at the location where they occurred. 1. After Bob (0/658) ordered those tires he got a flat. 2. Russian caviar (2/703) is eaten cold. 3. When John was finally (3/1101) persuaded by the argument he sighed. 4. Later the boyscout helped the (0/379)old man onto the bus. 5. The nightguard always (22/791)watches the monitor. 6. As soon as she finds those (41/693)flowers she’ll cry. 7. When there is a (77/550)protest on campus it’s huge. 8. When her date (84/1014)kissed her on the cheek she smiled. 9. The soldiers don’t desert (0/320)because they’re very loyal. 10. The coffee is too (55/597) strong when mother brews it. 11. Because it’s warm (8/851)Jan didn’t bring a coat. 12. When the professor asked (0/333)no one knew the answer. 13. Oranges are more (11/699) expensive than apples this winter. 14. Downhill skiers often fall and are (6/435)injured. 15. The congressman voted against this (0/525) controversial bill. 16. Triangles have three (0/251) lines and three angles. 17. The Irish (27/275) setter puppy wagged its tail when it saw us. 18. The young (62/562)Californian lost his surfboard at the beach.
19. The cards the (31/562) psychic reads are ominous. 20. The softball pitch by Beth (14/532) was really fast.
Experiments 3 and 4 Tones shown in parentheses occurred in one of the two conditions, and tones not in parentheses occurred in both conditions. The associated visual targets for early and late closure syntax completions of the fragments are also shown.
Experiment 3 Auditory fragment 1. (251.72) Because John studied (H*L2L%4) the material (H*L2L%4) Because JohnL1H* studiedL21 the material H*L21 2. (17.24) When Whitesnake plays (H*L2L%4) the music (H*L2L%4) When WhitesnakeL1H* plays (L21) the music (L21) 3. (100) When Tim is presenting (H*L2L%4) the lectures (H*L2L%4) When TimL1H* is presenting (L21) the lectures (L21) 4. (231.03) When the original cast performs (H*L2L%4) the plays (H*L2L%4) When the original castL1H* performs (L21) the plays (L21) 5. (255.17) When Madonna sings (H*L2L%4) the song (H*L2L%4) When MadonnaL1H* sings (L21) the song (L21) 6. (286.21) Whenever John swims (H*L2L%4) the channel (H*L2L%4) Whenever JohnL1H* swims H*L21 the channel H*L21 7. (6.91) When Roger leaves (H*L2L%4) the house (H*L2L%4) When RogerL1H* leaves (L21) the house (L21) 8. (34.48) Whenever Frank performs (H*L2L%4) the show (H*L2L%4) Whenever FrankL1H* performs L21 the show H*L21
Visual target
is/it’s
is/it’s
are/they’re
are/they’re
is/it’s
is/it’s
is/it’s
is/it’s
191
PROSODIC FACILITATION AND INTERFERENCE
Experiment 3—Continued Auditory fragment 9. (65.52) Because Mike phoned (H*L2L%4) his mother (H*L2L%4) Because MikeL1H* phoned (L21) his mother (L21) 10. (44.44) When the clock strikes (H*L2L%4) the hour (H*L2L%4) When the clockL1H* strikesH*L21 the hour L21 11. (20.01) If Joe starts (H*L2L%4) the meeting (H*L2L%4) If JoeL1H* starts (L21) the meeting (L21) 12. (89.65) If Josh buys (H*L2L%4) the beer (H*L2L%4) If JoshL1H* buys (L21) the beer (L21) 13. (31.03) Whenever the guard checks (H*L2L%4) the door (H*L2L%4) Whenever the guardL1H* checks (L21) the door (L21) 14. (100) If Laura is folding (H*L2L%4) the towels (H*L2L%4) If LauraL1H* is folding (L21) the towels (L21) 15. (86.21) If George is programming (H*L2L%4) the computer (H*L2L%4) If George is programming (L21) the computer (L21) 16. (210.34) If Charles is baby-sitting (H*L2L%4) the children (H*L2L%4) If CharlesL1H* is baby-sitting (L21) the children (L21) 17. (37.94) When the maid cleans (H*L2L%4) the rooms (H*L2L%4) When the maidL1H* cleans (L21) the rooms (L21) 18. (51.72) Before Jack deals (H*L2L%4) the cards (H*L2L%4) Before JackL1H* deals (L21) the cards (L21)
Experiment 4 Visual target
is/she’s
is/it’s
is/it’s
is/it’s
is/it’s
are/they’re
is/it’s
are/they’re
are/they’re
are/they’re
Auditory fragment 1. (255.17) When Madonna sings (H*H22) the song (H*H22) When MadonnaL1H* sings (L21) the song (L21) 2. (286.21) Whenever John swims (H*H22) the channel (H*H22) Whenever JohnL1H* swims (L21) the channel (L21) 3. (6.91) When Roger leaves (H*H22) the house (H*H22) When RogerL1H* leaves (L21) the house (L21) 4. (34.48) Whenever Frank performs (H*H23) the show (H*H23) Whenever FrankL1H* performs (L21) the show (L21) 5. (44.44) When the clock strikes (H*H22) the hour (H*H22) When the clockL1H* strikes (L21) the hour (L21) 6. (100) If Laura is folding (H*H22) the towels (H*H22) If LauraL1H* is folding (L21) the towels (L21) 7. (10.34) If Charles is baby-sitting (H*H23) the children(H*H23) If CharlesL1H* is baby-sitting (L21) the children (L21) 8. (37.94) When the maid cleans (H*H22) the rooms (H*H22) When the maidL1H* cleans (L21) the rooms (L21) 9. (51.72) Before Jack deals(H*H23) the cards(H*H23) Before JackL1H* deals (L21) the cards (L21) 10. (79.31) Because Wapner is judging (H*H22) the trial (H*H22) Because WapnerL1H* is judging (L21) the trial (L21)
Visual word
is/it’s
is/it’s
is/it’s
is/it’s
is/it’s
is/it’s
are/they’re
are/they’re
are/they’re
is/it’s
192
KJELGAARD AND SPEER
Experiment 4 —Continued Auditory fragment 11. (44.83) When Suzie visits (H*H22) her grandpa (H*H22) When SuzieL1H* visits (L21) her grandpa (L21) 12. (55.16) When Gino delivers (H*H22) the pizza (H*H22) When GinoL1H* delivers (L21) the pizza (L21) 13. (58.62) After Jane dusts (H*H23) the furniture (H*H23) After JaneL1H* dusts (L21) the furniture (L21) 14. (34.48) Because Victor is playing (H*H23) the music (H*H23) Because Victor L1H* is playing (L21) the music (L21) 15. (210.35) When a man cheats (H*H23) his friends (H*H23) When a manL1H* cheats (L21) his friends (L21) 16. (0.02) When the guerrillas fight (H*H22) the battle (H*H22) When the guerrillasL1H* fight (L21) the battle (L21) 17. (37.93) If Ian doesn’t notice (H*H22) Beth (H*H22) If IanL1H* doesn’t notice (L21) Beth (L21) 18. (244.83) If the baby surrenders (H*H22) the bottle (H*H22) If the babyL1H* surrenders (L21) the bottle (L21)
Visual word
is/he’s
is/it’s
is/it’s
is/it’s
are/they’re
is/it’s
will/she’ll
is/it’s
REFERENCES Abney, S. (1990). Parsing by chunks. In C. Tenny (Ed.), The MIT parsing volume, 1988 –1989. Center for Cognitive Science, MIT. Albritton, D. W., McKoon, G., & Ratcliff, R. (1996). Reliability of prosodic cues for resolving syntactic ambiguity. Journal of Experimental Psychology: Learning, Memory, & Cognition, 22, 714 –135. Bakeman, R., & McArthur, D. (1996). Picturing repeated measures: Comments on Loftus, Morrison, and others. Behavior Research Methods, Instruments, and Computers, 28, 584 –589.
Beach, C. (1991). The interpretation of prosodic patterns at points of syntactic structure ambiguity: Evidence for cue trading relations. Journal of Memory and Language, 30, 627– 643. Beckman, M. (1996). The parsing of prosody. Language and Cognitive Processes, 11, 17–18. Beckman, M., & Ayers, G. M. (1994). Guidelines for ToBI labeling, ver. 2.0. Unpublished manuscript, Ohio State Univ. [materials available by writing to
[email protected]] Beckman, M., & Pierrehumbert, J. (1986). Intonational structure in Japanese and English. Phonology Yearbook, 3, 266 –309. Carlson, G. N., & Tanenhaus, M. K. (1988). Thematic roles and language comprehension. In W. Wilkins (Ed.), Syntax and semantics: Thematic relations, 21, New York: Academic Press. Carroll, P. J., & Slowiaczek, M. L. (1987). Models and modules: Multiple pathways to the language processor. In J. Garfield (Ed.) Modularity in knowledge representation and natural language understanding (pp. 221– 247). New York: Academic Press. Ferriera, F. (1993). Creation of prosody during sentence production. Psychological Review, 100, 233–253. Ferriera, F. (1991). Effects of length and syntactic complexity on initiation times for prepared utterances. Journal of Memory and Language, 30, 210 –233. Ferreira, F., & Henderson, J. (1989). The use of verb information in syntactic parsing: Evidence from eye movements and word-by-word self-paced reading. Journal of Experimental Psychology: Learning Memory and Cognition, 16, 555–568. Fodor, J., & Bever, T. (1965). The psychological reality of linguistic segments. Journal of Verbal Learning and Verbal Behavior, 4, 414 – 420. Francis, W. N., & Kucera, H. (1982). Frequency analysis of English usage. Boston: Houghton Mifflin. Frazier, L. (1987a). Sentence processing: A tutorial review. In M. Coltheart (Ed.), Attention and performance XII. (pp. 559 –586) Hillsdale, NJ: Erlbaum. Frazier, L. (1987b). Structure in auditory word recognition. In L. K. Tyler and U. Frauenfelder (Eds.), Spoken word recognition (pp. 559 –586) Hillsdale, NJ: Erlbaum. Frazier, L., & Clifton, C. E., Jr. (1996). Construal. Cambridge, MA: MIT Press. Frazier, L., & Rayner, K. (1982). Making and correcting errors during sentence: Eye movements in the analysis of structurally ambiguous sentences. Cognitive Psychology, 143, 178 –210. Garrett, M., Bever, T., & Fodor, J. (1966). The active use of grammar in speech perception. Perception and Psychophysics, 1, 30 –32. Gordon, P. (1988). Induction of rate-dependent processing by coarse-grained aspects of speech. Perception and Psychophysics, 43, 137–146. Grabe, E., Warren, P., & Nolan, F. (1998). Prosodic disambiguatin of coordination structures. Unpublished manuscript.
PROSODIC FACILITATION AND INTERFERENCE Hayes, B. (1985). A metrical theory of stress rules. New York: Garland. Keller, E. (1994). Signalyze, v.3.14. Charlestown, MA: InfoSignal. Kjelgaard, M. M. (1995). The role of prosodic structure in the resolution of phrase-level and lexical syntactic ambiguity. [unpublished doctoral dissertation, Boston, MA: Northeastern Univ.] Ladd, D. R. (1980). The structure of intonational meaning: Evidence from English. Bloomington IN: Indiana Univ. Press. Ladd, D. R. (1986). Intonational phrasing: The case for recursive prosodic structure. Phonology Yearbook, 3, 311–340. Lehiste, I. (1973). Phonetic disambiguation of syntactic ambiguity. Glossa, 7, 102–122. Liberman, A. M., Cooper, F. S., Shankweiler, D. P., & Studdert-Kennedy, M. (1967). Perception of the speech code. Psychological Review, 74, 431– 461. Liberman, M., & Pierrehumbert, J. (1984). Intonational invariance under changes in pitch range and length. In M. Aronoff & R. J. Oehrle (Eds.), Language sound structure (pp. 157–233). Liberman, M., & Prince, A. (1977). On stress and linguistic rhythm. Linguistic Inquiry, 8, 249 –336. MacDonald, M. C., Perlmutter, N. J., & Seidenberg, M. S. (1995). The lexical nature of syntactic ambiguity resolution. Psychological Review, 101, 676 –703. Marcus, M., & Hindle, D. (1990). Description theory and intonation boundaries. In G. Altmann (Ed.), Cognitive models of speech processing: Computational and psycholinguistic perspectives. Cambridge, MA: MIT Press. Marslen-Wilson, W. D., Tyler, L. K., Warren, P., Grenier, P., & Lee, C. S. (1992). Prosodic effects in minimal attachment. Quarterly Journal of Experimental Psychology, 45A, 73– 87. Mazuka, R., & Lewis, M. A. (1997, March). An interaction between prosody and syntactic structure in attachment ambiguity resolution. Poster presented at the Annual meeting of the CUNY Conference on Sentence Processing, Santa Monica, CA. Miller, J. L. (1990). Speech perception. In D. N. Osherson & H. Lasnik (Eds.), Language: An invitation to cognitive science (pp. 69 –93). Cambridge, MA: MIT Press. Murray, W., & Watt, S. (1995, March). Prosodic form and parsing commitments. Paper presented at the Annual CUNY Sentence Processing Conference, Los Angeles, CA. Murray, W. S., Watt, S., & Kennedy, A. (1998). Parsing ambiguities: Modality, processing options and the garden path. [unpublished manuscript] Nagel, H. N., Shapiro, L., & Nawy, R. (1994). Prosody and processing filler-gap sentences. Journal of Psycholinguistic Research, 23, 473– 485. Nagel, H. N., Shapiro, L., Tuller, B., & Nawy, R. (1995).
193
Prosodic influences on the processing of attachment ambiguities. Journal of Psycholinguistic Research, 24. Neath, I. (1994). Macintosh psychology laboratory, ver 2.6. [unpublished manual, Purdue Univ. West Lafayette, IN] Nespor, M. A., & Vogel, I. (1986) Prosodic phonology. Boston: Klewer. Nicol, J. L., & Pickering, M. J. (1993). Processing syntactically ambiguous sentences: Evidence from semantic priming. Journal of Psycholinguistic Research, 22, 207–237. Perlmutter, N., & MacDonald, M. C. (1992). Plausibility and syntactic ambiguity resolution. In Proceedings of the Fourteenth Annual Conference of the Cognitive Science Society. Hillsdale, NJ: Erlbaum. Pierrehumbert, J. B. (1980). The phonology and phonetics of English intonation. [unpublished doctoral dissertation, Boston, MA: MIT] Pierrehumbert, J. B., & Beckman, M. E. (1988). Japanese tone structure. Cambridge, MA: MIT Press. Price, P., Ostendorf, M., Shattuck-Hufnagle, S., & Fong, C. (1991). The use of prosody in syntactic disambiguation. Journal of the Acoustical Society of America, 90, 2956 –2970. Pritchett, B. L. (1988). Garden path phenomena and the grammatical basis of language processing. Language, 64, 539 –576. Pynte, J., & Prieur, B. (1996). Prosodic breaks and attachment decisions in sentence processing. Language and Cognitive Processes, 11, 165–192. Schafer, A. (1997). Prosodic parsing: The role of prosody in sentence comprehension. [unpublished doctoral dissertation, Amherst, MA: Univ. of MA] Selkirk, E. O. (1984). Phonology and syntax: The relation between sound and structure. Cambridge: MIT Press. Selkirk, E. O. (1986). On derived domains in sentence phonology. Phonology Yearbook, 3, 371– 405. Selkirk, E. O. (1995). Sentence prosody: Intonation, stress and phrasing. In Handbook of phonological theory. Oxford: Blackwell Sci. Sevald, C., & Trueswell, J. C. (1997, March). Speakers cooperate with listeners: Prosody to sidestep the garden path. Poster presented to the annual meeting of the CUNY Conference on Sentence Processing, Santa Monica, CA. Shattuck-Hufnagel, S., & Turk, A. (1996). Prosodic cues: An introduction. Journal of Psycholinguistic Research, 25. Silverman, K., Beckman, M., Pitrelli, J., Ostendorf, M., Wightman, C., Price, P., Pierrehumbert, J., & Hirshberg, J. (1992). ToBI: A standard for labeling English prosody. Proceedings of the Internations Conference of Spoken Language Processing (ICSLP) (Vol. 2, pp. 867– 870). Slowiaczek, M. (1981). Prosodic units as processing units. [unpublished doctoral dissertation, Amherst, MA: Univ. of MA] Speer, S. R., Crowder, R. G., & Thomas, L. (1993). Pro-
194
KJELGAARD AND SPEER
sodic structure and sentence recognition. Journal of Memory and Language, 32, 336 –358. Speer, S. R., & Dobroth, K. (1997). Junctural and temporal aspects of prosody: Ambiguity resolution in auditory and visual sentence processing. [unpublished manuscript] Speer, S. R., Kjelgaard, M. M., & Dobroth, K. M. (1996). The influence of prosodic structure on the resolution of temporary syntactic closure ambiguities. Journal of Psycholinguistic Research, 25, 247–268. Speer, S. R., Shih, C.-L., & Slowiaczek, M. L. (1989). Prosodic structure in language comprehension: Evidence from tone sandhi in Mandarin. Language and Speech, 32, 337–354. Steedman, M. (1990). Syntax and intonational structure in a combinatory grammar. In G. Altmann (Ed.), Cognitive models of speech processing: Computational and psycholinguistic perspectives. Cambridge, MA: MIT Press. Steedman, M. (1991). Structure and intonation. Language, 67, 260 –296. Stirling, L., & Wales, R. (1996). Does prosody support or direct sentence processing? Language and Cognitive Processes, 11, 193–212. Streeter, L. (1978). Acoustic determinants of phrase boundary perception. Journal of the Acoustical Society of America, 64, 1582–1592. Tanenhaus, M. K., & Carlson, G. (1989). Lexical structure and language comprehension. In W. M. Marslen-Wil-
son (Ed.), Lexical representation and process. Cambridge, MA: MIT Press. Wales, R., & Toner, H. (1979). Intonation and ambiguity. In W. E. Cooper and E. C. T. Walker (Eds.), Sentence processing : Psycholinguistic studies presented to Merrill Garrett. Hillsdale, NJ: Erlbaum. Warner, J., & Glass, A. L. (1987). Context and distance-todisambiguation effects in ambiguity resolution: Evidence from grammaticality judgments of garden path sentences. Journal of Memory and Language, 27, 597– 632. Warren, P. (1985). The temporal organization and perception of speech. [unpublished doctoral dissertation, Cambridge, UK: Univ. of Cambridge] Warren, P., Grabe, E., & Nolan, F. (1995). Prosody, phonology, and closure ambiguities. Language and Cognitive Processes, 10, 457– 486. Watt, S., & Murray, W. (1996). Prosodic form and parsing commitments. Journal of Psycholinguistic Research, 25. Winer, B. J. (1971). Statistical principles in experimental design. New York: McGraw–Hill. Wingfield, A., & Klein, (1971). Syntactic structure and acoustic pattern in speech perception. Perception and Psychophysics, 9, 23–25. (Received August 25, 1997) (Revision received October 15, 1998)