Available online at www.sciencedirect.com
Lingua 122 (2012) 1380--1394 www.elsevier.com/locate/lingua
The Persian pitch accent and its retention after the focus Vahideh Abolhasanizadeh a,b, Mahmood Bijankhan c, Carlos Gussenhoven b,d,* a
Department of English Language and Literature, Shahid Bahonar University of Kerman, Iran b Department of Linguistics, Radboud University Nijmegen, The Netherlands c Department of Linguistics, University of Tehran, Iran d School of Languages, Linguistics and Film, Queen Mary University of London, UK Received 3 February 2012; received in revised form 4 June 2012; accepted 5 June 2012 Available online 6 July 2012
Abstract Persian words have prominence on the last syllable. Right-edge clitics fall outside this word domain, and segmentally identical words and word-plus-clitic combinations therefore contrast for the location of the prominence. Two experiments were conducted to answer two questions. A production experiment addressed the question whether any phonetic cues other than f0 signal this prominence contrast. We found small phonetic differences between members of minimal pairs outside the more evident f0 differences, but attribute these to side effects of pitch accent placement. The second question was whether post-focal words undergo deaccentuation, as evidenced by neutralization of the contrast between post-focal words and word-plus-clitic combinations. Both the production experiment and a perception experiment showed that there is Post Focus Compression, since pitch excursions in the post-focal speech were considerably reduced, both in interrogative and in declarative utterances, as compared to other positions in the sentence. However, no neutralization occurred. We tentatively conclude that Persian word prominences are pitch accents and that words are not deaccented when the pitch range is reduced after the focus. © 2012 Elsevier B.V. All rights reserved. Keywords: Clitic group; Phonological word; Prosodic hierarchy; Focus; Pitch range
1. Introduction Persian sentence prosody has been described as involving accentual phrases which have a single intonational pitch accent on a stressed syllable (Mahjani, 2003). After the focus constituent, deaccentuation has been claimed to occur (Sadat Tehrani, 2007). In this contribution, we address two issues in the word and sentence prosody of Persian. The first is the phonological and phonetic status of the Persian word prominence. The question here is whether the prominence is typologically like West Germanic or Catalan stress, with multiple phonetic parameters conspiring to create it, or a pitch accent that is signaled only through fundamental frequency (f0). Second, we are interested in knowing whether the word prominence disappears after the focus constituent, to the extent that minimal ‘stress’ pairs become homophonous. 1.1. Persian ‘stress’ Persian word prominence has generally been described as the assignment of stress to the final syllables of nouns, adjectives, most adverbs and unprefixed verbs (Ferguson, 1957; Lazard, 1957; Same’i, 1996). Prefixed verbs take stress * Corresponding author at: Afdeling Taalwetenschap, Radboud Universiteit Nijmegen, Postbus 9103, 6500 HD Nijmegen, The Netherlands. Tel.: +31 0243612839/237240; fax: +31 0627205464. E-mail address:
[email protected] (C. Gussenhoven). 0024-3841/$ -- see front matter © 2012 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.lingua.2012.06.002
V. Abolhasanizadeh et al. / Lingua 122 (2012) 1380--1394
1381
on the prefix. Kahnemuyipour (2003) argued that the uniformity in stress placement in nouns and its variability in verbs follows from a morphological difference between these word types and the resulting difference in the way they map onto prosodic structures. Specifically, prefixes are separate phonological words in his analysis, and a phrase-level stress rule puts the stress on the final syllable of the initial phonological word in a phonological phrase. While the assignment of stress thus follows transparently from the morphological (or prosodic) structure, the issue addressed here is the interpretation of the term ‘stress’ in these and other descriptions of Persian word prosody. In general, word-level prominent syllables have variably been characterized as having ‘accent’ or ‘stress’. The distinction between these has been sought in the extent to which the prominent syllable is phonetically cued by pitch features alone or, alternatively, whether other phonetic cues like duration and spectral properties are consistently present. Beckman (1986) termed these prominences ‘non-stress accent’ and ‘stress accent’, respectively. A different perspective is obtained when distributional and phonological properties are taken into account. Stress has been characterized as being obligatory, meaning that, not counting function words that cannot be citation utterances, every word has a stressed syllable, and culminative, meaning that there is one most prominent syllable in the word (Hyman, 2006). By contrast, an accented syllable need not be present on every word, allowing the existence of unaccented words, either due to deaccentuation of lexically accented words or to the fact that words can be lexically unaccented and not acquire accents in the sentence prosody. Hyman (2006) discusses cases of obligatoriness and culminativity where there is no evidence of phonetic stress, i.e. when the prominence is not a ‘stress accent’, in Beckman's (1986) sense. Some of these are ruled out as stress systems on the basis of the location of the accent. If that is a mora, as in Somali, the prominence is not stress, since stress is a property of syllables (Hayes, 1995; Hyman, 2006). Nubi represents a case of a culminative and obligatory system where the prominent element is the syllable and prominent syllables are not systematically differentiated by durational or spectral properties from non-prominent syllables (Hyman, 2006; Gussenhoven, 2006). The historical explanation here is that Nubi is a creolized form of Arabic in which the Arabic stress locations have been interpreted as Htoned syllables by speakers of East African tone languages (Wellens, 2005). However, there are likely to be more cases of phonological stress that are not signaled by phonetic stress, i.e. by f0 only. Levi (2005) presents phonetic data on Turkish which make her conclude that this language has a pitch accent, not (phonetic) stress. In line with the recent emphasis on language diversity, we present evidence that the word prominence of Persian is both obligatory and culminative in the sense of Hyman (2006), while also being a ‘non-stress accent’ in the sense of Beckman (1986). In current usage, it will be argued to be ‘pitch accent’, a concept and term that was introduced by Bolinger (1958) in reference to the tonal component in accented syllables in English. In autosegmental phonology, it is the term for any tonal melody that is associated with an accented syllable, whether that syllable is stressed, as it is in English (Bolinger, 1958; Pierrehumbert, 1980) and Jordanian Arabic (De Jong and Zawaydeh, 1999), or lexically determined, as it is in Japanese (Pierrehumbert and Beckman, 1988; Kubozono, 1993) and which is not analyzable as a boundary tone.1 This point we will try to make on the basis of Experiment I. 1.2. Post-Focus Compression The second issue addressed by our investigation concerns the phonological status of Post-Focus Compression (PFC) in Persian. Xu et al. (2012) suggest that the reduction of the pitch range after the focus constituent, as found for instance in Germanic languages, may in fact be an areal feature covering Europe as well as a northern and central swathe of Asia. Thus, Beijing Mandarin is a PFC language (as are Japanese, Bengali and Mongolian), but Taiwanese and Taiwanese Mandarin are not. We will show that Persian is a PFC language, in line with Xu et al.’s hypothesis. The question at issue in our investigation, however, one that is not considered by Xu et al., is whether PFC involves the removal of the tonal structure in the post-focal words. This we believe is the case in English. While the noun phrase a Spànish téacher is distinct from the compound a Spánish teacher in isolation, in a sentence like I’ve already HEARD that story about the Spanish teacher, it is no longer possible to tell which structure is used, because after focal heard no pitch accents occur,
1 We use the term ‘pitch accent’ in this meaning only. In particular, we do not mean to refer to any distributional or other criterion that might be assumed to allow a meaningful classification of a ‘pitch accent language’ (Hyman, 2009). In the meaning we use the term, that of tones that are systematically present in some syllable or mora and which cannot be analyzed as boundary tones, English, Japanese, Jordanian Arabic, Nubi, Turkish and Somali all have pitch accents. While making clear which meaning we intend, we use the term ‘stress’ both in the sense of ‘phonetic stress’, i.e. phonetically enhanced duration and spectral measures as occurring in, e.g. English, and in the sense of culminative obligatory word prominence as occurring in English, Nubi, Turkish and, as we will argue, Persian. An issue that is not always given the credit it deserves is whether an accentual analysis is to be preferred over an analysis with underlyingly linked tones, which will depend on the existence of generalizations about the location of the word prominence that abstract away from the tones that are found there (Goldsmith, 1975; Gussenhoven, 2004:37). As Hyman (2006) stresses, a tonal analysis can in principle always replace a word prosodic accentual analysis, but a tonal analysis can be cumbersome when there are many generalizations about their permitted locations and the pitch accent consists of more than one tone, as in Japanese, or when there are more options for the pitch accent, as in Barasana (Gomez-Imbert and Kisseberth, 2000).
1382
V. Abolhasanizadeh et al. / Lingua 122 (2012) 1380--1394
leaving Spanish and teacher with unaccented stressed syllables in both sentences (James Sledd, cited from Hill [1962:36] by Schmerling, 1976:27). By contrast, in Beijing Mandarin, PFC reduces the pitch range without deleting the tones of the words. Tonal minimal pairs like ma1 ‘mother’ and ma3 ‘horse’ thus remain phonologically distinct if they are used postfocally (Chen and Guion-Anderson, 2012). Similarly, Bizkaian Basque retains the distinction between accented and unaccented words under PFC (Elordieta and Hualde, 2003). If Persian has pitch accents without phonetic stress, the issue arises whether these accents are retained under PFC in a phonetically reduced form or whether instead they are deleted. If they are deleted, contrasts that rely on a difference in the location of the pitch accent will be neutralized. Our Experiment II was run to address this issue from a perceptual viewpoint. The results converge so as to allow the conclusion that Persian does not deaccent after the focus, but retains phonetically reduced pitch accents in post-focal speech that allow accentual minimal pairs to be disambiguated to a certain extent. 1.3. Intonation Persian has been described as having three levels of prosodic hierarchy that are relevant to the intonational structure, the accentual phrase, the intermediate phrase and the intonational phrase (Mahjani, 2003; Sadat Tehrani, 2007:36). The word-final syllable has been claimed to be associated with a pitch accent (Eslami and Bijankhan, 2002), but there are conflicting analyses of its tonal structure. Eslami (2000) posits four pitch accents, H*, L*, L*+H and L+H*, in addition to two tones marking intermediate phrases, L- and H-, as well as two boundary tones of the intonational phrase, L% and H%. The meanings of the tonal morphemes given by Eslami (2000), inspired by Hirschberg and Pierrehumbert (1986) and Pierrehumbert and Hirschberg (1990), are reproduced in (1). (1)
H* L* L+H* L*+H HLL% H%
new information given information contrast doubt incompleteness completeness statement question
In contrast to (1), Sadat Tehrani (2007) posits a single pitch accent, L+H*, which has two morpheme alternants, L+H* in polysyllabic accentual phrases and H* in monosyllabic ones. Another claim by Sadat Tehrani (2007) is that post-focal words are deaccented, while any internal boundary tones are deleted after the focus. We will evaluate some of the claims in the literature in section 5. 1.4. The clitic group Our investigation relies on a contrast between plain words and cliticized words. Combinations of words and clitics have been described as ‘clitic groups’. The exclusion of right-edge clitics from stress or accent assignment was noted by Lazard (1957:48) and Shaqaqi (1993:46). Bijankhan and Nourbakhsh (2009) make ‘stress’ the main defining feature of the phonological word, pointing out that since clitics remain unstressed, they must lie outside the domain of the phonological word. As an alternative, stress assignment can be described as being morphologically determined. This would mean that a pitch accent is assigned to the last syllable of lexical category words, and that cliticized words and non-cliticized words are both phonological words.2 Because the surface segmental structures of words and word-clitic combinations are not systematically different, many examples of minimal pairs can be given, like gol ‘flower’, which gives [go´li] ‘one flower’, with a clitic [i], and [golí] ‘proper name’, which has a suffix. We illustrate the systematic nature of accent assignment in (2a, b, c, d), where (2a) provides two isolated words, (2b) two suffixed words, (2c) two words with a clitic, and (2d) a compound. As these data show, words and suffixed words have final accented syllables, compounds fail to have an accent on their first constituent, while clitics are not assigned accent, causing the accent in cliticized words to be on the final syllable of the host. This latter generalization remains true if a word has two clitics, as in [ketɒ´ b-i-je] ‘of one book’. For convenience, we will refer to word+clitic combinations as ‘clitic groups’, without committing ourselves to the inclusion of this constituent in the prosodic hierarchy of Persian.
2 The fact that the assignment of a pitch accent to final syllables of words skips right-edge clitics does not form the sole motivation for assuming the existence of a clitic group for Bijankhan and Nourbakhsh (2009). A second motivation is provided by syncope, the deletion of a word-final vowel before a clitic-initial vowel.
V. Abolhasanizadeh et al. / Lingua 122 (2012) 1380--1394
(2)
a. b. c. d.
ketɒ´ b ketɒb-hɒ´ ketɒ´ b-i ketɒbxuné
‘book’ ‘books’ ‘one book’ ‘library’
xuné xune-hɒ´ xuné-j-i
1383
‘house’ ‘houses’ ‘one house’
1.5. Addressing the research questions We here report the results of two experiments. Experiment I was a production experiment, the first acoustic investigation of Persian word prominence, which was undertaken to answer two questions. First, we wanted to determine whether the word prominent syllable of Persian has phonetic stress in addition to being pitch accented. Second, we wanted to establish whether Persian has Post-Focus Compression in the words after the focus constituent. Experiment II was a perception experiment. It was undertaken to investigate the question whether PFC involves the neutralization of the difference between plain words and cliticized words. 2. Experiment I The aim of Experiment I was to collect detailed phonetic information about the realization of words and word+clitic combinations that is representative of the speech of Tehran, so as to enable us to establish the phonetic differences between them. Since the presence of a H-tone or a L-tone may be accompanied by small and partly systematic phonetic differences as compared to a toneless syllable (Beckman, 1986; Levi, 2005), we decided to place the investigation in a wider perspective. Specifically, we expected small and partly systematic phonetic differences that accompany other structural differences, like segmental distinctions or focus differences. We would like to be able to evaluate the status of any differences between our ‘stressed’ and ‘unstressed’ syllables either as side effects of other structural options, in this case the presence of a pitch accent, or as intrinsically due to differences in the location of phonological stress. For this purpose, in addition to the difference in the location of the word prominence (PW vs CG), we included a segmental difference in the intervocalic consonant separating the two potential accent positions ([p] vs. [b]), the focus condition of the target words, and sentence mode (declarative vs. interrogative). The phonetic measures that are potentially affected by these structural differences include f0, duration, intensity and spectral properties. All of these were included in our investigation. 2.1. Materials We composed a corpus of sentences featuring two minimal pairs contrasting a noun (henceforth the ‘word’ or ‘PW’ condition) and a noun+clitic combination (henceforth the ‘clitic group’ or ‘CG’ condition). These two pairs of minimal pairs contrasted only in the voicing of the obstruent in the onset of the second syllable, which in the CG was the last consonant of the lexical word. These materials were part of a larger corpus testing more conditions. Since no obvious quadruplets were available in the segmental condition we report here, one of the four target words was a nonsense word. The target words were [tɒbéʃ] ‘light’ vs. [tɒ´ b-eʃ] ‘swing+his/her’ and [tɒpéʃ] ‘(nonsense word)’ vs. [tɒ´ p-eʃ] ‘tank-top+his/her’. They were embedded in declarative and interrogative carrier sentences which varied across three focus conditions, referred to as neutral (3a), post-focal (3b) and focal (3c). In (3), we show the ‘voiced’ minimal pair in its declarative embedding sentences. The total number of sentences was thus 3 (focus conditions) 2 (word structures) 2 (voicing conditions) 2 (sentence modes) = 24. For the neutral and post-focal carrier sentence we used Un X-e ‘That is X’, where -e is a clitic. This makes all target words part of trisyllabic ‘clitic groups’ that contrast in having the H* on the antepenultimate syllable (the CG condition) or on the penultimate syllable (the PW condition). By having an accentual phrase-final unaccented syllable in all cases, we abstract away from local phrase-finality effects on the duration and f0 of the two target syllables. Condition (3c) differs from (3a,b) in having un ‘that’ in final position, which allows X to be focused and X-e to be in first position in the sentence, the focus position. (3)
a.
b. c.
un tɒbéʃ-e that light-is ‘That is light’ un tɒbéʃ-e ‘THAT is light’ tɒbéʃ-e un ‘That is LIGHT’
un tɒ´ b-eʃ-e that swing-his/her-is ‘That is his/her swing’ un tɒ´ b-eʃ-e ‘THAT is his/her swing’ tɒ´ b-eʃ-e un ‘That is his/her SWING’
1384
V. Abolhasanizadeh et al. / Lingua 122 (2012) 1380--1394
The sentences were presented to subjects in standard Persian orthography, which uses Arabic letters. Conditions (3a) and (3b) were distinguished by having bold print for the target word in (3a) and bold print for un in (3b), reproduced here in the transcription. These twelve sentences were given twice, once with a question mark ([TD$INLE] ) and once with a full stop (.) at the end, in order to elicit both declarative and interrogative intonation contours. Subjects read each sentence twice in a professional recording studio at the University of Tehran. 2.2. Speakers and recordings Twelve speakers took part in the experiment, six male and six female. Their ages ranged from 26 to 37 and they were all educated native speakers recruited from students and staff in the Linguistics Department of the University of Tehran. Speakers were freely allowed to repeat themselves if they thought they hadn’t read a sentence correctly. The two best versions were selected from the utterances of each sentence by each speaker. In the majority of cases, these were the only readings produced for the sentence. After inspecting these 576 utterances, we decided to discard 31 of them because of disfluencies or technical problems, which left us with 545 utterances for analysis. We supplied the means over speakers for the 5.4% missing utterances. 2.3. Procedure Utterances were segmented with the help of Praat (Boersma and Weenink, 1992--2009). Instead of establishing only the start of the closure duration and the end of the stop burst of plosives, the boundary between closure and burst was included as a segmental boundary, for both voiced and voiceless plosives. In the case of voiced plosives, this meant that we had burst intervals of zero duration in a number of cases. Initial plosives were only measured for their bursts, since no reliable indication of the beginning of the closure is available. An example of a TextGrid with wave form is shown in Fig. 1. We included separate tiers for segments, words and clitic [e]. Subsequently, we averaged all values over the repetitions. Because of the way we supplied averaged values for the missing data, we have potentially reduced the variation. We adopted a 1% significance level for all analyses, but include results at 5%, which may be seen as trends. 3. Experiment I: results We report the results for duration, intensity, spectral measures and f0. For duration, we first present the results of overall analyses of variance in which SEGMENT is included as a 7-level variable in order to identify interactions between segment durations and any of the four experimental variables. The same procedure is followed for intensity and the spectral formants (F1, F2 and F3) for the two vowels in the potentially accented syllables, as well as for Centre of Gravity R (COG), with three levels for segment ([t]-burst, [p/b]-burst and [ ]). The COG is a measure of indicating the mean spectral frequency over some time span. The measure is particularly useful for segments without well-defined formant structure, like those with voiceless friction (van Son and Pols, 1999).
[(Fig._1)TD$IG]
u
n un
t t
b
e
e
t be e
0
Time (s)
2.112
Fig. 1. Praat TextGrid for a declarative neutral utterance of [un tɒbéʃ-e] ‘That is light’.
V. Abolhasanizadeh et al. / Lingua 122 (2012) 1380--1394
1385
Table 1 Effects of Voicing of labial plosive, Focus condition, Sentence mode and Word structure on durations of seven phonetic segments in the target words [tɒpéʃ-e], [tɒ´ p-eʃ-e], [tɒbéʃ-e], [tɒ´ b-eʃ-e]. Segment [t]-burst [ɒ] [p/b]-closure [p/b]-burst [e] [ʃ] [e] * **
Voicing df 1,11
Focus df 2,22 *
ns F = 188.34 ** F = 20.92 ** F = 170.19 ** F = 27.071 ** ns ns
F = 5.646 ns F = 4.12 * ns F = 15.491 ** F = 61.506 ** F = 51.225 **
Sentence mode df 1,11
Word structure df 1,11
ns F = 8.71 * ns F = 6.81 * ns F = 14.872 ** F = 117.3 **
F = 20.446 ** ns F = 15.156 ** F = 6.13 * F = 5.189 * F = 7.74 * ns
p < 0.05. p < 0.01.
3.1. Duration An analysis of variance (repeated measures) was performed on the durations of the segmented sections of the target words, with SEGMENT ([t]-burst, [ɒ], [p/b]-closure, [p/b]-burst, [e], [ʃ], clitic [e]), WORD STRUCTURE (PW VS CG), SENTENCE MODE (declarative vs interrogative), FOCUS (neutral, post-focal, focal) and VOICE (voiced vs voiceless) as factors. Mauchly's test for sphericity was significant only for SEGMENT; we adopted the Greenhouse-Geisser correction in all cases. There were interactions between SEGMENT and WORD STRUCTURE (F[6,66] = 6.755; p < 0.001), SEGMENT and FOCUS (F[12,132] = 72.543; p < 0.001), SEGMENT and SENTENCE MODE (F[6,66] = 100.667; p < 0.001) and SEGMENT and VOICING (F[6,66] = 56.165; p < 0.001) as well as main effects for SEGMENT (F[6,22] = 123.31; p < 0.001), FOCUS (F[2,22] = 51.01; p < 0.001) and SENTENCE MODE (F[1,11] = 35.76; p < 0.001). This means that, unsurprisingly, segments have unequal durations, but more importantly that some or all of our seven segment durations vary systematically with the word type of the target word, with the focus condition, with the sentence mode and with whether [p] or [b] occurs in the target words. To establish which segments vary under which condition we carried out repeated measures analyses of variance for each of the segmental durations separately. The results are presented in Table 1. Table 1 shows that the voicing of the labial closure affects the duration of the closure and the burst. The closure phase of [p] is 12 ms longer than that of [b], and the burst is 39 ms longer (see Fig. 2). The segment [p] is 105 ms, [b] 54 ms in total. The preceding vowel 27 ms longer before [b] (149 ms) than before [p] (122 ms). This result follows widespread tendencies for voiceless plosives to be longer and preceding vowels to be shorter compared to the situation for voiced plosives (Luce and Charles-Luce, 1985; Kluender et al., 1988). Unexpectedly, the effect of the voicing of the plosive was also found on the following vowel, [e], which is 11 ms longer after [b] (97 ms) than after [p] (84 ms). The effect of the focus condition is due to two quite different factors. Focus condition is partly confounded with position in the sentence, because the focal target words are sentence-initial rather than sentence-final, as in the other two conditions. The effects on [e], [ʃ] and the final clitic [e] are due to final lengthening in the neutral and post-focal conditions. A post hoc test (Sidak) shows that in all three cases, the focal condition differs from the other two conditions ( p < 0.01),
[(Fig._2)TD$IG]
250
Duration (ms)
200 150 100 50 0
t burst
ɒ
p/b
p/b burst
e
e
Fig. 2. Mean segment durations for the target words pooled over 12 speakers for voiced (---) and voiceless (- - -) labial plosives separately.
[(Fig._3)TD$IG] 1386
V. Abolhasanizadeh et al. / Lingua 122 (2012) 1380--1394 250
Duration (ms)
200
150
100
50
0
t burst
ɒ
p/b
p/b burst
e
e
Fig. 3. Mean segment durations for the target words pooled over 12 speakers for neutral focus (---), post-focal (- - -) and focal ( ) pronunciations separately.
[(Fig._4)TD$IG]
250
Duration (ms)
200 150 100 50 0
t burst
ɒ
p/b
p/b burst
e
e
Fig. 4. Mean segment durations for the target words pooled over 12 speakers for declarative (---) and interrogative (- - -) sentences separately.
which however do not differ between themselves. In the focal condition, these three segments are respectively 8 ms, 10 ms and 16 ms shorter than in the other two conditions (see Fig. 3). The second explanatory factor is only present as a trend, and concerns the lesser articulatory care taken over the post-focal target words. However, the effects here are very small, as seen in Fig. 3, with [p/b] being 4 ms shorter and the [t]-burst 10 ms shorter in the post-focal condition than in the focal condition. Third, the effect of sentence mode is located in the final syllable, as indicated in Table 1 and depicted in Fig. 4. The onset [ʃ] is 7 ms longer and the final [e] is 95 ms longer in the interrogative condition than in the declarative condition. Increased final lengthening in questions would appear to be a general tendency (e.g. Smith, 2002), which has been phonologized in varieties of West Greenlandic (Rischel, 1974:79; see also Fortescue, 1984:4). By contrast, the effect on [ɒ] is a reduction of 8 ms in the interrogative condition. In fact, overall, non-final syllables tend to be longer in declaratives than in interrogatives, suggesting that the lengthening of the final syllable is heralded by an accelerando in the pre-final syllables.3 van Heuven and van Zanten (2005) in fact propose faster speech rate as a near-universal characteristic of questions.
3 A pattern of shorter non-final syllables and a longer final syllable in interrogatives compared to declaratives was earlier reported by Stoel (2007) for the East Timorese language Fataluku.
[(Fig._5)TD$IG]
V. Abolhasanizadeh et al. / Lingua 122 (2012) 1380--1394
1387
250
Duration (ms)
200 150 100 50 0
t burst
ɒ
p/b
p/b burst
e
e
Fig. 5. Mean segment durations for the target words pooled over 12 speakers for the CG (---) and PW (- - -) word structures separately.
Finally, is there evidence that the location of the accent is accompanied by inherent differences in duration of the syllable rime? The answer must be negative, even though we did find interpretable effects of word structure. In the CGcondition, in which [tɒ] has the pitch accent, the [t]-burst is 9 ms longer than in the PW-condition (see Fig. 5; the 6 ms longer [ɒ] just failed to reach significance (F = 4.735; p = 0.052)). Conversely, in the PW-condition, in which [be] has the accent, the labial closure is 7 ms and the [e] 6 ms longer than in the CG condition. The following [ʃ] compensates partly for this lengthening by being 3 ms shorter in the PW-condition. 3.2. Spectral measures Spectral measures have been used to detect differences in articulator shape or position. We report Centre of Gravity measurements and formant measurements. Centre of Gravity measures for the [t]-burst, [p/b]-burst and [ʃ] were subjected to a repeated measures analysis of variance with SEGMENT ([t]-burst, [p/b]-burst, [ʃ]), VOICE (voiced vs voiceless), FOCUS (neutral, post-focal, focal), SENTENCE MODE (declarative vs interrogative) and WORD STRUCTURE (PW VS CG) as factors. Apart from the obvious effect of SEGMENT, we found an interaction between FOCUS and SEGMENT (F[2,22] = 6.851; p < 0.01), which appeared to be due to a 330 Hz lower COG for [t]-burst in the focal condition. Since the focal condition has the target word in sentenceinitial position, this effect must be due to the occurrence of [t] at the beginning of the utterance. The same procedure was followed for F1, F2, and F3, but with [ɒ] and [e] as the levels for SEGMENT. (We excluded the final [e], as it was never accented.) In the case of F1, there was a main effect for VOICE (F = 5.339, p < 0.05) and a significant interaction between SEGMENT and WORD STRUCTURE (F[1,11] = 15.904; p < 0.01). In the case of F2, there was a main effect for VOICE (F[1,11] = 13.811, p < 0.01) and significant interactions between SEGMENT and VOICE (F[1,11] = 14.531; p < 0.01) and between SEGMENT and FOCUS (F[2,22] = 6.268; p < 0.01). In the case of F3, there was a main effect for FOCUS only (F[2,22] = 6.148; p < 0.01). The results of the separate analyses of variance of the three formants for the individual vowels are given in Tables 1 and 2. The effect of the voicing of the labial plosive is confined to [e], whose F2 is 92 Hz higher and whose F3 is 33 Hz higher before the voiceless consonant than the voiced consonant. This means that the vowel is slightly more centralized after [b] than after [p]. As for the focus condition, we found that the F1 of focal [ɒ] is marginally higher than that of post-focal [ɒ] Table 2 Effects of voicing of labial plosive, focus condition, sentence mode and word structure on the F1, F2 and F3 of [ɒ] and [e]. Segment
Dependent variable
Voicing df 1,11
Focus df 2,22
Sentence mode df 1,11
Word structure df 1,11
*
[ɒ]
F1 F2 F3
ns ns ns
F = 3.650 F = 4.854 * F = 4.277 *
ns ns ns
F = 7.078 * ns ns
[e]
F1 F2 F3
ns F = 25.361 ** F = 6.34 *
ns F = 3.802 * F = 4.077 *
ns ns ns
F = 10.917 ** ns ns
* **
p < 0.05. p < 0.01.
1388
V. Abolhasanizadeh et al. / Lingua 122 (2012) 1380--1394
Table 3 Effects of voicing of labial plosive, focus condition, sentence mode and word structure on the intensity of [ɒ] and [e]. Voicing df 1,11 [ɒ] [e] * **
Focus df 2,22 **
ns F = 15.92 **
F = 42.46 F = 42.47 **
Sentence mode df 1,11
Word structure df 1,11
ns F = 7.20 *
F = 5.89 * F = 5.04 **
p < 0.05. p < 0.01.
(25 Hz) and the neutral [ɒ] (14 Hz). The F2 of [ɒ] is higher in the post-focal condition than in the neutral condition (34 Hz) and the focal condition (52 Hz), while its F3 is 60 Hz lower in the post-focal condition than in the neutral condition and 46 Hz lower than in the focal condition. That is, [ɒ] is slightly more centralized in the post-focal condition than in the neutral and focal conditions. The F2 of [e] was 48 Hz lower in the post-focal condition than in the neutral condition, and F3 was 46 Hz lower in the post-focal condition than in the neutral condition, which, again, means that in the post-focal condition [e] was marginally more central. Finally, the effects of word type are summarized by observing that when [tɒ] has the pitch accent (CG), it has a marginally higher F1 (18 Hz) than when it has not (PW). Conversely, accented [e] (PW) has a marginally higher F1 (19 Hz) than unaccented [e] (CG). That is, vowels in accented syllables are fractionally, and negligibly, opener than in the unaccented case. 3.3. Intensity
[(Fig._6)TD$IG]
We report the results for intensity (dB) of the separate analyses of variance for the two target vowels separately in Table 3.
F0 (Hz) 350
350
250
250
150
150
50
50
350
350
250
250
150
150
50
50
350
350
250
250
150
150
50
50
Fig. 6. Mean declarative F0 contours for un and [tɒ[b/p]eʃe] on normalized time scale for PW (---) and CG (- - -) word structures separately, with target words in a neutral focus sentence (top), in post-focal position (middle) and focus position (bottom). Pooled over 4 speakers.
V. Abolhasanizadeh et al. / Lingua 122 (2012) 1380--1394
1389
The voiced labial plosive causes the intensity of the following [e] to be 1.47 dB higher compared to the voiceless consonant. In interrogatives, it is 2 dB higher than in declaratives, a statistical trend. We have no interpretation of these effects. As for Focus, [ɒ] is 3.03 dB higher in the neutral condition than that in the post-focal condition, and 1.26 dB higher in the focal condition than in the neutral condition. Similarly, the intensity of [e] is 1.36 dB higher in the focal condition than in the neutral condition and 3.98 dB higher in the neutral condition than in the post-focal condition. This result matches the communicative nature of these conditions for both vowels, with more intense pronunciations in more ‘emphatic’ conditions. As for the effect of word structure, we found that accented [ɒ] is 2.06 dB higher than unaccented [ɒ], and accented [e] is 1.96 dB higher than unaccented [e]. Again, this result is in the expected direction for both vowels, but the effects are statistically trends. 3.4. Fundamental frequency We report mean f0 for the PW and CG target words in the neutral, post-focal and focal conditions with declarative and interrogative intonation separately. Fig. 6 shows averaged contours on normalized time scales for the declarative condition, while Fig. 4 does the same for the interrogative condition. In comparison to the duration, intensity and spectral measurements, the f0 measurements show substantial differences between the two word types. In the top panels of Figs. 6 and 7, which show the neutral condition in declaratives and interrogatives respectively, accented [tɒ] is approximately 50 Hz (declarative) and 70 Hz (interrogative) higher than its unaccented counterpart. In the bottom panels, which give the focal condition, comparable differences are observed both for [tɒ] and [be]. To turn to the post-focal condition, a comparison of the neutral contrasts (Fig. 6, top panels) with the post-focal (middle panels) contrasts between the PW and CG pronunciations suggests that post-focal forms are not deaccented. With
[(Fig._7)TD$IG] F0 (Hz) 350
350
250
250
150
150
50
50
350
350
250
250
150
150
50
50
350
350
250
250
150
150
50
50
Fig. 7. Mean interrogative F0 contours for un and [tɒ[b/p]eʃe] on normalized time scale for PW (---) and CG (- - -) word structures separately, with target words in a neutral focus sentence (top), in post-focal position (middle) and focus position (bottom). Pooled over 4 speakers.
1390
V. Abolhasanizadeh et al. / Lingua 122 (2012) 1380--1394
neutral focus (top), the first syllable [tɒ] of the CG (solid line) has high pitch and the following clitic has low pitch, and this pattern is reversed for the PW (dashed line). In the post-focal condition, there is evident Post Focus Compression, and the differences between the two word types are reduced considerably as a result, but there are indications that the general pattern may be preserved. In the CG condition, there is a regular downtrend across the last four syllables, but in the PW condition, there is not lowering from [tɒ] to [be], which is consistent with an assumption that [be] has a range-compressed Htone target. The interrogative contours (Fig. 7) confirm this conclusion. A comparison of the contrasts in neutral and post-focal positions shows that the post-focal pronunciation of the target words (middle panels) are reduced versions of the contrast in neutral position (top panels). A further indication that post-focal words are not deaccented is the relatively high f0 of un in the focal condition, where un is post-focal (bottom panels in Figs. 6 and 7). In the CG condition in particular (solid line), the third syllable in the target words has lower pitch than the following syllable un, which suggests there is a H-tone on un in both the declarative and the interrogative. Since the declarative ends in L%, that H-tone must be H*. A final observation concerns the utterance-final syllables in the interrogative contours. All of these remain quite level till the end. This is different from what is seen in many other languages, where a final boundary H% causes a local rise in pitch. 4. Experiment I: discussion Experiment I was run to be able to answer the question whether Persian has phonetic stress, in the sense of features other than f0 that mark prominent syllables, or alternatively only tone, to be described as a pitch accent. It was addressed by means of a detailed investigation of the phonetic differences between nouns and segmentally identical, but prosodically different noun+clitic combinations in a variety of conditions. The choice of these conditions was motivated by two considerations. The first was to spread the word accent contrast exemplified by the two structures across an array of contexts that might have an impact on the realization of the contrast. The second was to create a baseline for gauging the effect size of any phonetic differences we might find between the two word structures, so as to be able to assign them to the existence of stress, as opposed to regarding them as side effects of the existence of a pitch accent, i.e. of tone. The reasoning here is that phonological contrasts rarely confine their effect on just a single or primary phonetic parameter, with tone only having an effect on f0 or [voice] only having an effect on the state of the glottis. Side effects are ubiquitous, and are often conventionalized in the phonetic implementation (Stevens and Keyser, 1989). The results showed that a number of structural contrasts are accompanied by differences in phonetic parameters that are not the primary phonetic exponents of these structural contrasts. The largest of these occurred as a function of sentence mode. Excessive final lengthening and some non-final shortening occurred in utterances with interrogative intonation as compared with the same utterances with the (tonally different) declarative intonation (102 ms for the final syllable). Next in importance were the durational effects of the value for [voice] of the intervocalic plosive on its closure and burst durations, which are longer for [p] than for [b] (by 51 ms), and on the preceding and following vowels, which are shorter in the case of [p] (by 27 ms and 11 ms, respectively). While the specific finding that vowels are longer after onset [b] than they are after [p] is new, as far as we are aware, both the interrogative durational pattern and the durational effects of the laryngeal specification of plosives are in line with many earlier findings (e.g. Rischel, 1974; Ryalls et al., 1994; Smith, 2002; van Heuven and van Zanten, 2005; Stoel, 2007; Luce and Charles-Luce, 1985; Kluender et al., 1988). If we ignore the effect of position in the sentence, an inevitable confound of focus in our data, the durational effects of focus is very small, with a 10 ms longer [t]-burst duration in post-focal pronunciation relative to neutral pronunciation. Similarly small effects are found for the structural difference at issue, that of the prosodic difference between first and second syllable accentuation in nouns and noun+clitic combinations, respectively. Adding significant as well as near-significant effects over the consonant and vowel in each syllable, we found a 15 ms longer duration of the first accented syllable and a 13 ms longer duration of the second accented syllable than in their unaccented counterparts. The findings for intensity and the spectral measures lead to the same conclusion. The intensity of the vowels responded most clearly to the variation in focus, with vowels in post-focal words having lower intensity than under neutral focus and having less intensity under neutral focus than under focus. A similar effect was found for the difference in the position of the word prominence, but it was statistically less robust. The spectral differences between accented and unaccented versions of the two vowels are extremely small, and smaller than those that resulted from the different focus conditions. In the case of [e], we found a difference in F2 of the vowel after the labial plosive which is comparable in size to the difference in F1 we found between the accented and unaccented versions of this vowel. In short, the durational and spectral differences between accented and unaccented vowels stay well below the baseline for a phonological status of stress. 5. Experiment II The results of Experiment I appeared to indicate that, while there is Post-Focus Compression in the declarative and interrogative data, the tonal distinctions between the two word types remain intact after the focus. Experiment II was conducted to see whether Post-Focus Compression merely compresses the pitch range or alternatively causes the
V. Abolhasanizadeh et al. / Lingua 122 (2012) 1380--1394
1391
deletion of the tones of the pitch accent. Lack of deaccentuation under Post-Focus Compression predicts that the salience of the contrast between the PW and CG conditions may well be reduced, because the distinctions between the high pitch of the prominent syllables and the low pitch of the non-prominent syllables is reduced, but that it is nevertheless categorically present. We used a word identification task to test this prediction. 5.1. Experiment II: materials Twelve utterances were selected from the recordings by each of four of the twelve speakers who contributed to the corpus used for Experiment I, two randomly chosen female speakers and two randomly chosen male speakers. The utterances contained equal numbers of nouns (PW) and noun+clitic (CG) versions of the same segmental strings. In order to see if interruption of f0 might aggravate the difficulty of perceiving the post-focal contrast, half of the utterances had the target words with the voiceless plosives and half those with the voiced plosive. The focus condition was represented by including the three sentence frames representing the neutral, post-focal and focal conditions. This yielded 4 (speakers) 2 (plosives) 2 (word structures) 3 (focus conditions), or 48 stimuli. We only used the declarative sentences in this experiment, as there seemed to be a tendency to preserve the contrast better in the interrogative sentences (cf. the middle panels in Figs. 6 and 7). Inclusion of interrogative sentences might have caused a positive bias in the results, something we wanted to avoid. 5.2. Experiment II: procedure Twenty subjects, 8 female and 12 male, were recruited from the student population of Tehran University. They were tested individually in the phonetics laboratory of the University of Tehran with the help of a Praat Multiple Forced Choice experiment run on a laptop (Boersma and Weenink, 1992--2009). They were instructed the listen to each stimulus and to select one of four structures displayed on the screen, where the words [tɒbéʃ] ‘light’, [tɒ´ b-eʃ] ‘swing+his/her’, [tɒpéʃ] (nonsense word) and [tɒ´ p-eʃ] ‘tank-top+his/her’ appeared in Arabic spelling in four clickable buttons, in this order. The order of the stimuli was randomized per listener. Before the test proper, subjects did six practice items to familiarize themselves with the task. They could listen to each stimulus as often as they wished, but once they made their choice, the next screen appeared automatically. 5.3. Experiment II: results Correct scores were pooled over the stimuli spoken by the different speakers, and an analysis of variance (repeated measures) was performed on them with WORD STRUCTURE (CG vs. PW), VOICE ([b], [p]) and FOCUS (neutral, post-focal, focal) and as factors. There was a main effect for FOCUS (F[2] = 54.125, p < 0.001). A post hoc test (Sidak) showed that the postfocal condition was significantly different from the neutral and focus conditions ( p < 0.001). The lower recognition scores in the post-focal condition are due to Post-Focus Compression, which as we have seen reduces the phonetic difference
[(Fig._8)TD$IG]
Fig. 8. Correct scores in a word identification task for noun (PW) vs noun+clitic combinations (CG) as obtained in the neutral condition, the postfocal condition and the focal condition.
1392
V. Abolhasanizadeh et al. / Lingua 122 (2012) 1380--1394
between the F0 of accented and unaccented syllables. Inspection of the errors showed that there were no confusions between the voiced and voiceless target words. Thus, while the chance level in this four-choice task is technically 25%, in practice it is 50% for the difference between noun and noun+clitic combinations, given that there is no variation in the scores for the voicing distinction. The score of 73% is clearly considerably better than a chance level of 50% (see Fig. 8). 6. Conclusion There were two questions we intended to address in our investigation. One concerned the presence of phonetic differences other than f0 in word prominent syllables and the other was whether Post-Focus Compression involved the deletion of the tonal structure that is responsible for the word prominence. The results of Experiment I showed extremely small durational, intensity and spectral differences between prominent and non-prominent syllables. They were smaller or at least as small as those found between different focus conditions, between vowels before and after voiced and voiceless labial plosives and between declarative and interrogative sentences. While the differences were in the direction to be expected from a difference in phonetic stress, with slightly longer, slightly more intense and slightly opener mid vowels in the prominent condition, their effect size fell well below what would be expected from a difference is stress. By contrast, the f0 differences were substantial. The Persian word prosodic prominence contrast, therefore, is that between the presence vs. absence of a pitch accent. The literature on Persian (Sadat Tehrani, 2007) as well as our data suggest that this pitch accent is (L)+H*, with the H-tone going to the last syllable of the lexical word domain, whereby L is overt in polysyllabic words. Because clitics fall outside this domain, but are syllabified with it, minimal pairs that differ in the location of the prominent syllable arise whenever a clitic has the same segmental composition as the last segments of a word. Specifically, the cliticized words (also ‘clitic group’ or CG) have non-final prominence where the word (also ‘phonological word’ or PW) has final prominence. Our experiment did not aim to elucidate the prosodic status of these constituents. The data we collected are consistent with an interpretation of all these structures as phonological words and with predictable prominence assignment taking place in the lexicon. Experiment II confirmed an impression that could be gained from the production data in Experiment I. The phonetic difference between CG [tɒ´ [b/p]eʃe] and PW [tɒ[b/p]éʃe] we observed in the neutral focus condition appeared to be preserved after the focus, where the pitch range was compressed. That is, a higher first syllable in the CG condition than in the PW condition was observable in the post-focal condition, even if the F0 difference was less than in the other conditions. A word identification task showed that the contrast, which reached a 96% correct score in the data without Post-Focus Compression, still reached a 73% correct score in the post-focal condition, where Post-Focus Compression applies. This means that there was no neutralization between CG and PW, and that the prosodic difference between them is intact. Persian thus differs from English in two respects. First, there is no comparable difference between stressed and unstressed syllables independently of the presence of the pitch accent, and second, unlike the pitch accents of English, the Persian pitch accent is not deleted after the focus. While Persian does have Post-Focus Compression (Xu et al., 2012), the reduction in pitch range is phonetic and leaves the tonal structure intact. These features make Persian more like Northern Bizkaian Basque (e.g. Hualde et al., 2007) and Tokyo Japanese (Pierrehumbert and Beckman, 1988; Kubozono, 1993) than English (e.g. Beckman, 1986), Dutch (Rietveld et al., 2004; van Heuven and de Jonge, 2010), Spanish (Ortega-Llebaria and Prieto, 2010) or Catalan (Ortega-Llebaria et al., 2010), where stressed and unstressed syllables differ in duration and often also in vowel quality. However, unlike Japanese and Northern Bizkaian Basque, Persian has obligatory accent, making it similar to Nubi (Gussenhoven, 2006) and Turkish (Levi, 2005). A preliminary conclusion therefore is that this kind of system, which is both culminative and obligatory and as such counts as a ‘stress system’ in the sense of Hyman (2006), may be more common than is suggested by their relatively sparseness in the typological literature. We have not addressed all the relevant issues. One of these concerns the question whether there is only a single pitch accent, (L+)H*, or more, as in English. Neither have we investigated the question whether deaccentuation of the wordbased pitch accent might be systematic in other contexts. If the pitch accent is routinely deleted in other contexts that would evidently compromise its culminative status. Observe that there is no post-lexical process in English that affects the location or presence of stressed syllables (Gussenhoven, 2011), and thus all stress changing processes take place during word derivation (satan -- satanic, explain -- explanation, etc.). That is, culminativity in English is absolute. It remains to be seen whether the same is true for Persian.4
4 Nima Sadat-Tehrani ran an informal small-scale replication of the perception experiment and found that responses in the post-focal condition were ‘in the vicinity of chance level’. To the third author, who doesn’t speak Persian, the post-focal stimuli in Experiment II sound deaccented and neutralized. A formal replication of the experiment would be welcome. Instead of a reading task, as used in Experiment I and which yielded the stimuli we used, a more realistic elicitation task would be desirable.
V. Abolhasanizadeh et al. / Lingua 122 (2012) 1380--1394
1393
Our data are too limited to argue for any particular tonal analysis of Persian. However, some aspects of the data would seem to conflict with claims in the literature. Interrogatives end in level pitch, rather than a final rise. Since a sequence of H* H% (or H* H-H%, as in Hirschberg and Pierrehumbert, 1986) has generally been used to describe upstepped contours, like the high rise of English (e.g. Gussenhoven, 2004:302) or the ‘rise to high’ of French (Post, 2000), the Persian contour may need to be analyzed with the absence of a boundary tone (cf. Grabe, 1998:49). This would mean that Persian contrasts a declarative L% with the absence of a boundary tone (Ø) for interrogatives. H% might be reserved for non-final IPs, as suggested by the examples in Sadat Tehrani (2007). Acknowledgements Experiment I was conducted by the first author under the supervision of the second author. The data have been reanalyzed and interpreted in collaboration with the third author. We thank the participants of Experiment I and Experiment II at the University of Tehran and Joop Kerkhoff for technical assistance. We are grateful for the comments by Hamed Rahmani, Nima Sadat-Tehrani and three anonymous reviewers, which have helped greatly to improve the final text. The first author acknowledges the ITRC Grant awarded by the Iranian Ministry of Information and Communication Technology, which enabled her to carry out research at Radboud University Nijmegen. The Ministry has in no way influenced the contents of this report. References Beckman, M.E., 1986. Stress and Non-stress Accent. Foris, Dordrecht. Bijankhan, M., Nourbakhsh, M., 2009. Voice onset time in Persian initial and intervocalic stop production. Journal of the International Phonetic Association 39, 335--364. Boersma, P., Weenink, D., 1992--2009. Praat: Doing Phonetics by Computer. Version 5.1.04. , www.praat.org. Bolinger, D., 1958. A theory of pitch accent in English. Word 14, 109--149. Chen, Y., Guion-Anderson, S., 2012. Prosodic realization of focus in Mandarin by advanced American learners of Chinese. Journal of the Acoustical Society of America 131, 3200--3234. De Jong, K., Zawaydeh, B.A., 1999. Stress, duration, and intonation in Arabic word-level prosody. Journal of Phonetics 27, 3--22. Elordieta, G., Hualde, J.I., 2003. Tonal and durational correlates of accent in contexts of downstep in Lekeitio Basque. Journal of the International Phonetic Association 33, 195--209. Eslami, M., 2000. Sˇ enaxt-e nævay-e goftar-e zæban-e farsi væ karbord-e an dar bazsazi væ bazsˇ enas-ye rayan’i-ye goftar [The prosody of the Persian language and its application in computer-aided speech recognition]. Ph.D. Dissertation, University of Tehran. Eslami, M., Bijankhan, M., 2002. Nezam-e ahæng-e zæban-e Farsi: [Persian intonation system]. Iranian Journal of Linguistics 34, 36--61. Ferguson, C., 1957. Word stress in Persian. Language 33, 123--135. Fortescue, M., 1984. West Greenlandic. Croom Helm, London. Goldsmith, J., 1975. Autosegmental phonology. Ph.D. Dissertation, MIT. Gomez-Imbert, E., Kisseberth, M., 2000. Barasana tone and accent. International Journal of American Linguistics 66, 419--463. Grabe, E., 1998. Comparative intonational phonology: English and German. PhD dissertation, Radboud University Nijmegen. Published in MPI Series in Psycholinguistics. Gussenhoven, C., 2004. The Phonology of Tone and Intonation. Cambridge University Press, Cambridge, UK. Gussenhoven, C., 2006. The word prosody of Nubi: between stress and tone. Phonology 23, 193--223. Gussenhoven, C., 2011. Sentential prominence in English. In: van Oostendorp, M., Ewen, C.J., Hume, E., Rice, K. (Eds.), The Blackwell Companion to Phonology, vol. 5. Wiley-Blackwell, Malden, MA/Oxford, pp. 2780--2806. Hayes, B., 1995. Metrical Stress Theory: Principles and Case Studies. Chicago University Press, Chicago. Hill, A.A., 1962. First Texas Conference on Problems of Linguistic Analysis. University of Texas, Austin. Hirschberg, J., Pierrehumbert, J., 1986. Intonational structuring of discourse. In: Proceedings of the 24th Meeting of the Association for Computational Linguistics. pp. 136--144. Hualde, J., Elordieta, G., Gaminde, I., Smiljanic´, R., 2007. From pitch accent to stress accent in Basque. In: Gussenhoven, C., Warner, N. (Eds.), Laboratory Phonology, vol. 7. Mouton de Gruyter, Berlin/New York, pp. 547--584. Hyman, L.M., 2006. Word-prosodic typology. Phonology 23, 225--257. Hyman, L.M., 2009. How (not) to do phonological typology: the case of pitch-accent. Language Sciences 31, 213--238. Kahnemuyipour, A., 2003. Syntactic categories and Persian stress. Natural Language and Linguistic Theory 21, 333--379. Kluender, K.R., Diehl, R.L., Wright, B.A., 1988. Vowel-length differences before voiced and voiceless consonants: an auditory explanation. Journal of Phonetics 16, 153--169. Kubozono, H., 1993. The organization of Japanese prosody. Studies in Japanese Linguistics, vol. 2. Kurosio Publishers, Tokyo. Lazard, G., 1957. Grammaire du Persan Contemporain. Klincksieck, Paris, New Edition published by Peeters, Paris, 2006. Levi, S., 2005. Acoustic correlates of lexical accent in Turkish. Journal of the International Phonetic Association 35, 73--97. Luce, P.A., Charles-Luce, J., 1985. Contextual effects on vowel duration, closure duration and the vowel consonant ratio in speech production. Journal of the Acoustical Society of America 78, 1949--1957. Mahjani, B., 2003. An instrumental study of prosodic features and intonation in modern Farsi (Persian). MA Thesis, University of Edinburgh. Ortega-Llebaria, M., Prieto, P., 2010. Acoustic correlates of stress in Central Catalan and Castilian Spanish. Language and Speech 54, 73--97.
1394
V. Abolhasanizadeh et al. / Lingua 122 (2012) 1380--1394
Ortega-Llebaria, M., Vanrell, M.M., Prieto, P., 2010. Catalan speakers’ perception of word stress in unaccented contexts. Journal of Acoustical Society of America 127, 462--471. Pierrehumbert, J., 1980. The phonology and phonetics of English intonation. Ph.D. Dissertation, MIT. Distributed 1988, Indiana University Linguistics Club. Pierrehumbert, J., Beckman, M.E., 1988. Japanese Tone Structure. MIT Press, Cambridge, MA. Pierrehumbert, J., Hirschberg, J., 1990. The meaning of intonational contours in the interpretation of discourse. In: Cohen, P., Morgan, J., Pollack, M. (Eds.), Intentions in Communication. MIT Press, Cambridge, MA, pp. 271--311. Post, B., 2000. Tonal and Phrasal Structures in French Intonation. LOT Publications, Utrecht. Rietveld, T., Kerkhoff, J., Gussenhoven, C., 2004. Word prosodic structure and vowel duration in Dutch. Journal of Phonetics 32, 349--371. Rischel, J., 1974. Topics in West Greenlandic Phonology. Akademisk, Copenhagen. Ryalls, J., Le Dorze, G., Lever, N., Ouellet, L., Larfeuil, C., 1994. The effects of age and sex on speech intonation and duration for matched statements and questions in French. Journal of the Acoustical Society of America 95, 2274--2276. Sadat Tehrani, N., 2007. The intonational grammar of Persian. Ph.D. Dissertation, University of Manitoba. Same’i, H., 1996. Tekye-ye fe,l dær zæban-e farsi: Yek bæresi-ye mojædæd.(Verb stress in Persian: a re-examination). Nameye Færhængestan 1, 6--21. Schmerling, S., 1976. Aspects of English Sentence Stress. University of Texas Press, Austin. Shaqaqi, V., 1993. Clitics in Persian. Ph.D. Dissertation, University of Tehran. Smith, C., 2002. Prosodic finality and sentence type in French. Language and Speech 45, 141--178. Stevens, K.N., Keyser, S.J.S.J., 1989. Primary features and their enhancement in consonants. Language 65, 81--106. Stoel, Ruben, 2007. Question intonation in Fataluku. Presented at the Fifth East Nusantara Conference, Kupang, Indonesia. www.fataluku.com. van Heuven, V.J.J.P., de Jonge, M., 2010. Spectral and temporal reduction as stress cues in Dutch. Phonetica 68, 120--132. van Heuven, V., van Zanten, E., 2005. Speech rate as a secondary prosodic characteristic of polarity questions in three languages. Speech Communication 47, 87--99. van Son, R.J.J.H., Pols, L.C.W., 1999. An acoustic description of consonant reduction. Speech Communication 28, 125. Wellens, I., 2005. The Nubi Language of Uganda: An Arabic Creole in Africa. Brill, Leiden. Xu, Y., Chen, S.-w., Wang, B., 2012. Prosodic focus with and without post-focus compression (PFC): a typological divide within the same language family? The Linguistic Review 29, 131--147.