The role of metrical information in apraxia of speech. Perceptual and acoustic analyses of word stress

The role of metrical information in apraxia of speech. Perceptual and acoustic analyses of word stress

Author’s Accepted Manuscript The role of metrical information in apraxia of speech. Perceptual and Acoustic Analyses of Word Stress Ingrid Aichert, Mo...

823KB Sizes 2 Downloads 30 Views

Author’s Accepted Manuscript The role of metrical information in apraxia of speech. Perceptual and Acoustic Analyses of Word Stress Ingrid Aichert, Mona Späth, Wolfram Ziegler www.elsevier.com/locate/neuropsychologia

PII: DOI: Reference:

S0028-3932(16)30009-4 http://dx.doi.org/10.1016/j.neuropsychologia.2016.01.009 NSY5851

To appear in: Neuropsychologia Received date: 31 August 2015 Revised date: 8 December 2015 Accepted date: 10 January 2016 Cite this article as: Ingrid Aichert, Mona Späth and Wolfram Ziegler, The role of metrical information in apraxia of speech. Perceptual and Acoustic Analyses of Word Stress, Neuropsychologia, http://dx.doi.org/10.1016/j.neuropsychologia.2016.01.009 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting galley proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

The role of metrical information in apraxia of speech. Perceptual and Acoustic Analyses of Word Stress. Ingrid Aichert, Mona Späth & Wolfram Ziegler EKN – Clinical Neuropsychology Research Group Institute of Phonetics and Speech Processing, Ludwig-Maximilians-Universität München Corresponding author: Ingrid Aichert EKN – Clinical Neuropsychology Research Group Ludwig-Maximilians-Universität München Institute of Phonetics and Speech Processing Schellingstr. 3 80799 München Phone: 0049-89-3068-5811, Fax: 0049-89-3068-7561 Email: [email protected]

1

Abstract Several factors are known to influence speech accuracy in patients with apraxia of speech (AOS), e.g., syllable structure or word length. However, the impact of word stress has largely been neglected so far. More generally, the role of prosodic information at the phonetic encoding stage of speech production often remains unconsidered in models of speech production. This study aimed to investigate the influence of word stress on error production in AOS. Two-syllabic words with stress on the first (trochees) vs. the second syllable (iambs) were compared in 14 patients with AOS, three of them exhibiting pure AOS, and in a control group of six normal speakers. The patients produced significantly more errors on iambic than on trochaic words. A most prominent metrical effect was obtained for segmental errors. Acoustic analyses of word durations revealed a disproportionate advantage of the trochaic meter in the patients relative to the healthy controls. The results indicate that German apraxic speakers are sensitive to metrical information. It is assumed that metrical patterns function as prosodic frames for articulation planning, and that the regular metrical pattern in German, the trochaic form, has a facilitating effect on word production in patients with AOS. Key Words: Apraxia of speech; word stress; trochee; iamb; prosody

2

1. Introduction Adult speakers rely on acquired implicit knowledge about how to move their articulators for the production of speech sounds. This knowledge is considered to be embodied in “phonetic plans” which provide the basis of fluent, automated speech. Apraxia of speech (AOS), an acquired neurogenic motor speech disorder, is conventionally defined as resulting from a deconstruction of such information (e.g., Code, 1998; for a review see Ziegler et al., 2012). It may therefore provide a window into the internal make-up of learned speech motor behaviour. This article deals with the architecture of speech motor plans as inferred from speech errors of patients with AOS. More specifically, it addresses the question of how articulation planning for vowels and consonants interacts with the rhythmical structure of speech. This question is particularly controversial in psycholinguistic and phonetic theories of speech production. The speech production model developed by Levelt and coworkers (Levelt et al., 1999; Cholin et al., 2004), for instance, takes a perspective in which the segments of an utterance, i.e., its vowels and consonants, and the metrical structure of the utterance are represented on different tiers. The metrical frame of an intended word specifies the number of syllables and, in cases of irregular stress, the main stress position (Roelofs and Meyer, 1998). As soon as a word’s segmental and metrical properties have been retrieved, syllabification rules are applied to attach the segments, from left to right, to the metrical template. The output of this association process, a syllabified phonological word, constitutes the input to the phonetic encoding stage of word production. The phonetic encoding component of Levelt et al.’s (1999) speech production model dispenses with any prosodic mechanisms. In this theory, the frequent phonological syllables of syllabified words activate corresponding phonetic syllables stored in a mental syllabary. The syllabary is considered as a repository of holistic, overlearned motor plans containing the gestural specifications for the lower-level speech motor system (Cholin et al., 2006; Cholin, 2008). Metrical structure plays no role at this stage. As a consequence, if the phonetic encoder embodies overlearned, automated aspects of speech motor behavior, the motor patterns implementing, for instance, a trochaic vs. a iambic rhythm, are - in Levelt’s theory – not a part of this skill. Hence, there is also no interaction between segmental and prosodic information within the phonetic encoding component of this model. In contrast, phonetic studies provide abounding evidence for an interplay of segmental with metrical information. Generally, the realization of sounds is considered to be strongly influenced by the prosodic framework of an utterance. To mention only a few examples: In several studies stress-related “strengthening effects”, i.e., more extensive and prolonged articulations in stressed vs. unstressed syllables, were described (e.g., De Jong, 1995). Another prosodic effect related to the strengthening concept is “domain-final lengthening”: A study by Wightman and colleagues (1992), for instance, reported a more extreme lengthening at the end of higher prosodic domains (e.g., intonational phrases) as compared to lower prosodic domains (e.g., syllables). Similar effects have also been described domain-initially (e.g., Fujimura, 1990). Several electropalatographic studies have for instance reported that consonants are produced with greater articulatory magnitude in domain-initial as compared to domain-medial positions at different levels (e.g., Fougeron and Keating, 1997; Fougeron, 2001). These investigations are predominantly concerned with surface properties of speech sounds and speech movements and refrain from drawing inferences about the underlying phonetic encoding mechanisms. However, it can plausibly be assumed that at least some of these surface phenomena are mediated by highly automated interlinks between segmental and metrical properties at the phonetic encoding level and that they reflect learned aspects of the motor organization of speech. This assumption receives support by more recent investigations. In a tongue twister experiment with unimpaired speakers (Croot et al., 2010) it was shown that sentence-level prominence, i.e. accent, protects against errors. Croot and colleagues inferred from their results that segmental content and prosodic structure of an utterance are integrated during the phonetic encoding stage. In reading experiments, Sulpizio 3

and colleagues (2015) reported an influence of word stress on reading latency and pronunciation accuracy and concluded that stress assignment affects the reading process during articulatory planning. As regards the characteristics of the speech impairment in patients with AOS, there is general agreement that segmental aspects of speech motor programming are affected in these individuals. The cardinal symptoms of the disorder, i.e., phonemic errors and phonetic distortions, are considered segmental in nature because they affect the vowels and consonants of spoken utterances (for an overview see McNeil et al., 2009). Moreover, a wide range of factors related to the segmental structure of words are known to influence the error pattern of patients with AOS, including, for instance, sound class or syllable complexity (e.g., Canter et al., 1985; Odell et al., 1990; Romani and Galluzzi, 2005; Aichert and Ziegler, 2004a; Laganaro, 2008; Staiger and Ziegler, 2008; Galluzzi et al., 2015). In contrast, influences of prosodic properties (e.g., word stress) on accuracy in apraxic speakers have been neglected so far, even though dysfluency and prosodic impairment are regularly mentioned among its clinical signs. Kent and Rosenbek (1982, 1983) emphasized that syllable segregation and a lengthening of steady-state segments and transitions (articulatory prolongation) constitute characteristic symptoms of AOS (for a similar pattern in cases of Primary Progressive Aphasia with AOS see Ballard et al., 2014). Furthermore, a flattening of the intensity contours across syllable chains and of stress contrasts was described (Kent and Rosenbek, 1982, 1983; Itoh and Sasanuma, 1984; Marquardt et al., 1995). In the “Treatment guidelines for acquired apraxia speech” of the Academy of Neurologic Communication Disorders and Sciences (Wambaugh et al., 2006a) two out of five primary clinical characteristics are related to the suprasegmental characteristics of the disorder (p. xvii). However, the nature of the prosodic symptoms is controversial: some authors assume that they result from a genuinely prosodic impairment (e.g., Kent and Rosenbek, 1982, 1983; Boutsen and Christman, 2002), but others suggest that they reflect a compensation of a foremost segmental deficit (e.g., Lebrun, 1990; Marquardt et al., 1995). Remarkably, the interplay between articulation and prosody has been an issue in AOS therapy more than in basic research. A number of rhythm-based treatment methods have been proposed and considered successful in AOS therapy (for an overview see Wambaugh et al., 2006b; Ballard et al., 2015). Some of these techniques rely on internal rhythmical cues, such as finger tapping (Wambaugh and Martinez, 2000), while others use external cues, e.g. tactile (Rubow et al., 1982) or acoustic (Shane and Darley, 1978; Dworkin et al., 1988; Wambaugh and Martinez, 2000; Brendel and Ziegler, 2008). Furthermore, rhythmic speech and singing are basic elements of Melodic Intonation Therapy applied in the treatment of severely impaired non-fluent aphasics (e.g., Helm-Estabrooks et al., 1989; Sparks and Deck, 1994). Stahl and colleagues (2011) compared the contribution of singing vs. rhythm in patients with non-fluent aphasia, most of whom were diagnosed with AOS, and suggested that rhythm rather than melody was crucial in the recovery of speech in these patients. In sum, the results from treatment studies demonstrate that the symptoms of AOS can be modulated positively by interventions focused on the rhythmical aspects of speaking. Notably, the effects of these approaches are apparently not confined to the suprasegmental symptoms of apraxic speech (e.g. fluency), but they also encroach on articulatory accuracy at a segmental level. What is still lacking, however, is an experimental approach towards understanding how prosodic structure impacts speech motor planning in patients with AOS. Based on the premise that AOS provides a window into the internal make-up of learned speech motor behavior, we here seek evidence for an interaction of segmental and prosodic information in speech planning by analyzing the factors influencing the occurrence of speech errors in apraxic speakers. In a recent investigation including 33 patients with AOS we applied a nonlinear regression model to predict apraxic speech errors on words and nonwords as a function of gestural decomposition, syllable structure, syllable number, and metrical 4

rhythm (Ziegler and Aichert, 2015). In this study we had discovered, among several other things, that two-syllabic trochaic words, which constitute the regular metrical pattern in German (Domahs et al., 2008; Wiese 2000; e.g., ‘puma, engl. puma), were produced with fewer errors than iambic words (e.g., me’nü, engl. menu). However, since the materials used in this experiment were not devised to examine metrical influences on speech in a very specific way, this finding was obtained as a side effect which deserves closer investigation. The present study was therefore designed to directly explore the influence of word stress in a new sample of apraxic speakers by using highly specific and tightly controlled speech materials. 2. Method 2.1. Participants Twenty-one patients with a diagnosis of apraxia of speech (AOS) according to clinical records were recruited from twelve clinical centres in Germany. Before inclusion they were screened for participation by a thorough assessment of their spontaneous speech and a word repetition test administered by the first author. Inclusion criteria were (1) a confirmed diagnosis of AOS, (2) no or only very mild dysarthria, (3) no or only minor aphasic impairment in word repetition, (4) intact auditory discrimination, and (5) ability to participate in an auditory repetition test involving two-syllabic words. Confirmation of the clinical diagnosis of AOS was grounded on the following criteria (e.g., Ziegler, 2008): (i) inconsistent occurrence of phonetic distortions and presence of perceived phonemic errors, (ii) suprasegmental disturbances such as intrasyllabic pauses, syllable segregation, or lengthening of speech sounds and sound transitions, and (iii) visible/audible groping, self-corrections, and effortful speaking. From the initial patient sample of 21, seven participants failed to match the inclusion criteria. Four patients suffered from a notable dysarthric or aphasic impairment, and in three patients speech apraxia was too mild given the low demands of the single word repetition task administered here. Therefore, 14 patients were finally included. All participants were right-handed native speakers of German, most of them suffered from an ischemic infarct. There was no clinically apparent difference between the 11 patients who had ischemic infarcts and the three patients with other aetiologies (for an overview of demographic and clinical characteristics of the patient sample see Table 1). Table 1: Demographic and clinical characteristics of the patient sample (n = 14) Age in years: median (range) 64 (26-87) Months post onset: median (range) 8 (1-144) Aetiology ischemic (11), hemorrhagic (2), traumatic (1) Severity of aphasia1 no (3), mild (5), moderate (5), severe (1) 2 Severity of AOS 4 mild, 9 moderate, 1 severe Auditory word discrimination 14 unimpaired (LeMo3, test 2) Auditory word-picture matching 12 unimpaired, 2 at threshold (LeMo3, test 23) Verbal naming4 (LeMo3, test 30) 6 unimpaired, 2 at threshold, 6 moderate-severe 1 Aachener Aphasie Test; Huber et al., 1983 2 based on the percentage of segmentally correct items in the experimental materials (mild: >70% / moderate: 30–70% / severe: <30% correct responses). 3 LeMo; De Bleser et al., 2004 4 responses with errors on only one phoneme (e.g. [] (engl. dress)  []) were scored as correct. 5

According to the Aachener Aphasie Test (AAT; Huber et al., 1983), three patients had no aphasia, hence were classified as pure AOS. Only one patient was diagnosed with a severe aphasia. Most patients had undisturbed auditory single-word processing abilities as revealed by subtests 2 (auditory word discrimination) and 23 (auditory word-picture matching) from the LeMo battery (Lexikon modellorientiert, De Bleser et al., 2004). Hence, auditory impairment could be ruled out as a factor influencing repetition accuracy. An oral picture naming test (LeMo, test 30) revealed intact word retrieval in six patients. Two patients were mildly disturbed, and six patients displayed moderate to severe word-finding difficulties (table 1). Regarding the error types in the naming task, the patients with naming deficits produced mainly null reactions (50%) or semantic errors (35%). Of note, these two error types were not observed in the experimental word repetition task. Based on the results of these neurolinguistic data, a core assumption of this study is that the patients’ performance on the experimental task used here, i.e., word repetition, was primarily affected by their apraxic impairment, with a negligible aphasic or dysarthric contribution. Furthermore, there was no relationship between the severity of AOS and aphasia severity. More specifically, the patient with the most severe AOS was only mildly aphasic, and the patients with moderate AOS ranged across all degrees of aphasia severity, from no aphasia to severe aphasia. Severity of AOS was ranked on the basis of rates of overall segmental errors made in the present repetition experiment. In four cases the apraxic impairment was classified as mild (<30% errors), in nine cases as moderate (30–70% errors), and in two cases as severe (>70% errors). Six neurologically healthy control speakers (four female, two male; age, mean: 51, range: 36– 73) were also examined. 2.2. Materials and Procedure The materials consisted of 48 two-syllabic nouns with stress on the first (trochees) or second syllable (iambs), respectively. The word list was grouped into six sublists of 8 words each, including words with simple CV and CVC structures and with complex syllables in the stressed word position. All words were monomorphemic nouns of low lexical frequency (< 10 spoken/written per Million; CELEX database, Baayen et al., 1995). Trochees and iambs were matched for phoneme density (mean phoneme number per word in trochees: 5.25; in iambs: 5.17). The organization of the materials and examples are shown in Table 2. Table 2: Organization of the word list, with an example for each word type. Trochees (N=24) Iambs (N=24) CV (N=16) ‘Puma, engl. puma Ko‘pie, engl. copy CVC (N=16) ‘Muskel, engl. muscle Kos‘tüm, engl. costume CCVC / CVCC (N=16) ‘Plastik, engl. plastic Kom‘post, engl. compost Test words were also controlled for syllable frequency. Syllable frequency counts were taken from a German syllable frequency database (Aichert et al., 2005) based on the CELEX-corpus (Baayen et al., 1995). T-tests failed to reveal significant differences in syllable frequency between the stressed syllables of the trochees and the iambs (p >.05) and the unstressed syllables of the two item groups (p >.05). A position-wise analysis revealed that frequencies of the unstressed syllables in both positions were significantly higher compared to the stressed syllables in the corresponding position of the respective metrical cognates (position 1: t = 2.72; p < .01; position 2: t = 4,63; p < .001). The study was conducted in a single 60-minutes session during which patients were administered the word list as well as the subtests from the Lemo battery (De Bleser et al., 2004, see above). The list was administered twice in two separate runs of an auditory 6

repetition task, resulting in a total of 96 items per patient and 1344 items overall. Items were presented in a pseudo-randomized order. 2.3. Data Analysis Patients` responses were recorded on video- and audiotape (video camera: Panasonic NVGS180, external microphone: beyerdynamic TG-X-58). For auditory evaluation, each word was transcribed phonetically by the first author. Analyses were conducted using both the auditory and the visual (mouth movement) information. Symptoms which were not captured by the core IPA-symbols (e.g., phonetic distortions or prosodic deviations such as phoneme lengthening, intra- and intersyllabic pauses) were marked with diacritics. Additionally, acoustic measures of word duration were obtained using wideband spectograms of the recorded wav-files (Praat, 5.1.05; Boersma & Weenink, 2009). 2.3.1. Response accuracy The patients’ responses were first analysed with respect to correct / incorrect productions. Whole words were considered as error units, i.e., multiple errors on a word were counted as a single word error. In case of multiple attempts, the first response was transcribed. Errors were classified as segmental and prosodic, respectively. Segmental errors included phonetic distortions and perceived phonemic errors (i.e., substitutions, additions or omissions). Prosodic errors comprised phoneme lengthenings, schwa-insertions and intersyllabic or intersegmental pauses. Reactions that missed any phonological relationship to the target word (i.e., fragmentary responses) were marked as unclassified and were excluded from the statistical analysis. Furthermore, we also documented the occurrence of visible or audible groping behaviour for each response. In a second step, segmental errors were analysed on a syllable-by-syllable basis, i.e., errors on the first and second syllable of each word were counted separately. 2.3.2. Word duration Word durations were measured by determining the onset of the initial and the offset of the final phoneme in the acoustic signal. Whereas phonetically distorted reactions were included in this analysis, items whose phonemic content was altered (i.e., by single or multiple phonemic errors or by fragmentary responses) were not considered. Word duration measurements were also performed for the six control speakers. 2.3.3. Statistics All statistics were performed using R (R Core Team, 2015). Generalized Linear Mixed Models were calculated using lme4 (Bates et al., 2015).

2.4. Reliability 2.4.1. Reliability of diagnosis To validate the diagnosis of AOS, a second experienced speech therapist who had no preinformation about the patients and who was not aware of the research question was presented stretches of spontaneous speech and extracts from a word repetition task of all 14 patients. Additionally, four patients with aphasic-phonological symptoms were included in the classification rating. In a first step, the second rater was asked to classify each patient as either speech apraxic or aphasic-phonological. In a second step she was asked to evaluate how certain she was in her classification (certain, slightly uncertain, very uncertain). Among the 14 patients included in this study, the independent rater was in accordance with the original diagnosis of AOS in 13 cases (certain: 10, slightly uncertain: 3). In one patient who had pure AOS with only mild symptoms the rater went for a classification as aphasic-phonological, qualifying her decision as “very uncertain”. In all four patients with aphasic-phonological disorder her classification was in accordance with the prior diagnosis (certain: 2, slightly uncertain: 2). 7

2.4.2. Reliability of accuracy scores In order to determine the reliability of the accuracy scores, the first 22 words of both runs (N = 44 words) from four randomly selected patients were analysed by the same speech therapist who also did the classification rating. There was substantial agreement (Kappa statistics, Landis & Koch, 1977) between the two transcribers for the segmental ( = .785, p <.001) and the suprasegmental whole-word errors ( = .646, p <.001). Furthermore, there was almost perfect agreement for the evaluation of the groping behaviour ( = .950, p <.001). Regarding the syllable-wise analysis of the segmental errors we also found a significant agreement between the ratings for the first syllable ( = .768, p <.001 / substantial agreement) and the second syllable ( = .864, p <.001 / almost perfect agreement). Regarding acoustic measurements, 28 words which were suitable for word duration analysis (i.e., phonemically correct productions) from the four patients were re-analysed (N = 112 items). A high correlation between the values measured by the two examiners was found (Pearson Correlation, r = .998, p <.001). 3. Results 3.1. Response accuracy: Word-based errors Since the healthy controls made no errors in the repetition task they were not included in the error analyses. In the patient group, 25 out of 1344 responses (9 trochees, 16 iambs; 1,9%) were excluded as unclassified. The unclassified reactions were exclusively due to fragmentary responses following articulatory groping. There were no null reactions and no semantic errors in the repetition task. Figure 1 depicts mean error rates by stress type (trochee, iamb) and syllable structure (CV, CVC, complex). Word-based segmental and prosodic errors and groping behaviours are plotted in the left, middle, and right panels, respectively. Overall, the amount of segmental errors significantly outnumbered the amount of prosodic errors and groping behaviours, respectively. This pattern also held for each individual subject.

Figure 1. Mean error rates (word-based) as a function of stress and syllable structure. Left: segmental errors; middle: prosodic errors; right: groping behaviour. The stressed syllables had a CV-, CVC-, or a complex structure with a consonant cluster (CC). As a general tendency, higher error rates occurred on the iambic as compared to the trochaic items. Average segmental error rates across the three syllable types were .33 in the trochees vs. .55 in the iambs. Prosodic error rates were .10 vs. .14, groping rates .22 vs. .24. There was also an expected increase of error rates with increasing syllable complexity. Generalized linear mixed models (logit-link) were computed for each error type separately, with the fixed effects factors METER and (syllable) STRUCTURE and the random effects factors PARTICIPANT, ITEM and RUN. In order to account for the possibility that participants and items might differ in their sensitivity to METER, random slopes were also specified for byPARTICIPANT and by-ITEM effects of METER. Goodness-of-fit was estimated using the Hosmer and Lemeshow test (HL test) for logistic models (Hosmer & Lemeshow, 1980). Table 3. Generalized linear mixed logit models of the influence of metrical pattern (trochaic vs. iambic) on segmental and prosodic word level errors and on the occurrence of groping. The structure of the stressed syllable (CV, CVC, complex) was modelled as a further fixed factor. Participants, items and runs (first vs. second) were included as random effects factors. 8

HL-test: Hosmer & Lemeshow (p-values > .05 indicate that the model should not be rejected). Significance levels: *** p<.001; ** p<.01; * p<.05; ns not significant. segmental errors prosodic errors1 Groping 2 HL-test χ (8) = 9.8; p=.28 χ2(8) = 10.5; p=.23 χ2(8) = 15.2; p=.06 Wald z df sig Wald z df sig Wald z df sig METER 4.39 1 *** 2.47 1 * 1.70 1 n.s. STRUCTURE 4.12 2 *** 5.21 2 *** 3.31 2 *** METER x STRUCTURE -2.18 2 * -2.42 2 * -2.00 2 * 1

Since 8 patients made no prosodic errors in one or several conditions, the model was calculated for only the six remaining participants. The results are plotted in table 3. For each model, significant deviation from the empirical data could be rejected by the HL test. Goodness-of-fit was poorest for the groping data. There were significant main effects of STRUCTURE across all error types, and a significant influence of METER for the segmental and the prosodic errors. There was also a significant interaction of METER x STRUCTURE in all three error variables. Plots of the random slopes of the byPARTICIPANT effects of METER revealed that the advantage of the trochaic pattern regarding segmental and prosodic errors was homogeneous across the apraxic participants, since coefficients were all positive and varied within a very small range (segmental: 1.2 - 1.8; prosodic: 1.4 - 2.3). Closer inspection of figure 1 reveals that the advantage of trochees was much smaller for the words containing consonant clusters than for the CV- and the CVC-words. A straightforward explanation of this result is that in the trochees the clusters occurred in the word onset (e.g., ‘plastic), whereas in the complex iambs the cluster was in the onset or the coda of the second syllable (e.g.,so’pran, com’post), - an imbalance which was due to the fact that iambs with complex word onsets are rare in German. Hence, the results plotted in figure 1 suggest that the benefit of the trochaic pattern was partly outweighed by onset complexity, especially in the case of the prosodic and the groping errors. To compensate for this obvious influence of onset position, a further series of GLM logit models was conducted in which only words with CV- and CVC-syllables were included. Table 4 plots the results of these analyses, demonstrating that METER again had a significant effect on segmental and prosodic errors. In the absence of the consonant cluster words, an influence of syllable structure was only seen on the prosodic errors, and the interaction of METER with STRUCTURE disappeared. Again, the random slope coefficients of the byPARTICIPANT effects of METER were uniformly positive and ranged within a small interval in the two analyses. Although the influence of the word stress pattern turned out to be homogeneous across the apraxic participants in almost all analyses, whole-word errors were also analysed for each single participant. In these analyses, words with complex clusters were excluded and the CVand CVC-conditions were collapsed. Pearson’s chi-squared tests were performed, with an alpha-level of .05. Regarding segmental errors, 9/14 patients were significantly less accurate on iambic than on trochaic words, among them were two of the three patients with pure AOS. Three of them also produced significantly more prosodic errors on the iambic as compared to the trochaic words. Additionally, one further patient (the third patient with pure AOS) also showed a facilitating trochee-effect in his groping behaviour. Reverse effects were not observed in any of the 14 participants on any of the three error variables.

9

Table 4. Generalized linear logit models as in table 3 above, with the difference that syllables with consonant clusters were excluded, i.e., the factor STRUCTURE includes only two levels (CV, CVC). Significance levels: *** p<.001; ** p<.01; * p<.05; ns not significant. segmental errors prosodic errors1 Groping 2 2 HL-test χ (8) = 13.6; p=.09 χ (8) = 8.0; p=.43 χ2(8) = 24.8; p<.01 Wald z df sig Wald z df sig Wald z df sig METER 4.72 1 *** 2.85 1 ** 1.37 1 n.s. STRUCTURE 1.85 1 n.s. 2.83 1 ** 1.46 1 n.s. METER x STRUCTURE -.77 1 n.s. -1.84 1 n.s. -.88 1 n.s. 1

Since 8 patients made no prosodic errors in one or several conditions, the model was calculated for only the six remaining participants.

3.2. Response accuracy: Syllable-based errors In order to investigate whether the metrical effects reported above were ascribable to syllable position or syllable stress, syllable-wise error counts were analyzed. Figure 2 depicts mean error rates as a function of syllable position (first vs. second) and stress (stressed vs. unstressed).

Figure 2: Segmental errors per syllable (in %), broken down by syllable position (first vs. second) and syllable stress (stressed: filled squares; unstressed: open circles). Labels indicate how the four syllable types combine in the two metrical patterns. On the average there was an overall tendency towards more errors on the first relative to the second syllable (28% vs. 20%) and also a slightly greater vulnerability of stressed vs. unstressed syllables (26% vs. 23%). Interestingly, however, the stressed first syllable of the trochees was produced more accurately than the unstressed first syllable of the iambs, demonstrating that the metrical effect outweighed the syllable-stress effect by far. Moreover, the final stressed syllables of the iambic words were more error-prone than the initial stressed syllables of the trochees, indicating that the overall benefit of the final vs. the initial syllable was reversed in the stressed condition, i.e., the metrical effect outplayed the disadvantage of the initial position. 10

A generalized linear mixed logit model with two fixed effects factors (STRESS, POSITION) and the random effects factors PARTICIPANT and ITEM explained the data significantly (LH χ2(8)=10.56, p=.23). Both the POSITION (Wald- z(1) = -6.2, p<.001) and the STRESS effect (Wald- z(1) = -3.1, p<.01) became significant, and there was also a significant STRESS x POSITION interaction (Wald- z(1) = 6.7, p<.001). 3.3. Word duration In the control group, all word productions were phonemically correct und could therefore be included in the acoustic analyses. In the patient group, only 65% of all items (n =875) could be analysed for word duration. There were more trochaic (n = 505) than iambic words (n = 370) remaining in the analysis, which reflects the higher vulnerability of iambic words. Yet, the distribution of analysable items across the three syllable structure types was similar in the trochaic and the iambic words, with at least 110 items per syllable structure remaining in each word stress category. Figure 3 depicts average word duration as a function of metrical pattern and syllable complexity for the healthy participants (left panel) and the AOS patients (right panel). As can be seen in figure 3, iambic words tended to have longer durations than trochaic words (mean difference across groups and syllable structures: 99 ms). Words with complex syllables were produced with the longest durations, and this effect was more pronounced in the trochaic than in the iambic words.

Figure 3. Word duration as a function of metrical and syllabic structure. Left: healthy controls; right: AOS patients. A linear mixed model was calculated with the fixed effects factors METER (trochees vs. iambs), STRUCTURE (CV, CVC) and GROUP (controls vs. AOS patients) and the random effects factors PARTICIPANT, ITEM, and RUN. Due to the position asymmetry of consonant clusters in trochees vs. iambs (see above), the analysis was confined to the CV and CVC stimuli. Significant main effects of METER (F(1, 35.1) = 22.5, p < .001) and of STRUCTURE (F(1, 35.1) = 4.3, p < .05) were obtained. There was also a significant main effect of GROUP (F(1, 14.9) =5.5, p < .05), i.e., patients with AOS had significantly longer word durations than the controls, with an average difference of 136 ms. Most importantly, a significant GROUP x METER interaction was found (F(1, 690.0) = 13.0, p < .001), indicating that the disadvantage of the iambic pattern was stronger in the patients than in the healthy participants. Taken together, these acoustic data provide additional and objective evidence for a metrical influence on specifically the prosodic aspects of apraxic speech. The fact that the metrical effect was still there after exclusion of the more vulnerable items (i.e., items with phonemic errors) argues for the robustness of the effect. 11

4. Discussion This is the first systematic investigation of how word stress affects error production in apraxia of speech. A group of 14 patients with left hemisphere strokes whose single word production was predominantly influenced by their apraxic speech impairment was examined by a wordrepetition task involving disyllabic words with a trochaic (i.e., stressed – unstressed) and an iambic (i.e., unstressed – stressed) metrical pattern. The words were systematically varied for the structure of stressed syllables and were paralleled for lexical frequency, syllable frequency, and phoneme number. As a general result, trochees (which constitute the predominant metrical pattern in German) were less error prone than iambs. The benefit of the regular (= trochaic) pattern was particularly pronounced with regard to rates of segmental errors, but was also found in prosodic error counts (prolonged sounds and sound transitions, within-word pauses). Furthermore, an impact of metrical patterning was also observed on word durations. Durations turned out to be longer for iambic than for trochaic words in both the healthy control speakers and the apraxic patients, but the effect was significantly stronger in the patients. As regards groping behaviour, there was no significant influence of word stress at the group level (though individual patients did show a metrical effect). Instead, groping for articulations appeared to be more related to word onset complexity. Syllable-wise error counts revealed that the impact of a word’s metrical pattern on segmental accuracy was not ascribable to syllable-related factors like position, stress, or frequency. It is widely accepted that apraxic speakers are susceptible to failure at word onsets and less vulnerable at the end of a word (e.g., Aichert and Ziegler, 2013a; Canter et al., 1985; Croot, 2002). However, in this study the first syllable of the trochees was produced more accurately than both syllables of the iambs. Therefore, the metrical effect emerged to be stronger than the position effect. To give an example: the stressed syllable [ju:] in the trochee ‘judo (engl. judo) was produced with a considerably lower error rate (45%) than the unstressed syllable [ju:] in the iamb ju’wel (engl. jewel, error rate: 64%), despite the fact that the two syllables have the same segments. Furthermore, we found that stressed syllables were not per se more errorprone than unstressed syllables, as would have been predicted by Odell et al. (1991). To the contrary, the first, unstressed syllables of the iambic words were even produced with the highest error rates. This finding deserves particular mention because the first syllables of the iambic words had higher syllable frequencies than the first syllables of the trochaic words. Obviously, the influence of the regular metrical pattern even outplayed the syllable frequency effect that has been described in several earlier studies (Aichert and Ziegler, 2004a; Laganaro, 2008; Staiger and Ziegler, 2008). Notably, however, the trochee effect did not swamp all segmental influences, - especially not the consonant cluster effect observed earlier (e.g., Aichert and Ziegler, 2004a). In the experiment described here, the presence of consonant clusters in the stressed syllables attenuated the trochaic advantage, presumably because the clusters occurred in the word onsets of the trochees, but word-medially or -finally in the iambs. Onset clusters were particularly vulnerable to errors, and especially for the prosodic and groping errors they nullified or even reversed the metrical effect. Overall, the results presented here show that segmental and prosodic factors interact closely in apraxia of speech and should therefore not be considered independent in their influence on speech motor planning. Hence, our study provides new evidence for a nonlinear architecture of speech motor representations, in which the articulatory primitives of speech motor planning, i.e., phonetic gestures for the articulation of vowels and consonants, are dominated by the rhythmical patterning of words (cf. Ziegler, 2005, 2009; Ziegler and Aichert, 2015). This perspective is at variance with theories suggesting that the phonetic encoding mechanism operates on linear strings of phonemes or syllables. In the influential model proposed by 12

Levelt et al. (1999), for instance, metrical properties of utterances are accessed independently from the segmental content during word form (phonological) encoding and play no role at the subsequent phonetic encoding stage. There is also some neurolinguistic evidence supporting such a dualist view of segmental vs. prosodic encoding, as for instance data from an earlier study in which we reported on two aphasic patients with a mutually dissociating pattern of segmental and prosodic errors (Aichert & Ziegler, 2004b), or other reports of patients whose impairment was purely prosodic since they mis-stressed words with an irregular metrical pattern (Cappa et al., 1997; Laganaro et al., 2002). A weak interaction of segmental with prosodic information was demonstrated in aphasic patients who showed a tendency to omit unstressed syllables and to make more segmental errors on unstressed as compared to stressed syllables (Nickels & Howard, 1999). In plain contrast, the AOS patients of the present study exhibited no bias for unstressed syllables to be universally more vulnerable, but rather showed that segmental accuracy is a function of the overall metrical organization of a word. Furthermore, there was no case where a word was mis-stressed, i.e., an iambic target word realized with a trochaic stress pattern or vice versa. This contrasts with reports of inappropriate word stress patterns in related aetiologies, such as childhood apraxia of speech (e.g., Shriberg et al., 2003) and foreign accent syndrome (e.g., Miller et al., 2006). Our result may be explained by assuming that the assignment of stress is already completed when the phonological word form is fed into the phonetic encoding process. The facilitating effect of the trochaic (i.e., the predominant, regular, unmarked) stress pattern in our patients with AOS supports studies of normal speakers which also pointed at a close link between segmental and prosodic aspects and an advantage of the trochee in speech motor planning (e.g., Croot et al., 2010; Sulpizio et al., 2015). We assume that, at least for German, the strong-weak stress pattern constitutes a particularly stable architecture, in which the lower-level gestural and syllabic motor routines for words are organized more coherently than in words with a weak-strong pattern. From a clinical perspective, the results of this study suggest that the facilitating effect of the regular word stress might also be helpful in the treatment of apraxic speakers. Choosing words with a regular stress pattern may enhance speech motor learning in patients with AOS. In a recent learning experiment in which we investigated the effectiveness of syllable-based learning in AOS (Aichert and Ziegler, 2013b) we found that the learning of single syllables led to improvements on (untrained) two-syllabic trochaic words. Obviously, the transfer from single syllables to trochees required only little extra motor planning for these patients. In a treatment report of a German patient with chronic, pure AOS (Aichert and Ziegler, 2010) we used metrically structured phrases composed of trochaic words (e.g., ‘eine ‘laute ‘Pauke, engl. a loud timbal) and poems with an underlying regular metric (e.g., Goethe's "The Sorcerer's Apprentice"). Anecdotally, after the first exercises with these materials the patient was struck by experiencing an immediate increase in fluency. After 20 treatment sessions we observed a clearly increased speaking rate. Taken together, these data show that the regular metrical pattern in German, the trochaic form, facilitates articulatory accuracy and articulatory fluency in patients with AOS. We assume that trochees are particularly stable phonetic patterns in German. Furthermore, the results suggest that segmental and prosodic aspects of speech motor planning are intertwined, which fits well with earlier suggestions of a hierarchically nested organization of phonetic plans (Ziegler and Aichert, 2015). Finally, the results presented here may also contribute to discussions about the units to be chosen as treatment targets in AOS therapy. Acknowledgments The first author of this study was supported by a grant from the German Research Council (Deutsche Forschungsgemeinschaft [DFG], Grant Zi 469/14–1). We are grateful to the 13

patients and to the healthy control speakers for their participation. Furthermore, we thank Bernadette Vögele for contributing to the reliability analyses.

References Aichert, I., & Ziegler, W. (2004a). Syllable frequency and syllable structure in apraxia of speech. Brain and Language, 88, 148–159. Aichert, I., & Ziegler, W. (2004b). Segmental and metrical encoding in aphasia: Two case reports. Aphasiology, 18, 1201 – 1211. Aichert, I., & Ziegler, W. (2010). Therapie bei chronischer Sprechapraxie: Vorgehensweise am Beispiel eines Patienten mit reiner Sprechapraxie. Forum Logopädie, 3, 6-13. Aichert, I., & Ziegler, W. (2013a). Word position effects in apraxia of speech: Group data and individual variation. Journal of Medical Speech-Language Pathology, 20, 7-11. Aichert, I., & Ziegler, W. (2013b). Segments and syllables in the treatment of apraxia of speech: An investigation of learning and transfer effects. Aphasiology, 27, 1180-1199. Aichert, I., Marquardt, C., & Ziegler, W. (2005). Frequenzen sublexikalischer Einheiten des Deutschen: CELEX-basierte Datenbanken. Neurolinguistik, 19, 55 – 81. Baayen, R. H., Piepenbrock, R., & Gulikers, L. (1995). The CELEX lexical database (CDROM). Philadelphia, PA: Linguistic Data Consortium, University of Pennsylvania. Ballard, K.J., Savage, S., Leyton, C.E., Vogel, A.P., Hornberger, M., & Hodges, J.H. (2014). Logopenic and nonfluent variants of primary progressive aphasia are differentiated by acoustic measures of speech production. PloS One, 9, 1-14. Ballard, K.J., Wambaugh, J.L., Duffy, J.R., Layfield,C., Maas, E., Mauszycki, S. & McNeil, M.R. (2015). Treatment for acquired apraxia of speech: A systematic review of intervention research between 2004 and 2012. American Journal of Speech-Language Pathology, 24, 316337. Bates, D., Maechler, M., Bolker, B., & Walker, S. (2015). lme4: Linear mixed-effects models using Eigen and S4. R package version 1.1-8, URL: http://CRAN.Rproject.org/package=lme4. Boersma, P., & Weenink, D. (2009). Praat: Doing phonetics by computer (Version 5.1.05) [Computer software]. Retrieved from www.praat.org. Boutsen, F. R. & Christman, S. S. (2002). Prosody in apraxia of speech. Seminars in Speech and Language, 23, 245-255. Brendel, B. & Ziegler, W. (2008). Effectiveness of metrical pacing in the treatment of apraxia of speech. Aphasiology, 22, 77-102. Canter, G. J., Trost, J. E., & Burns, M. S. (1985). Contrasting speech patterns in apraxia of speech and phonemic paraphasia. Brain and Language, 24, 204–222. Cappa, S. F., Nespor, M., Ielasi, W., & Miozzo, A. (1997). The representation of stress: evidence from an aphasic patient. Cognition, 65, 1-13. Cholin, J., Schiller, N. O., & Levelt, W. J. M. (2004). The preparation of syllables in speech production. Journal of Memory and Language, 50, 47-61. Cholin, J., Levelt, W. J. M., & Schiller, N. O. (2006). Effects of syllable frequency in speech production. Cognition, 205-235. Cholin, J. (2008). The mental syllabary in speech production: An integration of different approaches and domains. Aphasiology, 22, 1127-1141. Code, C. (1998). Major review: models, theories and heuristics in apraxia of speech. Clinical Linguistics and Phonetics, 12, 47-65. Croot, K. (2002). Diagnosis of AOS: definition and criteria. Seminars in Speech and Language, 23, 267-280. Croot, K., Au, C., & Harper, A. (2010). Prosodic structure and tongue twister errors. In C. Fougeron, B. Kuehnert, M. d'Imperio, & N. Vallée (Eds.), Papers in laboratory phonology 10: 14

Variation, phonetic detail and phonological representation (pp. 433-459). Berlin: De Gruyter Mouton. De Bleser, R., Cholewa, J., Stadie, N., & Tabatabaie, S. (2004). LeMo - Lexikon modellorientiert. Einzelfalldiagnostik bei Aphasie, Dyslexie und Dysgraphie. Muüchen: Elsevier. De Jong, K. J. (1995). The supraglottal articulation of prominence in English: Linguistic stress as localized hyperarticulation. The Journal of the Acoustical Society of America, 97, 491-504. Domahs, U., Wiese, R., Bornkessel-Schlesewsky, I. & Schlesewsky, M. (2008). The processing of German word stress: evidence for the prosodic hierarchy. Phonology, 25, 1-36. Dworkin, J. P., Abkarian, G. G., & Johns, D. F. (1988). Apraxia of speech: the effectiveness of a treatment regimen. Journal of Speech and Hearing Disorders, 53, 280-294. Fougeron, C. & Keating, P. A. (1997). Articulatory strengthening at edges of prosodic domains. Journal of the Acoustical Society of America, 101, 3728-3740. Fougeron, C. (2001). Articulatory properties of initial segments in several prosodic domains. Journal of Phonetics, 29, 109-135. Fujimura, O. (1990). Methods and goals of speech production research. Language and Speech, 33, 195-258. Galluzzi, C., Bureca, I., Guariglia, C., & Romani, C. (2015). Phonological simplifications, apraxia of speech and the interaction between phonological and phonetic processing. Neuropsychologia, 71, 64-83. Helm-Estabrooks, N., Nicholas, M., & Morgan, A. (1989). Melodic intonation therapy. Manual. Austin, TX: Pro-Ed. Hosmer, D.W., & Lemeshow, S. (1980). A goodness-of-fit test for the multiple logistic regression model. Communications in Statistics, A10, 1043-1069. Huber, W., Poeck, K., Weniger, D., & Willmes, K. (1983). Aachener Apasie test (AAT). Göttingen: Hogrefe. Itoh, M. & Sasanuma, S. (1984). Articulatory movements in apraxia of speech. In J.C. Rosenbek, M. R. McNeil, & A. E. Aronson (Eds.), Apraxia of Speech: Physiology, Acoustics, Linguistics, Management (pp. 135-165). San Diego: College-Hill Press. Kent, R. D. & Rosenbek, J. C. (1982). Prosodic disturbance and neurologic lesion. Brain and Language, 15, 259-291. Kent, R. D. & Rosenbek, J. C. (1983). Acoustic patterns of apraxia of speech. Journal of Speech and Hearing Research, 26, 231-249. Laganaro, M., Vacheresse, F., & Frauenfelder, U. H. (2002). Selective impairment of lexical stress assignment in an Italien-speaking aphasic patient. Brain and Language, 81, 601-609. Laganaro, M. (2008). Is there a syllable frequency effect in aphasia or in apraxia of speech or both? Aphasiology, 22, 1191-1200. Landis, J. R. & Koch, G. G. (1977). The measurement of observer agreement for categorial data. Biometrics, 33, 159-174. Lebrun, Y. (1990). Apraxia of speech: a critical review. Journal of Neurolinguistics, 5, 379406. Levelt, W. J. M., Roelofs, A., & Meyer, A. S. (1999). A theory of lexical access in speech production. Behavioral and Brain Sciences, 22, 1-38. Marquardt, T. P., Duffy, G., & Cannito, M. P. (1995). Acoustic analysis of accurate word stress patterning in patients with apraxia of speech and Broca`s aphasia. American Journal of Speech Language Pathology, 4, 180-185. McNeil, M.R., Robin, D.A., & Schmidt, R.A. (2009). Apraxia of speech: definition and differential diagnosis. In M.R. McNeil (Ed.). Clinical management of sensorimotor speech disorders (2nd Ed.) (pp. 249–268). New York, NY: Thieme. 15

Miller, N., Lowit A., & O'Sullivan H. (2006). What makes acquired foreign accent syndrome foreign? Journal of Neurolinguistics, 19, 385-409. Nickels, L., & Howard, D. (1999). Effects of lexical stress on aphasic word production. Clinical Linguistics and Phonetics, 13, 269-294. Odell, K., McNeil, M., Rosenbek, J. C., & Hunter, L. (1990). Perceptual characteristics of consonant production by apraxic speakers. Journal of Speech and Hearing Disorders, 55, 345–359. Odell, K., McNeil, M. R., Rosenbek, J. C., & Hunter, L. (1991). Perceptual characteristics of vowel and prosody production in apraxic, aphasic, and dysarthric speakers. Journal of Speech and Hearing Research, 34, 67-80. R Core Team (2015). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/. Roelofs, A. & Meyer, A. S. (1998). Metrical structure in planning the production of spokeln words. Journal of Experimental Psychology: Learning, Memory, and Cognition, 24, 922-939. Romani, C., & Galluzzi, C. (2005). Effects of syllabic complexity in predicting accuracy of repetition and direction of errors in patients with articulatory and phonological difficulties. Cognitive Neuropsychology, 22, 817–850. Rubow, R. T., Rosenbek, J. C., Collins, M., & Longstreth, D. (1982). Vibrotactile stimulation for intersystemic reorganization in the treatment of apraxia of speech. Archives of Physical Medicine and Rehabilitation, 63, 150-153. Shane, H. C. & Darley, F. L. (1978). The effect of auditory rhythmic stimulation on articulatory accuracy in apraxia of speech. Cortex, 14, 444-450. Shriberg, L.D., Campbell, T.F., Karlsson, H.B., Brown, R.L., McSweeny, J.L., & Nadler, C.J. (2003). A diagnostic marker for childhood apraxia of speech: The lexical stress ratio. Clinical Linguistics and Phonetics, 17, 549–574 Sparks, R. & Deck, J. (1994). Melodic intonation therapy. In R. Chapey (Ed.), Language intervention strategies in adult aphasia (pp. 368-379). Baltimore, MD: Williams & Wilkins. Stahl, B., Kotz, S. A., Henseler, I., Turner, R., & Geyer, S. (2011). Rhythm in disguise: why singing may not hold the key to recovery from aphasia. Brain, 134, 3083-3093. Staiger, A., & Ziegler, W. (2008). Syllable frequency and syllable structure in the spontaneous speech production of patients with apraxia of speech. Aphasiology, 22, 12011215. Sulpizio, S., Spinelli, G., & Burani, C. (2015). Stress affects articulatory planning in reading aloud. Journal of Experimental Psychology: Human Perception and Performance, 41, 453461. Wambaugh, J. L. & Martinez, A. L. (2000). Effects of rate and rhythm control treatment on consonant production accuracy in apraxia of speech. Aphasiology, 14, 851-871. Wambaugh, J. L., Duffy, J. R., McNeil, M. R., Robin, D. A., & Rogers, M. A. (2006a). Treatment guidelines for acquired apraxia of speech: A synthesis and evaluation of the evidence. Journal of Medical Speech Language Pathology, 14, xv-xxxiii. Wambaugh, J. L., Duffy, J. R., McNeil, M. R., Robin, D. A., & Rogers, M. A. (2006b). Treatment guidelines for acquired apraxia of speech: Treatment descriptions and recommendations. Journal of Medical Speech Language Pathology, 14, xxxv-ixvii. Wiese, R. (2000). The Phonology of German. Oxford: Oxford University Press. Wightman, C. W., Shattuck-Hufnagel, S., Ostendorf, M., & Price, P. J. (1992). Segmental durations in the vicinity of prosodic phrase boundaries. Journal of the Acoustical Society of America, 91, 1707-1717. Ziegler, W. (2005). A nonlinear model of word length effects in apraxia of speech. Cognitive Neuropsychology, 22, 603-623. Ziegler, W. (2008). Apraxia of speech. In G. Goldenberg, & B. Miller (Eds.), Handbook of clinical neurology (Vol. 88, pp. 269-285). London: Elsevier. 16

Ziegler, W. (2009). Modelling the architecture of phonetic plans: evidence from apraxia of speech. Language and Cognitive Processes, 24, 631e661. Ziegler, W, Aichert, I, & Staiger, A. (2012). Apraxia of speech: concepts and controversies. Journal of Speech, Language, and Hearing Research, 55, 1485-1501. Ziegler, W. & Aichert, I. (2015). How much is a word? Predicting ease of articulation planning from apraxic speech error patterns. Cortex, 69, 24-39. Highlights - patients with AoS are sensitive to metrical information - metrical feet function as prosodic frames for articulation planning - the regular metrical pattern in German, the trochaic form, facilitates apraxic speech

17