Research in Autism Spectrum Disorders 17 (2015) 116–125
Contents lists available at ScienceDirect
Research in Autism Spectrum Disorders Journal homepage: http://ees.elsevier.com/RASD/default.asp
Nasal voice in boys with high-functioning autism spectrum disorder Audrey M. Smerbeck * Rochester Institute of Technology, 2388 Eastman Hall, 18 Lomb Memorial Drive, Rochester, NY 14623, United States
A R T I C L E I N F O
A B S T R A C T
Article history: Received 8 January 2015 Received in revised form 10 June 2015 Accepted 23 June 2015 Available online 7 July 2015
This study compared speech samples of 29 boys aged 6–13 with high-functioning autism spectrum disorder (HFASD) to those of 29 typically developing (TD) boys matched on age and ethnicity. Ten listeners blind to speakers’ diagnoses rated speech samples for nasality and reported their perceptions of the speaker on a 6-point Likert-type scale. Results indicated significantly greater listener-perceived nasality in the HFASD than the TD group. Listeners rated the HFASD group significantly higher than the TD group on negative socially relevant adjectives, a finding which was mediated by nasality. In addition, compared to TD speakers, speakers with HFASD were rated lower on dominance and perceived age, as well as higher on perceived disability. ß 2015 Elsevier Ltd. All rights reserved.
Keywords: Autism spectrum disorder HFASD Asperger’s disorder Nasality Voice Resonance
1. Introduction Autism spectrum disorder (ASD) is primarily characterized by two groups of symptoms: social-communicative impairment and a pattern of restricted, repetitive, and stereotyped behaviors, activities and interests (American Psychiatric Association [APA], 2013). While some individuals with ASD have intellectual disability and/or severe deficits in verbal communication, others display normal intelligence and fluent speech. The latter group generally encompasses cases previously diagnosed as having Asperger’s disorder and is often described as ‘‘high-functioning’’ (e.g., see Mayes et al., 2009; Volker et al., 2010). Although the two core domains of impairment define high-functioning autism spectrum disorder (HFASD), additional associated features have been frequently observed. For example, children with HFASD suffer high rates of bullying and ostracism by age mates, which are in turn associated with mood and anxiety disorders (Asperger, 1944; Green, Gilchrist, Burton, & Cox, 2000; Lopata et al., 2010). Similarly, fine and gross motor impairments have frequently been observed (e.g., see Dowell, Mahone, & Mostofsky, 2009; Ghaziuddin, Butler, Tsai, & Ghaziuddin, 1994; Green et al., 2009; Ming, Brimacombe, & Wagner, 2007; Staples & Reid, 2010). This report proposes an additional associated feature: nasal voice. Unspecified or varied abnormalities in voice have previously been described in this population. In fact, one of Asperger’s original case studies, Ernst, was noted to have a voice which was ‘‘high, slightly nasal and drawn out’’ (Asperger, 1944). Since then, the research literature has occasionally mentioned voice abnormalities in individuals with HFASD – sometimes with specific mention of, but never with a focus on, nasality (Baltaxe, Simmons, & Zee, 1984; Fine, Bartolucci, Ginsberg, & Szatmari, 1991; Paccia & Curcio, 1982; Provonost, Wakstein, & Wakstein, 1966; Shriberg et al., 2001).
* Tel.: +1 585 475 3953. E-mail address:
[email protected] http://dx.doi.org/10.1016/j.rasd.2015.06.009 1750-9467/ß 2015 Elsevier Ltd. All rights reserved.
A.M. Smerbeck / Research in Autism Spectrum Disorders 17 (2015) 116–125
117
1.1. Mechanics of speech production Human speech is produced by using the organs of the vocal tract, such as larynx, tongue, and lips, to shape and control the flow of air as it exits the lungs (Ladefoged & Maddieson, 1996). During non-speech exhalation, air flows freely up the throat. It passes the pharynges and the velum, which are relaxed and lie separated, and exits via the nasal cavity. During production of many speech sounds, however, the velum and pharynges move toward one another, sealing off the nasal cavity and diverting airflow to the oral cavity, where the tongue and lips can create the majority of English phonemes by varying position and pressure. From a developmental perspective, it is important to note that the ‘‘path of least resistance’’ for exhaled air is through the nasal cavity; newborns produce only nasal exhalation (for both breathing and vocalization) until approximately three months of age, when some capacity to close the velopharyngeal apparatus is achieved (Netsell, 1981). The majority of English consonants are produced with little or no nasal resonance, requiring the speaker to close the velopharyngeal apparatus (Ladefoged & Maddieson, 1996). All vowels include a mixture of oral and nasal resonance, but in American English nasal resonance is minimal except when the speaker is using prosodic features to convey displeasure (i.e., whining tone; Kummer & Lee, 1994). An exception occurs when vowels are produced immediately before or after a nasal consonant [/m/, /n/, and /E/]. These vowels assimilate to the features of the nasal consonant and have increased nasal resonance in normal speakers (Ladefoged & Maddieson, 1996). It is common to refer to voice and prosody as if they were interchangeable, but in actuality they refer to separate, though highly interrelated, facets of communication. Prosody refers to the manipulation of stress, pitch, volume, rate, duration, and timbre to emphasize or modify the grammatical, pragmatic, or affective message conveyed by a spoken utterance (Paul, Augustyn, Klin, & Volkmar, 2005). Voice is the vehicle through which prosody is conveyed in spoken languages, but it also has its own independent, baseline features. Nasality is a feature of voice, generally produced consistently by an individual, independent of communicative intent (Kummer & Lee, 1994). However, nasality can be prosodically manipulated within an utterance to add communicative features, specifically to convey a sense of dissatisfaction. Thus, while all speakers will increase nasality for prosodic purposes, some may have chronically high or low levels of nasal voice. While the ratio of nasal to oral exhalation can be measured (a variable known as nasalance), this value is only moderately correlated with listener perceptions of nasality, just as amplitude is only moderately correlated with listener perceptions of loudness (see Sweeney & Sell, 2008 for a more detailed treatment). The impairment associated with nasal voice is the result of listener judgments, rather than the specific path of air movement. Thus, listener-perceived nasality, rather than mechanically measured nasalance, is considered the ‘gold standard’ for research which examines the social consequences of nasal speech (Kuehn & Moller, 2000). 1.2. Social consequences of nasal voice The psychosocial consequences of persistently nasal speech are significant and well established. Early pediatric research demonstrated that children expressed increased willingness to avoid or exclude peers with highly nasal speech (Blood & Hyman, 1977). Further research with adults demonstrated that listeners rated highly nasal speakers as less pleasant, less kind, and even less physically attractive (Blood, Mahan, & Hyman, 1979). Later studies replicated these findings with child and adolescent speakers and also reported a tendency for listeners to rate nasal speakers as less intelligent, less confident, and less honest (Lass, Ruscello, Harkins-Bradshaw, & Blankenship, 1991; Lass, Ruscello, Stout, & Hoffman, 1991; Ruscello, Lass, & Podbesek, 1988). A study of Australian adults found vocal nasality to be negatively correlated with persuasiveness, social status, and solidarity (Pittam, 1990). Proposing a possible mediator for negative listener ratings, McKinnon, Hess, and Landry (1986) found that listeners reported increased anxiety when listening to nasal speech. A more recent investigation sought listener ratings on a variety of semantic differential items (e.g., confident/unsure, beautiful/ugly, graceful/awkward, etc.) and found that highly nasal speakers were not only rated significantly more negatively than normal control speakers, but also more negatively than speakers with other voice disorders (Lallh & Rochet, 2000). In addition, these researchers provided a subgroup of listeners with information about voice disorders before the listeners began the rating task, but found no significant differences between the opinions of raters who received stigma-reducing education those who received neutral control information. This is consistent with McKinnon et al.’s finding that listeners’ negative reaction to nasal voice may be affectively, rather than cognitively, mediated. Negative listener reaction to nasality has even been found in listeners’ reactions to infants’ pre-speech vocalizations. For example, both Canadian and Japanese mothers responded less frequently to nasal infant vocalizations, perceiving them as less communicative than non-nasal infant vocalizations (Masataka & Bloom, 1994). When undergraduates rated infants on a variety of social favorability criteria based on their vocalizations, nasality was associated with significantly lower ratings for boys, though not for girls (Bloom, Moore-Schoenmakers, & Masataka, 1999). Bloom, Zajac, and Titus (1999) published results that further elaborated on the relationship between nasality and gender. They asked listeners to rate male and female speakers varying in nasality on adjectives reflecting positive female stereotypes (e.g., sensitive, helpful, etc.), negative female stereotypes (e.g., whiny, weak, etc.), positive male stereotypes (e.g., assertive, competent, etc.), and negative male stereotypes (e.g., boastful, hostile, etc.). Bloom et al. found that a highly nasal voice was most strongly associated with an increase in ratings for negative stereotypically female traits across speakers of both sexes, and a decrease in ratings for positive male traits. The latter finding yielded a significant nasality by sex interaction, indicating that highly nasal male speakers were penalized more harshly than highly nasal female speakers on ratings of positive masculine traits.
118
A.M. Smerbeck / Research in Autism Spectrum Disorders 17 (2015) 116–125
1.3. Nasal voice in HFASD While many studies have examined prosodic impairment in individuals with HFASD, none have examined voice characteristics independent of communicative intent. Using audio recordings of the clinical interview portion of the Autism Diagnostic Observation Schedule (ADOS; Lord, Rutter, DiLavore, & Risi, 1999), Shriberg et al. (2001) compared the voice and prosody characteristics of 30 young men with HFASD to those of typically-developing control speakers. They found speakers with HFASD to have significantly higher rates of inappropriate nasal resonance than controls, with about one-third of HFASD participants displaying excessive nasal voice. An additional study examining the same groups of subjects found a significant correlation (r = .35) between the absence of resonance errors and performance on the Vineland Adaptive Behavior Scales – Socialization Domain (Paul et al., 2005). However, these studies must be interpreted cautiously, due to the circumstances under which the speech samples were elicited. The ADOS module used with verbally fluent adults involves structured questioning about friendship, romantic relationships, and employment (Lord et al., 1999). Nasality can be a prosodic cue to convey unhappiness or dissatisfaction (Kummer & Lee, 1994). Studies of adults with HFASD have found frequent unemployment/underemployment and only rare instances of successfully maintained long-term romantic relationships (Barnhill, 2007; Cederlund, Hagberg, Billstedt, Gillberg, & Gillberg, 2008). Thus, it is possible that the increased rates of nasality observed in the HFASD speakers were present because the speakers were accurately using a prosodic cue to express their unhappiness with their relationships or life circumstances, rather than displaying an underlying and relatively constant voice feature. Therefore, though this study suggests the possibility of more generalized excessive nasality in HFASD, more research is necessary. An extensive literature base shows that highly nasal speakers suffer negative listener judgments and there is good reason to expect that children with HFASD will be overrepresented among highly nasal speakers. This study compares children with HFASD to typically developing children on perceived nasality of speech and examines its effects on listener reactions. 2. Material and methods 2.1. Participants Study participants were males between the ages of 6 and 13 years who were either typically-developing controls (TD) or were diagnosed with HFASD. All participants spoke American English as their first and primary language. Exclusionary criteria included any current or previous physiological abnormality of the vocal tract as well as any current condition which might alter or occlude airflow, such as an upper respiratory infection. Participants with HFASD were recruited from a cognitive-behavioral summer treatment program. To gain entry into the program and therefore to be eligible for recruitment into the present study, children had to have an independent diagnosis of ASD, confirmed by a records review performed by two licensed psychologists. To be considered ‘high-functioning’, the child had to be verbally fluent and to achieve a full scale IQ score greater than 70 on the Wechsler Intelligence Scale for Children, Fourth Edition (WISC-IV; Wechsler, 2003) with a Verbal Comprehension Index or Perceptual Reasoning Index score greater than 85. The TD group was recruited via word-ofmouth and fliers distributed through local agencies. Exclusionary criteria for the TD group included any current or past developmental delay, emotional or behavioral disturbance, or diagnosis of an ASD in a first-degree relative. Between-groups analyses compared demographically matched samples of TD children and those with HFASD (see Table 1). The groups were identical in age and race/ethnicity distribution. It was not possible to match the groups perfectly on socioeconomic status, as assessed by years of parent education. However, further analysis found that parent education did not correlate significantly with nasality (r [56] = .19, p = .144). The use of IQ as a covariate when examining a developmentally disabled population remains a complex issue. The obtained IQ score reflects not only underlying intellectual ability, but also a host of behavioral and neuropsychological features such as attention, compliance, and task persistence which are likely symptoms of the disability. Consequently, when an individual with an HFASD is tested, his score reflects both his intelligence and the degree to which his test-taking behaviors have been affected by ASD. Thus, when analyzing the performance of both disabled and non-disabled individuals, diagnostic status and general intellectual ability are to some degree conflated. In this sample, an IQ difference between groups was found and did achieve statistical significance (t[57] = 2.88, p = .006). Examining all participants, IQ was significantly correlated with nasality ratings (r[56] = .36, p = .005). However, when examining only healthy controls, none Table 1 Demographic information for matched pairs.
Age – mean (SD); range Parent education – mean (SD); range IQ – mean (SD); range Race/ethnicity – n (%)
HFASD (n = 29)
Typically Dev. (n = 29)
9.45 (1.78); 6–13 15.40 (2.05); 12.0–19.0 103.39 (15.44); 73.93–134.74 Caucasian – 25 (86%) African Amer. – 1 (3%) Latino – 2 (7%) Asian Amer. – 1 (3%)
9.45 (1.78); 6–13 16.42 (2.01); 11.5–19.0 114.44 (13.72); 86.00–140.54 Caucasian – 25 (86%) African Amer. – 1 (3%) Latino – 2 (7%) Asian Amer. – 1 (3%)
A.M. Smerbeck / Research in Autism Spectrum Disorders 17 (2015) 116–125
119
of IQ’s correlations with listener ratings (including nasality and attribute ratings) ever achieved a p < .05 standard of significance. This suggests that IQ does not typically predict listener ratings; it may only have predictive value to the degree to which it can capture some of the variance represented by diagnostic status. As this is an exploratory study with emphasis on identifying possible avenues for further research, all analyses were run with and without IQ as a covariate. 2.2. Measures 2.2.1. Short-form IQ To assess and control for any differences in general intelligence between the TD and HFASD groups, participants were administered a two-subtest short form (vocabulary and matrix reasoning) of the WISC-IV (Wechsler, 2003). Using the method described by Tellegen and Briggs (1967), scaled scores from these two subtests were summed and converted to a deviation quotient score with a mean of 100 and a standard deviation of 15. The short form has an estimated internal consistency reliability coefficient of .93 and is estimated to correlate highly (r = .86) with the Full Scale IQ derived from all 10 subtests (psychometric properties estimated using formulae supplied by Tellegen and Briggs, 1967). 2.2.2. Speech sampling As discussed previously, nasality can be used prosodically to express unhappiness or dissatisfaction. In connected speech, particularly regarding affectively charged content such as the interview data elicited during administration of the ADOS analyzed by Shriberg et al. (2001), a whiny prosodic tone would be very noticeable and could easily influence listeners’ perceptions. To help ensure that listeners’ ratings were based on the participants’ overall voice characteristics, not specific prosodic inflections, speech samples in the present study were collected using single word stimuli. The twenty words used included numbers 1–10, basic colors (red, yellow, orange, green, blue, purple, brown), and shapes (circle, triangle, square). These words were selected because they lack significant affective content, contain a mixture of nasal and non-nasal consonants (see below), and could easily be elicited from the even youngest children in the sample. Participants were taken to a quiet room and introduced to the recording equipment. Few participants displayed reticence regarding the recording. Those that did were reassured that their recorded voice would not be made public and this appeared sufficient to reduce worry. In addition, before beginning the official procedure, all participants were given the opportunity to hear their own voices ‘‘sound silly’’ via the application of audio filters. Thus, participants were not likely to be expressing negative affect during the recording. Once comfort with the procedure was established, participants were then shown cards one at a time which spelled out and illustrated each word (e.g., the word ‘blue’ and a patch of blue ink). When shown each card, they were encouraged to speak the word aloud. In the rare case in which a child did not spontaneously say the target word, the examiner prompted the correct response without saying the target word (e.g., ‘‘What shape is this?’’). Word order was kept constant across all participants. The examiner controlled the rate of speech by presenting the cards one at a time, allowing each word to be spoken as a separate token. 2.2.3. Perceived nasality and attribute ratings The 10 raters consisted of seven undergraduate and three graduate students who volunteered to assist the primary investigator in order to obtain research experience. All were native speakers of American English and reported American English to be their current dominant language. None had previous experience working with individuals with HFASD. Prior to rating the voice samples, raters completed a 30–45 min training session with the primary investigator individually or in small groups. The trainer first introduced examinees to the idea of nasal voice by providing them with samples of more and less nasal utterances. Raters were then required to take a 10-item test to demonstrate that they were able to correctly identify nasal voice and to complete the test with 90% accuracy. Raters were all given identical pairs of Sennheiser HD-202 headphones for listening to the sound files. All raters were kept blind to the hypotheses under investigation as well as to participant diagnoses and demographics until their ratings were complete. Each individual word was rated for nasality using a 6-point Likert-type scale (ranging from 1 = ‘not nasal at all’ to 6 = ‘extremely nasal’). To generate overall nasality scores for analysis, an average was taken of the ratings produced by each of 10 individual listeners for all 20 words spoken by a given child. Thus, each child’s nasality score was created by averaging 200 ratings (each of 20 spoken words rated by 10 listeners). Participants with HFASD were expected to obtain higher nasality scores than controls. Then, each rater listened to a file containing an individual child’s 20-word sample. Based on the 20-word sample, listeners were asked to use a 6-point Likert-type rating scale to indicate the degree to which they felt each attribute applied to the speaker (ranging from 1 = ‘does not describe this child at all’ to 6 = ‘describes this child very well’). The specific attributes and associated categories are listed in Table 2. The first twelve items were grouped into four sets of three adjectives each: positive socially relevant adjectives (PSRA), negative socially relevant adjectives (NSRA), positive internal state adjectives (PISA), and negative internal state adjectives (NISA). An overall mean of all three words in a set as rated by all 10 listeners was calculated for each of the four sets. Thus, each child had an overall mean score for PSRA, NSRA, PISA, and NISA. Given previous research on listener ratings of nasal speech, it was expected that children with HFASD, as compared to controls, would be rated higher on NSRA and lower on PSRA. The evidence for internal state adjectives was weaker and it was expected that group differences on PISA and NISA would be smaller than differences on socially relevant adjectives, though it was predicted that the general pattern of high negative and low positive ratings for the HFASD group would be maintained.
120
A.M. Smerbeck / Research in Autism Spectrum Disorders 17 (2015) 116–125 Table 2 Item content of listener ratings. Composite
Adjectives
Positive socially relevant adjectives
Cooperative Friendly Agreeable
Negative socially relevant adjectives
Whiny Irritating Annoying
Positive internal state adjectives
Content Happy Comfortable
Negative internal state adjectives
Frustrated Angry Unsatisfied
Masculinity
Masculine Femininea
Dominance
Dominant Weaka
Judgment of age Perceived disability a
Indicates that adjective was reverse scored.
Four additional attribute ratings probed for perceptions pertaining to gender stereotypes. Feminine was reverse-scored and averaged with masculine to form a composite called Masculinity. Similarly, weak was reverse scored and averaged with dominant to form a composite called Dominance. It was predicted that speakers with HFASD would be rated lower than TD speakers on both Masculinity and Dominance. In addition, the raters were asked to guess each speaker’s age (they were not given a range of ages for the participants in the study) and to guess whether or not the speaker was typically developing. Each child’s Judgment of Age score was calculated by taking the mean of all raters’ estimation of age. Perceived disability ratings were arbitrarily coded such that a guess of typical development was assigned a ‘00 and a guess of disability was assigned a ‘1’. Each child’s Judgment of Disability score was calculated by averaging across all 10 raters. Speakers with HFASD were expected to be assigned younger perceived ages and higher perceived disability. 2.2.4. Reliability and validity of listener ratings When combining individual nasality ratings of the 20 words into an overall composite, the internal consistency (Cronbach’s alpha) values for each individual rater ranged from .67 to .94 (Mdn = .91), while the alpha estimate for the mean rating of all 10 raters combined was .94. Listeners re-rated a subset of fourteen speech samples to assess the degree to which they would assign the same rating when presented with the stimulus a second time. Test-retest correlation coefficients for individual raters ranged from .73 to .94; when examining the overall scores obtained by averaging across all raters, a correlation coefficient of .91 was obtained. Interrater reliability for nasality ratings has long been recognized as problematic. The intent in this investigation was not to predict how any given individual would perceive the child’s voice, but rather how the child’s voice is generally perceived in his social milieu. Thus, the analysis focused on the level of the overall mean score across raters, effectively encompassing and averaging input from all 10 raters. The intraclass correlation coefficient (ICC) is the most appropriate measure of the reliability of an overall score combining the input of multiple raters (Shrout & Fleiss, 1979), with a consistency-based calculation most appropriate for a perceptual ranking scale such as the one employed in this study. As a measure of reliability, ICC coefficients may be interpreted according to the same general standards as internal consistency or test-retest reliability. Thus, values .70 are considered adequate for research use (Sattler, 2008). The ICC coefficient for nasality ratings was .84. ICC coefficients for the adjective ratings ranged from .72 to .95 (Mdn = .82). Most English vowels are produced with little nasal resonance, but when they immediately border a nasal consonant (i.e., / m/, /n/, and / E/), they are produced with increased nasal resonance, through a process known as assimilation (Ladefoged & Maddieson, 1996). Of the 20 words elicited in the speech sample, eight include one or more vowels immediately preceded or followed by a nasal consonant (e.g., green). Thus, for words containing nasal consonants, all English speakers would be expected to speak more nasally. This natural variation can be used to assess the validity of listener nasality ratings – that is, whether they were rating nasality or some other aspect of speech. For each speaker, a mean was calculated of all listeners’ ratings of the eight nasal assimilation words and was compared to the mean of all listeners’ ratings of the remaining twelve words. The nasal assimilation words yielded significantly higher nasality ratings than the twelve words which contained no nasal consonants (t[64] = 8.42, p < .001, d = 0.61). This clear difference between known nasal and non-nasal words in the expected direction provided evidence that the listeners were in fact validly rating nasality.
A.M. Smerbeck / Research in Autism Spectrum Disorders 17 (2015) 116–125
121
3. Results All major hypotheses were directional. Thus, except where noted, one-tailed p-values are reported. As these results are investigational, no correction has been made for multiple comparisons. The primary hypothesis proposed that boys with HFASD demonstrate significantly greater nasality than TD boys. A regression analysis found diagnostic status to be a significant predictor of nasality (F(1, 56) = 9.21, p = .002, R2 = .141). This effect was maintained even after controlling for differences in IQ (FD(1, 55) = 4.75, p = .017, R2D = .069). The overall mean nasality rating for the HFASD group was 2.87 (SD = 0.48), while the mean for the TD group was 2.51 (SD = 0.43), yielding a Cohen’s d effect size of 0.80, consistent with a large effect (Cohen, 1988). When a nasality cutoff of 1.0 SD above the TD mean (i.e., 2.94) was used to dichotomize participants into ‘highly nasal’ and ‘typical’ groups, analysis found 5 out of 29 controls judged highly nasal, while 14 out of 29 HFASD participants were judged highly nasal. Replicating the findings from previous research, nasality ratings yielded a statistically significant correlation with NSRA (r[56] = .67, p < .001), though they did not correlate with listener rating of PSRA (r[56] = .21, p = .056), PISA (r[56] = .17, p = .100), or NISA (r [56] = .14, p = .155). Increased nasality was associated with lower ratings of masculinity (r[56] = .37, p = .002) and dominance (r[56] = .56, p < .001). Increased nasality was also associated with lower estimates of the speaker’s age (r[56] = .47, p < .001) and an increased likelihood that the speaker would be perceived as disabled (r [56] = .58, p < .001). Table 3 displays the relationship between diagnostic status and Listener Attribute Ratings. Compared to TD speakers, speakers with HFASD were rated significantly higher on NSRA, yielding an effect size d of 0.90 (large; Cohen, 1988). Groups differed significantly on NSRA even after controlling for IQ. The relationship between diagnostic status and NSRA is markedly diminished when nasality is controlled for, suggesting that nasality mediates the relationship between group status and NSRA ratings. When not controlling for IQ, however, the relationship between group status and NSRA ratings remains significant even after controlling for nasality, suggesting that other characteristics of HFASD speakers’ voices contributed partially to the elevated NSRA ratings. In contrast, groups did not significantly differ on PSRA, PISA, or NISA, though group means did trend in the expected direction. No effect was found on the Masculinity variable, but groups did differ significantly on Dominance, Judgment of Age, and Judgment of Disability. Group differences on the Dominance and Judgment of Age variables became nonsignificant when controlling for nasality, again suggesting nasality’s role as a mediating variable. Neither Dominance nor Judgment of Age remained significant, however, when controlling for IQ. Like NSRA, Judgment of Disability remained statistically significant but substantially diminished when controlling for nasality, suggesting that nasality partially mediates the relationship between group status and listener perceptions of disability, but that other speaker characteristics likely play a significant role. Even when controlling for IQ, groups differed significantly on the Judgment of Disability variable, a relationship which was suppressed when the analysis controlled for nasality as well. TD participants achieved a mean disability score of 0.38 (SD = 0.23), whereas HFASD participants received a mean score of 0.57 (SD = 0.24), yielding a large effect size (d = 0.85). 4. Discussion A broad trend in the results was a clear relationship between nasality and other listener ratings that weakened or became undetectable when HFASD status was substituted for nasality. Two clear potential explanations present themselves. First, substituting diagnostic status for nasality effectively dichotomizes a continuous variable. Though there may have been a distinct trend between the ratings of an individual who displays moderately increased nasality and one who is displays markedly increased nasality, that variance is not captured in a dichotomous system, thus decreasing the likelihood that subtle associations will be identified. Secondly, and perhaps more importantly, just under one-half of participants with HFASD were identified as highly nasal in this study. When the HFASD group was treated as a block, the normal-nasality participants likely contributed enough random error variance that a significant result could no longer be achieved. Essentially, nasality strongly predicted other listener ratings. Because high nasality was more strongly concentrated in the HFASD group than the TD group, examination by group status provided a faint reflection of the nasality predictions. This is most clearly illustrated by contrasting a simple t-test analysis comparing HFASD vs. TD participants with one comparing participants with normal nasality to those with high nasality (Table 4). The same broad pattern is produced in both comparisons, but the effect size estimates are smaller and fewer comparisons are significant when HFASD vs. TD is examined instead of high nasality vs. normal nasality groups. From a clinical perspective, however, it is important to remember that 48% of the speakers with HFASD in this sample were classified as highly nasal and the latter set of results would apply to them. In other words, an individual with HFASD who is highly nasal would expect the full severity of negative listener reactions associated with excessive nasality. Thus, for this substantive subset of the HFASD population, it appears that listeners may at least initially judge them 1.5 standard deviations higher on NSRA (irritating, whiny, and annoying) based solely on the characteristics of their voices. This is a serious setback for a population which is already defined by its social difficulties. Prior research has consistently demonstrated a relationship between nasal voice and negative listener judgments, clearly reproduced in this study. Ratings of the HFASD group’s speech were elevated for a composite of the three negative socially relevant adjectives: annoying, whiny, and irritating. Interestingly, however, group status failed to demonstrate a statistically
122
Without controlling for IQ
Controlling for IQ
Group status predicting listener ratings
Group status predicting listener ratings, controlling for nasality
Group status predicting listener ratings, controlling for IQ
Group status predicting listener ratings, controlling for IQ and nasality
Positive socially relevant adjectives Negative socially relevant adjectives Positive internal state adjectives Negative internal state adjectives
FD(1, 56) = 1.92, p = .086, R2D = .033
FD(1, 55) = 0.71, p = .201, R2D = .012
FD(1, 55) = 1.50, p = .113, R2D = .026
FD(1, 54) = 0.71, p = .200, R2D = .013
FD(1, 56) = 11.79, p = .001**, R2D = .174
FD(1, 55) = 3.38, p = .036*, R2D = .032
FD(1, 55) = 4.30, p = .022*, R2D = .046
FD(1, 54) = 0.92, p = .172, R2D = .007
FD(1, 56) = 0.50, p = .242, R2D = .009
FD(1, 55) = 0.06, p = .406, R2D = .001
FD(1, 55) = 0.32, p = .286, R2D = .006
FD(1, 54) = 0.60, p = .404, R2D = .001
FD(1, 56) = 0.34, p = .282, R2D = .006
FD(1, 55) = 0.05, p = .416, R2D = .001
FD(1, 55) = 0.19, p = .334, R2D = .003
FD(1, 55) = 0.04, p = .427, R2D = .001
Masculinity Dominance Judgment of age Judgment of disability
FD(1, FD(1, FD(1, FD(1,
56) = 0.47, p = .247, R2D = .008 56) = 5.54, p = .011*, R2D = .090 56) = 4.28, p = .022*, R2D = .071 56) = 11.01, p = .001**, R2D = .164
FD(1, FD(1, FD(1, FD(1,
55) = 0.18, 55) = 0.77, 55) = 0.65, 55) = 3.59,
p = .339, R2D = .003 p = .193, R2D = .009 p = .211, R2D = .009 p = .032*, R2D = .040
FD(1, FD(1, FD(1, FD(1,
55) = 0.002, p = .485, R2D < .001 55) = 1.61, p = .105, R2D = .023 55) = 1.59, p = .107, R2D = .025 55) = 4.67, p = .018*, R2D = .059
FD(1, FD(1, FD(1, FD(1,
54) = 0.46, 54) = 0.09, 54) = 0.21, 54) = 1.56,
p = .251, p = .384, p = .327, p = .109,
R2D = .007 R2D = .001 R2D = .003 R2D = .016
FD is the change in F-value attributable to variation group membership above and beyond any covariates; the associated p-value reports whether variation due to group membership significantly predicts Attribute Ratings above and beyond the effects of any covariates. R2D is the proportion of variance in listener Attribute Ratings due to group membership once variance due to covariates has been removed. * p < .05. ** p < .01.
A.M. Smerbeck / Research in Autism Spectrum Disorders 17 (2015) 116–125
Table 3 Relationship between Group Status (HFASD or typical developing) and listener attribute ratings.
A.M. Smerbeck / Research in Autism Spectrum Disorders 17 (2015) 116–125
123
Table 4 A contrast of the mean differences between TD and HFASD groups with the mean differences between normal nasality and highly nasal groups. Variable
TD (n = 29) M (SD)
HFASD (n = 29) M (SD)
t-value
p-value
Cohen’s d
Normal nasality (n = 39) M (SD)
Highly nasal (n = 19) M (SD)
t-value
p-value
Cohen’s d
PSRA NSRA PISA NISA Masculinity Dominance Judgment of Age Judgment of Disability
4.13 1.92 3.79 1.97 4.30 4.03 9.83 0.37
3.95 2.44 3.68 2.05 4.09 3.60 8.90 0.57
1.38 3.43 0.71 0.58 0.69 2.35 2.07 3.32
.86 .001** .242 .282 .247 .011* .022* .001**
0.36 0.90 0.17 0.16 0.62 0.54 0.54 0.85
4.12 1.93 3.81 1.98 4.46 4.08 9.85 0.38
3.88 2.70 3.58 2.08 3.65 3.26 8.37 0.65
1.73 5.38 1.29 0.73 2.63 4.82 3.26 4.36
0.045* <.001*** 0.101 0.235 0.006** <.001*** 0.001** <.001***
0.49 1.50 0.37 0.20 0.73 1.35 0.91 1.20
(0.49) (0.55) (0.61) (0.47) (1.08) (0.66) (1.70) (0.23)
(0.50) (0.60) (0.65) (0.50) (1.25) (0.72) (1.72) (0.24)
(0.44) (0.49) (0.56) (0.48) (1.07) (0.65) (1.47) (0.24)
(0.59) (0.56) (0.75) (0.51) (1.17) (0.51) (1.92) (0.19)
* p < .05. ** p < .01. *** p < .001.
significant effect on ratings of PSRA, and nasality itself proved only a marginal predictor. It is unclear why nasality had a much stronger effect on negative than positive social perceptions. Neither group status nor nasality was associated with the raters’ estimation of the speakers’ internal states. Listeners perceived an unpleasant tone in the child’s voice and rated the child as displeasing to others, but did not infer that the child was displeased. These findings are consistent with those of McKinnon et al. (1986) who demonstrated that listeners’ anxiety increased when exposed to highly nasal speech. Thus, it seems most likely that listeners in this study heard highly nasal speech and felt annoyed. They thus rated the speaker as annoying, but had no reason to infer that he was unhappy. Based only on their voices, speakers with HFASD were substantially more likely to be judged as disabled than their TD peers. They were also judged younger and less dominant. These findings may help to explain why children with HFASD are so frequently the targets of peer victimization. If their voices signal to peers that they are weak or immature, their classmates are likely to perceive them as ‘‘easy targets.’’ Future research should determine whether these results can be replicated with peers as raters, particularly peers with a history of bullying behavior. 5. Conclusions A sampling decision was made to include only male speakers, as few female HFASD participants could be located. This is both a strength and a weakness of the study. By including only boys, the results provide a clear demonstration of the effects of nasality on males with HFASD. The findings pertaining to masculinity and dominance would have become more ambiguous had the sample included girls. At the same time, these findings cannot be assumed to extend to girls with HFASD. An analogous decision was made to focus on high-functioning individuals with ASD, not only because the study focused on the effects of nasality in fluent speakers, but also because lower-functioning individuals would be much more likely to have undergone intensive direct instruction to initiate communicative vocalization which could introduce additional unknown variance into the sample. Thus, a stronger statement can be made about the HFASD population, but the findings cannot be assumed to apply to lower-functioning individuals with ASD. Future research should seek to replicate the finding with female and lower-functioning individuals with ASD. A strength of the study was moderate ethnic diversity (approximately 14% non-Caucasian) and the use of ethnicity matching between the TD and HFASD groups. A limitation of this study was its use of a speech sample that could be considered less than maximally sensitive to nasality. A very short elicitation procedure was selected, in part, as a practical matter, to avoid overburdening children who were participating in multiple studies. Longer speech samples not only increase reliability by providing more tokens for analysis (Sattler, 2008), they also tax the velopharyngeal apparatus, making subtle dysfunction more readily detectable (Kummer & Lee, 1994). With a more sensitive elicitation task, more subtle abnormalities could be detected. However, this does not discount the current findings, which were achieved despite use of a less than optimal task; findings under such conditions suggest robust effects. The greatest strength of the study was the use of extensive procedures designed to differentiate voice abnormality from prosodic abnormality. By collecting low-affect single-word tokens and presenting them to raters in a different order than the one in which they were elicited, this study ensured that any differences between the HFASD and the TD group were due to abnormalities of voice, not variations in the use of voice to convey meaning. An additional strength of the study was the use of an unusually large (N = 10) number of raters to analyze the data. This significantly increased the reliability of the overall listener judgment scores. However, it should be noted that all perceptual ratings were performed by young adult undergraduate and graduate students who had no prior contact with the speakers. Children and older adult listeners cannot be assumed to produce the same ratings, nor can listeners who have regular direct contact with highly nasal speakers. It may be particularly interesting for researchers to examine the attitudes of parents and teachers who have frequent interactions with a highly nasal speaker. Nonetheless, the methodology employed herein was extremely useful in ensuring that observed
124
A.M. Smerbeck / Research in Autism Spectrum Disorders 17 (2015) 116–125
variation in ratings was due to the children’s voices, not to prosody or overall social-communicative behavior. Ultimately, this study tentatively demonstrated that a significant subset of boys with HFASD produce highly nasal speech and this is associated with negative listener perceptions, a significant interpersonal barrier in a population already defined by impaired social functioning. Future research should explore potential causes for this phenomenon, which will in turn inform treatments.
References American Psychiatric Association (2013). Diagnostic and statistical manual of mental disorders (5th ed.). Washington, DC: Author. Asperger, H. (1944). Die ‘‘Autistichen Psychopathen’’ im Kindesalter. Archiv fu¨r Psychiatrie und Nervenkrankheiten, 117, 76–136. Baltaxe, C., Simmons, J., & Zee, E. (1984). Intonation patterns in normal, autistic and aphasic children. In A. Cohen & M. van de Broecke (Eds.), Proceedings of the Tenth International Congress of Phonetic Sciences (pp. 713–718). Dordrecht, The Netherlands: Foris Publications. Barnhill, G. P. (2007). Outcomes in adults with Asperger syndrome. Focus on Autism and Developmental Disabilities, 22(2), 116–126. http://dx.doi.org/10.1177/ 10883576070220020301 Blood, G. W., & Hyman, M. (1977). Children’s perceptions of nasal resonance. Journal of Speech and Hearing Disorders, 42, 446–448. Blood, G. W., Mahan, B. W., & Hyman, M. (1979). Judging personality and appearance from voice disorders. Journal of Communication Disorders, 12, 63–68. http:// dx.doi.org/10.1016/0021-9924(79)90022-4 Bloom, K., Moore-Schoenmakers, K., & Masataka, N. (1999). Nasality of infant vocalizations determines gender bias in adult favorability ratings. Journal of Nonverbal Behavior, 23(3), 219–236. http://dx.doi.org/10.1023/A:1021317310745 Bloom, K., Zajac, D., & Titus, J. (1999). The influence of nasality of voice on sex-stereotyped perceptions. Journal of Nonverbal Behavior, 23(4), 271–281. http:// dx.doi.org/10.1023/A:1021650809431 Cederlund, M., Hagberg, B., Billstedt, A., Gillberg, I. C., & Gillberg, C. (2008). Asperger’s syndrome and autism: A comparative longitudinal follow-up study more than 5 years after the initial diagnosis. Journal of Autism and Developmental Disorders, 38, 72–85. http://dx.doi.org/10.1007/s10803-007-0364-6 Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences (2nd ed.). New York, NY: Routledge Academic. Dowell, L., Mahone, E., & Mostofsky, S. (2009). Associations of postural knowledge and basic motor skill with dyspraxia in autism: Implication for abnormalities in distributed connectivity and motor learning. Neuropsychology, 23(5), 563–570. http://dx.doi.org/10.1037/a0015640 Fine, J., Bartolucci, G., Ginsberg, G., & Szatmari, P. (1991). The use of intonation to communicate in pervasive developmental disorders. Journal of Child Psychology and Psychiatry, 32, 771–782. http://dx.doi.org/10.1111/j.1469-7610.1991.tb01901.x Ghaziuddin, M., Butler, E., Tsai, L., & Ghaziuddin, N. (1994). Is clumsiness a marker for Asperger syndrome? Journal of Intellectual Disability Research, 38(5), 519– 527. http://dx.doi.org/10.1111/j.1365-2788.1994.tb00440.x Green, D., Charman, T., Pickles, A., Chandler, S., Loucas, T., Simonoff, E., et al. (2009). Impairment in movement skills of children with autistic spectrum disorders. Developmental Medicine & Child Neurology, 51(4), 311–316. http://dx.doi.org/10.1111/j.1469-8749.2008.03242.x Green, J., Gilchrist, A., Burton, D., & Cox, A. (2000). Social and psychiatric functioning in adolescents with Asperger syndrome compared with conduct disorder. Journal of Autism and Developmental Disabilities, 30, 279–293. http://dx.doi.org/10.1023/A:1005523232106 Kuehn, D., & Moller, K. T. (2000). Speech and language issues in the cleft palate population: The state of the art. Cleft Palate – Craniofacial Journal, 37, 341–348. http://dx.doi.org/10.1597/1545-1569(2000)037<0348:SALIIT>2.3.CO;2 Kummer, A. W., & Lee, L. (1994). Evaluation and treatment of resonance disorders. Language, Speech, and Hearing Services in Schools, 27, 271–281. Ladefoged, P., & Maddieson, I. (1996). The Sounds of the World’s Languages. Cambridge, MA: Blackwell Publishers Inc. Lallh, A. K., & Rochet, A. P. (2000). The effect of information of listeners’ attitudes toward speakers with voice or resonance disorders. Journal of Speech, Language, and Hearing Research, 43, 782–794. Lass, N. J., Ruscello, D. M., Harkins-Bradshaw, K., & Blankenship, B. L. (1991). Adolescents’ perceptions of normal and voice-disordered children. Journal of Communication Disorders, 24, 267–274. http://dx.doi.org/10.1016/0021-9924(91)90002-Z Lass, N. J., Ruscello, D. M., Stout, L. L., & Hoffman, F. M. (1991). Peer perceptions of normal and voice disordered children. Folia Phoniatrica, 43, 485–489. http:// dx.doi.org/10.1159/000266098 Lopata, C., Toomey, J. A., Fox, J. D., Volker, M. A., Chow, S. Y., Thomeer, M. L., et al. (2010). Anxiety and depression in children with HFASD: Symptom levels and source differences. Journal of Abnormal Child Psychology, 38, 765–776. http://dx.doi.org/10.1007/s10802-010-9406-1 Lord, C., Rutter, M. L., DiLavore, P. C., & Risi, S. (1999). In WPS (Ed.), Autism Diagnostic Observation Schedule – WPS. Los Angeles: Western Psychological Services. Masataka, N., & Bloom, K. (1994). Acoustic properties that determine adults’ preferences for 3-month-old infant vocalizations. Infant Behavior & Development, 17(4), 461–464. http://dx.doi.org/10.1016/0163-6383(94)90038-8 Mayes, S., Calhoun, S., Murray, M., Morrow, J., Yurich, K., Mahr, F., et al. (2009). Comparison of scores on the checklist for autism spectrum disorder, childhood autism rating scale, and Gilliam Asperger’s disorder scale for children with low functioning autism, high functioning autism, Asperger’s disorder, ADHD, and typical development. Journal of Autism and Developmental Disorders, 39(12), 1682–1693. http://dx.doi.org/10.1007/s10803-009-0812-6 McKinnon, S. L., Hess, C. W., & Landry, R. G. (1986). Reactions of college students to speech disorders. Journal of Communication Disorders, 19, 75–82. http:// dx.doi.org/10.1016/0021-9924(86)90005-5 Ming, X., Brimacombe, M., & Wagner, G. (2007). Prevalence of motor impairment in autism spectrum disorders. Brain & Development, 29(9), 565–570. http:// dx.doi.org/10.1016/j.braindev.2007.03.002 Netsell, R. (1981). The acquisition of speech motor control: A perspective with directions for research. In R. E. Stark (Ed.), Language behavior in infancy and early childhood. Amsterdam: Elsevier North-Holland. Paccia, J., & Curcio, F. (1982). Language processing and forms of immediate echolalia in autistic children. Journal of Speech and Hearing Research, 25, 42–47. Paul, R., Augustyn, A., Klin, A., & Volkmar, F. R. (2005). Perception and production of prosody by speakers with autism spectrum disorders. Journal of Autism and Developmental Disorders, 35(2), 205–220. http://dx.doi.org/10.1007/s10803-004-1999-1 Paul, R., Shriberg, L. D., McSweeny, J., Cicchetti, D., Klin, A., & Volkmar, F. (2005). Brief report: Relations between prosodic performance and communication and socialization ratings in high functioning speakers with autism spectrum disorders. Journal of Autism and Developmental Disorders, 35(6), 861–869. http:// dx.doi.org/10.1007/s10803-005-0031-8 Pittam, J. (1990). The relationship between perceived persuasiveness of nasality and source characteristics for Australian and American listeners. The Journal of Social Psychology, 130(1), 81–87. http://dx.doi.org/10.1080/00224545.1990.9922937 Provonost, W., Wakstein, M., & Wakstein, D. (1966). A longitudinal study of speech behavior and language comprehension in fourteen children diagnosed as atypical autistic. Exceptional Children, 33(1), 19–26. Ruscello, D. M., Lass, N. J., & Podbesek, J. (1988). Listeners’ perceptions of normal and voice disordered children. Folia Phoniatrica, 40, 290–296. http://dx.doi.org/ 10.1159/000265922 Shriberg, L. D., Paul, R., McSweeny, J. L., Klin, A., Cohen, D. J., & Volkmar, F. R. (2001). Speech and prosody characteristics of adolescents and adults with highfunctioning autism and Asperger syndrome. Journal of Speech, Language, and Hearing Research, 44(5), 1097–1115. http://dx.doi.org/10.1044/1092-4388(2001/ 087) Sattler, J. (2008). Assessment of children: Cognitive foundations. La Mesa, CA: Jerome M Sattler Publishers. Shrout, P. E., & Fleiss, J. L. (1979). Intraclass correlations: Uses in assessing rater reliability. Psychological Bulletin, 86, 420–426. http://dx.doi.org/10.1037/00332909.86.2.420
A.M. Smerbeck / Research in Autism Spectrum Disorders 17 (2015) 116–125
125
Staples, K., & Reid, G. (2010). Fundamental movement skills and autism spectrum disorders. Journal of Autism and Developmental Disorders, 40(2), 209–217. http:// dx.doi.org/10.1007/s10803-009-0854-9 Sweeney, T., & Sell, D. (2008). Relationship between perceptual ratings of nasality and nasometry in children/adolescents with cleft palate and/or velopharyngeal dysfunction. International Journal of Language and Communication Disorders, 43(3), 265–282. http://dx.doi.org/10.1080/13682820701438177 Tellegen, A., & Briggs, P. F. (1967). Old wine in new skins: Grouping Wechsler subtests into new scales. Journal of Consulting Psychology, 31(5), 499–506. http:// dx.doi.org/10.1037/h0024963 Volker, M. A., Lopata, C., Smerbeck, A. M., Knoll, V., Toomey, J. A., Rodgers, J. D., et al. (2010). BASC2 PRS profiles for students with high functioning autism spectrum disorders. Journal of Autism and Developmental Disorders, 40, 188–199. http://dx.doi.org/10.1007/s10803-009-0849-6 Wechsler, D. (2003). Wechsler intelligence scale for children (4th ed.). San Antonio, TX: Pearson Inc.