Validation and Standardization of the Pediatric Voice Symptom Questionnaire: A Double-Form Questionnaire for Dysphonic Children and Their Parents *,†Verduyckt Ingrid, ‡Morsomme Dominique, and †Remacle Marc, *Louvain-la-Neuve, yBruxelles, and zLiege, Belgium Summary: The aim of our study was to validate a Pediatric Voice Symptom Questionnaire (PVSQ) presenting with a parallel form for children and their parents. The items of the questionnaire were elaborated from the results of structured interviews with dysphonic children (DP) and normophonic children (NP) and their mothers and were tested for feasibility in a pilot study involving 42 normophonic children aged 5–13 years. The items were then administered in a test-retest mode to 333 children and their parents (154 boys and 179 girls with a mean age of 9 years, standard deviation: 1.8); 45 consulting DP, 34 nonconsulting dysphonics (NcDP), 163 NP, and 91 others. Classical statistical analyses and an item response modeling approach were used to analyze the results. High internal consistency and good test-retest stability were found. Significant differences between total score of the NP, DP, and NcDP groups were observed both in the children and the parents and also between parental and child score for the NP and NcDP groups (P < 0.001–P ¼ 0.014). Correlations between child and parental scores were found only in the DP groups (r ¼ 0.478; P < 0.001). Based on our results, the PVSQ is a valid and reliable instrument for the autoevaluation of dysphonia in the child population. Key Words: Dysphonia–Children–Parents–PROM–Quality-of-life questionnaire.
INTRODUCTION Children represent up to 10% of the patients seeking clinical help for voice disorders1,2 and the prevalence of childhood dysphonia varies between 0.12% and 24%.3–7 The disparity of the results can in part be explained by the definition of dysphonia (based on perceptual or acoustic data or on laryngeal findings) and the design used in each study (eg, recruitment of children by school teachers or by therapists). Although child voice has been considered a neglected area,8 there has been a recrudescence of studies on the subject in recent years9–12 and the literature offers normative data for child voice regarding most of the objective measures used and recommended for vocal assessment.13–16 However, there is still a shortage when it comes to the child’s subjective evaluation of his or her voice. The information from patients’ self-evaluation is independent from the objective information accounted for by acoustic or aerodynamic measures17 and is valuable both in the setting up of therapy goals as well as a therapy outcome measure.17,18 Patient-reported outcome measures (PROMs) are nowadays considered a corner stone of voice assessment,19 and three questionnaires, available in English, exist for the subjective evaluation of vocal symptoms in children. All three tools are constituted of items that were adapted from adult voice questionnaires and designed in the form of parental proxies.20–22
Accepted for publication August 1, 2011. From the *Faculte de psychologie, Universite de Louvain, Louvain-la-Neuve, Belgium; yCentre d’Audiophonologie, Cliniques Universitaires de Saint-Luc, Universite de Louvain, Bruxelles, Belgium; and the zDepartement de Psychologie: cognition et comportement/Logopedie des troubles de la voix, Universite de Liege, Liege, Belgium. Address correspondence and reprint requests to Verduyckt Ingrid, Centre d’Audiophonologie, Cliniques Universitaires de Saint-Luc, Universite de Louvain, Bruxelles, Belgium. E-mail:
[email protected] Journal of Voice, Vol. 26, No. 4, pp. e129-e139 0892-1997/$36.00 Ó 2012 The Voice Foundation doi:10.1016/j.jvoice.2011.08.001
In the pediatric literature on health-related quality of life (HRQoL), the use of parental proxies have been debated for more than one decade.23–29 When the interest for pediatric HRQoL measures was first raised, researchers mainly advocated for reports by adult proxies, claiming that children lacked the cognitive and linguistic skills required to understand and respond autonomously to the surveys. Children’s limited social experience and lack of long-term view of events were also raised as reasons to rely on adult proxies. However, the children’s limited social experience is the reality they are evolving in and referring to and it could as well be argued that the adults are not reliable respondents for children just because of this discrepancy in social experience and long-term view of events. Adult and child parallel forms of HRQoL questionnaires have been already developed in the last 10 years for children from the age of 5 years.30 Although the purpose of the parallel forms are not always clearly stated, two different reasons can be identified, either the parental reports are viewed as complementary to the child report, giving supplementary information that will be useful for treatment decision or the parallel forms are developed to examine whether one of the respondent can be substituted to the other. In the first case, good concordance between adult and child respondent is not expected, whereas in the other case, good concordance is a condition sine qua non for the tool to be valid. The fact that there only exist tools that turn toward the parents and not the children in the vocal field could be the expression for a tacit belief that children either are not aware of or not capable of accounting for their vocal symptoms. However, two recent studies where focused group interviews were conducted with dysphonic children and their parents show that children are both conscious of and affected by their vocal symptoms and that they are capable of describing them accurately without adult help already from the age of 5 years.31,32 The studies also
e130 found evidence that the parental evaluation of the child’s difficulties is not in full concordance with the child’s own evaluation. The validity of the parental proxies in the three existing voice-related child questionnaires has not been studied. We do not know if the adult responses can be substituted to the child’s in the voice-related domain. In fact, we know little about how well parents and children agree on HRQoL in general; results from available studies are sometimes contradictory and cannot be generalized because of differences in methodology. Poor-to-good agreement between child and adult proxies have been found and the values differ according to the domains measured; in some studies, domains referred to as observable, such as physical health, have yielded better concordance, whereas other studies have found better agreement in psychosocial functioning. The treatment of pediatric dysphonia demands active implication from the parents and the child and discordances between parental and child perception of the impact of dysphonia could have implications for treatment adherence and treatment outcome. Although a careful evaluation of the child’s and his/ her parents’ perception of the dysphonia is part of the clinical procedure before the treatment decision, there is no validated tool available to formally assess this autoevaluation. A specific questionnaire permitting the parallel subjective evaluation of dysphonia in children both by the child and his/her parents would be a help in providing comparable data across studies. The present study aims to the development and validation of a questionnaire permitting the parallel subjective evaluation of vocal symptoms in children by the child and by the parents.
MATERIALS AND METHODS An approval for this study was obtained from the Saint-Luc Hospital’s ethical committee according to Belgian regulation and a signed informed consent form was later obtained from all participating children’s parents before testing. First, a double-form questionnaire of 31 items (for each form) was constructed from the different voice-related symptoms or complaints that were identified in our previous study.32 In the child form, the items address the child himself or herself (for example: ‘‘Do you have to strain when you speak?’’); in the parent form, the items duplicate those used in the child form but address the parents (for example: ‘‘Do your child strain when he/she speaks?’’). During the interviews conducted in the first study, it was noticed that most children had to contextualize their voice use by recalling concrete situations. As an example, a child could start up by saying she had no vocal symptoms and later in the interview, while speaking about her choir repetition, tell that she mimed some of the songs because her voice broke or because she did not feel confident with it. Thus, we chose to contextualize some of the items to situations that the child reports familiar to him/her. This item mode was inspired from the Paediatric Asthma Quality of Life Questionnaire.33 An ordered response scale answer mode was chosen. For the child form, answer options were presented both verbally (orally and in written form) and as symbols of small to large circles. That choice was made according to Rebok et al34 who found
Journal of Voice, Vol. 26, No. 4, 2012
it to be one of the most reliable evaluation modality for young children. Second, a study on the feasibility of the questionnaire was undertaken to evaluate the children’s comprehension of the items and of the test situation as well as their concentration throughout the test. The questionnaire was submitted orally and individually to each child. Two clinicians were present; one, the examiner, carried out the questionnaire, whereas the other took notes regarding the children’s comprehension difficulties and the examiners additional explanations and regarding any sign of dispersal (not paying attention to what is said, not maintaining eye contact, making inappropriate comments, and so forth). The items were randomly distributed in four different forms to avoid a length effect on certain items. The children were invited to tell when they did not understand an item and the examiner tried to explain it in another way. All understanding difficulties were noted as were the terms used that entailed comprehension. Every sign of distraction and dispersal in the children were noted as well. The answers were analyzed both qualitatively and quantitatively: A ‘‘comprehension score’’ was computed for each item based on the percentage of children who understood it without reformulation from the examiner. No statistical analyses were used at this point of the study. Finally, the questionnaire obtained after the pilot study was submitted to a group of 333 children and their parents. Normophonic children (NP) were recruited through primary schools in Brussels and neighborhood and dysphonic children (DP) were recruited through the ENT service at the AudioPhonological Centre of the University Hospital Saint-Luc in Brussels. Before the administration of the questionnaire, the introductory text on voice was read to the children, three questions about voice-use were asked and then the child form of the Pediatric Voice Symptom Questionnaire (PVSQ) was administered orally. In parallel, one of the child’s parents was asked to fill in the parent form of the PVSQ at home. The PVSQ was administered orally in a test-retest mode to the children and in a written test-retest mode to the parents. Additionally, the children were recorded on a sustained /a/ and on three sentences,35 these voice samples were later used for a perceptual rating of voice quality. Maximum phonation time and vital capacity were assessed and the phonation quotient computed. Parents also answered an anamnestic questionnaire. The administration of the PVSQ to the DP group was done by two experienced speech language pathologists (8–15 years of experience), specialized in voice and working at the ENT service. The administration to the NP group was done by either an experienced speech language pathologist or a last year speech pathology student specializing in voice and who had received special training in the protocol of the study under supervision of an experienced speech language pathologist. Inclusion criteria Both NP and DP group had to have French as their mother tongue and no associated speech or language disorders, this
Verduyckt Ingrid, et al
Validation and Standardization of the PVSQ
was controlled by the anamnestic questionnaire addressed to the parents, no formal assessment was undertaken by the examiners. NP group should have never been consulting for voice problems and their voice had to be rated as normophonic on a perceptual evaluation of voice quality. DP group had to have been consulting for voice problems at a clinic and have a diagnosis of dysphonia and had no speech therapy at the time of the test and the retest. The test was undertaken during their first visit to the clinic or during the first week after that visit. The retest was done 2 weeks later. No speech therapy was undertaken during that time and the parents and the children were asked not to talk about the questionnaire until after the retest. The retest was followed by a debriefing involving the child, the parents, and the speech therapist conducting the study. Treatment, if recommended, was discussed and started after the retest. A third group of children was eventually created, as several children reported as normophonic by their parents were perceptually identified as dysphonic, this group was entitled nonconsulting dysphonics (ncDP). The schools were notified about the identified DP children, with a recommendation for them to consult a voice clinic, but the follow-up of these children could not be ensured by the present study. Perceptive judgments All voice samples were perceptually evaluated by three experienced speech pathologists (3–20 years of experience in the field). Perceptive evaluation of voice quality is a wellresearched field and it is commonly accepted that, although the ear remains a very precise evaluation tool, the human brain is source of subjectivity and sensitive to biases linked to context, scale type, and parameters evaluated.36,37 The objective of our perceptual rating was not to precisely grade the voice samples but to exclude possible DP children, unaware of their dysphonia, from the NP group. We thus chose a binary classification task, the three judges had to rate each voice sample either as normophonic or dysphonic. If all three speech pathologists rated a child’s speech samples (vowel and sentences) as normophonic, the child was included in the NP group. If all three speech pathologists rated both speech samples as dysphonic, the child was included in the ncDP group. If the speech pathologists did not agree on the rating of one or both samples, or if the vowel was judged differently than the sentences, the child’s voice status was considered as uncertain and the child was excluded from all subgroups. The children who could not be recorded or who had defective recordings and school children whose parents reported that they had suffered from dysphonia or were under treatment for dysphonia during the time of the study were also excluded from the subgroups. Those children were however included in the total sample for reliability testing of the test-retest. Statistics The raw scores obtained at the questionnaire were analyzed for internal consistency with Cronbach’s alpha. No specific alpha value was retained; items were removed until optimized alpha. An item response modeling (IRM) approach38 was used to analyze response patterns of our subjects on the questionnaires.
e131
IRM theories assume that subjects having more numerous and more severe grade of this case in vocal symptoms also have a higher probability to endorse any given questionnaire item than subjects with less numerous and less severe grades of vocal symptoms. Also, the probability of endorsing a less severe item or item grade is higher than that of endorsing a severe item or item grade. Based on these assumptions, an ability score is computed for each subject and a difficulty score for each item. If the items and answer alternatives of our questionnaire are valid, the more severe answer alternatives (that is symptoms occurring more frequently) should obtain a higher difficulty grade than less severe answer alternatives (symptoms occurring less frequently or never). Also, the DP subjects should obtain higher ability grades than NP subjects, as we suppose they should experience more and more severe voice symptoms. The IRM theory relies on the assumption that all items measure the same construct; therefore, an item’s fit to the model is tested before difficulty or ability scores are computed and nonfitting items are analyzed separately. We used the freeware ‘‘Construct Map’’39 to carry out the IRM analysis. Item fit mean square statistics were computed for each item according to a partial credit model. Means square values of 0.75 and 1.33 and a t value above 2 or below 2 were set as model fit boundaries (Adams and Khoo, 1996 in Ref.38). Finally, traditional statistics were undertaken to explore group differences (Mann-Whitney U test and the Wilcoxon test), parent and child correlations, and test-retest correlations (Spearman’s rho). The P value was set at 0.01 for all analyses.
RESULTS Pilot study The populations were 42 NP (boys ¼ 21 and girls ¼ 21) aged 5– 13 years (mean, 9 years), attending last kindergarten year to last primary school year. Dispersal was only observed in two 5-year-old boys, the rest of the children had no difficulties in concentrating throughout that time. The test situations seemed to be well understood by children from the age of 6 years and older; however, the younger children had difficulties to act adequately during the administration: They tried to find a ‘‘correct’’ answer and had difficulties using the answer modalities. They preferred to answer by ‘‘yes’’ or ‘‘no’’ although they were reminded to use the whole range of answer options. The pictorial answer mode proved helpful especially to children younger than 9 years who preferentially pointed on the symbol corresponding to their answer, whereas older children often read aloud the written sentence, we thus chose to keep both the symbols and the text on the answer options. Twenty-four of the 31 items obtained a comprehension score between 75% and 100%, eight items obtained a comprehension score between 59% and 72%, and two items obtained a comprehension score of 46% and 26%, respectively. Comprehension difficulties were linked to either specific terms that were not understood or to the fact that voice was mixed up with articulation or language.
e132
Journal of Voice, Vol. 26, No. 4, 2012
FIGURE 1. Infit mean squares value for the 19 items of the child form. The items obtaining a comprehension score between 59% and 75% were revised according to the modifications that the examiner had done during the administration of the test and
that had proved useful, and four of them were contextualized to activities specific to the child. The two items obtaining the lowest comprehension scores were withdrawn.
FIGURE 2. Infit mean squares value for the 19 items of the parental form.
Verduyckt Ingrid, et al
Validation and Standardization of the PVSQ
e133
FIGURE 3. Subject ability scores and item ability scores for the child form. Each X represents three subjects.
Following the various mix-ups of voice with articulation and/ or speech problems, an illustrative text on voice was composed to be read to each child before the administration of the questionnaire. The questionnaire used for the subsequent validation study was composed of 29 items with a four-point answer alternative on each item. Validation procedure Descriptive statistics. We administered the questionnaire to 333 children in total (154 boys and 179 girls with a mean age of 9 years; standard deviation [SD]: 1.8). We lost 25 children at the retest. Parents of 269 children fulfilled the questionnaire at the first time and 219 filled in the retest. Mean test-retest delay was 19.3 days for the children (SD: 7.6) and 24.3 days for the parents (SD: 9.6). Fifty-eight children were failed to be recorded or the recordings were defective (saturation or background noise) and seven children were reported as being under treatment for dysphonia or having suffered from dysphonia. One hundred sixty-three children were identified as normophonic and were attributed to the NP group. Thirty-four children
were unanimously judged as dysphonic although their parents reported no voice problems and had never consulted for voice problems; those children were attributed to the ncDP group. The voice samples from 26 children were judged differently by different judges and were not included in any subgroup. The consulting dysphonic children were 45, 24 of them could participate in the test; all other children participated at the retest. Mean time delay between the test and the retest was of 19.3 days (SD: 7.6) for the children and 24.3 (SD: 9.6) for the parents. Thirty fathers participated on the test and 25 on the retest, the rest of the parent respondents were mothers. Mean age was of 9 years (SD: 1.8; minimum: 5.3; maximum: 13.3); there were no significant group differences according to age as shown by a t test analysis (NP/ncDP: P > 0.5; ncDP/ DP: P > 0.1; DP/NP: P > 0.1). There were seven missing values on the child questionnaire (three at the test, distributed on three subjects and two items; and four at the retest, distributed on two subjects and three items) accounting for 0.05% of the expected data, they were replaced by the group’s item mean. There were 46 missing values on the parental questionnaire (36 at the test, distributed across 24 subjects and 15 items; and 10 on the retest,
e134
Journal of Voice, Vol. 26, No. 4, 2012
FIGURE 4. Subject ability scores and item ability scores for the parental form. Each X represents six subjects. distributed across 11 subjects and six items) accounting for 0.44% of the expected data. One item, ‘‘Is your child sad because of his/her voice,’’ accounted for 30% of the missing data, whereas the rest of the missing values were evenly distributed across the remaining 15 items. Two of the parents accounted for 27.8% of the missing data at the test and were removed from further analysis, remaining missing values were evenly distributed across the other 33 subjects and were replaced by the group’s item mean.
eventually removed to yield an optimized alpha of 0.879. The initial Cronbach’s alpha value on the parental questionnaire was of 0.824. We chose to use the child questionnaire as a baseline for the modifications on the parental questionnaire and removed the same eight items; the resulting Cronbach’s alpha was 0.893 and could not be further optimized. Furthermore, the retests were analyzed without the eight items and yielded Cronbach’s alpha values of 0.876 and 0.757 for the child and the parental questionnaire, respectively.
Internal consistency. The initial Cronbach’s alpha on the 29-item child questionnaire was of 0.845. Eight items were
IRM analyses. Item 20, ‘‘Do your child have a problem with your voice?,’’ in the parental questionnaire did not fit the model (infit mean square value: 0.67, t value: 2.2) and item 21, ‘‘Is your voice damaged?,’’ in the child questionnaire had an infit mean square value close to the acceptable boundaries and a high t value (infit mean square: 1.32, t value: 3.1). We removed both items from each questionnaire and rerun the analysis on the 19-item questionnaires. This time, item estimate analysis resulted in acceptable infit mean square values for all items both on the parental and on the child questionnaire (Figures 1 and 2). All 19 items can thus be assumed to measure the same underlying construct. Difficulty scores for
TABLE 1. Item Score Computing as Done Originally and According to the IRM Analyses Scores Alternative A Alternative B
Never/ Sometimes/ Often/ Always/ Not at all A little Quite Very much 0 0
1 1
2 2
3
Verduyckt Ingrid, et al
e135
Validation and Standardization of the PVSQ
TABLE 2. Descriptive Statistics of Raw Total Score at the Test for NP, ncDP, and DP groups and Total Correlations NP
Type
ncDP
DP
Total
Group
Child
Parent
Child
Parent
Child
Parent
Child
Parent
Mean N SD Minimum–maximum
7.63 136 4.31 0–23
0.39 103 0.37 0–3
11.06 34 6.93 2–27
1.96 26 2.57 0–8
11.80 44 5.90 1–26
10.40 42 5.74 4–29
8.91 304 5.86 0–28
2.69 244 4.58 0–29
Abbreviations: N, number of subjects; SD, standard deviation. Missing values (0.25%) were replaced by group means.
each item and ability scores for each subject were computed; both scores are mapped on the same measurement scale and graphically represented on Wright Maps (Figures 3 and 4). The left side histogram shows subjects and the right side shows item abbreviations and grade (remember that each item had four grade alternatives for the answers: 0–3). The subjects are ranked from the most able at the top to the least able at the bottom and the items are ranked according to difficulty, with the most difficult at the top and the least difficult at the bottom on a scale ranging from 6 to 6. Subjects and items facing each other on the Wright Map are comparable in ability and difficulty, which means that the subject facing a specific item and item grade has a theoretical 50% probability of achieving that item and item grade. Item difficulty follows the expected distribution, that is, the more severe grades are the most difficult to obtain and the less severe are the easiest to obtain: consider for example item 3 (Figure 1) ‘‘Do you have to push in order to bring out your voice’’; the answer ‘‘sometimes’’ (3.1) is achieved more easily than the answer ‘‘often’’ (3.2). Most items’ fourth grade has a difficulty score that overrides the ability of both parent and child subjects. The most severe answer alternative is not used at all by the parents on several items. Thus, the probability of endorsing that score is very low even for the most able subjects; consequently, little information is provided by that grade, which suggests that it could be collapsed with the second most severe grade. We recomputed the raw score on the basis of a three grade alternative (alternative B in Table 1) and analyzed its correlation with the subjects’ ability score by means of Spearman’s rho. The results
show very high correlations (r: 0.992 for the children and r: 0.977 for the parents, P < 0.000). When we run the same correlation analysis on the raw score based on the four grade alternatives (alternative A in Table 1), the correlations are lower (r: 0.884 and r: 0.811, P < 0.000 for children and parents, respectively). The collapsed grade alternative can be considered as highly representative of the ability scores and a clinical alternative that avoid raw score transformation. The subsequent statistical analysis are based on the raw score computed from the three grade alternative, the score on the formerly removed item 20 ‘‘Do you/your child have problems with your/his/her voice’’ was used in its four grade alternative for correlation analysis with the raw score at the 19-item questionnaire. Comparison of mean raw score among groups. Descriptive statistics according to groups are plotted in Tables 2 and 3 and in Figure 5. One NP dyad was removed from the analysis because of an aberrant value at the raw score on the retest for the parent. There is a significant difference between the NP and the DP group both for the child raw total score and the parental raw total score (NP < DP: P < 0.001) and between the NP and the ncDP group both for the children and the parents (NP < ncDP: P ¼ 0.012 and 0.014, respectively). The difference between the ncDP and the DP is only significant for the parents (ncDP < DP: P < 0.001) but not for the children (ncDP ¼ DP: P ¼ 0.411). Child and parent comparison. The correlation between child and parental raw total score are plotted in Table 4. Correlations are good and significant for the DP group and there is no
TABLE 3. Descriptive Statistics of Raw Total Score at the Retest for NP, ncDP, and DP groups and Total Correlations NP
Type
ncDP
DP
Total
Group
Child
Parent
Child
Parent
Child
Parent
Child
Parent
Mean N SD Minimum–maximum
6.02 135 4.48 0–25
0.75 93 1.69 0–14
9.56 34 6.76 1–24
1.55 22 2.15 0–8
9.0 25 5.50 0–20
10.06 18 5.16 3–23
8.91 304 5.86 0–28
2.69 244 4.58 0–29
Abbreviations: N, Number of subjects; SD, standard deviation. Missing values (0.25%) were replaced by group means.
e136
Journal of Voice, Vol. 26, No. 4, 2012
There is a neither significant difference between the raw total score at the test nor at the retest in the DP or the ncDP group, either be it for the parents or for the children, whereas the raw total score differs significantly between the test and the retest for the NP children (test > retest: P < 0.001) but not for the NP parents. Correlation raw total score—item 20. Correlation of item 20 with the total score at the test is good and significant for the children and high and significant for the parents (Table 7). Analysis by group shows high correlations for both children and parents of the DP group and good correlations for the children and parents of the ncDP group and NP group, respectively.
FIGURE 5. Distributions of the raw total score according to groups. significant difference between raw total scores (P ¼ 0.187), whereas the correlation is low and nonsignificant for the NP and the ncDP group and the difference between raw scores is significant at P < 0.001 (child > parent). Between group comparison of mean score at item 20. The descriptive statistics of item 20 are plotted in Table 5. There is a significant difference between NP and DP groups both for the children and the parents (NP < DP: P < 0.001), and between ncDP and DP groups both for children and parents (ncDP < DP: P < 0.001). The difference between the NP and ncDP groups is only significantly different for the parents (NP < ncDP: P ¼ 0.003) and nonsignificant for the children (NP ¼ ncDP: nonsignificant). Test-retest comparison. There is an overall good correlation between raw total score at the test and the retest for both children and parents (Table 6). Correlation analysis by groups shows medium correlations for the parents in the NP group and high correlations for the ncDP and DP parents. All child groups have high test-retest correlations. To understand the lower correlation obtained at the test-retest in the NP parents, we cross-tabulated their raw scores at the test and the retest. The range of the raw score is 0–3 at the test and 0–5 at the retest. The results reveal that the scores at the test and the retest are identical for 59.4% of the subjects, that they differ by one point for 25%, by two points for 2.1%, and by three points for 1% of the subjects.
TABLE 4. Spearman’s Rho Results for the Parent-Child Correlations for the Mean Raw Score at the Test for the NP, ncDP, and DP groups and Total Correlations Type
NP
ncDP
DP
Rho
0.055
0.032
0.478*
* Correlations are significant at P < 0.001, other correlations are nonsignificant.
DISCUSSION The PVSQ was created to provide a tool permitting the full assessment of child voice as advocated by current guidelines, where PROMs are highly advised.19 In fact, the objective evaluation of a dysphonia as measured by aerodynamic or acoustic measures is not or only poorly correlated to patients’ self-report of its impact on their daily life.17,40 Patients are affected differently in function, for example to the occupational demand on their voices, their personality, and their coping styles.40,41 By the use of parallel child and parent forms, we wanted to study if the parental reports about their child’s vocal symptoms were interchangeable to the children’s report. Specific guidelines for the development of PROMs have been delineated to ensure reliable and valid instruments30,42 and the PVSQ has been developed according to those. Item construction was based on a literature search, expert opinion, and focus group interviews that have been accounted for in a previous publication.32 The 31-item questionnaire that was constructed following this first study was subjected to a pilot study on a group of 42 non-DP children. The aim of the pilot study was to assess the children’s comprehension of the items and of the test situation and their capacity to concentrate throughout the administration of the questionnaire. No feasibility study was done on the parental questionnaire. The results of the pilot study suggested that children from the age of 6 years are capable of understanding the test situation and most of the items and are able to concentrate throughout the questionnaire. The acceptability and the face and content validity of the PVSQ is ensured by the first part of this study. The acceptability is further reinforced by the low proportion of missing data by less than 0.5%. The construct validity of the PVSQ was confirmed by a high Cronbach’s alpha on both parental and child version after suppression of eight items on both versions of the questionnaire. The IRM analyses further suggested the removal of two additional items, the analysis of these new set of items confirmed good construct validity for all items in both the parental and the child form. Following the item estimate analysis, the most severe answer score showed to be superfluous and we thus chose to assign the same value as the second most severe answer score to it. This raw score computing showed to be better correlated with the subject’s proficiency scores than the raw scores computed on
Verduyckt Ingrid, et al
e137
Validation and Standardization of the PVSQ
TABLE 5. Descriptive Statistics of Raw Score at Item 20 for NP, ncDP, and DP groups and Total Correlations NP
Type
ncDP
DP
Total
Group
Child
Parent
Child
Parent
Child
Parent
Child
Parent
Mean N SD Minimum–maximum
0.206 136 0.489 0–3
0.019 103 0.139 0–1
0.324 34 0.684 0–3
0.160 25 0.374 0–1
1.046 44 0.914 0–3
1.047 43 0.844 0–3
0.375 304 0.716 0–3
0.216 245 0.549 0–3
Abbreviations: N, Number of subjects; SD, standard deviation.
the three-point model. It can be considered as representative of subjects’ proficiency and prevents the need of using a conversion formula to correctly interpret the raw scores. The test-retest reliability is high both for children and parents on a total group level. When splitting the analysis by groups, correlations are higher for the DP and ncDP parents than for the NP parents; however, it was noted that the distribution of the NP parents’ raw total score is restricted both at the test and at the retest (0–3 at the test and 0–5 at the retest) and a cross-tabulation analysis shows that the total raw score at the test and at the retest was identical for near to 60% of the NP parents, an additional 37% has a difference of one or two points only; this means that they either changed their answers by one point on two of the 19 items or by two points on one of the 19 items. The remaining 3% of the parents had a maximum discrepancy of four points, we thus believe that the obtained correlation coefficient cannot be considered as reflecting a lack of stability in the NP parents’ answers at the PVSQ. The PVSQ proved able to discriminate between pathological and nonpathological groups both at the parental and child level, the DP and NcDP children and their parents obtained significantly higher scores than the NP groups, the IRT analyses also showed that these subjects endorse more severe items and item scores than the NP subjects. These findings support good construct validity of the PVSQ. However, although the PVSQ discriminates between normophonic (NP) and dysphonic (DP and ncDP) children on a group level, scores of the DP and NP children overlap on the individual level. The PVSQ also differentiates NP and DP parents on a group level but on the individual level, scores of the ncDP and NP parents overlap. This is not true for the DP parents versus the NP parents as no DP parent scored under four and no NP parent scored over three.
Our results show that the answers obtained from the parental proxies at the PVSQ are representative, on a group level, of the children’s answers in a consulting population but not in a nonconsulting population, whether the children are dysphonic or normophonic. Because voice treatment in children demands active involvement of both the child and his/her parents, we think that the use of the parallel form is interesting in the motivation process of both children and parents. As were noticed during the debriefing sessions after the completion of the retest, parents and children were sometimes surprised about each other’s answers. The parallel forms supplied a base for a structured discussion with the child and his/her parents about the actual voice issues from which an adapted therapy plan could be elaborated, actively involving child and parents from the start. Furthermore, the parallel forms of the PVSQ make it possible for future studies to analyze and gain understanding in the discrepancies in perception of voice symptoms by parents and children. Although it was not the main aim of this study, interesting pieces of information could already be observed through our results. We observed that the NcDP parent group displays scores that are significantly higher than the NP parental group but significantly lower than the DP parental group, whereas their children’s mean raw scores, significantly higher than NP children’s, can be assimilated to those of the DP child group. Thus, it seems that the triggering factor for seeking medical care for a voice problem is a high awareness of the children’s vocal symptoms in the parents. On the other hand, high awareness of vocal symptoms in the child that are not acknowledged by the parents (as in the ncDP group) will not lead to a medical consultation. It also seems that high awareness of vocal symptoms in the children are not a consequence of high parental awareness because the ncDP children relate similar levels of
TABLE 6. Spearman’s Rho Results for the Test-Retest Correlations for the Mean Raw Score for the NP, ncDP, and DP groups and Total Correlations NP
Type
ncDP
DP
Total
Group
Child
Parent
Child
Parent
Child
Parent
Child
Parent
Rho
0.590
0.439
0.744
0.849
0.662
0.704
0.650
0.707
All correlations are significant at P < 0.001.
e138
Journal of Voice, Vol. 26, No. 4, 2012
TABLE 7. Spearman’s Rho Results for the Correlations Between the Mean Raw Score and Item 20 for the NP, ncDP, and DP groups and Total Correlations NP
Type
ncDP
DP
Total
Group
Child
Parent
Child
Parent
Child
Parent
Child
Parent
Rho
0.512
0.584
0.595
0.540
0.682
0.653
0.499
0.620
All correlations are significant at P < 0.001.
vocal symptoms as the DP children but have parents who are scoring significantly lower than the DP parents. The reason for this discrepancy could not be explored by this study, but it is interesting to note that the ncDP children have a low mean score at item 20 ‘‘Do you have a problem with your voice?’’. Thus, although the amount of vocal symptoms experienced by NcDP and DP children is equivalent, their interpretation of those as a problem is different. The parental groups behave differently than the child groups on this item: not surprisingly, the DP group has significantly higher scores on this item than both NcDP and NP groups; but interestingly, the NcDP group has significantly higher scores than the NP group. This could reflect the NcDP parents’ awareness of their children’s voice status. It might be hypothesized that the NcDP parents do not consult because their children do not give expression for a vocal problem, regardless of a high level of voice symptoms. The DP parents, in contrast, might be consulting with their children because the children regard their vocal symptoms as a voice problem. However, our results do not permit us to draw any conclusions on this matter, and it cannot be excluded that the DP children’s score on this item is increased because their parents regard the vocal symptoms as problematic or by the sole fact that they are taken to a doctor, an action that probably highlights the voice symptoms as an actual problem. Few HRQoL instruments exist that make use of parallel child and parent reports, making it difficult to understand the nature and the source of discrepancies in parental and child reports. However, factors concerning differences in opinion or inability of the child to evaluate subjective domains and differences in response style are suspected to be contributing.28 The PVSQ reveals as a valid tool to investigate and contribute to the understanding of the discrepancies between child and parental perception of dysphonia. Future studies should investigate if the scores on the PVSQ have a predictive value on treatment adherence and treatment outcome, it could be hypothesized that high PVSQ scores in both adult and child would favor treatment success, whereas divergent scores, in one or the other way, or low scores in both child and adult would be unfavorable. The PVSQ’s sensitivity to change will also have to be investigated in future studies as well as the upper age limit for its use in children.
CONCLUSION The PVSQ is a parallel form self-evaluation tool adapted to dysphonic children and their parents’ subjective evaluation of the
child’s vocal symptoms. The instrument has been developed according to PROM development guidelines and the final version is a valid and reliable questionnaire permitting both child self-report from the age of 6 years and a parallel parental proxy evaluation of the same symptoms. It is compact and score computing is easy, which is an advantage in clinical settings. The responsiveness of the PVSQ to treatment is currently under evaluation.
REFERENCES 1. Van Houtte E, Van Lierde K, D’Haeseleer E, Claeys S. The prevalence of laryngeal pathology in a treatment-seeking population with dysphonia. Laryngoscope. 2010;120:306–312. 2. Coyle SM, Weinrich BD, Stemple JC. Shifts in relative prevalence of laryngeal pathology in a treatment-seeking population. J Voice. 2001;15: 424–440. 3. McKinnon DH, McLeod S, Reilly S. The prevalence of stuttering, voice, and speech-sound disorders in primary school students in Australia. Lang Speech Hear Serv Sch. 2007;38:5–15. 4. Carding PN, Roulstone S, Northstone K, ALSPAC Study Team. The prevalence of childhood dysphonia: a cross-sectional study. J Voice. 2006;20: 623–630. 5. Duff MC, Proctor A, Yairi E. Prevalence of voice disorders in African American and European American preschoolers. J Voice. 2004;18: 348–353. 6. Milutinovic Z. Social environment and incidence of voice disturbances in children. Folia Phoniatr Logop. 1994;46:135–138. 7. Kilic¸ M, Okur E, Yildirim I, G€uzelsoy S. The prevalence of vocal fold nodules in school age children. Int J Pediatr Otorhinolaryngol. 2004;68: 409–412. 8. ESF/SCSS- Exploratory Workshop. Voice Development, Assessment, Education and Care in Childhood and Adolescence. University of London: United Kingdom; May 2002. 9. Signorelli ME, Madill CJ, McCabe P. The management of vocal fold nodules in children: a national survey of speech-language pathologists. Int J Speech Lang Pathol. 2011; [Epub ahead of print]. 10. Nienkerke-Springer A, McAllister A, Sundberg J. Effects of Family Therapy on Children’s Voices. J Voice. 2005;19:103–113. 11. Lee E, Son Y. Muscle tension dysphonia in children: voice characteristics and outcome of voice therapy. Int J Pediatr Otorhinolaryngol. 2005;69: 911–991. 12. Trani M, Ghidini A, Bergamini G, Presutti L. Voice therapy in pediatric functional dysphonia: a prospective study. Int J Pediatr Otorhinolaryngol. 2007;71:379–384. 13. Meredith Morgan L, Theis Shannon M, McMurray J Scott, Zhang Yu, Jiang Jack J. Describing pediatric dysphonia with nonlinear dynamic parameters. Int J Pediatr Otorhinolaryngol. 2008;72:1829–1836. 14. Wuyts F, Heylen L, Mertens F, De Bodt M, Van de Heyning P. Normative voice range profiles of untrained boys and girls. J Voice. 2002;16:460–465. 15. Schneider B, Zumtobel M, Prettenhofer W, Aichstill B, Jocher W. Normative Voice Range Profiles in Vocally Trained and Untrained Children Aged Between 7 and 10 Years. J Voice. 2010;24:153–160.
Verduyckt Ingrid, et al
Validation and Standardization of the PVSQ
16. Weinrich B, Salz B, Hughes M. Aerodynamic measurements: normative data for children ages 6:0 to 10:11 years. J Voice. 2005;19:326–339. 17. Woisard V, Bodin S, Yardeni E, Puech M. The Voice Handicap Index: correlation between subjective patient response and quantitative assessment of voice. J Voice. 2006;21:623–631. 18. Jacobson BH, Johnson A, Grywalski C, Silbergleit A, Jacobson G, Benninger MS, Newman CW. The Voice Handicap Index (VHI) development and validation. Am J Speech Lang Pathol. 1997;6:66–70. 19. Dejonckere PH, Bradley P, Clement P, et al. A basic protocol for functional assessment of voice pathology, especially for investigating the efficacy of (phonosurgical) treatments and evaluating new assessment techniques. Guideline elaborated by the Committee on Phoniatrics of European Laryngological Society (ELS). Eur Arch Otorhinolaryngol. 2001;258:77–82. 20. Hartnick CJ. Validation of a pediatric voice quality-of-life instrument: the pediatric voice outcome survey. Arch Otolaryngol Head Neck Surg. 2002; 128:919–922. 21. Boseley ME, Cunningham MJ, Volk MS, Hartnick CJ. Validation of the pediatric voice-related quality-of-life survey. Arch Otolaryngol Head Neck Surg. 2006;132:717–720. 22. Zur KB, Cotton S, Kelchner L, Baker S, Weinrich B, Lee L. Pediatric Voice Handicap Index (pVHI): a new tool for evaluating pediatric dysphonia. Int J Pediatr Otorhinolaryngol. 2007;71:77–82. 23. Theunissen NCM, Vogels TGS, Koopman HM, Verrips GHW, Zwinderman KAH, Verloove-Vanhorick SP. The proxy problem: child report versus parent report in health-related quality of life research. Qual Life Res. 1998;7:387–397. 24. le Coq EM, Boeke AJP, Bezemer PD, Colland VT, van Eijk JT. Which source should we use to measure quality of life in children with asthma: The children themselves or their parents? Qual Life Res. 2000;9:625–636. 25. Eiser C, Morse R. Can parents rate their child’s health-related quality of life? Results of a systematic review. Qual Life Res. 2001;10:347–357. 26. Jokovic A, Locker D, Guyatt G. How well do parents know their children? Implications for proxy reporting of child health-related quality of life. Qual Life Res. 2004;13:1297–1307. 27. Davis E, Waters E, Mackinnon A, Reddihough D, Graham K, LehmetRadji O, Boyd R. Paediatric quality of life instruments: a review of the impact of the conceptual framework on outcomes. Dev Med Child Neurol. 2006;48:311–318. 28. Davis E, Nicolas C, Waters E, Cook K, Gibbs L, Gosh A, RavensSieberer U. Parent-proxy and child self-reported health-related quality of
29.
30.
31. 32.
33.
34.
35. 36. 37. 38. 39. 40.
41.
42.
e139
life: using qualitative methods to explain the discordance. Qual Life Res. 2007;16:863–871. Upton P, Lawford J, Eiser C. Parent-child agreement across child healthrelated quality of life instruments: a review of the literature. Qual Life Res. 2008;17:895–913. National Centre for Health Outcomes Development, University of Oxford Patient-reported Health Instruments Group (formerly the PatientAssessed Health Outcomes Programme) Report to the UK Department of Health July 2001. Instruments for Children and Adolescents: A Review. http://phi.uhce.ox.ac.uk/pdf/phig_children_report.pdf. Last accessed May 5, 2011. Connor N, Cohen S, Theis S, Thibeault S, Heatley D, Bless D. Attitudes of children with dysphonia. J Voice. 2008;22:197–209. Verduyckt I, Remacle M, Jamart J, Benderitter C, Morsomme D. Voicerelated complaints in the pediatric population. J Voice. 2011;25: 373–380. Reichenberg K, Broberg AG. Quality of life in childhood asthma: use of the Paediatric Asthma. Quality of Life Questionnaire in Swedish sample of chidren 7 to 9 years old. Acta Paediatr. 2000;89:989–995. Rebok G, Riley A, Forrest C, Starfield B, Green B, Robertson J, Tambor E. Elementary school-age children’s reports of their health: a cognitive interviewing study. Qual Life Res. 2001;10:59–70. Combescure P. 20 lists of phonetically balanced sentences. Revue d’Acoustique. 1981;56:34–38. Shrivastav R. The use of an auditory model in predicting perceptual ratings of breathy voice quality. J Voice. 2003;17:502–512. Gerratt B, Kreiman J. Theoretical and methodological development in the study of pathological voice quality. J Phon. 2000;28:335–342. Wilson M. Constructing Measures: An Item Response Modeling Approach. Mahwah, NJ: Lawrence Erlbaum Associates; 2005. BEAR Center, University of California, Berkeley. Construct Map. Available at: http://bearcenter.berkeley.edu/. Last accessed May 5, 2011. Wheeler K, Collins S, Sapienza C. The relationship between VHI scores and specific acoustic measures of mildly disordered voice production. J Voice. 2006;20:308–317. Yiu E, Ho E, Ma E, Verdolini Abbott K, Branski R, Richardson K, Li N. Possible cross-cultural differences in the perception of impact of voice disorders. J Voice. 2011;25:348–353. Assessing health status and quality of life instruments: attributes and review criteria. Qual Life Res. 2002;11:193–205.