Tinnitus Functional Index: Development, validation, outcomes research, and clinical application

Tinnitus Functional Index: Development, validation, outcomes research, and clinical application

Hearing Research xxx (2015) 1e7 Contents lists available at ScienceDirect Hearing Research journal homepage: www.elsevier.com/locate/heares Researc...

433KB Sizes 2 Downloads 81 Views

Hearing Research xxx (2015) 1e7

Contents lists available at ScienceDirect

Hearing Research journal homepage: www.elsevier.com/locate/heares

Research paper

Tinnitus Functional Index: Development, validation, outcomes research, and clinical application James A. Henry a, b, *, Susan Griest a, b, Emily Thielman a, Garnett McMillan a, Christine Kaelin a, Kathleen F. Carlson c, d a

VA RR&D National Center for Rehabilitative Auditory Research, VA Portland Health Care System, Portland, OR, USA Department of Otolaryngology, Oregon Health & Science University, Portland, OR, USA VA HSR&D Center of Innovation, VA Portland Health Care System, Portland, OR, USA d Department of Public Health and Preventive Medicine, Oregon Health & Science University, Portland, OR, USA b c

a r t i c l e i n f o

a b s t r a c t

Article history: Received 1 March 2015 Received in revised form 27 May 2015 Accepted 2 June 2015 Available online xxx

The Tinnitus Research Consortium (TRC) issued a Request for Proposals in 2003 to develop a new tinnitus outcome measure that would: (1) be highly sensitive to treatment effects (validated for “responsiveness”); (2) address all major dimensions of tinnitus impact; and (3) be validated for scaling the negative impact of tinnitus. A grant was received by M. Meikle to conduct the study. In that observational study, all of the TRC objectives were met, with the final 25-item Tinnitus Functional Index (TFI) containing eight subscales. The study was published in 2012, and since then the TFI has received increasing international use and is being translated into at least 14 languages. The present study utilized data from a randomized controlled trial (RCT) that involved testing the efficacy of “telephone tinnitus education” as intervention for bothersome tinnitus. These data were used to confirm results from the original TFI study. Overall, the TFI performed well in the RCT with Cohen's d being 1.23. There were large differences between the eight different subscales, ranging from a mean 13.2-point reduction (for the Auditory subscale) to a mean 26.7point reduction (for the Relaxation subscale). Comparison of TFI performance was made with the Tinnitus Handicap Inventory. All of the results confirmed sensitivity of the TFI along with its subscales. This article is part of a Special Issue entitled .

Keywords: Randomized controlled trial Tinnitus assessment Tinnitus severity Tinnitus distress Tinnitus dimensions

Published by Elsevier B.V.

1. Introduction Based on epidemiology studies, 10e15% of adults experience chronic, persistent tinnitus (Heller, 2003; Hoffman and Reed, 2004; Shargorodsky et al., 2010). For about 20% of these individuals, the tinnitus is “bothersome,” disrupting sleep and concentration, and

Abbreviations: BPVAMC, Bay Pines VA Medical Center; CBT, cognitive-behavioral therapy; CC, Cleveland Clinic; HSI, Hearing and Speech Institute; IC, immediate care; JAHVA, James A. Haley VA Hospital; OHSU, Oregon Health & Science University; POQ-VA, Pain Outcomes Questionnaire-VA; PTM, Progressive Tinnitus Management; RCT, randomized controlled trial; TFI, Tinnitus Functional Index; THI, Tinnitus Handicap Inventory; TTE, telephone tinnitus education; TRC, Tinnitus Research Consortium; VA, Veterans Affairs; WLC, wait-list control * Corresponding author. VA Portland Health Care System, Mail Code: NCRAR, PO Box 1034, Portland, OR, 97207, USA. Tel.: þ1 503 220 8262x57466 (office/voicemail); fax: þ1 503 402 2955. E-mail address: [email protected] (J.A. Henry).

causing negative emotional reactions (Jastreboff and Hazell, 1998; Davis and Refaie, 2000; Krog et al., 2010; Cima et al., 2011). Although a cure for tinnitus is actively being sought, currently there is no proven means of eliminating tinnitus, or even of reducing its loudness (Henry et al., 2014). Patients with bothersome tinnitus must learn management techniques, and numerous behavioral methods exist for this purpose (Cima et al., 2014; Hoare et al., 2014). The method with perhaps the strongest evidence, based on randomized controlled trials (RCTs), is cognitive-behavioral therapy (CBT) (Martinez-Devesa et al., 2010; Tunkel et al., 2014). Research is ongoing to evaluate the effectiveness of existing behavioral methods and to develop new methods. These methods are not designed to reduce or change the perception of tinnitus, but are intended to reduce reactions to tinnitus and thereby improve quality of life. A separate line of research focuses on “treatments” for tinnitus that are intended primarily to reduce the loudness (intensity or magnitude) of tinnitus (Schmidt et al., 2014). Such

http://dx.doi.org/10.1016/j.heares.2015.06.004 0378-5955/Published by Elsevier B.V.

Please cite this article in press as: Henry, J.A., et al., Tinnitus Functional Index: Development, validation, outcomes research, and clinical application, Hearing Research (2015), http://dx.doi.org/10.1016/j.heares.2015.06.004

2

J.A. Henry et al. / Hearing Research xxx (2015) 1e7

treatments include pharmaceutical drugs, specialized acoustic protocols, and various alternative methods such as electrical and magnetic stimulation (Folmer et al., 2014). By reducing the loudness of tinnitus, it is expected that reactions to tinnitus would also be reduced (Schmidt et al., 2014). Prior to the year 2000, there were at least nine well-known questionnaires, each of which was statistically validated for intake assessment (Meikle et al., 2012). None, however, was validated for assessing outcomes, which would have required being prospectively designed and tested to maximize responsiveness to change in outcomes related to intervention. Further, these questionnaires did not cover all dimensions of tinnitus functional impact, and each differed with respect to formatting, scaling, and wording of individual items. These differences made it difficult to compare outcomes between clinics and between clinical trials, thus resulting in a lack of available systematic reviews, which are important to determine the clinical effectiveness of different interventions (Kamalski et al., 2010). Some explanation is needed as to why none of these questionnaires was validated for responsiveness. The importance of measurement sensitivity and responsiveness was not fully recognized by researchers until the 1980s and 1990s. Lipsey (1990) provided guidance for selecting measures that would be sensitive to change in intervention studies. Lipsey and Cordray (2000) reported “the characteristics that make a measure sensitive to individual differences on a construct of interest are not necessarily the ones that make them sensitive to change on that construct over time” (p. 355). These concepts were not familiar to the tinnitus researchers who developed these original questionnaires. Since then, responsiveness and measurement sensitivity for intervention studies has been extensively researched. To address this gap in current tinnitus questionnaires, the Tinnitus Research Consortium (TRC) in 2003 issued a Request for Proposals for a study to develop a new self-report questionnaire they pre-named the Tinnitus Functional Index (TFI). The TRC stipulated criteria for developing the TFI, most importantly that the new questionnaire be validated for use in intake assessment and for being sensitive/responsive to intervention-related changes in the functional effects of tinnitus resulting from intervention. Additionally, the TRC specified that the TFI: (1) employ 10 specific domains of negative tinnitus impact; (2) avoid overly-negative items (i.e., items that “catastrophize”); (3) not use items that refer only to hearing loss (and not tinnitus) or that pertain to more than one domain; (4) use only items having high construct validity for scaling of tinnitus severity; (5) use Likert-type response scales to provide high resolution of responses; and (6) use unambiguous wording that also addresses low health literacy. In response to the Request for Proposals, Dr. Mary Meikle, tinnitus researcher at Oregon Health & Science University (OHSU) submitted a grant proposal to develop the TFI. Her application (with JAH as Co-Principal Investigator) was approved, and the study was funded in 2004. The approach of the TFI study was based on the model used by Clark et al. (2003) to develop the Pain Outcomes Questionnaire-VA (POQ-VA). That study addressed the need for Veterans Affairs (VA) hospitals to have available a uniform method of measuring the effectiveness of treatment for chronic pain. The treatment of chronic pain was conceptualized as a complex phenomenon involving multiple domains (behavioral, perceptual, physical, psychosocial) of patient functioning, thus requiring the targeting of treatment to the different domains. The assessment of change within each major domain was preferred over a single outcome score because individuals exhibit different patterns of function/dysfunction across domains. With a single summary score, treatment-related changes in specific outcomes domains would be obscured. At the time the

POQ-VA was developed, no pain outcomes instrument was capable of measuring treatment effectiveness within the different domains considered important to a person with chronic pain. As a consequence, pain practitioners used instruments that were developed and validated as clinical pain assessment tools to assess outcomes of treatment for pain. The authors conducted a 5-year study to develop the POQ-VA and to validate its reliability and validity for evaluating the effectiveness of treatment for chronic pain. The study to develop the POQ-VA was conducted at six VA pain centers to ensure a sufficient number of subjects for valid statistical evaluation (Clark et al., 2003). A total of 957 subjects completed the POQ-VA using a two-stage, iterative process of data collection and analysis to refine the pool of potential items for the final instrument. Treatment was conducted at each center as per usual standard of care. This approach also resulted in more diverse, generalizable data. The project to develop and validate the TFI was similarly conducted at multiple sites, which included: Bay Pines VA Medical Center (BPVAMC), Bay Pines, FL; Cleveland Clinic (CC) Tinnitus Management Clinic, Cleveland, OH; Hearing and Speech Institute (HSI), Portland, OR; James A. Haley Veterans' Hospital (JAHVA), Tampa, FL; and OHSU Tinnitus Clinic, Portland, OR. The CC Tinnitus Management Clinic and OHSU Tinnitus Clinic were “destination clinics” that were sought out by patients with severe reactions to tinnitus. To evaluate the ability of the TFI to assess the full range of patients with respect to differential reactions to tinnitus, sites were included whose patients would typically have less severe tinnitus: BPVAMC, HSI, and JAHVH. A necessary tradeoff was that the VA sites had mostly male patients. There were three stages of TFI development: (1) item selection and design (construct Prototype 1); (2) test Prototype 1 to derive Prototype 2; (3) test Prototype 2 to derive final TFI. This collaborative effort required 4 years, and resulted in a publication describing details of TFI development and testing (Meikle et al., 2012). A condensed description of the three stages of work is presented herein, followed by TFI data obtained from an RCT and suggestions for clinical and research application of the TFI. 2. Stage 1: construct prototype 1 Design criteria for constructing the initial prototype included (1) responsiveness (include only items expected to have high sensitivity to treatment-related change); (2) high construct validity (each item should contribute to overall effectiveness in scaling of tinnitus severity); (3) comprehensive coverage (to address the outcomes domains most important to patients); (4) brevity (without compromising comprehensive coverage, limit questionnaire to 25 or fewer items); (5) good resolution for responsivenessdLikert-type 0e10 response scale preferred (Nunnally, 1978); (6) clarity of items e minimal reading difficulty; (7) simple scoring of items and of overall questionnaire; and (8) avoidance of overly negative thoughts in questionnaire items. Three steps were involved in creating TFI Prototype 1: (1) consultation with measurement experts; (2) selection of items; and (3) construction of Prototype 1. The nine existing tinnitus questionnaires provided the initial pool of 175 items (questions) that were identified as addressing important topics. Selection of items to maximize construct validity followed published recommendations to use multiple expert judges and formalized scaling procedures to quantify their judgments (Haynes et al., 1995). Seventeen tinnitus experts agreed to serve on the Item Selection Panel, of which eight had previously been involved in developing tinnitus questionnaires. Each Panel member reviewed all 175 items, using a website that was created for this purpose. Items had their own rating pages, which could be viewed in any order (with correction of previous

Please cite this article in press as: Henry, J.A., et al., Tinnitus Functional Index: Development, validation, outcomes research, and clinical application, Hearing Research (2015), http://dx.doi.org/10.1016/j.heares.2015.06.004

J.A. Henry et al. / Hearing Research xxx (2015) 1e7

responses allowed). For each item, Panel members selected one or more domains they felt the item represented from the list of 10 domains that were recommended by the TRC. They rated each item's sensitivity (low, moderate, high) to treatment-related change in response to the following instructions: “Individual questionnaire items may differ in responsiveness (ability to reflect tinnitus alterations following treatment). Please check one answer below, indicating how responsive you think this questionnaire item would be for measuring treatment-related changes in tinnitus.” If the item addressed a domain that was not on the list, that domain could be added. Tallying all Panel members' responses, 13 domains of negative tinnitus impact were identified: (1) emotional distress; (2) disturbance of sleep and rest; (3) cognitive interference; (4) intrusiveness; (5) persistence; (6) social distress or impact; (7) work interference; (8) leisure interference; (9) disturbance of relaxation; (10) reduced sense of control; (11) auditory perceptual problems ascribed to tinnitus; (12) somatic and physical complaints due to tinnitus; and (13) impaired quality of life. Of the 175 items, 70 were judged by the Panel to be responsive to treatment effects. Each of these items also addressed at least one of the 13 specified domains. Questions were then eliminated that were redundant, ambiguous, overly negative, or that referred only to hearing loss or to multiple subtopics within a domain. Elimination of items was also based on item effect sizes that had been obtained from a clinical trial that used four of the nine preexisting questionnaires (Henry et al., 2006). As a result of this elimination process the number of items was reduced from 70 to 35. For reliability purposes, measurement specialists recommend a minimum of 3e4 items for a single domain, with each item addressing one unique subtopic (Fabrigar et al., 1999; Moran et al., 2001). To have at least three items for each of the 13 domains, eight new items were added resulting in a total of 43 items for Prototype 1. (Three of the eight new items were adapted from non-tinnitus questionnaires and five were composed.) The 43 items were formatted using a 0- to 10-point Likert-type response scale with item-specific anchors at the scale extremes. A 0e10 response scale is recommended by measurement specialists to provide good resolution for responsiveness (Lipsey, 1990; Turk and Burwinkle, 2005), to optimize reliability (Nunnally, 1978), and because it is a familiar and preferred response format (Castle and Engberg, 2004). An important formatting issue for questionnaires is the recall interval for each question (U.S. Department of Health and Human Services, (2006)). Consultation with measurement experts determined that each block of 3e6 items should use the lead-in phrase “Over the past week …” This relatively brief recall interval was expected to minimize recall errors and to minimize response variability for individuals with a tinnitus condition that varies over time. Completion of Stage 1 resulted in the 43-item TFI Prototype 1. The 43 items were judged most relevant for addressing the 13 content domains and most likely to be responsive to interventionrelated change. Preliminary testing of TFI Prototype 1 was conducted with 10 patients (five at each of two participating sites) reporting bothersome tinnitus, all of whom were asked if they noted any particular difficulties completing the questionnaire. None reported any problems with its administration. 3. Stage 2: test prototype 1 to derive prototype 2 For Stage 2, the 43-item Prototype 1 was administered to patients at the five participating sites. The goal was to identify the best-functioning items from Prototype 1 (with respect to underlying domains [internal structure], responsiveness, and ability to scale tinnitus severity) to use with Prototype 2. This goal was accomplished by acquiring three classes of data: (1) Baseline (to

3

assess comprehensiveness and validity for scaling the degree of tinnitus impact). TFI Prototype 1 was mailed to patients complaining of tinnitus prior to their clinic appointment. They were also mailed a brief tinnitus history questionnaire, the Tinnitus Handicap Inventory (THI) (Newman et al., 1996), and the Beck Depression Inventory-Primary Care (Beck et al., 1997). These patients were asked to complete the forms at home and bring them to their appointment, at which time they could choose whether or not to participate in the study (their questionnaire data were not used for the study if they declined). (2) Retest (to assess test-retest reliability e only done at OHSU and HSI). If patients completed the questionnaires at home within 7e30 days prior to their appointment, and if they agreed to participate, they were asked to complete Prototype 1 a second time at the clinic. (3) Follow-up (to assess responsiveness/sensitivity to change resulting from treatment e done only at BPVAMC, CC, and JAHVA). Follow-up questionnaires were mailed at 3, 6, and 9 months after the initial appointment. However, 9-month data were excluded because there were too few responses for a valid statistical analysis of responsiveness at 9 months (which was also the case for Prototype 2). The focus of Stage 2 was to evaluate the responsiveness of Prototype 1, regardless of type of intervention and treatment efficacy. Some sites provided intervention that would be considered “more intensive,” typically involving counseling, tabletop sound generators, ear-level devices (“maskers” and combination instruments), and medications for comorbid insomnia, anxiety, and depression. Intervention considered “less intensive” involved hearing aids, tinnitus educational materials, and brief counseling. Across all five study sites, 327 subjects (male ¼ 81%) were enrolled in Stage 2. Each subject was a patient reporting persistent tinnitus who had an upcoming clinical appointment to address the tinnitus. Of these, 326 completed the Baseline questionnaires, 65 completed the 3-month Follow-up questionnaires, and 43 completed the 6-month Follow-up questionnaires. Subjects at BPVAMC and CC completing Follow-up questionnaires were paid $10. Prior to analyzing data obtained from Prototype 1, items with floor and ceiling effects, and those that were often left unanswered (presumably due to ambiguity), were identified. Effect sizes (Cohen's d e a standardized, scale-free measure of the relative size of the effect of an intervention expressed in standard deviation units) were calculated for the TFI index score, subscales, and for individual items (Cohen, 1988). The observational nature of the outcome data precluded the computation of effect sizes to compare treatment and control groups. An alternative approach, as recommended by Lipsey (1990) was to compute effect sizes for “criterion groups.” This approach creates cohorts that would differ from one another to the degree that a treatment and control group would differ. Criterion groups were based on subjects' 3- and 6-month responses to the “Global Perception of Change” item, which asked: “Since the last time you filled out our questionnaire, how would you describe your overall tinnitus status?” Responses could range from 1 to 9 (1 ¼ very much improved; 5 ¼ no change; 9 ¼ very much worse). Because of the small number of Follow-ups, the nine response categories were collapsed to create three criterion groups: (1) 1e4 ¼ “Improved,” (2) 5 ¼ “Unchanged,” and (3) 6e9 ¼ “Worse.” Collapsing in this manner created sample sizes that were minimally adequate to estimate effect sizes (Improved: n ¼ 11; Unchanged: n ¼ 45; Worse: n ¼ 9). Effect sizes for each of the criterion groups were calculated using the formula: mean score (Baseline) minus mean score (Follow-up) divided by SD (pooled across scores). According to Cohen (1988) effect sizes (Cohen's d) are rated as “small” (<0.5), “medium” (0.5 and < 0.8), and “large” (0.8). Calculation of effect sizes for each of the three criterion groups revealed effect-size magnitudes that would be reasonably

Please cite this article in press as: Henry, J.A., et al., Tinnitus Functional Index: Development, validation, outcomes research, and clinical application, Hearing Research (2015), http://dx.doi.org/10.1016/j.heares.2015.06.004

4

J.A. Henry et al. / Hearing Research xxx (2015) 1e7

expected: 0.79 for Improved, and near-zero for Unchanged and Worse. Effect sizes were also computed for each of the 43 items from Prototype 1: seven items had “small” effect sizes; 20 had “moderate” effect sizes, and 14 had “large” effect sizes. Two of the items had negative effect sizes. To identify key domains of tinnitus impact, exploratory factor analysis was conducted on the Baseline scores for the 43 items, including Principal Axis Factoring and Principal Components Analysis. Each subject was asked the question “How much of a problem is your tinnitus?” Response choices were Not a Problem, Small, Moderate, Big, and Very Big. To obtain the clearest factor solutions, “Not a Problem” responses were omitted, leaving 285 subjects who described their tinnitus problem as at least “Small.” Eight factors were identified that accounted for 80% of the variance: (1) sleep disturbance; (2) emotional effects; (3) intrusiveness of tinnitus; (4) interference with thinking; (5) interference with relaxing; (6) reduced sense of control; (7) reduced quality of life; and (8) interference with hearing. Briefly, results of Prototype 1 testing revealed: (1) high test-retest reliability (r ¼ 0.92, p < 0.005); (2) high internal consistency reliability (coefficient alpha ¼ 0.99); (3) itemetotal correlations (0.56e0.91 with 37 correlations 0.70); (4) criterion-related validity (r ¼ 0.91 with THI; r ¼ 0.73 with Visual Analog Scale of tinnitus severity); (5) clear factorial structure in agreement with expert clinical judgment, accounting for over 80% of variance among 43 items; and (6) high responsiveness to treatment-related changes in tinnitus impact (0.79 effect size for overall TFI score). These results provided the necessary data to develop TFI Prototype 2. Thirty items were selected that together encompassed all eight tinnitus domains and had maximal effect sizes. 4. Stage 3: test prototype 2 to derive final TFI For Stage 3, a new patient sample was used to evaluate the 30item Prototype 2 in terms of responsiveness, key domains (internal structure), and scaling of tinnitus impact. Subjects included 347 patients (82% male; average age ¼ 60 years) at four of the participating sites (the HSI discontinued participation). As for the Stage 2 sample, patients with an upcoming clinical appointment to address persistent tinnitus were recruited. Stage 3 testing procedures were essentially the same as for Stage 2. Also, as for Stage 2, factor analytic techniques were used to confirm the eight domains, and effect sizes were calculated to evaluate sensitivity of items and domains. The best-functioning Prototype 2 items would be used to create the final TFI. To increase compliance, subjects with more problematic tinnitus were recruited, and payment to retest and follow-up subjects was raised from $10 to $20. Tinnitus severity levels were higher for the Prototype 2 sample than for the Prototype 1 sample. Follow-up data were provided by 155 subjects at 3 months and 85 subjects at 6 months. Retest data were collected only at OHSU (n ¼ 37). Even though the number of items was reduced from 43 to 30, Prototype 2 testing revealed good test-retest reliability, high internal consistency reliability, consistent factor structure, and strong construct validity for scaling tinnitus impact. Responsiveness was moderate at 3 months, and high at 6 months. These data were satisfactory to develop the final version of the TFI, which would retain at least three items per subscale. The bestfunctioning items were selected, resulting in the removal of five items: (1) discomfort caused by tinnitus; (2) interference of tinnitus with participation in social events; (3) interference of tinnitus with leisure activities; (4) fatigue caused by tinnitus; and (5) amount of time that overall quality of life was reduced by tinnitus. The final TFI included eight subscales: Intrusive, Sense of

Control, Cognitive, Sleep, Auditory, Relaxation, Quality of Life, and Emotional. Four items were included in the Quality of Life subscale, and there were three items each for the remaining seven subscales. All analyses used for evaluating Prototype 2 were repeated for the 25-item final TFI, using data obtained with the Prototype 2 sample. 5. TFI performance in a randomized controlled study Data from the original TFI study were obtained using an observational study design. The change scores obtained in that study would be considered preliminary, and results of RCTs are needed to refine the original estimates. We are completing a RCT that has data available to address this issue (ClinicalTrials.gov Identifier: NCT01129141). More specifically, we are completing a RCT that is testing the efficacy of “telephone tinnitus education” (TTE) as intervention for bothersome tinnitus. The current RCT follows a pilot study (Henry et al., 2012) that resulted in positive outcome of the intervention, justifying funding of a major RCT to obtain more definitive results. TTE utilizes the educational counseling that was developed for Progressive Tinnitus Management (PTM). With PTM, the basic tinnitus intervention is termed Level 3 Skills Education and is delivered within the context of the stepped-care program that is defined for PTM (Henry et al., 2009). PTM involves audiologic testing and a brief tinnitus assessment (Level 2) to determine if a patient's tinnitus condition is sufficiently bothersome to warrant Level 3 Skills Education (Henry et al., 2008). The Level 3 intervention is delivered as five sessions, with two led by an audiologist and three by a behavioral health provider (psychologist, counselor, social worker, or other provider trained to offer CBT). During each session, patients are taught skills to self-manage their reactions to tinnitus. In the audiologist-led sessions, different ways sound can be used for tinnitus management are explained. Patients identify situations when their tinnitus is most bothersome (for example, falling asleep at night) and are taught how to make a “sound plan” to address each tinnitus problem-situation. Similarly, the behavioral health provider teaches coping skills from CBT and helps patients develop plans for using these skills to self-manage their tinnitus in the most bothersome situations. While these sessions are ideally conducted as group workshops, patients can obtain the same education by attending one-on-one sessions with an audiologist and a behavioral health provider. Because intervention with PTM involves self-help education, the intervention is adaptable to delivery from a remote location. So far, efficacy evaluation of a remote intervention delivery has been limited to the use of telephone. (The use of audio plus video in the future is envisioned.) For our TTE pilot study, and for the present RCT, subjects were recruited nation-wide, mostly by posting recruitment flyers at different VA hospitals (Henry et al., 2012). Candidates were screened to determine that their tinnitus was a problem to such a degree that intervention was warranted. To qualify, an audiologic evaluation had to have been performed recently and hearing aids must be worn if indicated by the evaluation (to ensure that subjects would not have confounding hearing problems). For the current RCT, qualified candidates were randomized to either “immediate care” (IC) or wait-list control (WLC) groups. The IC group received telephone appointments right away with an audiologist and a psychologist e a total of six appointments were scheduled over a 6-month period. The telephone education was adapted from PTM Level 3 Skills Education, and covered the same material. The WLC group waited 6 months before starting the intervention. Both groups completed the TFI and the Tinnitus Handicap Inventory (THI) (Newman et al., 1996) at Baseline and at 6 months. Although this study is not yet complete, sufficient data are available to provide further insight into the functioning of the TFI.

Please cite this article in press as: Henry, J.A., et al., Tinnitus Functional Index: Development, validation, outcomes research, and clinical application, Hearing Research (2015), http://dx.doi.org/10.1016/j.heares.2015.06.004

J.A. Henry et al. / Hearing Research xxx (2015) 1e7

5

Table 1 Summary statistics using the Tinnitus Handicap Inventory (THI) and Tinnitus Functional Index (TFI) outcome measures in the Telephone Tinnitus Education (TTE) study. Instrument

Scale

Immediate care

Wait list

N

6 Month e baseline

SD

N

6 Month e baseline

SD

TFI

Intrusive Control Cognitive Sleep Auditory Relaxation Quality Emotional TFI Index Score

77 77 77 77 77 77 76 76 77

17.14 25.11 23.55 22.6 13.18 26.67 17.14 25.83 21.21

18.93 24.91 21.82 25.27 22.91 25.5 24.43 28.07 18.19

90 90 90 90 90 90 90 90 90

1.52 1.48 0.74 2.37 3.04 4.63 0.17 2.59 1.21

16.32 20.89 18.95 22.06 19.92 18.69 18.5 22.75 14.27

THI

Functional Emotional Catastrophic THI Index Score

77 77 78a 77

15.56 15.77 20.48 16.7

19.67 20.37 23.74 18.67

90 90 90 90

1.4 1.58 1.83 0.32

13.3 15.34 17.87 12.86

a The Catastrophic subscale has n ¼ 78 instead of 77 because one subject missed enough questions that the Functional and Emotional subscales could not be calculated (as well as the overall score), but did answer enough of the Catastrophic questions that that one subscale was calculated.

5.1. Outcomes Two change scores were computed for this analysis: 6 Month TFI e Baseline TFI and 6 Month THI e Baseline THI. Additional change score outcomes were similarly computed for each subscale in each instrument. 5.2. Methods Effect size for each questionnaire was compared using Cohen's d statistic, which was calculated based on the ratio of the difference in mean change scores between IC and WLC and the pooled standard deviation of the difference (Cohen, 1988). Related to Cohen's d statistic, the sample size requirement for a subsequent randomized trial using each change score outcome assuming 80% power and a two-sided level 0.05 test is reported. The latter metric is a measure of the benefit that investigators receive from using either the THI or the TFI in subsequent study designs.

with bothersome tinnitus, using a two parallel arm wait-list control design. Overall, both THI and TFI performed well with Cohen's d being somewhat larger for the TFI (d ¼ 1.23) than for the THI (d ¼ 1.04). If powering a similar study, fewer subjects would be required with the TFI than with the THI, which confirms results obtained from the original TFI study (Meikle et al., 2012). The present study provides additional data with respect to the subscales and how they could determine changes in individual domains when change is observed in the overall index score. As Table 1 shows, there were substantial differences between subscales. For the TFI, the Auditory subscale had the smallest mean change (13.2-point reduction), while the Relaxation subscale displayed the largest mean change (26.7-point reduction). These results could be interpreted that the intervention had the least effect on the Auditory domain (which would be expected) and the strongest effect on the Relaxation domain. Whereas the TFI contains eight subscales (domains), of which seven contain 3 items and one contains 4 items, the THI contains

5.3. Results Table 1 shows the mean change scores for each instrument and subscale for n ¼ 77 (depending on the scale) IC subjects and n ¼ 90 WL subjects providing valid Baseline and 6 month Follow-up data. Negative values indicate average improvement in reaction to tinnitus at 6 months follow-up. Note that across instruments and subscales the IC group had considerably greater improvement (larger negative scores) than the WL group, suggesting some benefit to the TTE telephone-based intervention. The TFI and THI change scores are strongly, linearly related (Pearson's correlation ¼ 0.69, p < 0.0001). Results are plotted in Fig. 1. The solid dots are the IC subjects and the circles show the WL control subjects. This figure emphasizes the similarity between the TFI and THI as an outcomes instrument. Table 2 compares the IC and WLC groups across measures. The mean and standard deviation of the difference in the change scores (WLC e IC) are shown in the first two columns. Positive values in the first column indicate greater improvement in the IC group. Cohen's d statistic is shown in the third column. The number of subjects per treatment group that one would need to recruit for an 80% chance of detecting a significant difference in treatments is shown in the final column (‘N per Group’). 5.4. Comments It should first be emphasized that that these results are specific to the treatment used (TTE), in a population of mostly Veterans

Fig. 1. Scatterplot of the Tinnitus Handicap Inventory (THI) and Tinnitus Functional Index (TFI) outcome measure change scores observed in the Tinnitus Telephone Education (TTE) study.

Please cite this article in press as: Henry, J.A., et al., Tinnitus Functional Index: Development, validation, outcomes research, and clinical application, Hearing Research (2015), http://dx.doi.org/10.1016/j.heares.2015.06.004

6

J.A. Henry et al. / Hearing Research xxx (2015) 1e7

Table 2 Contrasts and design benefits of the Tinnitus Handicap Inventory (THI) and Tinnitus Functional Index (TFI) outcome measures in the Telephone Tinnitus Education (TTE) study. Instrument

Scale

IC over WL

SD

Cohen's d

N per group

TFI

Intrusive Control Cognitive Sleep Auditory Relaxation Quality Emotional TFI Index Score

15.62 23.63 22.81 20.23 16.22 22.04 17.30 23.24 20.00

17.57 22.83 20.32 23.59 21.35 22.09 21.41 25.32 16.19

0.89 1.03 1.12 0.86 0.76 1.00 0.81 0.92 1.23

21 16 14 23 29 17 26 20 12

THI

Functional Emotional Catastrophic THI Index Score

16.96 14.19 18.65 16.38

16.54 17.83 20.80 15.81

1.03 0.80 0.90 1.04

16 26 21 16

only three subscales: Catastrophic (5 items), Emotional (9 items), and Functional (11 items). As noted above, “catastrophic” items were purposely omitted from the TFI, as mandated by the grant funding agency. In the original TFI study, the Catastrophic subscale of the THI showed the largest effect sizes (Meikle et al., 2012). For the present study, the Catastrophic subscale of the THI had the largest mean reduction of the three subscales, but the Functional subscale had the largest effect size (Table 2). The Catastrophic subscale warrants further study. If it works particularly well as an outcome measure, it could be utilized in some fashion in the future to complement the TFI. This would of course negate the original intention of leaving overly-negative items out of the TFI, which was done to avoid suggesting to patients with less severe tinnitus that their eventual fate was to experience such catastrophic effects. A possible solution to this dilemma is to take the 5-item THI Catastrophic subscale and use a 0e10 response scale anchored on the low end with “no problem,” which might be less suggestive of inevitable negative catastrophic effects. In spite of the TFI not containing catastrophic items, it nevertheless showed the greatest sensitivity to treatment effects, in both the original and present studies. It might also be noted that the TFI Auditory subscale was somewhat less sensitive to change than the other seven subscales. Further, all but one of the TFI subscales are 3item scales whereas the THI subscale items range from 5 to 11. Typically, scales with more items tend be more sensitive. This was not the case in the current analysis. More work is clearly needed to better interpret patients’ responses on different subscales. The TFI subscales were carefully chosen as part of the systematic study to develop the TFI (Meikle et al., 2012). That study involved the identification of 10 initial subscales, which expanded to 13, and then was reduced to the current eight based on their sensitivity to treatment-related change. A better understanding of how patients respond to each subscale, and their patterns of subscale responses, should be helpful in designing treatments that more appropriately address individual needs. 6. Use of TFI in the clinic and clinical research Following publication of the original TFI study (Meikle et al., 2012), a special website was set up at OHSU to enable access to the copyrighted TFI. As of 11/10/2014, 531 downloads were made from that website by clinicians and researchers around the world. The TFI has achieved considerable international attention in a short period of time, as also attested to by the numerous requests to translate the TFI into at least 14 different languages. Already, a

Dutch version of the TFI has been translated and validated for having psychometric properties consistent with the original version (Rabau et al., 2014). These researchers utilized a translation-back translation procedure and then administered the Dutch version to 263 clinical patients complaining of tinnitus. Factor analysis was performed to compare results of the eightfactor structure with the original TFI. Internal consistency and convergent validity were both assessed. All analyses led to the conclusion that the Dutch version has psychometric properties consistent with the original TFI. The Dutch researchers who translated the TFI noted the difficulty of comparing results between tinnitus-outcome studies due to the many different questionnaires that are used (Rabau et al., 2014). They also noted that the TFI has been validated for assessing treatment-related change in outcomes, which provided the primary impetus to translate the TFI into Dutch. Their translation efforts appear to be conducted appropriately, which warrants some comment. Tinnitus clinical management and research is an international concern, and the TFI has the potential to bridge different languages and cultures. It is, however, critical that any translations of the TFI be performed using rigorous translation procedures in order to generalize and apply results beyond those obtained with the original English version. Translation of survey instruments is a field that recognizes the importance of valid translations and the difficulties encountered in translating successfully across cultures. The relevant literature notes that valid translation of a questionnaire involves much more than just translating from one language to another, even if done by professionally trained translators (Su and Parham, 2002). When translating from a source language to a target language, some of the problems encountered involve: (1) lack of equivalent words; (2) differences in grammar and syntax; (3) idiomatic expressions that lose their meaning with a literal translation; and (4) divergent cultural norms (a concept relevant in one culture may not be relevant in another culture). To overcome these difficulties in achieving equivalence of meaning, backtranslation processes are considered essential. Conducting the back-translation process properly requires at least two translators who are thoroughly familiar with both languages. One translator makes the initial translation. A second translator is blinded to the original survey and conducts the back translation, which is then compared to the original. In some cases the original meaning of a word or phrase cannot be translated, which requires revising the original word/phrase to maintain the original meaning while offering a different option for translation. This process is repeated iteratively until the two versions achieve equivalence of meaning. At this point it is important to evaluate the translated version with persons representative of the target population and obtain feedback regarding item meanings, a process known as “pretesting.” The iterative translation-back translation process, followed by pretesting are the minimum requirements for translating questionnaires (Bullinger et al., 1993). Any future translations of the TFI should following these minimum recommendations. The TFI is being utilized in numerous clinical trials, including in the United Kingdom (Fackrell et al., 2013), New Zealand (Chandra, 2013), and the United States (Scherer et al., 2014; Henry et al., 2015). Because of this high level of interest, it is important to provide general guidelines for administering the TFI and for interpreting results. Based on data collected during development of the TFI, for evaluating tinnitus impact at intake mean scores can be stratified into five levels: 1. Not a problem: M ¼ 14 (range: 0e17) 2. Small problem: M ¼ 21 (range: 18e31) 3. Moderate problem: M ¼ 42 (range: 32e53)

Please cite this article in press as: Henry, J.A., et al., Tinnitus Functional Index: Development, validation, outcomes research, and clinical application, Hearing Research (2015), http://dx.doi.org/10.1016/j.heares.2015.06.004

J.A. Henry et al. / Hearing Research xxx (2015) 1e7

4. Big problem: M ¼ 65 (range: 54e72) 5. Very big problem: M ¼ 78 (range: 73e100) Preliminary data support an additional way to interpret TFI scores:  <25 ¼ relatively mild tinnitus (little or no need for intervention)  25e50 ¼ significant problems with tinnitus (possible need for intervention)  >50 ¼ tinnitus severe enough to qualify for more aggressive intervention The question of how much change in questionnaire index scores is required to be “clinically significant” is debated among measurement experts. Confounding this debate is the fact that individual patients vary considerably regarding what each would consider a “meaningful change.” Consequently, any statistical demonstration of differences between treatment groups would not necessarily indicate change that all patients would consider important or meaningful. With respect to the TFI, the criterion groups approach (described above) revealed mean change scores that progressed in an orderly fashion from Much or Moderately improved through Unchanged to Moderately or Much worse. These data were interpreted to suggest a reduction in TFI scores of ~13 points would in general be meaningful to patients (taking into account individual differences). 7. Conclusion The RCT reported herein employs the TFI as its primary outcome measure. Although the RCT is not yet complete, sufficient data are available to provide an analysis of TFI outcome data. This analysis compares the performance of the TFI to data derived from the original TFI-development study (Meikle et al., 2012). The Tinnitus Handicap Inventory (THI) was used in both studies, and the current analysis also compared TFI results to THI results, with respect to index scores and subscales. All analyses reveal the TFI is performing as intended, lending credibility to the original findings. Because of its demonstrated responsiveness to treatment-related change, comprehensive coverage of the domains of tinnitus impact, and other psychometric properties the TFI is showing promise as a standard instrument for both clinical and research settings, which was the overall intent of the Tinnitus Research Consortium when it announced its Request for Proposals in 2003 to develop a new tinnitus outcome questionnaire. Acknowledgments We thank the Tinnitus Research Consortium for providing the principal funding support for the original research (PP0009; MIRB1385), which was granted to Dr. Mary Meikle (dec.), the primary developer of the TFI. The randomized controlled trial reported in the present article was funded by VA Rehabilitation Research and Development (RR&D) Service (C7452I). We also wish to acknowledge important contributions from Dr. Barbara Stewart. References Beck, A.T., Guth, D., et al., 1997. Screening for major depression disorders in medical inpatients with the beck depression inventory for primary care. Behav. Res. Ther. 35 (8), 785e791. Bullinger, M., Anderson, R., et al., 1993. Developing and evaluating cross-cultural instruments from minimum requirements to optimal models. Qual. Life Res. 2 (6), 451e459. Castle, N.G., Engberg, J., 2004. Response formats and satisfaction surveys for elders. Gerontologist 44 (3), 358e367.

7

Chandra, N., 2013. New Zealand validation of the Tinnitus Functional Index. Unpublished Dissertation. Bachelor of Health Sciences (Hons). The University of Auckland. Cima, R.F., Vlaeyen, J.W., et al., 2011. Tinnitus interferes with daily life activities: a psychometric examination of the Tinnitus Disability Index. Ear Hear 32 (5), 623e633. Cima, R.F., Andersson, G., et al., 2014. Cognitive-behavioral treatments for tinnitus: a review of the literature. J. Am. Acad. Audiol. 25 (1), 29e61. Clark, M.E., Gironda, R.J., et al., 2003. Development and validation of the pain outcomes questionnaire-VA. J. Rehabil. Res. Dev. 40 (5), 381e395. Cohen, J., 1988. Statistical Power Analysis for the Behavioral Sciences, second ed. Lawrence Erlbaum Associates, Inc, Hillsdale, NJ. Davis, A., Refaie, A.E., 2000. Epidemiology of Tinnitus. In: Tinnitus Handbook. Singular Publishing Group, San Diego, pp. 1e23. R. Tyler. Fabrigar, L.R., Wegener, D.T., et al., 1999. Evaluating the use of exploratory factor analysis in psychological research. Psychol. Methods 4 (3), 272e299. Fackrell, K., Hall, D.A., et al., 2013. UK validation of the Tinnitus Functional Index (TFI): convergent and discriminant validity. In: 7th International Tinnitus Research initiative Conference. Valencia, Spain. Folmer, R.L., Theodoroff, S.M., et al., 2014. Experimental, controversial, and futuristic treatments for chronic tinnitus. J. Am. Acad. Audiol. 25 (1), 106e125. Haynes, S.N., Richard, D.C., et al., 1995. Content validity in psychological assessment: a functional approach to concepts and methods. Psychol. Assess. 7, 238e247. Heller, A.J., 2003. Classification and epidemiology of tinnitus. Otolaryngol. Clin. N. Am. 36 (2), 239e248. Henry, J.A., Schechter, M.A., et al., 2006. Outcomes of clinical trial: tinnitus masking vs. Tinnitus retraining therapy. J. Am. Acad. Audiol. 17, 104e132. Henry, J.A., Zaugg, T.L., et al., 2008. The role of audiologic evaluation in progressive audiologic tinnitus management. Trends Amplif. 12 (3), 169e184. Henry, J.A., Zaugg, T.L., et al., 2009. Principles and application of counseling used in progressive audiologic tinnitus management. Noise Health 11 (42), 33e48. Henry, J.A., Zaugg, T.L., et al., 2012. Pilot study to develop telehealth tinnitus management for persons with and without traumatic brain injury. J. Rehabil. Res. Dev. 49 (7), 1025e1042. Henry, J.A., Roberts, L.E., et al., 2014. Underlying mechanisms of tinnitus: review and clinical implications. J. Am. Acad. Audiol. 25 (1), 5e22 quiz 126. Henry, J.A., Frederick, M., et al., 2015. Validation of a novel combination hearing aid and tinnitus therapy device. Ear Hear 36 (1), 42e52. Hoare, D.J., Searchfield, G.D., et al., 2014. Sound therapy for tinnitus management: practicable options. J. Am. Acad. Audiol. 25 (1), 62e75. Hoffman, H.J., Reed, G.W., 2004. Epidemiology of Tinnitus. Tinnitus: Theory and Management. J. B. Snow. BC Decker Inc., Lewiston, NY, pp. 16e41. Jastreboff, P.J., Hazell, J.W.P., 1998. Treatment of Tinnitus Based on a Neurophysiological Model. In: Tinnitus Treatment and Relief. Allyn & Bacon, Needham Heights, pp. 201e217. J. A. Vernon. Kamalski, D.M., Hoekstra, C.E., et al., 2010. Measuring disease-specific healthrelated quality of life to evaluate treatment outcomes in tinnitus patients: a systematic review. Otolaryngol. Head. Neck Surg. 143 (2), 181e185. Krog, N.H., Engdahl, B., et al., 2010. The association between tinnitus and mental health in a general population sample: results from the HUNT Study. J. Psychosom. Res. 69 (3), 289e298. Lipsey, M.W., 1990. Design Sensitivity: Statistical Power for Experimental Research. Sage, Newbury Park, CA. Lipsey, M.W., Cordray, D.S., 2000. Evaluation methods for social intervention. Annu Rev. Psychol. 51, 345e375. Martinez-Devesa, P., Perera, R., et al., 2010. Cognitive behavioural therapy for tinnitus. Cochrane Database Syst. Rev. (9), CD005233. Meikle, M.B., Henry, J.A., et al., 2012. The tinnitus functional index: development of a new clinical measure for chronic, intrusive tinnitus. Ear Hear 33 (2), 153e176. Moran, L.A., Guyatt, G.H., et al., 2001. Establishing the minimal number of items for a responsive, valid, health-related quality of life instrument. J. Clin. Epidemiol. 54 (6), 571e579. Newman, C.W., Jacobson, G.P., et al., 1996. Development of the tinnitus handicap inventory. Archives Otolaryngol.dHead Neck Surg. 122, 143e148. Nunnally, J.C., 1978. Validity. Psychometric Theory. J. C. Nunnally. McGraw-Hill, New York, pp. 86e113. Rabau, S., Wouters, K., et al., 2014. Validation and translation of the Dutch tinnitus functional index. B-ENT 10 (4), 251e258. Scherer, R.W., Formby, C., et al., 2014. The Tinnitus Retraining Therapy Trial (TRTT): study protocol for a randomized controlled trial. Trials 15, 396. Schmidt, C.J., Kerns, R.D., et al., 2014. Toward development of a tinnitus magnitude index. Ear Hear 35 (4), 476e484. Shargorodsky, J., Curhan, G.C., et al., 2010. Prevalence and characteristics of tinnitus among US adults. Am. J. Med. 123 (8), 711e718. Su, C.T., Parham, L.D., 2002. Generating a valid questionnaire translation for crosscultural use. Am. J. Occup. Ther. 56 (5), 581e585. Tunkel, D.E., Bauer, C.A., et al., 2014. Clinical practice guideline: tinnitus. Otolaryngol. Head. Neck Surg. 151 (2 Suppl), S1eS40. Turk, D.C., Burwinkle, T.M., 2005. Assessment of chronic pain in rehabilitation: outcomes measures in clinical trials and clinical practice. Rehabil. Psychol. 50 (1), 56e64. U.S. Department of Health and Human Services, 2006. Guidance for Industry. Patient-reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims. Food and Drug Administration, Rockville, MD.

Please cite this article in press as: Henry, J.A., et al., Tinnitus Functional Index: Development, validation, outcomes research, and clinical application, Hearing Research (2015), http://dx.doi.org/10.1016/j.heares.2015.06.004