Pain, 68 (1996) 349–361 @ 1996 International Association for the Study of Pain. 0304-3959/96/
349 $15.00
PAIN 3217
Treatment helpfulness questionnaire: a measure of patient satisfaction with treatment modalities provided in chronic pain management programs Stanley L. Chapman’*, Robert N. Jamisonc and Steven H. Sandersb aDepurtment qf”Ane.whesiologyand bDepartment qf Rehabilitation Medicine, Emory University School qf”kledicine, Atlanta, GA 30322 (USA) and cDepartment.vof”Anesthesiaand Psychiatry, Harvard Medical School and Brigham and Women Hospital, Boston, MA 02115 (USA) (Received 5 January 1996, revised version received 14 June 1996, accepted 1 July 1996)
The Treatment Helpfulness Questionnaire (THQ)is presented as a reliable and valid measure for assessing patient perSummary ceptions of the helpfulness of treatment modalities offered at multidisciplinary pain centers. It is easy to administer and score and shows good interscorer and test-retest reliability without order effects and with good internal consistency. Patients give diverse responses to items that fall into four factors, three of which represent identifiable components of multidisciplinary treatment for chronic pain. Findings that similar THQ items are positively correlated and that many items show positive correlations with treatment outcome support the validity of the instrument. The latter finding also suggests the potential of patient satisfaction measurement for improving treatment outcomes at pain centers.
Key words: Patient satisfaction; Factor analysis; Reliability; Validity; Multidisciplinary pain treatment; Outcome assessment
Introduction Measures of patient satisfaction have become increasingly important in health care. They correlate directly with treatment compliance and outcome (Kincey et al. 1975; Fitzpatrick et al. 1983; Fitzpatrick et al. 1987) and can be vital to the economic success of a clinic (Luecke et al. 1991). Their importance is suggested by the proliferation of client satisfaction surveys by many businesses that serve the public (Brown et al. 1993). The Commission on Accreditation of Rehabilitation Facilities in its Standards Manual requires that measures of patient satisfaction be part of an evaluation system for all of the rehabilitation programs it surveys, including chronic pain management programs (Commission on Accreditation of Rehabilitation Facilities 1995). Although evaluation of patient satisfaction is one component of a total quality assurance program on pain management recommended by the American Pain Society (American Pain Society 1991), the issue of measurement of patient satisfaction has received scant attention in the * Corresponding author: Dr. Stanley Chapman, The Center for Pain Medicine, Central Clinic-Suite A1227, 1365 Clifton Road, N.E., Atlanta, GA 30322, USA. Tel.:(404) 778-3952; Fax: (404) 778-4087, PII S0304-3959(96)032
17-4
chronic pain literature, and there is no standardized way by which this variable has been measured. While a controlled study of carefully defined patients exposed selectively to different combinations of treatments would provide the best data, such a study is difficult and often impractical in the average clinical setting. In the absence of evidence for the optimal combinations of treatment that produce a desired outcome, measures of patient satisfaction with treatments that correlate directly with outcomes could be helpful in the design of effective interdisciplinary chronic pain management programs. The purpose of this article is to introduce the Treatment Helpfulness Questionnaire (THQ), a simple way of measuring the perceived helpfulness of treatment modalities offered at comprehensive pain management programs, and to present its reliability and validity data.
Methods Questionnaire development The concept for the THQ originated in work on visual analog scaling to measure subjective pain intensity (e.g. Scott and Huskisson 1976), The authors believed that patients could rate the degree of helpfulness of each type of treatment they received by placing a mark along a 10 cm line, They considered other types of satisfaction questionnaires, including one
350 in which patients were given a choice of verbal responses, but thought a continuous scale might provide more specificity of response and, depending on the distribution of answers, greater statistical power. Many patient satisfaction questionnaires use ‘not helpful’ as the most negative rating, but written and verbal comments indicate that some patients view certain treatments as harmful or counterproductive, For example, some patients report that invasive treatments, withdrawal of certain medications, or physical therapy worsened their pain, or that prescription medications produced bothersome side effects. Treatments vary in their potential for being seen as harmful. Chapman (1985) found that when given a forced choice of rating a treatment as ‘harmful,’ ‘not helpful,’ ‘somewhat helpful,’ or ‘very helpful,’ seven of 222 patients in a chronic pain management program rated sympathetic nerve blocks as ‘harmful,’ whereas none of 234 patients rated group counseling as ‘harmful.’ Thus, the authors constructed the questionnaire along a continuum, with end points of ‘extremely harmful’ and ‘extremely helpful’ and ‘neutral’ in the middle. The pilot version of this questionnaire listed treatments in the left column and spaced the five headings (’extremely harmful,’ ‘harmful,’ ‘neutral,’ ‘helpful,’ and ‘extremely helpful’) across the top of the page.
Some of the more than 100 pilot subjects rated treatments progressively lower as they moved down the page. When interviewed, they explained that their ratings were distorted because they held the paper at an angle and had difficulty aligning their responses to the headings on top of the page. To correct for this problem, small vertical guidelines one centimeter apart were added (with longer vertical lines at the end points and at ‘neutral’) to reorient subjects continuously as they moved down the page, ASdiscussed later, this revised version showed n. order effects Fig. 1 shows the final format of the THQ. The treatment modrdities listed in the figure are those rated at one particular pain center; modrdities can vary depending on the nature of the program and the goals of evaluation. The treatment modalities can be broadly defined, as exemplified by a rating of the helpfulness of an entire program or, narrowly defined, as by a rating of transcutaneous electrical nerve stimulation (TENS). Responses are scored along a 10-point scale ranging from –5 for ‘extremely harmful,’ to +5 for ‘extremely helpful.’ ‘Neutral’ yields a score of 0, Scores are calculated to the nearest tenth of a point, This format resulted in adequate within-subject variance in responses. Of 199 patients who rated at least three of a possible 19 treatment mo-
‘IkeatmentHelpfulnew Questionnaire Posttreatment
Name:
Follow-uP
Date:
~y treatment a.person receives can be rated on a scale rangingfrom extremelyharmful to extremelyhelpful with neutral (not helpful or harmful)falling in the middle. Belowis a list of treatments offered at this Pain center. Please rate each treatment You had here by making a vertical mark (not a slanted line or checkmark) along the line to showhow helpful (or harmful) the treatment has been for you. Leave blank any treatment you did
not receive at this pain center.
‘emely harmfuf
Harmful
Neutral
Helpful
Extremely helpful
WHOLEPROGRAM MEDICAL ASSESSMENT ANDTREATMENT
I
1 1
I
PSYCHOLOGY ASSESSMENT ANDTREATMENT
I
1 1
I
I 1
I
I I
I
PHYSICAL THERAPYASSESSMENT ANDTRL4TMENT OfficeVisitswith Phyaicfan
I
fndividuafPsychologicalTherapy
t
MedicafDiagnosticTests (Thermography,EMG)
}
1 I
I
MedicalWorkAbilitiesTesting (functionalcapacity, impairment)
I t
1
I
Patient Education Groups
I
1
GroupCounseling
t
Fig. 1, The final format of the THQ.
r
I
& I
I
?5 I dalities, 93% showed a ‘discrimination’ response, defined as a difference of> 0,3 points between the showed a score on at least one item that was not within 0. I points of the center of any of the five headings, and 89% showed at least one response thot was not within 0, I points of any of the 11 vertical guidelines.
logical therapy, relaxation training, patient education, group counseling, and biofeedback therapy. Physical therapy, consisting primarily of exercises designed to increase strength and/or range of motion, with instruction in proper body mechanics and in selection and level of participation in activities of daily living, was provided in both group and individual settings and included home assignments of exercises. When considered likely to be beneficial, TENS was applied for pain relief. Patients rated the helpfulness of physical therapy assessment and treatment and TENS, Instruction to gradually increase activities of daily living within assigned medical restrictions and with proper pacing was given to a majority of patients by the entire staff, and it was rated for helpfulness as ‘activity increase’ on the THQ.
highest ~rrd lowest rated treatrnent,s; 8870
Subjects To test the reliability and validity of this questionnaire, a total of 216 subjects completed the THQ at one of two interdisciplinary pain management programs. Posttreatment helpfulness was assessed at Center A for 139 subjects, of whom 107 provided 3-to-6 month follow-up data. At Center B, 77 subjects were tested; posttreatment data were collected for 65 and 3-to-6 month follow-up data for 67. (In some cases at Center B, subjects were tested at follow-up who had not completed the questionnaire at posttreatment. ) Subjects were typical of those seen at interdisciplinary chronic pain centers. All subjects had experienced chronic pain of at least 3 mos duration, and almost all had failed to obtain significant pain relief or normal function despite multiple previous treatments in health care settings. Demographic and medical history data are presented in Table I.
Treatment programs and THQ items The programs at Centers A and B had goals similar to those of many chronic pain management programs (Commission on Accreditation of Rehabilitation Facilities 1987): increased physical function, with return to work if possible; elimination of excessive use of opioids, tranquilizers, and barbiturates; reduced subjective pain intensity; improved psychological and emotional status; and improved ability to manage pain independently, with reduction in posttreatment medical needs.
Center A Treatment at Center A was individualized and averaged 6–10 visits following assessment, with each visit comprising about 4–5 h of treatment, Patients initially saw a physician for a medical interview and examination. Depending on this physician’s judgment, some were referred for interdisciplinary treatment and included in this study, During each subsequent visit to the center, these patients saw the physician, who generally answered questions, evaluated medications, and reinforced progress toward rehabilitation goals. Medication changes generally involved withdrawal or reduction of opioids, tranquilizers, barbiturates and sedatives, and/or prescriptions of anti-inflammatory agents imd/or tricyclic antidepressant medication. If the patient was assessed as having sympathetically maintained pain or pain related to the presence of trigger points, the physician also frequently administered sympathetic nerve blocks or trigger point injections, respectively, provided that he/she assessed these injections as being supportive of the patient’s efforts toward improved coping and function. For patients at Center A, the THQ included the following items relating to medical care: overall medical msessment and treatment, office visits with the physician, drug withdrawal, drug prescription, trigger point injections, and sympathetic nerve blocks. Almost all patients also were scheduled for group and individual treatments with their psychologist or psychological counselor. Besides the comprehensive diagnostic interview, psychology staff also conducted group and individual therapy sessions designed to support patients and their families in their efforts to cope more effectively with pain and related problems. All patients received individual and group instruction in relaxation methods (progressive muscle relaxation, imagery, and deep breathing) with tapes for home practice. A minority of patients (primarily those with pain related to muscle tightness) received electromyographic biofeedback as an adjunct to relaxation. Psychologists led general counseling groups, which invited patient discussion of pain and stress management issues and reinforced functional improvement. Psychologists alternated with physicians in presenting educational lectures to groups of patients about different aspects of pain and its management. Patients at Center A rated the overall helpfulness of psychological assessment and treatment on the THQ, as well as the helpfulness of individual psycho-
Center B At Center B, most patients came in 3 days/week for 5 wks for structured treatments similar to those provided at Center A. When medically indicated. epidural steroid or trigger point injections or sympathetic nerve blocks were offered. Physical therapy included o daily group aerobics program led by an exercise physiologist and individual exercises, all designed to increase physical function and stamina. As at Center A. patients at Center B saw psychologists for group and individual therapy, for educational groups, and, when indicated, for biofeedback, They also attended group sessions with their families, led jointly by tbe psychologist and social worker, that addressed family issues related to chronic pain and its management, Unlike those at Center A, patients at Center B were not asked to rate the helpfulness of TENS, activity increme, or drug withdrawal (though these treatments were provided as needed), but they did rate the helpfulness of intensive day treatment (group exercise and group and individual psychological intervention three days per week), epidural steroid injections, group and individual exercise, and family sessions, Three THQ items employed throughout the study at Center B, including ratings of the whole program, of overall medical and of overall psychological assessment and treatment, were added to the THQ at Center A midway through tbe study. Other THQ items were identical at the two centers, Patients at both centers were seen for follow-up as dictated by their individual rrceds, A monthly visit to the physician and/or psychologist typically was recommended.
Procedure Demographics information and medical histories were gathered at TABLE I SUBJECT DEMOGRAPHICS ANDMEDICAL DATA
Mean age (years) Mean years of education Mean years since pain onset Female (%) Married (%) Receiving disability at pretreatment (%) Working at pretreatment (%) Primary puin k~cufion Low back (%) Cervical back (%) Lower extremity (%) Upper extremity (%) Headache/facial (%) Generalized (%) Other (%)
Center A (,1= 139)
Center B (n = 77)
43.1 I2.0 6.5 46,4 77.7 59.5
42.0 13,3 4.6 56.2 55.0 73.6
2! .6
I3.5
41.0 14,4 11.5 10.8 9,4 2,2 10.7
58,3 15.0 1.7 16.7 1.3 5.0 2,0
352 both centers before treatment. To rrieasure subjective pain intensity, subjects at Center A also completed a visual analog scale consisting of a 10 cm line with end points labeled ‘no pain’ and ‘pain as bad as it could be’ (Scott and Huskisson 1976). At Center B, the minimum pain level was also defined as ‘no pain’ and the maximum as ‘excruciating, incapacitating, worst pain possible.’ Subjects at both centers were instructed to record the average level of subjective pain intensity for the week preceding treatment, and their responses were converted to a O-to-100scale. Subjects at Center A also completed an activity diary and a medication diary for one week before beginning treatment. The activity diary estimated daily time spent standing or walking; the daily average served as the mewurc of activity level. Medications were converted to a score on the Medication Quantification Scale (MQS; Steedman et al. 1992), designed to rate the detriment potential of medication use. Before beginning treatment, many subjects at Center A also completed the short form of the Beck Depression Inventory (BDI; Beck and Beck 1972) and the Sickness Impact Profile (SIP; Bergner et aL 1981). To reduce the potential for biased response, posttreatment data were gathered at both centers by an individual who had not provided treatment, and it was made clear to subjects that results would not become part of their medical record. Subjects at both centers filled out the THQ on the Iast day of treatment. At Center A, subjects again completed the short form of the BD1; 25 subjects also completed the SIP and 28 the posttreatment questionnaire, both of which were introduced as measures midway through the study. The latter asked them to rate whether their ability to cope with pain and related problems and their sleep (if sleep had been impaired at pretreatment) were ‘significantly worse,’ ‘slightly worse,‘ ‘about the same,‘ ‘slightly better,’ or ‘significantly better’ than at pretreatment. Their answers were converted to a scafe of 1 (significantly worse) to 5 (significantly better). Similarly, subjects rated their activity level compared with pretreatment as ‘significantly decreased,’ ‘slightly decreased,’ ‘about the same,’ ‘slightly increased,’ or ‘significantly increased,’ and their responses were scorvd in this order from 1 to 5. Subjects at both centers completed the same visual analog scale of subjective pain intensity as at pretreatment. Follow-up data were gathered 3-6 mos after the end of treatment at both centers through an identical packet of questionnaires mailed to subjects, The same precautions were taken to reduce bias as at posttreatment. Subjects who did not return questionnaires within a month were telephoned and asked to return them. A duplicate packet was mailed if needed. The return rate for questionnaires was approximately 80% at Center A and 85% at Center B. The questionnaires included a cover letter, the aforementioned activity and medication diaries to be tilled out for the next 7 days, and the aforementioned scales to measure average subjective pain intensity during the week preceding treatment. In addition, subjects rated their current and pretreatment abilities to cope with pain and related problems (as ‘very poor,’ ‘poor,’ ‘fair,’ ‘good,’ or ‘excellent’), and their improvement in sleep if they had had sleeping problems at pretreatment (measured the same as at posttreatment). Subjects also were asked whether they: (1) had seen any professionals related to their pain problem aside from routine follow-up visits; (2) had been hospitalized for their presenting pain problem; (3) had surgery for this problem, and (4) were working for pay or involved daily in a vocational rehabilitation program. Responses related to coping were converted to a scale of 1 (very poor) to 5 (excellent). The follow-up MQS score was computed from medication diaries for subjects at Center A.
Reliability and validity measures We employed several tests to assess the reliability and the validity of the THQ. To assess interscorer reliability, two individuals independently scored responses on the same 20 questionnaires, and a Pearson productmoment correlation was calculated. Test-retest reliability was assessed by giving 19 subjects the identical questionnaire to fill out I–5 h apart, with no intervening treatment. All subjects returned the questionnaires, A third test of reliability was conducted to ensure that the placement of an item on the questionnaire did not affect the score of that item. Nine subjects each tilled out two versions of the questionnaire 1–5 h apart. Ten differ-
TABLE II RELIABILITY OF THQ ITEMS AT POSTTREATMENT FOLLOW-UP
Posttreatment Follow-up
Subjects
Items
(n) 82 48
AND
(n)
Cronbach alpha coefficient
Guttman split-haff coefficient
10 9
0.89 0.88
0.88 0.84
ent treatments were rated, which represented the maximum that would fit easily on a page. This second version of the test differed from the first in that the treatments were listed in reverse order. It was hypothesized that the Pearson correlation coefficient for this test would be similar to that calculated for test-retest reliability when the two versions of the test were identicaf, which would show an absence of order effects. Two other tests of reliability relating to within-subject consistency were employed. Cronbach’s Alpha (Novick and Lewis 1967) measured average covariance of items on the scale, while Guttman split-half reliability (Wirier 1962) arbitrarily divided the questionnaire in haff and analyzed the degree of concordance between tire two-halves. It was hypothesized that coefficients from these analyses would indicate consistent rather than random subject responses and thus would support the reliability of the test. Three tests of validity were employed. Two of them, intercorrelations among items and factor analysis, examined the internal relationships among items. We hypothesized that many significant correlations would be found among treatment modalities of the THQ that were similar in nature or in goals and that the magnitude of these correlations generally would increase with increasing similarity among treatments. We further hypothesized that significant correlations would be found between items when one item was a subset or component of another. Thus, we expected ratings of the whole program to correlate significantly with those of many treatments and ratings of overall medical, psychological, and physical therapy assessment and treatment to correlate significantly with specific medical, psychological, or physicrd therapy treatments, respectively. We also assessed the internal structure of the THQ through exploratory factor analysis, using an orthogonal rotation among THQ items for which there were an adequate number of subjects. Factors were identified until the eigen-vahre fell to less than 1.0, which signified minimal variance remaining after factors were extracted. We hypothesized that factor loadings of different @atments would be higher when these treatments werv similar in their nature and goals, and that dissimilar treatments would load on different factors. A third type of validity study consisted of tests of the relationships of THQ items with measures of treatment outcome. Specifically, Pearson product-moment correlations were crdculated between posttreatment scoies of THQ items and the following outcome measures taken at posttreatment: (1) subjects’ ratings of improvement in ability to cope with pain and related problems; (2) their perceived increase in activity level; (3) their subjective pain intensity level; (4) SIP subscale (Physical, Psychosocial, and Other) and total scores, and (5) BDI total score. Items also were correlated with difference scores representing change from pretreatment to posttreatment on subjective pain intensity, SIP scores, and BDI score. At follow-up, Pearson product-moment correlations were calculated between THQ items and the following follow-up measures: (1) perceived ability to cope with pain and related problems; (2) activity level; (3) total MQS score; (4) subjective pain intensity, and (5) rating of perceived improvement in sleep. Follow-up THQ scores atso were correlated with difference scores representing changes from pretreatment to follow-up on perceived ability to cope with pain and related problems, subjective pain intensity, and the total MQS score. The point-biserial correlation coefficient was employed, to evaluate relationships between THQ items and dichotomous variables reflecting whether or not patients had seen health care professionals other than for routine follow-up visits,
353
systematic order effect was underscored by analysis of the 40 pairs of data points for which the score for a given item on the THQ differed during consecutive administrations of the test; there were 18 instances in which the item was rated higher when it was listed on the top half of the page, versus 22 instances of a higher rating when listed on the bottom half of the page, a statistically non-significant difference. Results from the Cronbach Alpha and Guttman split-half reliability tests are depicted in Table II. The range of coefficients from both analyses at posttreatment and at followup of 0.84 to 0.89 indicates consistent rather than random subject responses and supports the reliability of the test.
had been hospitalized or had surgery related to their pain, and were working or were involved in a full-time vocational rehabilitation program. We hypothesized that ratings of the helpfulness of the whole program would correlate significantly with a wide range of outcome measures, while other THQ items would correlate most highly with levels or improvements in those outcome measures addressed by the specific item. For example, we expected significant correlations between: (1) THQ score for physical therapy and activity level; (2) THQ score for psychological therapies and coping, BDI score, and the psychosocial scale of the SIP; turd (3) THQ scores for TENS, drug prescriptions, and injections ond subjective pain intensity.
Correction for multiple analyses The large number of data points generated from intercorrelations among THQ items and from correlations of THQ items and treatment outcome measures enhance the possibility of false tindings of statistical significance. We expected the magnitude of these correlations to vary greatly along a broad continuum, making it difficult to hypothesize in advance how many would be statistically significant. We thus decided to correct for the likelihood of false positive findings by applying the more rigorous .01 alpha level to test statistical significance rather than the 0.05 alpha level.
Validity Table III summarizes mean scores, SD, and numbers of subjects on items of the THQ from both centers combined at posttreatment and at follow-up. The number of subjects for different items varies significantly because some items were added to the questionnaire midway through the study and because some subjects were not exposed to a particular treatment represented on the THQ. Tables IV and V present intercorrelations among THQ items at posttreatment and at 3-to-6 month follow-up, respectively. To avoid presentation of results based on inadequate sample size, we omitted results based on fewer than 15 subjects. Significance levels are based on one-tailed tests and on hypotheses of positive correlations among items. At both time points, ratings of the helpfulness of the whole program were significantly and positively correlated
Results Reliability Interscorer reliability for 163 pairs of data points was excellent, yielding a Pearson correlation of 0.98. Test-retest reliability for 152 pairs of ratings was 0.86 when the same version of the THQ was given twice to the same subjects. The Pearson test-retest reliability was even higher (r= 0.92) for the 75 pairs of data points when items were listed in reverse order on the second test. The lack of a
TABLE 111 NUMBER OF SUBJECTS, MEAN SCORES, AND STANDARD DEVIATIONS OF THQ ITEMS AT POSTTREATMENT AND AT FOLLOW-UP Item
Whole program Overall medical Overall psychological Overall physical therapy Intensive day treatment MD office visits Drug prescriptions Drug withdrawal Trigger point injections Sympathetic nerve blocks Epidural steroid injections Patient education groups Group counseling Relaxation therapy Individual psychological therapy Biofeedback therapy Family conferences Group exercise Individual exercise Activity increase TENS
Follow-up
Posttreatment n
Mean
SD
n
Mean
SD
94 93 90 157 66 87 140 37 101 43 30 153 139 171 153 49 38 57 54 75 41
2,84 1.90 2.95 2.42 3.61 2.07 2.28 1.94 I .99 0.92 0.10 2.39 2.70 2.53 2.99 1.72 2.32 2,37 2.17 1.80 1.53
1.52 1.95 1,73 1,96 1.31 1.84 2.06 2.52 1.99 2,39 1.69 1.65 1,62 1,82 I ,70 1.56 1.42 2.05 1.98 1.98 2.04
107 106 107 143 55 99 127 29 101 51 33 133 127 144 150 56 33 52 55 76 52
2.57 2.07 2.66 1.71 3,40 2.26 2.08 0.70 I ,51 0.85 -0.58 2,21 2.36 2.06 2.61 1.85 2.36 2,04 2.00 1.01 1.08
1.62 1.81 1.91 2.28 1,27 1,82 2.24 2.59 2.25 2.65 1.94 1.91 1.86 1.71 1,90 2.07 1,60 2.39 2.20 2.35 2.57
Ab
0.60” 0.38*
B D 0.68* 0.06 0.55* 0.17
c
0.46* 0.35 0.26 0.20 0.44” 0.43” 0.29* 0.12 0.53*
0.15
F
0.37* 0.68” 0.37’ 0.36
E
H
0.39’
0.18 0.34 0.28 0.31
. 0.12 0.11 0.26 0.43* 0.38*
G
0.23 0.31 0.53* 0.32 0.21
0.05 0.36 0.35 0.27
I
0.12
-0.17 0.19 0.02
-0.10 0.16 -0.06 0.39
J
0.40” 0.28* 0.04 0.47* +.12
0.52* -0.01 0.49’ 0.20
K
0.44* 0.45” 0.11
0.40” 0.40
0.29
0.28”
0.68’
–0.13
-0.24
0.45
0.09
0.50’ 0.30 0.28
0.32
-0.01
0.32
0.19
0.46 0.48 0.05
0.53* 0.16 0.16 0.59”
QRs~ 0.50* 0.19 0.22 0.35
0.43 0.20 0.21
0.39
0.03
0.38
0.02
0.13
0.61 0.45
0.50 0.44* 0.70*
0.28 0.46* 0.26 0.48* 0.32* 0.47*
0.43* 0.35
0.12
0.00
0.16
0.52” 0.53* 0.50* 0.02 0.18 0.34 -0.06 -0.17 0.17
0.42 -0.22 0.18 0.24
0.53 0.63 0.24 0.39
0.66” 0.43* 0.76* 0.33* 0.42 0.45* 0.28* 0.27 0.18
P
o
N
0.40” 0.13 0.24 0.40 0.19
0.58* 0.20 0.41* 0.34*
M
0.09
0.40” 0.40* 0.10 0.54* 0.00
0.49* 0.00 0.49* 0.26
L
0.56*
0.55
0.24 0.33
0.26
0.17
0.61*
0.34 0.03
0.30
-0.03 0.46 0.22 0.64* 0.22 0.34
0.35 0.28 0.18 0.56* 0.43
thetic nerve blocks; J, epidural steroid injections; K, patient education groups; L, group counseling; M, relaxation therapy; N, individual psychological therapy; O, biofeedback therapy; P, family conferences; Q, group exercise; R, individual exercise; S, activity increase; T, TENS. *P <0.01, one-tailed test.
aCorrelation omitted if <15 subjects (30 of 210 data points). bA overall medica]; B, overall Psychological; C, Overa]lPhysica] therapy; D, intensive day treatment; E, MD office visits; F, drug prescriptions; G, drug withdrawal; H, trigger Point injections; I, sympa-
0.42* Whole program Overafl medical Overall psychological Overall physical therapy Intensive day treatment MD office visits Drug prescriptions Drug withdrawal Trigger point injections Sympathetic nerve blocks Epidural steroid injections Patient education groups Group counseling Relaxation therapy Individual psychological therapy Biofeedback therapy Family conferences Group exercise Individual exercise Activity increase
THQ item
INTERCORRELATIONS AMONG THQ ITEMS AT POSTTREATMENTa
TABLE IV
355
356 TABLE VI EXPLORATORY FACTOR ANALYSIS OF THQ ITEMS AT POSTTREATMENT USING AN ORTHOGONAL ROTATIONa (N= 294) Item
Factor 1
Factor 2
Factor 3
Factor4
Group counseling Patient education groups Individual psychological therapy Relaxation therapy Overall psychological Overall medical MD office visits Group exercise Individual exercise Trigger point injections Drug prescriptions Overall physical therapy
0.78 0.75 0.62 0.59 0.53 -0.03 -0.05 0.09 0.14 -0.05 0.08 0.20
-0.07 -0.06 0.41 0.03 0.59 0.85 0.82 0.08 0.07 -0.07 0.27 0.18
0.15 0.21 -0.07 0.13 -0.07 0.11 0.16 0.83 0.76 -0.02 -0.10 0.22
0.03 -0.01 0.18 0.35 -0.10 0.12 0.20 -0.03 0.71 0.71 0.63 0.55
aPrincipal components analysis; four factors with eigen-vatue > 1.0; sampling adequacy = 0.72.
with those of a large number of treatments. Consistent with expectations, these correlations were particularly high between ratings of the whole program and of modalities that supported the overall emphases of increased activity, improved understanding of pain, and improved psychological coping. As hypothesized, overall psychological treatment correlated significantly at posttreatment and at follow-up with ratings of individual psychological therapy, relaxation therapy, and group interventions, while overall medical and physical therapy helpfulness ratings were highly correlated with helpfulness ratings of office visits and of group exercise, respectively. The hypothesis that similar types of treatments would receive highly correlated ratings of helpfulness was also supported by several significant correlations, such as those between patient education and group counseling, and between individual and group exercise. The lower correlations generally found among more dissimilar treatments, such as injections and psychological interventions, were also consistent with expectations. Tables VI and VII present the factor loadings of THQ items from the exploratory factor analysis. The sampling
adequacies of 0.72 and 0.77 at posttreatment and at followup, respectively, indicate adequate distribution of scores to perform factor analysis. Four factors were found at both time intervals, with fairly similar loadings. Factor 1 at posttreatment and Factor 2 at follow-up both loaded highly on psychological and educational treatments; posttreatment Factor 2 and follow-up Factor 1 consisted largely of medical visits, while Factor 3 in both instances included primarily exercise regimens. At both time intervals, a diversity of treatment modalities comprised the fourth factor. Data regarding the relationship of THQ items to outcome measures at posttreatment and at follow-up are summarized in Tables VIII and IX, respectively. Correlations again were omitted for small numbers of subjects (< 15), and one-tailed significance testing was based on the hypothesis that more positive treatment outcomes would correlate with higher THQ ratings. With the exception of variables related to subjective pain intensity, data from posttreatment came exclusively from Center A, because they were collected only at that center. All follow-up outcome measures except MQS scores came from both centers. At
TABLE VII EXPLORATORY FACTOR ANALYSIS OF THQ ITEMS AT FOLLOW-UP USING AN ORTHOGONAL ROTATIONa (N= 241) Item
Factor 1
Factor 2
Factor 3
Factor 4
MD office visits Overall medical Drug prescriptions Group counseling Patient education groups Relaxation therapy Overall psychological Group exercise Individual exercise Trigger point injections Overall physical therapy Individual psychological therapy
0.86 0.88 0.73 0.20 0.19 0.15 0.55 -0.02 -0.09 0.02 0.29 0.12
0.16 0.15 0.23 0.86 0.85 0.60 0.54 0,09 0.12 -0,03 0.30 0.34
0.11 0.09 -0.04 0.10 0.07 0.11 0.13 0.89 0.88 0.00 0.30 0.43
0.03 -0,01 0.18 -0.05 0.08 0.35 -0.05 0.10 0.13 0.88 0.52 -0.09
aPrincipal components analysis; four factors with eigen-vahre > 1.0; sampling adequacy = 0.77.
n
0.42
0.00 0.04 -0.08 0.51*
0.38
0.02 0.36 0.27 0.09
0.40
-0.03 –0.06 –0.12 –0.16
–0,40* –0.35* –0.29 0.57”
0,47” 0.44” 0.52* 0.41 0.30 0.20
0.13 0.28 0.00 0.35 0.16 0.04
–0.27 –0.14 –0.03 –0.27” –0.10 –0.16 –0.30 0.01 –0.26
28 3.75 0.65
28 3.29 0.71
152 61.87 20.18
–0.18 –0.32 –0.34 -0.28
–0.58
0.06 0.08 –0.14 0.09 4.43 –0.31* -0.17
–0.51* 4.46 -0.31 –0,64* –0.38* –0.01
–0.66”
–0.24 0.11 –0.25 -0.13
-0.35 –0.09 –0.20 –0.31 –0.17 –0.11 m
25 29.67 21.17
25 16.24 14.10
103 7.61 5.38
–0.06 0.06 0.10 -0.15 -0.03 -0.01 -0.07 -0.16 -0.04
Psych
Phys
BDIe
–0.74”
–0.37 –0.20 –0.10 –0.15
-0.09 -0.39” -0.31”
–0.05 –0.12 -0.21 –0.26’
–0.30 –0,09 –0.29 -0.21
-0.78”
–0.29* -0.12 -0.18 –0.23” –0.16 –0.20 –0.36 0.01 -0.35
-0.44 –0.25 -0.26 –0.49 –0.26 –0.09
124 -8.45 18.85
25 24.64 14.66
25 28,88 11.85
–0.32 –0.18 –0.19 –0.47 -0.15 –0.09
SPI’
Total
Other
SIP scale scoresf
-0.30 -0.29 -0.20
–0.16 –0.18 4t.21 0.00
-0.15 0.09 0.17 –0,27* 0.03 –0.01 –0,45* –0.12 -0.22
88 -1.77 4.35
BD14
-0.17 -0.39 –0,64* –0.40
–0.58* -0.55” –0.46 –0.47 –0.56” –0.24
22 –1.49 7.58
Phys
–0.16 -0.17 4.68* -0.22
-0.39 –0.13 –0.41 –0.07 –0.20 –0.11
22 -0.12 15.60
Psych
–0.31 –0.39 4.59* -0.30
–0.60* +.48 -0.34 4.59* –0.54” -0.15
22 –0.69 10.16
Other
SIP scale scoresf
Difference scores: posttreatment minus pretreatment
aCorrelation omitted if<15 subjects (520f224 data points). bsubjective pain intensity (range from ~lOO; Centers AandB) cSubjects’ rated change in activity level from pretreatment (range from 1 = significantly decreased to 5 = significantly increased; Center A) dsubjects, rated ability in coping from pretreatment level (range from 1 = significantly worse to 5 = significantly better; Center A). ‘Beck Depression Inventory score (short form; Center A). ‘Sickness Impact profile Scale scores (Phys = Physical; Psych= Psychosocial), Center A. *P <0.01, one-tailed test.
THQitem Whole program Overall medical Overall psychological Overall physical therapy MDoffice visits Drug prescriptions Dmgwithdrawal Trigger point injections Sympathetic nerve blocks Patient education Groupcounseling Relaxation therapy Individual psychological therapy Biofeedbacktherapy Activity increase TENS
mean SD
Copingd
ADLC
sPIb
Measures at posttreatment
CORRELATIONS BETWEEN THQ ITEMS AND POSTTREATMENT OUTCOME MEASURESa
TABLE VIII
–0.27 -0.38 –0.79” -0.35
-0.58” -0.40 -0.49 -0.36 -0.45 -0.20
22 +.78 9.37
Total
0.36* 0.25* 0.17 0.27* 0.20 0.21 0.27* 0.33 0.11 0.23 0.34 0.33* 0.27* 0.23 0.34’ 0.07 -0.01 0.34 0.33 0.49* 0.30
0.21 0.36 0.17 0.29* 0.39* 0.06
-0.30 0.09 -0.24 -0.39’ -0.53” -0.27
115 2.63 0.80
0.20 -0.02 0.06 0.37* 0.13 -0.15 0.08 0.22 0.11 0.11 0.02 0.02 -0.09 0.03 0.04
119 337.80 200.00
-0.38* -0.23” -0.08 -0.37” -0.15 -0.03 -0.17 -0.39 -0.31” -0.40” -0.34 -0.24” -0.22” -0.37* -0.07
152 61.47 21.55
Sleepf
0.30 -0.06 0.27 0.34 0.65* 0.43*
-0.45”
-0.12 -0.22
-0.14 -0.06 -0.24 0.04
0.04 0.09 4.22 -0.05 -0.10
-0.37* 0.39* 0.33* 0.30* 0.32 0.37* 0.40’ 0.49 0.24 0.35 -0.08 0.21 0.34* 0.31* 0.11
102 3.72 0.93
-0.12 -0.13 0.06 -0.24*
101 5.69 7.29 n
Yes (n) No (n)
‘Working for pay or involved in a daily vocational rehabilitation program. JHad surgery for pain since end of treatment. *P< 0.01, one-tailed test.
‘Total score = Medication Quantification Scale. (range from 1 = S@SifiCarrtly WorsetO5 = SignifiCarItlybetter). fAbili~ t. S]eepsince pretreatment gHOSpitrdi~d for pain since end of treatment. hsaw professionals for pain aside from routine follow-up visits.
cNumbcr of mirr/day standing or walking. md related problems (range frOIrr1 = very Poor to 5 = excellent). dAbility t. COP with pain
aCorrelation omitted if <15 subjects (10 of 252 data points). bsubjective pain intensity (0-100‘mge).
Overall medical Overall psychological Oversdl physical therapy Intensive day treatment MD office visits Drug prescriptions Dmg withdrawal Trigger point injections Sympathetic nerve blocks Epiduml steroid injections Patient education Group counseling Relaxation therapy Individual psychological therapy Biofeedback therapy Family conferences Group exercise Individual exercise Activity increase TENS
THQitem Wholeprogram
Mean SD
n
MQSe
-0.16 -0.36 -0.30 -0.10 -0.04 -0.08
-0.10 -0.08 -0.07 -0.11 -0.22 -0.01 -0.08 -0.04 0.08 -0.03 -0.01 -0.12 -0.04 -0.21 0.01
136 13 123
-0.28 0.32 -0.12 -0.25” –0.30” -0.09
-0.27* -0.16 -0.10 -0.24” -0.01 -0.08 -0.27” -0.37 -0.10 -0.09 0.40 -0.15 -0.10 -0.21* -0.12
156 70 86
Prot+
Hospg
Copingd
SPIb
ADLC
Point-biseriaf correlations
Pearsonian correlations
Measures at follow-up
CORRELATIONS BETWEEN THQ ITEMS AND FOLLOW-UP OUTCOME MEASURESa
TABLE IX
0.20 -0.10 0.06 0.15 0.32* 0.04
0.17 0.11 -0.01 0.22* -0.03 -0.09 0.10 0.10 0.21 0.24 -0.02 -0.02 0.01 0.07 0.08
157 47 110
Worki
-0.23 -0.16 -0.07 0.10 -0.12
-0.07 -0.03 4.06 -0.04 -0.15 0.04 0.00 -0.16 -0.01 -0.08 0.06 -0.08 -0.01 -0.18 -0.01
134 9 125
SurgeryJ
Mean SD
n
-0.29 0.02 -0.04 –0.29* -0.39” -0.36”
-(3.33* -0.30” -0.19 -0.29” -0.08 -0.23 -0.15 -0.20 -0.28” -0.29 -0.38 -0.16 -().27* -0.43* -0.11
126 8.87 21.20
SPIb
0.34 -0.17 0.04 0.19 0.46’ 0.39
0.40” 0.34* 0.43* 0.23 0.35* 0.41* 0.47* 0.45 0.24 0.43* 0.18 0.44* 0.37” 0.39* 0.42*
115 +1.31 0.98
Copingd
Pearsoniarr correlations
Difference scores: follow-up minus pretreatment
-0.06 -0.11 0.01
-0.14
-0.06 0.04 0.05 -0.06
-0.05 0.12 0.06 0.02 -0.06
-0.12 -0.02 -0.04 -0.06
101 -4.93 12.42
MQSf
w m
w
359
posttreatment, the helpfulness rating of the whole program correlated significantly with many outcome variables representing changes in function from pretreatment to posttreatment, including subjects’ ratings of improved coping and difference scores on measures of subjective pain intensity and on all three subscales and the total score of the SIP. Consistent with expectations, THQ scores on two highly emphasized program elements showed significant associations with outcome: activity increase correlated with seven of 10 outcome measures and relaxation therapy correlated significantly with improvement from pretreatment to posttreatment on all subscales of the SIP. However, contrary to hypotheses, most correlations were low and non-significant between THQ scores of psychological treatment modalities and scores on the BDI, and between THQ
scores
of group
improvements As depicted
and educational
therapies
and rated
in coping. in Table
IX,
many
correlations
between
THQ items and outcome measures at follow-up were significant, though the magnitude of these correlations rarely exceeded r = 0.50. As was the case with posttreatment data, the THQ score for the whole program was correlated significantly with many outcome measures (six of 12 assessed). THQ scores for physical therapy and activity increase correlated significantly with eight outcome measures, while relaxation therapy and group counseling each correlated with five. Many THQ items showed significant associations with levels and improvements in pain and in coping and with improvement in sleep. On the other hand, generally non-significant correlations were found between THQ items versus follow-up work status, hospitalization or surgery after treatment, and level and change in MQS score. The hypothesis of stronger relationships between THQ scores of specific treatments and outcomes addressed by those treatments was partially supported. For example, activity level at follow-up correlated most closely with THQ scores for activity increase and overall physical therapy, and THQ scores for injections and for individual psychological therapy generally correlated most highly with levels and changes in subjective pain intensity and in coping, respectively; however, several hypothesized relationships were not found, such as between THQ score for drug prescriptions and subjective pain intensity, and between THQ score for drug withdrawal and results on the MQS.
Discussion The THQ appears to be a simple, reliable, and valid measure of patient satisfaction with treatment modalities employed in chronic pain management programs. It is convenient, is easily understood by patients, and takes very little time to administer and score. The test has good face validity, and patients with limited educational levels have had little difficulty in understanding its format. Furthermore, the format allows multiple treatment modalities to be
rated quickly and without significant order effects. The finding that a given patient often gives varying ratings to different treatment modalities is important; this variability allows administrators and providers to assess responses to specific treatments offered. A format that allows patients to rate treatments as harmful also appears to be important, especially given findings suggesting significant impact of dissatisfied ‘clients’ on the success of a business (e.g. Luecke et al. 1991). Data regarding means and standard deviations on treatment modalities represented by THQ items reveal a fairly limited range of mean scores, especially given the possible range of 10 points on the THQ; however, the significant variability of scores on THQ items reveals that different patients rate the helpfulness of different treatment modalities very differently. This variability will lend itself well to future investigations of the types of subjects and conditions that lead to different levels of satisfaction with different types of treatment listed on the THQ. The ease of scoring the THQ is suggested by its high interscorer reliability. Indeed, the reliability would have been even higher if a few subjects had not used checkmarks or highly slanted lines in rating treatment modalities. (Instructions now explicitly request subjects not to make such checkmarks or slanted lines). The test-retest reliability is quite adequate, and the absence of order effects allows comparability of items regardless of placement on the page. Factor-analytic data also support the validity of the THQ. Data were collected on a large number of subjects at posttreatment (n= 294) and at follow-up (n= 241), and results were similar at these two time points. At both, the first three factors loaded clearly on three major components of pain management programs: exercise, psychologicaleducational interventions, and medical interventions and visits. This study provided a massive number of correlations among THQ items and between such items and a variety of treatment outcome measures. Interpretation of individual correlations must be made with caution, as one would expect 1% of these correlations to be significant by chance alone; however, the finding of such a large number of significant correlations could not logically have occurred by chance alone, and was expected given the significant overlap in the nature and underlying philosophies of many treatments listed on the THQ. Furthermore, the finding that correlations were higher between more similar types of treatments, or when one treatment was a component of another, supports the validity of the THQ. Divergent validity was demonstrated by the lack of significant correlations among more dissimilar items, such as between injections (which played only an ancillary role at both centers) and ratings of either the whole program or psychological and educational interventions. Data regarding relationships between THQ items and treatment outcome measures also need to be interpreted with caution, as outcome reports are based on self-report
360
data rather than on direct behavioral observation. As with all correlational data, causation cannot be inferred, so it is not possible to ascertain whether satisfaction with treatment creates reports of better outcome or vice versa. Though correlations generally were modest in magnitude, the large number that were significant at the 0.01 level supports the validity of the THQ. Consistent with expectations, correlations tended to be highest when global helpfulness ratings were compared with general outcomes. After all, the programs at both centers were structured so that multiple types of treatment would contribute to the major goals of improved function and coping with pain. The finding of many significant negative correlations between pain levels and THQ scores at both posttreatment and follow-up was not surprising. Many subjects gave reduced pain level as their major goal for treatment; thus their success in reaching that goal likely influenced their perceptions of the helpfulness of many treatment modalities. Similarly, the finding of many significant correlations between the THQ score for activity increase and so many outcome variables reflects the fundamental role of increased activity in the two pain management programs studied. It is difficult to interpret the large number of low correlations of THQ items versus posttreatment BDI score, follow-up MQS score and work status, and hospitalization and surgery after treatment. Because relatively few subjects were hospitalized or had surgery related to their pain, the results from one or two subjects could have been very influential. The relatively brief follow-up interval of 3-6 mos may have been insufficient to evaiuate return to work. Furthermore, this outcome likely depended on factors besides those addressed in the program, including legal and disability issues and local availability of employment. The low mean THQ score at follow-up for drug withdrawal (0.70) may help to explain low correlations between MQS scores and THQ items. Low correlations between posttreatment THQ and BDI scores may relate in part to the finding that the latter only changed by a mean of 1.77 points during the relatively brief period of treatment. One wonders whether correlations would have been higher had the BDI been administered at follow-up, when more time might have allowed better evaluation of the impact of the program on depression level. As noted, expectations that ratings of treatment modalities would be specifically correlated with the outcomes those treatments were designed to produce were met only partially. In interdisciplinary treatment programs with consistent philosophies of care, many treatments and their interactions may help to produce outcomes, thus making it difficult to isolate the effects of a given modality. Despite some inconsistent results, the overall findings of many positive and significant correlations between THQ items and outcomes suggests that the questionnaire may be helpful for administrative and program personnel to evaluate their programs treatment by treatment, especially if they can make meaningful comparisons of THQ scores with
those of other programs. Such comparisons would require similar patient populations and definitions of THQ items as well as comparable procedures for measurement. One possible avenue of improved outcome in chronic pain management programs is modeling of treatment modalities after those that yield high THQ scores in comparable programs. Some limitations of the THQ and caveats regarding its use must be emphasized. Though the format of the test might lend itself to measurement of helpfulness in multidisciplinary programs for problems other than pain, interpretation of data from this study must be confined to pain centers, as all reliability and validity studies were performed at such centers. Furthermore, the THQ addresses only one aspect of patient satisfaction, and that is with the particular treatment modalities that served as items on the questionnaire. It does not address the reasons a given patient found one modality helpful and another not. Other questions regarding satisfaction and opportunities for patient comments may help us obtain a fuller picture of patient perceptions of treatments. Because different pain centers provide different treatment modalities and label and define them differently, we purposely did not attempt to define what modalities should be listed on the THQ. Depending on their purpose, scientists and clinicians may want to measure the perceived helpfulness levels of many types of treatment modalities. Obviously, the validity of any THQ item necessitates its being well defined; patients completing the questionnaire (and those interpreting the results) need to know exactly what is meant by each item. To ensure clarity, items must be labeled with care, and some additional definition may be needed. For example, at Center A, more patients initially rated ‘biofeedback therapy’ than actually had experienced it. Some patients had assumed from talking with other patients that biofeedback meant the same as t@axation. Adding the words ‘using the biofeedback machine’ eliminated the problem. Obviously, interpretation of THQ results also necessitates clear instruction to patients as to whether they are to rate treatment modalities only as they experienced them in a given program, or more generally.
Acknowledgements The authors wish to thank Fran Chewning for her work in preparing tables for this manuscript, a task which demanded complex integration of computer software programs. Also, special thanks to Jaylyn Olivo who reviewed this manuscript.
References AmericanPainSocietyCommissionon QuatityAssuranceStandards, American Pain Society quality assurance standards for relief of acute
361 pain and cancer pain. In: MR. Bond, J.E. Charlton and C.J. Woolf (Eds.), Proc. VIth World Congress on Pain, Elsevier, Amsterdam, 1991, pp. 185-189. Beck, A.T. and Beck, R.W., Screening depressed patients in family practice: a rapid technic, Postgmd. Med., 32 (1972) 8 1–85. Bergner, M., Bobbit, R.A., Carter, W.B. and Gilson, B.S., The sickness impact profile: development and final version of a health status measure, Med. Care, 19 (1981) 787–805. Brown, S.W., Nelson, A.M., Bronkesh, S.J. and Wood, S.0., Patient Satisfaction Pays: Quality Service for Practice Success, Aspen, Gaithersburg, MD, 1993. Chapman, S.L., Behavior modification. In: S.F. Brerm and S.L. Chapman (Eds.), Management of Patients with Chronic Pain, Spectrum, New York, 1983, pp. 145–159. Commission on Accreditation of Rehabilitation Facilities, Program Evaluation in Chronic Pain Programs, Author, Tucson, 1987. Commission on Accreditation of Rehabilitation Facilities, 1995 Standards Manual for Organizations Serving People with Disabilities, Author, Tucson, 1995. Fitzpatrick, R., Bury, M., Frank, A. and Donnelly, T., Problems in the
assessment of outcome in a back pain clinic, Int, Rehabil, Stud,, 9 (1987) 161-165. Fitzpatrick, R,, Hopkin, A, and Harvard-Watts, O., Social dimensions of healing: a longitudinal study of outcomes of medical management of headaches, Soc, Sci, Med., 17 ( 1983) 501–5 10. Kincey, J,, Bradshaw, P, and Ley, P,, Patients’ satisfaction and reported acceptance of advice in general pmcticc, J.R. CoIl. Gen. Pract., 25 (1975) 558-566. Luecke, R.W., Rosselli, V.R. and Moss, J.M., The economic ramifications of ‘client’ dissatisfaction, Group Pract. J., May/June (1991) 8– 18. Novick, M.R. and Lewis, C., Coefficient alpha and the reliability of composite measurements, Psychometrika, ( 1967) l–l 3, Scott, J. and Huskisson, EL., Graphic representation of pain, Pain, 2 (1976) 175-184. Steedman, SM., Middaugh, S.J,, Kee, W.G., Carson, D.S., Harden, R.N. and Miller, M.C., Chronic pain medications: equivalence levels and method of quantifying usage, Clin. J. Pain, 8 ( 1992) 204–2 14. Winer, B.J., Statistical Principles in Experimental Design, McGraw-Hill, New York, 1962.