1339
Houghton Scale of Prosthetic Use in People With LowerExtremity Amputations: Reliability, Validity, and Responsiveness to Change Michael Devlin, MD, FRCPC, Tim Pauley, MSc, Kris Head, BPE, BScPT, MSc, Susan Garfinkel, MSc ABSTRACT. Devlin M, Pauley T, Head K, Garfinkel S. Houghton Scale of prosthetic use in people with lowerextremity amputations: reliability, validity, and responsiveness to change. Arch Phys Med Rehabil 2004;85:1339-44. Objective: To evaluate the responsiveness to change and the floor and ceiling effects of the Houghton Scale. Design: One-week and 3-month test-retest to evaluate reliability, validity, and responsiveness to change. Setting: Amputee rehabilitation program. Participants: Persons (N⫽125) with unilateral or bilateral lower-extremity amputation who were wearing a prostheses: 1 group (n⫽49) for the reliability component and another group (n⫽76) for the responsiveness and validity component. Interventions: Not applicable. Main Outcome Measures: Responsiveness to change, ceiling and floor effects, and reliability and convergent validity. Results: Evaluation of responsiveness to change (n⫽76) showed that the total score increased from a mean ⫾ standard deviation of 6.14⫾2.40 at discharge to 7.70⫾2.62 (P⬍.001) at follow-up 3 months later. Floor and ceiling effects were not detected for the overall score but were noted for the individual subscales. The internal consistency was moderate at discharge (Cronbach ␣⫽.71) and follow-up (Cronbach ␣⫽.70). The Houghton Scale correlated significantly, although moderately, with the physical composite score of the Medical Outcomes Study 36-Item Short-Form Health Survey (r⫽.393, P⬍.01) and the 2-minute walk test at admission (r⫽.620, P⬍.01) and discharge (r⫽.653, P⬍.01). The reliability (intraclass correlation coefficient⫽.96) of the Houghton Scale was high (n⫽49). Conclusions: The Houghton Scale is appropriately responsive to change in prosthetic use in individuals with lower-limb amputation after rehabilitation. Key Words: Amputation; Amputees; Prostheses and implants; Rehabilitation; Treatment outcomes. © 2004 by the American Congress of Rehabilitation Medicine and the American Academy of Physical Medicine and Rehabilitation HERE IS A NEED for a quantitative tool that will quickly and reliably measure prosthetic use by people with ampuT tations. In a busy clinical practice, such a tool must be one that
From the Division of Physiatry, Department of Medicine, University of Toronto, Toronto, ON (Devlin); Clinical Evaluation and Research Unit (Pauley), West Park Healthcare Centre, Toronto, ON (Devlin), Village Square Sport Physiotherapy, Calgary, AB (Head); and Institute for Clinical Evaluative Sciences, Toronto, ON (Garfinkel), Canada. No commercial party having a direct financial interest in the results of the research supporting this article has or will confer a benefit upon the authors(s) or upon any organization with which the author(s) is/are associated. Correspondence to Michael Devlin, MD, West Park Healthcare Centre, 82 Buttonwood Ave, Toronto, ON M6M 2J5, Canada, e-mail:
[email protected]. Reprints are not available from the authors. 0003-9993/04/8508-8404$30.00/0 doi:10.1016/j.apmr.2003.09.025
can be administered in a short period of time or it will not be functionally useful. Such a tool would be used for routine clinical follow-up, program evaluation, and research in which prosthetic use is a variable of interest. The Houghton Scale1,2 is an instrument that looks solely at prosthetic use in people with lower-extremity amputations; it reflects a person’s perception of prosthetic use, rather than a health care provider’s viewpoint, and it consists of 4 questions (appendix 1). It is quickly administered and easy to score. The Houghton Scale has been compared with the Prosthesis Evaluation Questionnaire (PEQ) and the Locomotor Capabilities Index of the Prosthetic Profile of the Amputee (PPA).3 The Houghton Scale had modest internal consistency, good testretest reliability, and was the only scale tested that could discriminate between people with transfemoral and transtibial amputations. It was believed to have appropriate face validity. As yet, however, no data exists about its responsiveness to change. Other tools that measure prosthetic use have been developed that could be used to assess rehabilitation outcome in this population. The PPA questionnaire has been validated as a reliable tool.4 This questionnaire has 44 closed and open-ended questions, which makes it a lengthy instrument to administer. This tool asks not only about prosthetic use, such as wearing time, walking aids used, and ability to don the prosthesis, but also about factors that influence prosthetic use, which the authors termed “enabling” factors. These factors include cosmetic acceptance by the person with the amputation as well as by his/her family, ease of access to a facility that fits prosthetics, living arrangements, and vocational status. The PEQ is another tool that has had its psychometric properties explored.5 This instrument has 83 questions and combines issues of prosthetic use with comfort, cosmesis, satisfaction with the prosthesis, as well as health-related quality of life (HRQOL) components. Therefore, it does not purely answer questions about prosthetic use. It is lengthy to administer and score because most answers are ranked on a linear analog scale. There have been several other methods described in the literature that attempt to address prosthetic use; in general they consist of a variety of quantitative questions. None has had its psychometric properties investigated, none is in common use, and many include aspects other than prosthetic use. Considering the limitations of the various prosthetic outcome scales that have been used, we concluded that the Houghton Scale has the potential to be the most useful for clinical follow-up and program evaluation, and therefore needed to be assessed for its responsiveness to change in patient status. This prospective, single-center study had the following objectives to replicate the test-retest reliability and convergent validity based on the work of Miller et al3: to evaluate the scale’s responsiveness to change in patient status and to evaluate its floor and ceiling effects. Arch Phys Med Rehabil Vol 85, August 2004
1340
HOUGHTON SCALE RESPONSIVENESS TO CHANGE, Devlin Table 1: Subject Demographics for Samples 1 and 2
Mean age ⫾ SD (y) Men Unilateral amputation Transfemoral Transtibial Bilateral amputation Bilateral transtibial Partial foot Reason for amputation Diabetes and PVD
Sample 1 (n⫽49)
Sample 2 (n⫽76)
60.9⫾16.8 26 (53%) 43 (87.7%) 10 33 4 (8.2%) 4 2 (4.1%)
65.5⫾13.6 58 (76.3%) 27 (35.5%) 8 19 46 (60.5%) 26 3 (4.0%)
26 (54%)
62 (82%)
Abbreviations: PVD, peripheral vascular disease; SD, standard deviation.
METHODS Participants A convenience sample of participants was recruited from our regional amputee rehabilitation program. Study participation was open to English-speaking patients who were attending our clinic for routine follow-up, who had either a unilateral or bilateral amputation (transtibial or transfemora), and had been fitted with a prosthesis. (The purpose of the prosthetic fitting could have been for ambulation or facilitating transfers.) To be eligible for the study, participants had to be medically stable. In addition, they had to be free of complaints, or to be reporting complaints for which no intervention was offered, with respect to their prosthesis. Two participant samples were used in the study. Data from the first sample were used to establish the test-retest reliability of the Houghton Scale, whereas data from the second sample were used to evaluate its responsiveness to change, internal consistency, and validity. Sample 1 consisted of 49 participants (26 men) with a mean age at evaluation of 60.9⫾16.8 years. Sample 2 consisted of 76 participants (58 men) with a mean age of 65.5⫾13.6 years at evaluation. People with bilateral amputations were evaluated after their most recent amputation. Patient demographics for both samples are presented in table 1. Sample size calculations were not performed because there were no available data at the inception of this study with which to do such a calculation. The research ethics review committee of our institution approved the research protocol. Procedures After obtaining informed consent, basic demographic data were collected. People enrolled in sample 1 were attending a routine follow-up clinic. Later that same day, a research assistant contacted participants at their homes by telephone to administer the Houghton Scale questionnaire. One week later, the same research assistant contacted them at home and readministered the questionnaire. On both occasions the research assistant followed a scripted dialogue so to maintain consistency in questionnaire administration. Subjects assigned to sample 2 were recruited at the time of discharge from their initial rehabilitation program after amputation. Data, including the Houghton Scale, the 2-minute walk test6 (2MWT), and the Medical Outcomes Study 36-Item Short-Form Health Survey7 (SF-36), were collected in this group at discharge and at 3-month follow-up. Evaluation Houghton Scale. Participants were evaluated for prosthetic use and functional capacity. Prosthetic use was ascerArch Phys Med Rehabil Vol 85, August 2004
tained with the Houghton Scale questionnaire.1,2 This is a self-administered tool consisting of 4 items. The first 3 items are scored on a 4-point scale, and attempt to capture prostheticwearing habits; the fourth question has 3 dichotomous (yes/no) items that assess a patient’s comfort level when negotiating different outdoor surfaces. Results are reported as a total score out of 12, with higher scores indicating greater performance and greater comfort. The 2MWT. To provide comparative data with which to investigate the validity of the Houghton Scale questionnaire, sample 2 subjects also completed the 2MWT. This test is designed to assess mobility and requires the subject to walk along a long, straight corridor for 2 minutes. The distance covered in the allotted time is taken as the measurement of performance. This test has been shown to be responsive to change and to correlate moderately with the Houghton Scale (r⫽.493, P⬍.05), and the physical function component of the SF-36 (r⫽.479, P⬍.001). The SF-36. The SF-36 is a self-administered questionnaire that measures HRQOL.7 It has 36 questions concerning physical functioning physical role, bodily pain, general health, emotional role, social function, mental health, and vitality. The first 4 domains comprise the physical composite score (PCS), whereas the latter 4 make up the mental composite score (MCS). Although the SF-36 is widely used in rehabilitation settings, it has not been validated in the amputee population.7 Analysis Analyses were conducted to determine the responsiveness to change, floor and ceiling effects, test-retest reliability, and internal consistency, and convergent validity of the Houghton Scale. Descriptive statistics were performed for demographic data and individual and total item scores on the scale. Responsiveness to change. To assess responsiveness to change, the Wilcoxon matched-pairs signed-rank test was used for ordinal level data for items 1, 2, 3, and 4 (total) to detect any significant changes in scores from discharge to follow-up 3 months later. Chi-square analysis was used to compare discharge and follow-up scores for nominal level data for items 4a, 4b, and 4c. A paired samples t test was used to compare total score and 2MWT results at both time points. Floor and ceiling effects. Floor and ceiling effects were calculated as a percentage of responses at the bottom or top, respectively, of the range of the scale for each item. Scales ranged from 0 to 3 for items 1 through 3, from 0 to 1 for items 4a, 4b, and 4c, and from 0 to 12 for the total score. Reliability. The Cronbach ␣ was used to determine internal consistency, whereas an intraclass correlation coefficient (ICC) was calculated to examine test-retest reliability for total score. The Kendall -b statistic was used to assess test-retest agreement for items 1 through 3, whereas the Cohen statistic8 was used to determine agreement for items 4a, 4b, and 4c. Validity. Results of the Houghton Scale were correlated with data from the PCS and MCS of the SF-36 as well as from the 2MWT to investigate convergent validity. The strength of correlations was interpreted as follows: correlations ranging from 0.0 to .20 were considered to be negligible, .21 to .35 to be weak, .36 to .50 to be moderate, and those exceeding .50 were considered strong.9 Discriminant validity was evaluated by comparing Houghton scores of transfemoral versus transtibial and unilateral versus bilateral participants. All analyses were conducted by using SPSS, version 10.0,a for Windows.
1341
HOUGHTON SCALE RESPONSIVENESS TO CHANGE, Devlin Table 2: Houghton Scale Discharge and Follow-Up Scores (nⴝ76) Discharge Item
Range
Central Tendency*
1 2 3 4a 4b 4c 4 Total Total
0–3 0–3 0–3 0–1 0–1 0–1 0–3 0–12
2 2 1 1 0 0 2 6
Follow-Up
Floor Effect, n (%)
Ceiling Effect, n (%)
Central Tendency*
Floor Effect, n (%)
Ceiling Effect, n (%)
3 (3.9) 2 (2.6) 22 (28.9) 8 (9.0) 42 (47.2) 60 (67.4) 7 (9.2) 0
21 (27.6) 17 (22.4) 1 (1.3) 81 (91.0) 47 (52.8) 29 (32.6) 22 (28.9) 1 (1.3)
3 3 1 1 0 0 1 8
9 (11.8) 2 (2.6) 14 (18.4) 13 (14.6) 55 (61.8) 57 (64.0) 8 (10.5) 1 (1.3)
43 (56.6) 59 (77.6) 5 (6.6) 76 (85.4) 34 (38.2) 32 (36.0) 18 (23.7) 1 (1.3)
*Median score presented for items 1, 2, 3, 4 total, and total. Mode presented for items 4a, 4b, and 4c.
RESULTS Demographics The group 1 participants were long-standing prosthetic users; they were with an equal mix of men and women. Half had their amputation for diabetes mellitus and peripheral vascular disease (PVD), and most were unilateral transtibial amputees. Group 2 participants were slightly older, most were men, and their amputations resulted from complications of diabetes and PVD. Bilateral transtibial amputation was the most common level in this group (table 1). Responsiveness to Change In consideration of a given scale’s sensitivity to detect significant change in function as an indicator of underlying change in clinical status, it is necessary to consider both the appropriateness and responsiveness of the scale.10 Appropriateness. Appropriateness refers to the extent to which the range of disabilities covered by the scale approximate that of the study sample. In assessing the appropriateness of a measurement scale, it is desirable for the measure of central tendency to approach the midpoint of the distribution. The range should show the variability in the study sample, indicating a greater capacity for discrimination between subjects. Minimal floor and ceiling effects are preferred; floor and ceiling effects exceeding 20% are interpreted as an inherent inability to discriminate subjects at an acceptable level.11 Floor and ceiling effects of individual items of the Houghton Scale must be interpreted with caution. Three of 4 items have a range of 1 to 3, whereas the fourth item consists of 3 dichotomous subcomponents. By chance alone, the ceiling and floor effects of the first 3 items will each be 25%, whereas those for the individual components of the fourth item will each be 50%. However, taken as a whole, the Houghton Scale has a range of 0 to 12, allowing an appropriate assessment of floor and ceiling effects. Table 2 shows the median scores for items 1, 2, 3, 4 total, and total score, as well as the modal scores for items 4a, 4b, and 4c taken at discharge and follow-up. Floor and ceiling effects on the individual items were notable (as expected), although nearly absent for the overall score. Responsiveness. Responsiveness refers to the scale’s ability to detect clinically relevant change. From discharge to follow-up, the mean Houghton Scale score for sample 2 increased from 6.14⫾2.40 to 7.70⫾2.62 (t75⫽2.14, P⬍.001), resulting in a mean score change of 1.55⫾2.75. The effect size calculated for this change was .60, indicating a moderate difference.8
Table 3 shows the distribution of positive and negative ranks and ties for items 1, 2, 3, and 4 total. A positive rank indicates an increase in score from discharge to follow-up. With scales where improvement is indicated by an increase in score, it is desirable to have a greater proportion of positive over negative ranks in a Wilcoxon signed-rank test table. In addition, a P value is included to indicate level of significance for each change. From discharge to follow-up, participants demonstrated a significant improvement on items 1, 2, 3, whereas 4 total showed no change. Responsiveness to change of items 4a, 4b, and 4c was assessed with the chi-square statistic. A statistically significant difference was not detected from discharge to follow-up for any of these items. A post hoc analysis was conducted to further examine the apparent nonresponsiveness of item 4 and its subcomponents. To this end, 1-week follow-up responses from sample 1 were compared with follow-up responses from sample 2. Chi-square analysis revealed a significantly higher proportion of “yes” responses (indicating a sense of instability) among sample 1 patients than among sample 2 patients for item 4a 2 ⫽6.86, P⬍.01). This was similar for item 4c, with (1,n⫽125 sample 1 patients reporting a significantly higher proportion of 2 yes responses (1,n⫽125 ⫽5.02, P⬍.05). There was no significant difference between the 2 samples for item 4b. A MannWhitney U test for 2 independent samples revealed a significant difference between item 4 total scores, with a mean rank score of 55.2 for sample 1 and 68.1 for sample 2 (P⬍.05). Mean scores for item 4 were 1.2 for sample 1 and 1.6 for sample 2. These results suggest that sample 2 patients perceived fewer episodes of instability when walking outside.
Table 3: Wilcoxon Analysis Comparing Discharge With 3-Month Follow-Up Results for the 4 Components of the Houghton Scale (nⴝ76) Item
Positive Ranks*
Negative Ranks†
Ties‡
P Value
1 2 3 4 total
33 (43%) 48 (63%) 40 (53%) 21 (28%)
16 (21%) 5 (7%) 12 (16%) 26 (34%)
27 (36%) 23 (30%) 24 (32%) 29 (38%)
⬍.05 ⬍.001 ⬍.001 NS
Abbreviation: NS, not significant. *Follow-up score greater than discharge score. † Follow-up score less than discharge score. ‡ Follow-up score equal to discharge score.
Arch Phys Med Rehabil Vol 85, August 2004
1342
HOUGHTON SCALE RESPONSIVENESS TO CHANGE, Devlin Table 4: Comparison of Results With and Without Item 4 Included in Analysis Index
All Houghton Items Included in Analysis
Median Floor effect Ceiling effect Median Floor effect Ceiling effect Discharge score Follow-up score Change Effect size ICC Discharge Follow-up Houghton and PCS Houghton and MCS Houghton and 2MWT Houghton and 2MWT Transfemoral vs transtibial Transfemoral vs transtibial Unilateral vs bilateral Unilateral vs bilateral Age Age
6 n⫽0 n⫽1 (1.3%) 8 n⫽1 (1.3%) n⫽1 (1.3%) 6.14⫾2.40 7.70⫾2.62 1.55⫾2.75‡ .60 r⫽.96 ␣⫽.71 ␣⫽.70 r⫽.393† r⫽.235 (NS) r⫽.620† r⫽.653† 4.88⫾1.46 vs 6.84⫾2.59* 6.75⫾3.45 vs 8.05⫾2.37 (NS) 6.26⫾2.46 vs 5.91⫾2.35 (NS) 7.67⫾2.37 vs 7.61⫾2.63 (NS) r⫽⫺.024 (NS) r⫽.150 (NS)
Variable
Appropriateness (discharge)
Appropriateness (follow-up)
Responsiveness
Test-retest reliability Internal consistency Convergent validity (discharge)
Convergent validity (follow-up) Discriminant validity (discharge) Discriminant validity (follow-up) Discriminant validity (discharge) Discriminant validity (follow-up) Discriminant validity (discharge) Discriminant validity (follow-up)
Items 1, 2, and 3 Only Included in Analysis
4 n⫽0 n⫽1 (1.3%) 7 n⫽2 (2.6%) n⫽3 (3.9%) 4.42⫾1.78 6.09⫾2.10 1.67⫾2.13‡ .79 r⫽.95 ␣⫽.70 ␣⫽.71 r⫽.397† r⫽.273* r⫽.572† r⫽.603† 3.38⫾1.30 vs 4.88⫾2.47 vs 4.52⫾1.72 vs 5.96⫾2.26 vs r⫽.060 (NS) r⫽.129 (NS)
5.00⫾1.67* 6.42⫾2.06 (NS) 4.22⫾1.72 (NS) 6.09⫾2.04 (NS)
NOTE. Values are mean ⫾ SD. *P⬍.05. † P⬍.01. ‡ P⬍.001.
Test-Retest Reliability Forty-nine participants (sample 1) were recruited for this part of the study. Test-retest reliability estimates for items 1, 2, and 3 were .743, .688, and 1.000, respectively. All were statistically significant (P⬍.01). Agreement was high for items 4a, 4b, and 4c, with values of .712, .824, and .453. All were statistically significant (P⬍.001). The ICC for total score testretest was .96 (95% confidence interval, .92–.97) Internal Consistency An evaluation of internal consistency revealed Cronbach ␣ values similar to those found previously.3 In this study, ␣ was .71 at discharge and .70 at follow-up. Convergent Validity Of the 76 participants recruited to assess internal consistency (sample 2), SF-36 data were available for 55 subjects at discharge. 2MWT data were available for 61 at discharge and 56 at 3-month follow-up. There were insufficient SF-36 data available at 3-month follow-up for an analysis of validity. The Houghton total score correlated moderately well with the PCS of the SF-36 and the 2MWT, but poorly with the MCS. The correlation between Houghton score and the PCS was .393 (n⫽55) at discharge (2-tailed, P⬍.01). The correlation between the Houghton score and the MCS of the SF-36 was .235 at discharge (not significant). The correlation between Houghton score and the 2MWT was .620 (n⫽61) at discharge (P⬍.01) and .653 (n⫽56) at 3-month follow-up (P⬍.01). Of the 61 participants at discharge and the 56 at follow-up for which we had 2MWT data, 50 had data collected at both points in time. A Wilcoxon signed-rank test revealed a significant increase in distance walked. Positive and negative ranks Arch Phys Med Rehabil Vol 85, August 2004
and ties accounted for 92%, 6%, and 2%, respectively, of changes from discharge to follow-up (z⫽⫺5.751, P⬍.001). Discriminant Validity The Houghton Scale successfully discriminated between transfemoral versus transtibial participants; however, there was no difference between unilateral and bilateral transtibial participants, nor was there any difference in Houghton score between age groups. At discharge, Houghton scores for transfemoral and transtibial participants were 4.88⫾1.46 and 6.84⫾2.59, respectively (t25⫽⫺2.502, P⬍.51). At follow-up, Houghton scores did not differ between these 2 groups. There were no differences in the scores between unilateral and bilateral subjects at discharge or at follow-up. There was no significant correlation between age and Houghton score, even when participants were partitioned into 3 discrete age categories (⬍60y, 60 –75y, ⬎75y). Exclusion of Item 4 From Data Analysis Given the lack of responsiveness shown by item 4 and its inconsistency with items 1, 2, 3, and total score, we conducted an additional post hoc analysis to evaluate the psychometric properties of the first 3 items of the Houghton Scale, exclusive of item 4. Table 4 compares the previously described analyses with the results of the analysis repeated after item 4 was removed. There was little change in these indices, save for effect size. When item 4 was dropped from the analysis, change in total score from discharge to follow-up increased from 1.55⫾2.75 to 1.67⫾2.13. The resultant effect size increased from .60 to .79. Median total scores at discharge and follow-up decreased, but this is simply an effect of reducing the number of items comprising the total score. There was a slight increase
HOUGHTON SCALE RESPONSIVENESS TO CHANGE, Devlin
in the correlation between the total Houghton score and MCS of the SF-36. This correlation failed to achieve statistical significance when item 4 was included in the total score, but it was significant (P⬍.05) when item 4 was excluded. DISCUSSION This study indicates that the Houghton Scale is appropriately responsive to change in prosthetic use by individuals with lower-limb amputation because it changes in the same direction as the 2MWT and the PCS score. When individual items of the Houghton Scale were investigated for responsiveness to change from discharge to followup, Wilcoxon analysis revealed that items 1, 2, and 3 changed significantly over time. The change in total score for items 4 did not reach statistical significance. Chi-square analysis was unable to detect any change in the proportion of respondents indicating a feeling of instability in negotiating obstacles over time on items 4a, 4b, and 4c individually. There is an apparent inconsistency between items 1, 2, and 3, and the individual components of item 4 may be explained as an inherent difference in the underlying construct measured by these sets of questions. According to Miller et al,3 the Houghton Scale appears to measure performance or “did do” issues. Although this is true for items 1 through 3, item 4 can best be characterized as a question of perception of ability. It may be that, as a result of improvement in performance—as attested to by an increase in scores on items 1 through 3—subjects are more mobile, and hence, have had a greater opportunity to be confronted by situations that have caused them concern (eg, walking on a slope). Therefore, the lack of change on item 4 may not truly reflect a lack of an underlying change in level of confidence but simply indicates that the subject has been exposed to more potentially challenging situations than what they had experienced at discharge. However, the data indicate that 81% of sample 1 subjects always wore their prostheses outside, and 53% used a single cane or no gait aid, whereas 75% of sample 2 participants wore their prostheses outside and 47% used a cane or no upper aid. These similar rates would argue against the concept that those who had their prosthesis longer (group 1) had been exposed to more challenging situations. The finding that sample 2 patients, who were assessed early in their rehabilitation and had more people with bilateral amputation (n⫽46) as compared with sample 1 (n⫽4), reported fewer issues with stability argues against the face validity of item 4. When the analysis was rerun with item 4 excluded, neither measures of reliability nor validity changed appreciably. However, the effect size associated with change over time increased substantially, thereby improving the capacity of the Houghton Scale to detect clinically significant change. Item 4 of the Houghton Scale appears to measure a different underlying construct than the first 3 questions, does not change appreciably over time, and does not have good face validity. Inclusion of item 4, however, appears to reduce the capacity of the Houghton to detect change over time. We would therefore recommend that this question be eliminated from the Houghton Scale. The analysis of the floor and ceiling effects of the Houghton Scale revealed that the total score is relatively unaffected by an excess of scores at either the bottom or top of the scale. At each point in time, the mean of the distributions approximated the midpoint between the 2 extremes, allowing the majority of scores to lie well away from the confines of the floor or ceiling. This is a particularly important feature of a measurement tool such as this. For a tool to have the capability to discriminate
1343
between poorly functioning subjects and subjects who perform well, it is essential that the distribution of scores span a point on the scale in which variation can proceed in either the negative or positive direction. As would be anticipated, there were notable floor and ceiling effects for the individual items within the Houghton score and therefore these should not be used in isolation to assess performance. The Houghton Scale showed good test-retest reliability over a 1-week span. The ICC for the total score was very good at .96, higher than the value of .85 found by Miller.3 The discrepancy between these 2 values may be explained by the time interval between the points when the scales were administered. In our study, patients completed the Houghton Scale on 2 occasions, 1 week apart. In Miller,3 patients completed the Houghton Scale subsequent to their regularly scheduled follow-up appointment and again 4 weeks later where there would be a greater chance for some interval change to occur. The latter do not, however, report the change in Houghton score from time 1 to time 2. It is possible that the close proximity in which the test was administered in our study explains the higher ICC coefficient. An important property of a scale of this nature is whether it correlates well with conceptually similar measures. In this study, data from the Houghton Scale was compared with 2MWT data and the SF-36. At both discharge and follow-up, the Houghton Scale correlated significantly with the 2MWT. Similar results were found when compared with the PCS at discharge. At discharge, however, there was no correlation with the MCS. Correlation with the 2MWT and PCS argues for the convergent validity of the Houghton Scale in that the components of the latter are reflective of individual physical functioning. The lack of correlation with the MCS argues for the discriminant validity of the Houghton Scale in that each attempts to evaluate very different underlying constructs. Other authors12 have also reported that the MCS score in traumatic amputees is no different from that in nonimpanel subjects. Discriminant validity was further bolstered by the finding that the transfemoral group scored significantly lower than their transtibial counterparts at discharge. However, this difference was no longer apparent at follow-up. Despite the fact that Houghton scores improved significantly from discharge to follow-up in the entire group, the scores of the transfemoral and transtibial groups converged slightly. The Houghton score did not discriminate between subjects with unilateral versus bilateral transtibial amputations, nor did it show any relation to age. The Houghton Scale does not offer information about HRQOL, fulfilment of personal roles, or other prosthetically related issues (ie, satisfaction); however, other instruments can be used if these outcomes are of interest. The use of more generic tools to measure these types of outcomes permits comparison of results from one diagnostically related group to another, whereas the use of a more all-inclusive, diagnosisspecific tools limits comparison to other groups. There are several limitations to this study. Two different methods of having the Houghton Scale completed were used: by telephone (sample 1) and by paper and pencil (sample 2), and it is not known whether these 2 methods are equivalent. There was an unexpected preponderance of participants with bilateral transtibial amputation in sample 2, which may affect the generalizability of the results. This study did not address whether the Houghton Scale actually does reflect prosthetic use by people with lowerextremity amputation; this would have to be addressed in future studies. Arch Phys Med Rehabil Vol 85, August 2004
1344
HOUGHTON SCALE RESPONSIVENESS TO CHANGE, Devlin
CONCLUSIONS In view of these findings, we conclude that it is reasonable to use the Houghton Scale as a measure of prosthetic use, where the need is to have a simple tool that is easy and quick to administer and score. We would recommend that item 4 be dropped from the scale. Acknowledgment: We thank Janet Parsons for her assistance with data collection and entry.
APPENDIX 1. HOUGHTON SCORE QUESTIONS 1. Do you wear your prosthesis:
0-Less than 25% of waking hours (1–3 hours) 1-Between 25% and 50% of waking hours (4–8 hours) 2-More than 50% of waking hours (more than 8 hours) 3-All waking hours (12–16 hours) 2. Do you use your 0-Just when visiting the doctor or prosthesis to walk: limb-fitting center 1-At home but not to go outside 2-Outside the home on occasion 3-Inside and outside all the time 0-Use a wheelchair 3. When going outside 1-Use two crutches, two canes, or wearing your a walker prosthesis, do you: 2-Use one cane 3-Use nothing 4. When walking with your prosthesis outside, do you feel unstable when: 4a. Walking on a flat 0-Yes surface 1-No 4b. Walking on slopes 0-Yes 1-No 4c. Walking on rough 0-Yes ground 1-No
From: Houghton et al.1, © British Journal of Surgery Society Ltd. Reproduced with permission. Permission is granted by John Wiley & Sons Ltd on behalf of the BJSS Ltd.
Arch Phys Med Rehabil Vol 85, August 2004
References 1. Houghton AD, Taylor PR, Thurlow S, Rootes E, McColl I. Success rates for rehabilitation of vascular amputees: implications for preoperative assessment and amputation level. Br J Surg 1992;79: 753-5. 2. Houghton A, Allen A, Luff R, McColl I. Rehabilitation after lower limb amputation: a comparative study of above-knee, throughknee and Gritti-Stokes amputations. Br J Surg 1989;76:622-4. 3. Miller WC, Deathe AB, Speechley M. Lower extremity prosthetic mobility: a comparison of 3 self-report scales. Arch Phys Med Rehabil 2001;84:1432-40. 4. Gauthier-Gagnon C, Grise MC. Prosthetic profile of the amputee questionnaire: validity and reliability. Arch Phys Med Rehabil 1994;75:1309-14. 5. Legro MW, Reiber GD, Smith DG, del Aguila M, Larsen J, Boone D. Prosthesis evaluation questionnaire for persons with lower limb amputations: assessing prosthesis-related quality of life. Arch Phys Med Rehabil 1998;79:931-8. 6. Brooks D, Parsons J, Hunter JP, Devlin M, Walker J. The 2-minute walk test as a measure of functional improvement in persons with lower limb amputation. Arch Phys Med Rehabil 2001;82:1478-83. 7. Ware JE Jr, Sherbourne CD. The MOS 36-item short-form health survey (SF-36). I. Conceptual framework and item selection. Med Care 1992;30:473-83. 8. Cohen J. Statistical power analysis for the behavioural sciences. 2nd ed. Hillsdale: Erlbaum Associates; 1988. 9. Lacasse Y, Wong E, Guyatt G. A systematic overview of the measurement properties of the Chronic Respiratory Disease Questionnaire. Can Respir J 1997;4:131-9. 10. van der Putten JJ, Hobart JC, Freeman JA, Thompson AJ. Measuring change in disability after inpatient rehabilitation: comparison of the responsiveness of the Barthel index and the Functional Independence Measure. J Neurol Neurosurg Psychiatry 1999;66: 480-4. 11. Holmes WC, Shea JA. Performance of a new, HIV/AIDS-targeted quality of life (HAT-QoL) instrument in asymptomatic seropositive individuals. Qual Life Res 1997;6:561-71. 12. Pezzin LE, Dillingham TR, MacKenzie EJ. Rehabilitation and the long-term outcomes of persons with trauma-related amputations. Arch Phys Med Rehabil 2000;81:292-300. Supplier a. SPSS Inc, 233 S Wacker Dr, 11th Fl, Chicago, IL 60606.