J Shoulder Elbow Surg (2009) 18, 424-428
www.elsevier.com/locate/ymse
Normal shoulder outcome score values in the young, active adult LCDR Michael G. Clarke, MD, MC, USNa, LCDR Christopher B. Dewing, MD, MC, USNa, LCDR David T. Schroder, MD, MC, USNRb, CDR Daniel J. Solomon, MD, MC, USNa,*, LCDR Matthew T. Provencher, MD, MC, USNRa a b
Department of Orthopaedic Surgery, Naval Medical Center San Diego, San Diego, CA Department of Orthopaedic Surgery, US Naval Hospital, Yokosuka, Japan Background: Our objective was to determine baseline, normative values for multiple shoulder outcome scores in a young, active population without shoulder symptoms. Methods: One hundred ninety-two volunteers completed the Single Assessment Numeric Evaluation, modified American Shoulder and Elbow Surgeons score, Western Ontario Shoulder Instability index, Simple Shoulder Test, and Disabilities of the Arm, Shoulder and Hand score. Their mean age was 28.8 years (range, 17-50 years). Results: Of the participants, 59 (31%) scored no deficiencies on any of the outcome instruments, whereas 133 (69%) demonstrated some abnormal shoulder score. The mean scores were as follows: Single Assessment Numeric Evaluation, 97.7 (SD, 5.2); modified American Shoulder and Elbow Surgeons score, 98.9 (SD, 3.3); Western Ontario Shoulder Instability index, 82.7 of 2100 (SD, 153.5); Simple Shoulder Test, 11.79 (SD, 0.60); and Disabilities of the Arm, Shoulder and Hand score, 1.85 (SD, 5.99). Conclusion: Our results show that the best possible shoulder score in an asymptomatic population may not be equivalent to a perfect score on the outcome scale. Ó 2009 Journal of Shoulder and Elbow Surgery Board of Trustees. Keywords: Shoulder outcome scores; ASES Score; Western Ontario Shoulder Instability Index; Simple Shoulder Test; DASH score.
Functional outcome scores have served an increasingly important role in patient postoperative assessment, in assessing treatment efficacy, and as research tools in orthopaedic surgery. The paradigm has recently shifted from objective, physician-directed assessment to patientbased self-assessment. The ideal outcome instrument is simple to administer, reliable, and valid. Multiple validated outcome scores (Single Assessment Numeric Evaluation *Reprint requests: CDR Daniel J. Solomon, MD, MC, USN, Department of Orthopaedic Surgery, Naval Medical Center San Diego, 34800 Bob Wilson Dr, Suite 112, San Diego, CA 92134-1112. E-mail address:
[email protected] (CDRD.J. Solomon).
[SANE],17 American Shoulder and Elbow Surgeons [ASES],11 Western Ontario Shoulder Instability [WOSI],8 Simple Shoulder Test [SST],9 and Disabilities of the Arm, Shoulder and Hand [DASH]7 scores) are available to evaluate shoulder conditions. Despite the availability of numerous shoulder scores to assess pain, function, symptoms, and activity level for a variety of shoulder conditions, most of the shoulder scores have not been tested in normal patients. Previous studies have made the assumption that a perfect score equaled a normal score on the outcome instrument.13 However, more recent studies using certain shoulder outcome scores have shown less than perfect scores for normal individuals.4,6,9,13,14
1058-2746/2009/$36.00 - see front matter Ó 2009 Journal of Shoulder and Elbow Surgery Board of Trustees. doi:10.1016/j.jse.2008.10.009
Normal shoulder outcome score values Normative data scores are important for both researchers and clinicians to facilitate interpretation of outcomes across a wide range of shoulder pathology, especially because some of the shoulder outcome instruments include general patientderived health and function assessments. Validated outcome scores allow accurate comparison between groups and studies. However, current outcome-based scoring surveys have previously described normalized values for older patients.6,9,13 With increased emphasis in orthopaedics on both physician- and patient-derived metrics, it is important to determine accepted normative scores in younger, healthier populations. The purpose of this study is to determine baseline, normative values for multiple shoulder outcome scores in a young, active population without known shoulder pathology or symptoms and determine the correlation among multiple shoulder scores.
425 Table I
Patient demographics
Demographic
No.
Total participants Excluded Included Age (y) Mean (SD) Median (95% confidence interval) Gender Male Female Undeclared Hand dominance Right Left Undeclared
206 14 192 28.79 (7.42) 27 (27-29) 148 (77%) 38 (20%) 6 (3%) 164 (85%) 24 (13%) 4 (2%)
Materials and methods Over an 18-month period (June 2006 to December 2007), 206 volunteers without any prior shoulder symptoms, injury, or pathology completed an anonymous battery of shoulder outcome scores for this institutional review boardeapproved study. Patients who were being seen for conditions unrelated to their shoulders were recruited from our orthopaedic sports medicine clinic. Inclusion criteria consisted of age less than 50 years and asymptomatic shoulders without any history of injury, surgery, or pathology. Exclusion criteria were any history of shoulder pathology, prior shoulder surgery, or shoulder pain or failure to complete more than 2 outcome metrics in the survey. We excluded 14 patients (age, 1; failure to complete >2 outcome metrics, 7; shoulder pain, 4; and shoulder pathology, 2). The mean age was 28.8 years (range, 17-50 years; SD, 7.4 years). The median age was 27 years (95% confidence interval, 27-29 years). There were 148 men (77%), 38 women (20%), and 6 undeclared (respondents who not identify gender on the questionnaire) (3%), all of whom were active-duty military personnel. Hand dominance consisted of 164 right, 24 left, and 4 undeclared (Table I). All participants who met the inclusion criteria completed an anonymous questionnaire consisting of a screening cover sheet, demographic data, and outcome scores. The cover sheet contained detailed written instructions for completion of the questionnaire and secondary screening questions for any history of shoulder or upper extremity pain, pathology, or surgery. Demographic data included age, gender, occupation, hand dominance, sports activity, and medical history. The outcome measures consisted of the SANE,17 modified ASES score,11 WOSI index,8 SST,9 and DASH score.7 All participants were instructed to complete the questionnaire based on their dominant hand/extremity. The lead author calculated and tabulated all scores. The SANE is a single question that asks the patient to rate his or her shoulder on a scale of 0 to 100 (100 is a perfect score). The modified ASES score is a sum of functional and pain subscores, with a maximum of 100 points.11 The functional subscore equals the sum of 10 functional questions (responses graded 0-3 points), multiplied by a factor of 5/3, for a maximum of 50 points. The pain subscore equals the visual analog score (0-10) subtracted from 10 and
multiplied by 5, for a maximum of 50 points. The WOSI is a sum of the scores for 21 questions (graded 0-100), with a maximum total of 2100 points (0 is a perfect score), that is often converted to a percentage of normal function to facilitate interpretation.8 The WOSI consists of 4 subcategories: WOSI-A (physical symptoms, 10 questions), WOSI-B (sports/recreation/work, 4 questions), WOSI-C (lifestyle, 4 questions), and WOSI-D (emotions, 3 questions). The SST represents the total number of yes responses to 12 shoulder function questions.9 The DASH is determined as follows: [(sum of n responses [graded 0-5] divided by n) 1] 25, with a maximum of 100 points (0 is a perfect score).7
Statistical methods Mean values with SD were calculated for each outcome score. Median values with 95% confidence intervals were also determined. Logistic regression was used to determine whether age, gender, or hand dominance had a significant effect on the outcome scores. The 2-sample Wilcoxon rank sum test was also used to determine the significance of gender and hand dominance on the outcome scores. Statistical significance was accepted at P .05. Pearson correlation coefficients were used to compare all scores, as well as subscores, with a correlation greater than 0.70 being considered significant.
Results All 192 participants completed the questionnaire; however, some omissions were noted. With regard to demographic data, 2 participants omitted age, 6 omitted gender, and 4 omitted hand dominance. Ten failed to complete the SANE score. In total, 177 (92%) completed the questionnaire without delinquencies. Of the participants, 59 (31%) scored no deficiencies on all of the outcome instruments (a score of 0 indicates 100% function). However, most of the participants (133 [69%]) demonstrated some abnormality in their shoulder scores.
426 Table II
LCDRM.G. Clarke et al. Outcome measures: Normative data Range SD
Total performed
Percent from perfect
Mean
SANE ASES pain subscale (visual analog score) ASES functional subscale ASES total WOSI WOSI-A WOSI-B WOSI-C WOSI-D SST DASH
182 192
2.3% 1.5%
97.74 49.27
5.21 2.84
60 30
100 50
192 192 192 192 192 192 192 192 192
0.75% 1.1% 3.9% 4.4% 2.3% 4.3% 4.2% 1.7% 1.8%
49.62 98.90 82.70 43.80 9.16 17.29 12.45 11.79 1.85
1.41 3.27 153.55 86.12 27.47 33.74 33.00 0.60 5.99
38 80 0 0 0 0 0 9 0
50 100 1035 600 200 200 254 12 50.8
One hundred eighty-two participants (ninety-five percent) provided a mean SANE score of 97.74 points (range, 60-100 points; SD, 5.21 points). The modified ASES score was completed by 192 participants (100%). The ASES scale was analyzed for total score, pain subscore, and function subscore. The mean scores with SD were 98.90 points (SD, 3.27 points; range, 80-100 points), 49.27 points (SD, 2.84 points; range, 30-50 points), and 49.62 points (SD, 1.40 points; range, 38-50 points), respectively. All participants completed the WOSI score without deficiency. The mean WOSI score was 82.70 points (range, 0-1035 points; SD, 153.55 points). The corresponding median value was 25 points (range, 15-40 points). The scores for the WOSI subcategories were as follows: WOSIA, 43.80 points (range, 0-600 points; SD, 86.12 points); WOSI-B, 9.16 points (range, 0-200 points; SD, 27.47 points), WOSI-C, 17.29 points (range, 0-200 points; SD, 33.74 points), and WOSI-D, 12.45 points (range, 0-254 points; SD, 33.00). Because each subcategory consists of a varying number of questions, conversion of these scores to a percentage of normal values facilitates interpretation and comparison. The converted values for the mean scores were as follows: WOSI, 96.1%; WOSI-A, 95.6%; WOSI-B, 97.7%; WOSI-C, 95.7%; and WOSI-D, 95.8%. Every participant completed the SST. The mean score was 11.79 points out of a 12-point maximum (range, 9-12 points; SD, 0.60 points). One hundred ninety-two participants (one hundred percent) provided a mean DASH score of 1.84 points (range, 0-50.8 points; SD, 5.99 points). Except for the WOSI score, the median score for all other outcome measures was equivalent to a perfect score for that outcome metric. The results are summarized in Table II. Pearson correlation coefficients were used to compare all scores as well as subscores. Strong correlation was noted between ASES total score and ASES pain subscore, among
Minimum
95% Confidence interval
Outcome measure
Maximum
Median
Minimum
Maximum
100 50
100 50
100 50
50 100 25 2.5 0 0 0 12 0
50 100 15 0 0 0 0 12 0
50 100 40 16.28 0 0 0 12 0
WOSI overall score and WOSI subcategories A/B/C, and between WOSI-A and WOSI-B. The remaining scoring deficiencies were poorly correlated (Table III). Logistic regression and 2-sample Wilcoxon rank sum test were used to determine the significance of age, gender, and hand dominance on the outcome measure scores. None of the outcome scores was significantly associated with gender (P ¼ .117) or hand dominance (P ¼ .748).
Discussion With increased emphasis on both physician- and patientderived functional metrics, determining accepted normative scores in a patient population is important. Previous studies have made the assumption that a perfect score equaled a normal score on the outcome instrument.13 Our results determined normal values for several shoulder outcome scores commonly used in clinical assessment and research. The mean scores for our young, active patients were as follows: SANE, 97.7 (SD, 5.2); ASES, 98.9 (SD, 3.3); WOSI, 82.7 of 2100 (SD, 153.5); SST, 11.79 (SD, 0.60); and DASH, 1.85 (SD, 5.99). The relative scores from a perfect result in descending order were the WOSI (3.9%), SANE (2.3%), DASH (1.8%), SST (1.7%), and ASES (1.1%). Our results illustrate that outcome scores deviate from a perfect score even in a young, active population with no shoulder symptoms or pathology. Of interest, the median values for the SANE, ASES, SST, and DASH were equivalent to a perfect score. Consequently, the majority of our young, healthy, active population scored perfectly on these 4 outcome measures. However, the majority (69%) showed some level of abnormal shoulder score on the questionnaire. We found no significant correlations among the individual outcome scores. This finding supports the conclusion of Brinker et al4 that there is not equivalent translation between normal values of the various outcome scores.
Normal shoulder outcome score values Table III
427
Correlation coefficients Age
Age SANE ASES pain subscale (visual analog score) ASES functional subscale ASES total WOSI-A WOSI-B WOSI-C WOSI-D WOSI SST DASH
SANE
ASES pain ASES subscale functional (visual subscale analog score)
1.0 0.0424 0.1263
1.0 0.2554
1.0
0.1377
0.2321
0.1775
0.0892 0.0060 0.0529 0.0631 0.0703 0.0351 0.1304 0.1152
0.3163 0.3164 0.2325 0.2197 0.3573 0.3407 0.2566 0.2045
0.9345) 0.2929 0.2836 0.3998 0.3573 0.3407 0.1850 0.2526
ASES
WOSI-A
WOSI-B
WOSI-C
WOSI-D WOSI
SST
DASH
1.0
0.4174 0.5072 0.4145 0.2780 0.3395 0.4927 0.5107 0.2278
1.0 0.3420 1.0 0.3654 0.7718) 1.0 0.4303 0.5888 0.6459 1.0 0.2158 0.5253 0.6173 0.3600 1.0 0.3981 0.9412) 0.8864) 0.7429) 0.6991 1.0 0.0595 0.4153 0.2980 0.3783 0.2424 0.4215 1.0 0.3341 0.2300 0.2121 0.2786 0.2596 0.2840 0.4118 1.0
Rising health care costs, emphasis on cost-effectiveness, and determination of treatment effectiveness have stimulated recent interest in outcome metrics. Patient factors, such as comorbidities, work status/compensation status, and expectations, influence the patient’s perception of overall outcome.12,15,16 Thus, recent emphasis has been placed on the development on outcome measures from the patient’s perspective.5 Outcome measures are important tools for quantifying, standardizing, and assessing the results of treatment and research.10 An ideal outcome measure should have the following qualities: universality, practicality, reliability, effectiveness, inclusiveness, and reproducibility.9 The ASES score describes the ideal measure having these attributes: ease of use, assessment of activities of daily living, and inclusion of patient self-evaluation.7 Outcome measures should be consistent (reliable) and valid (measure what they are intended to measure).5 Validated outcome measures can be further divided by scope: general health, region specific, joint specific, and disease specific.5 Many region-specific, joint-specific, and disease-specific outcome measures assess the shoulder.2,7 The SANE score is a generalized tool that is used to evaluate overall health/ condition. Richards et al11 first developed the ASES score in 1994 to assess shoulder function. Beaton and Richards1 showed that the ASES score is reproducible, valid, and responsive to the patient’s condition. Kirkley et al8 published the WOSI index in 1998 as the first disease-specific outcome measure for the shoulder. Validity has been assessed via construct validation.7 Compared with other outcome measures (ASES, DASH, University of California, Los
Angeles [UCLA], and Rowe), the WOSI was more responsive for shoulder instability.8 Lippitt et al9 developed the SST in 1992 to document postoperative functional improvement and characterize shoulder impairment. Because of the dichotomy of the responses, the SST is less sensitive to small but clinically relevant changes in function and less likely to differentiate between patients with varying severities of the same condition.5,7 The American Academy of Orthopaedic Surgeons (Rosemont, IL) and the Institute for Work & Health (Toronto, Ontario, Canada) developed the DASH questionnaire for patients with any condition of the upper extremity (region-specific measure).7 Validity, responsiveness, and reliability have been previously established; however, the DASH has also been shown to be less responsive than other joint-specific and disease-specific instruments for shoulder conditions.3,7,8 The broader scope of this measure makes it more attractive in the clinical setting but less attractive for research.7 The normal values for shoulder outcome measures have been established in several different populations. Hunsaker et al6 determined a normal value of 10 points for the DASH in the general population. Lippitt et al9 reported a normal value of greater than 95% yes answers on the SST in a population of healthy patients aged 60 to 70 years. Sallay and Reed13 found a mean ASES score of 92.2 points in 343 healthy patients (mean age, 42.8 years; range, 6-87 years). No normal values in a young population for the WOSI or SANE have been identified. Our results for normal values in our population compare favorably with those in the literature. As described by Kirkley et al,8 the WOSI is the most responsive of the scores
428 and would predictably deviate the most from a perfect score in the normal patient, as shown by our data. Our mean ASES value correlates very closely to the mean obtained by Brinker et al4 in a very comparable population. Our normal values for the DASH and SST are closer to a perfect score than established normal values in the literature. A significant age effect has been reported in outcome measures requiring rating of physical activities. Brinker et al4 also wanted to determine whether instrument bias occurred in outcome measures with regard to gender, activity level, age, and hand dominance. In their study, only the Constant-Murley score showed gender and age bias. The remaining scores, including the ASES, had negligible bias for gender, activity level, age, and hand dominance. Our results also showed no significant effect of gender or hand dominance on our selected outcome measures. Soldatis et al14 performed outcome measures (Rowe, ASES, UCLA, Constant-Murley, and SST) on 190 healthy collegiate athletes to determine the presence and severity of shoulder symptoms in a normal population similar to an active-duty military population. They showed that significant shoulder symptoms exist in this normal population and increase with shoulder dominance and history of prior injury. Although Soldatis et al did not determine a mean ASES score, they discovered that 46% of all shoulders showed some degree of pain, equating to a less than perfect score. Brinker et al4 also performed outcome measures (ConstantMurley, UCLA, ASES, Shoulder Pain and Disability Index, and Oxford) on 120 healthy collegiate or recreational athletes (mean age, 28.8 years; range, 17-81 years). They showed a mean cumulative ASES score of 98.5 points (range, 88.5-100 points) in their population. They also noted that normalized total and subscale scores for these outcome measures were not equivalent and recommended that researchers should select a measure that matches their population and purpose. Some limitations of our study are evident. Our normal values can only apply to a similar population of young, active adults and not to the broader general population. Our volunteers’ age and activity level were commensurate with our overall patient population. However, the relative paucity of female participants (20%) may have affected our analysis and made the findings less applicable to women. Another limitation is the lack of objective testing (shoulder examination or imaging studies). Some participants may have had physical or structural abnormalities but were still included in the study. Zarins18 suggests that subjective outcome measures need to be correlated with objective measures to provide a complete postoperative assessment. These outcome measures were intended for use in injured shoulders in the general populace; they might not be valid for a young, healthy subgroup. Finally, not all outcome measures consist of similar subscales, and, therefore, they may not be valid for all shoulder conditions. Our study provides normative values for multiple shoulder outcome scores commonly used in clinical shoulder
LCDRM.G. Clarke et al. assessment and research. Our results show, even in a completely asymptomatic population, that the best possible shoulder score may not be equivalent to a perfect score on the outcome scale. The individual outcome scores do not significantly correlate with one another. Clinicians and researchers should be aware that normal outcome scores in a young population may be in the 96% to 98% range. Surgeons and researchers can now use these values to assess outcomes accurately in a young, active adult population.
References 1. Beaton D, Richards RR. Assessing the reliability and responsiveness of 5 shoulder questionnaires. J Shoulder Elbow Surg 1998;7:565-72. 2. Beaton D, Schemitsch E. Measures of health-related quality of life and physical function. Clin Orthop Relat Res 2003:90-105. 3. Beaton DE, Katz JN, Fossel AH, et al. Measuring the whole or the parts: validity, reliability, and responsiveness of the DASH outcome measure in different regions of the upper extremity. J Hand Ther 2001; 14:128-46. 4. Brinker MR, Cuomo JS, Popham GJ, et al. An examination of bias in shoulder scoring instruments among healthy collegiate and recreational athletes. J Shoulder Elbow Surg 2002;11:463-9. 5. Daum WJ, Brinker MR, Nash DB. Quality and outcome determination in health care and orthopaedics: evolution and current structure. J Am Acad Orthop Surg 2000;8:133-9. 6. Hunsaker FG, Cioffi DA, Amamdico PC, et al. The American Academy of Orthopaedic Surgeons outcomes instruments: normative values from the general population. J Bone Joint Surg Am 2002;84:208-15. 7. Kirkley A, Griffin S, Dainty K. Scoring systems for the functional assessment of the shoulder. Arthroscopy 2003;19:1109-20. 8. Kirkley A, Griffin S, McLintock H, et al. The development and evaluation of a disease-specific quality of life measurement tool for shoulder instability: the Western Ontario Shoulder Instability index (WOSI). Am J Sports Med 1998;26:764-72. 9. Lippitt SB, Harryman DT, Matsen FA. A practical tool for evaluating shoulder function: the Simple Shoulder Test. In: Matsen FA III, Fu FH, Hawkins RJ, editors. The shoulder: a balance of mobility and stability. Rosemont (IL): American Academy of Orthopaedic Surgeons; 1993. p. 501-18. 10. Placzek JD, Lukens SC, Badalanmenti S, et al. Shoulder outcome measures: a comparison of 6 functional tests. Am J Sports Med 2004; 32:1270-7. 11. Richards RR, An KN, Bigliani LU, et al. A standardized method for the assessment of shoulder function. J Shoulder Elbow Surg 1994;3:347-52. 12. Rozencwaig R, van Noort A, Moskal MJ, et al. The correlation of comorbidity with function of the shoulder and health status of patients who have glenohumeral degenerative joint disease. J Bone Joint Surg Am 1998;80:1146-53. 13. Sallay PI, Reed L. The measurement of normative American Shoulder and Elbow Surgeons scores. J Shoulder Elbow Surg 2003;12:622-7. 14. Soldatis JJ, Moseley JB, Etminan M. Shoulder symptoms in healthy athletes: a comparison of outcome scoring systems. J Shoulder Elbow Surg 1997;6:265-71. 15. Viola RW, Boatright KC, Smith KL, et al. Do shoulder patients insured by workers’ compensation present with worse self-assessed function and health status? J Shoulder Elbow Surg 2000;9:368-72. 16. Weinstein JN, Deyo RA. Clinical research: issues in data collection. Spine 2000;25:3104-9. 17. Williams GN, Gangel TJ, Arciero RA, et al. Comparison of the single assessment numeric evaluation method and two shoulder rating scales. Am J Sports Med 1999;27:214-21. 18. Zarins B. Are validated questionnaires valid? J Bone Joint Surg Am 2005;87:1671-2.