J Shoulder Elbow Surg (2014) -, 1-10
www.elsevier.com/locate/ymse
Reliability of patient self-assessment of shoulder range of motion and strength after shoulder arthroplasty Justin S. Yang, MDa, Jay D. Keener, MDa, Ken Yamaguchi, MDa, Jiajing Chen, MPHb, Georgia Stobbs-Cucchi, RN, CCRPa, Rebecca Patton, MA, CCRPa, Leesa M. Galatz, MDa,* a b
Department of Orthopaedic Surgery, Washington University School of Medicine, St. Louis, MO, USA Division of Biostatistics, Department of Orthopaedics, Washington University School of Medicine, St. Louis, MO, USA Background: Patient-derived self-assessment potentially minimizes loss of valuable outcomes data, conserves medical resources, and benefits patients by saving valuable time out of work and travel expenses. The purpose of this study was to determine the physician-patient correlation of a patient-derived outcomes questionnaire that assesses range of motion (ROM) and strength after shoulder arthroplasty. Methods: One hundred twenty consecutive patients completed a home-based questionnaire before their 1year postoperative visit after shoulder arthroplasty. The questionnaire contained demographic information such as age, gender, type of surgery, education level, and income. Diagram-based questions, in which patients were asked to identify the image representing their own active shoulder ROM in various planes, were included. Patients were asked to perform a strength examination using premeasured zip-lock bags filled with water that correspond to predetermined weights up to 2.72 kg. The k statistics were used to assess the degree of agreement between the patient’s self-assessment and the clinician’s measures. Results: The k statistics indicated moderate clinician-patient agreement (0.5-0.59) on items related to ROM and substantial to almost perfect agreement (0.62-0.92) on items related to strength (forward flexion and abduction). A majority of patients (>88%) correctly estimated their ROM within 1 grade of the clinician’s measurement. Patients tended to err toward overestimating their ROM. Conclusions: This patient-derived questionnaire provides a moderate to high level of agreement with clinician assessment. This assessment questionnaire may be an important tool in facilitating both clinical and research follow-up of patient outcomes after shoulder arthroplasty. Level of evidence: Level I, Diagnostic Study. Ó 2014 Journal of Shoulder and Elbow Surgery Board of Trustees. Keywords: Shoulder replacement; shoulder arthroplasty; outcomes; self-assessment; arthritis and glenohumeral joint
Institutional Review Board approval was obtained from the Human Research Protection Office, The Washington University in St. Louis: IRB ID#201105158. The grant in this paper was used to fund statistical support. *Reprint requests: Leesa M. Galatz, MD, Department of Orthopaedic Surgery, Washington University, 660 S Euclid Ave, Campus Box 8233, St. Louis, MO 63110, USA. E-mail address:
[email protected] (L.M. Galatz).
Tracking patient outcomes and satisfaction is an increasingly important component of medical care. These parameters may strongly influence allocation of resources in the future; however, patient follow-up is challenging as travel time and expense often preclude long-term assessment. In particular, patients doing well after surgical
1058-2746/$ - see front matter Ó 2014 Journal of Shoulder and Elbow Surgery Board of Trustees. http://dx.doi.org/10.1016/j.jse.2014.08.025
2 procedures may not overcome these challenges just to visit a physician’s office for outcome assessment alone. In addition, the requirement of valuable clinic and personnel time places a burden on the system. A validated, homebased patient-derived shoulder questionnaire potentially offers a valuable tool in gathering and retaining patient information that may otherwise be lost.11 From a treatment perspective, these assessments also could be used to identify patients performing more poorly than expected, who may benefit from physician re-evaluation. We hypothesized that patient-derived measures would compare favorably with physician-based assessment. Various standardized questionnaires, such as the American Shoulder and Elbow Surgeons (ASES) questionnaire and the Simple Shoulder Test (SST), have been widely used to evaluate patient outcome after shoulder surgery.1,9 Whereas these questionnaires have been shown to be both reproducible and reliable, the strength and range of motion (ROM) elements could be more precise and objective. They are patient based and give limited information as to objective functional outcome. Other measurement tools are physician directed. The Constant score, for example, is physician directed and requires a clinician’s evaluation to obtain physical examination data. Both subjective and objective measures should be used to evaluate patient satisfaction and outcome and used selectively on the basis of the clinical or research question. A few studies have objectively examined the association between the patient’s self-assessed function and the clinician’s physical examination, with an emphasis on ROM. 3,11 Currently, no validated patient self-assessment tool exists to objectively assess both shoulder strength and ROM. The purpose of this study was to determine the physician-patient correlation of a patient-derived outcomes questionnaire that assesses ROM and strength after shoulder arthroplasty.
Materials and methods Study subjects From April 2012 to March 2013, 158 patients who underwent a shoulder arthroplasty were recruited for the study. These subjects were identified from a consecutive list of patients scheduled for a 1-year postsurgery follow-up visit as a prospective cohort. The procedures were performed by 1 of 3 fellowship-trained shoulder and elbow specialists in an academic practice setting. After consenting to participate by phone contact, patients were mailed a home-based questionnaire 2 weeks before the 1-year postoperative visit. Fifteen patients declined to participate in the study after receiving the questionnaire. Eleven patients subsequently canceled their postoperative appointment and were not able to complete the study, given our time constraints. Seven patients had incomplete participation in the strength assessment using the water bags and were excluded. Five patients had cognitive limitations that excluded them from the study. These were the only patients who
J.S. Yang et al. were unable to complete the evaluation for cognitive reasons. One hundred twenty patients completed all portions of the evaluation and compose the study group. Eight patients had bilateral arthroplasties. Primary total shoulder arthroplasty (TSA) had been performed in 55 patients, reverse shoulder arthroplasty in 51, hemiarthroplasty in 2, and revision arthroplasty in 12. The diagnosis included primary osteoarthritis in 55 patients, loosening of previous arthroplasty in 10, massive rotator cuff tear or rotator cuff arthropathy in 43, fractures in 5, rheumatoid arthritis in 4, and osteonecrosis in 3. The goal of the study was to develop an assessment tool that could be used for both TSA and reverse TSA; thus, consecutive arthroplasty patients were included regardless of implant or diagnosis as long as they met the inclusion criteria.
Office visit protocol The patients were initially contacted 4 weeks before their 1-year follow-up appointment. They were mailed the questionnaire along with a detailed instruction sheet 2 weeks before the appointment. They brought the completed questionnaire to the clinic. The questionnaire was collected by an independent office staff and checked for completeness. Patients were given an opportunity to complete unfinished portions of the assessment. They were evaluated by 1 of 2 independent orthopedic nurses trained in a standardized physical examination technique for the shoulder.4,6,7 The nurses were trained in physical examination and evaluation of the shoulder and elbow by the physicians on the service. Their evaluations have been validated and monitored. They used established standards of evaluation according to ASES standards,10 which requires the use of a goniometer and measures the angle between the arm and the trunk or thorax. The nurses were blinded to the patient’s questionnaire. Their standardized examinations were previously shown to have both excellent intraobserver and interobserver reliability.7 The nurse first assessed ROM by the same ordinal categories as in the questionnaire and subsequently with the use of a goniometer. Shoulder strength in the forward flexion and abduction planes was measured with a portable Isobex (Medical Device Solutions, Oberburg, Switzerland) dynamometer by previously described techniques.7
Questionnaire The questionnaire contained demographic information such as age, gender, height, weight, type of surgery, education level, and income (Fig. 1). An ASES questionnaire and the SST were included. For the ROM assessment, photographs were taken of a volunteer with a normal shoulder and spine. A goniometer was used to measure 0 , 30 , 60 , 90 , 120 , 150 , and 180 of abduction and forward flexion. The difference between the model’s spine and the vertical plane was negligible. The volunteer was measured and asked to hold position briefly for the photograph. Similar methodology was used for external rotation at the side and in 90 of abduction. Internal rotation behind the back was photographed relative to anatomic landmarks. The photographs were placed on the questionnaire, and patients were asked to identify the image most closely representing their own active shoulder ROM (Fig. 2). For the strength assessment, quart-sized (0.90-kg) Hefty One Zip (Reynolds Consumer Products, Lincolnshire, IL, USA) bags were filled with water until they weighed 0.90 kg. The fill line was then drawn on the bag with a waterproof marker. The water was
Patient self-assessment after shoulder arthroplasty
Figure 1
3
Demographics and socioeconomic portion of the questionnaire.
Figure 2 Diagram-based range of motion questions for various planes of motion. (A) Abduction. (B) Forward flexion. (C) External rotation at 90 abduction. (D) External rotation with elbows at the side. (E) Internal rotation.
4
J.S. Yang et al.
Figure 2 discarded and the level of the fill line was measured from the bottom of the bag, and this was used to produce subsequent fill lines on fresh bags. We validated the ability of patients to reliably perform the task of filling the bags before formal initiation of the study (Fig. 3). Thirty-seven random patients were asked to fill a bag to the fill line, and the bags were weighed. On average, the measured weight was 0.99 kg (95% confidence interval, 2.14-2.27). That is, with 95% certainty, the estimate of the water bag weight is between 0.06 and 0.12 kg more than the goal weight of 0.90 kg. As part of the self-assessment, patients were asked to lift up to 3 bags of water corresponding to 0.90, 1.81, and 2.72 kg. Planes of motion assessed include forward flexion and abduction. With use of a diagram-based question, the patients were asked to identify the picture representing how high they could lift the bags and how many bags they were able to lift. Being able to lift the bags to 90 was classified as able, and all others were considered unable. Time to completion and difficulty of the questionnaire (scale of 1-10, 10 being the most difficult) were recorded by the patient.
Statistical analysis An a priori power analysis was performed to determine that a sample size of 120 was needed for 80% statistical power to detect
(continued). patient-clinician differences in ROM and strength with an effect size of 0.26 or greater at the .05 a level. The effect size was based on a previous work on the projected standard deviation of the difference between clinician and patient.11 The variables we used from the previous work were ‘‘strength,’’ ‘‘how high can you raise your arm,’’ ‘‘how far inward,’’ and ‘‘how far outward.’’11 We calculated the effect size on the basis of the overall minimum detectable effect size with these variables. Because significance testing was not performed on the reliability coefficients, statistical power is described in terms of the precision of the k estimates. The k coefficient accounts for chance agreement in categorical responses by comparing the observed agreement with the possible agreement beyond chance. With the sample size of 120, the 95% confidence bounds on our estimates achieved acceptable precision. Specifically, with 95% certainty, our k estimates of patientclinician reliability are within 0.12 unit of the true agreement. For 8 patients who had bilateral measures, one side was randomly selected for inclusion in the analyses. Because ROM measures were recorded in ordinal categories, weighted k statistics were computed to assess the agreement between selfassessments and clinician measures. Unlike the simple k, the weighted k takes into account the degree of disagreement between ordered categories. For strength items, patient self-
Patient self-assessment after shoulder arthroplasty
5
Figure 2 assessment was defined as the number of bags the patient could lift converted to newtons, where 1, 2, and 3 bags converted to 8.9 N, 17.8 N, and 26.7 N, respectively. The clinician’s Isobex measures were computed on the basis of the mean values from up to 3 trials of the clinician’s Isobex measurements. Both the patient’s self-assessment and the clinician’s Isobex measures were categorized into ‘‘able’’ and ‘‘unable’’ on the basis of the criteria defined before. Simple k statistics were computed to assess the agreement between self-assessments and clinician Isobex measures. We used the benchmarks for agreement measures for categorical data as described by Landis and Koch,8 where 0.00 to 0.20, 0.21 to 0.40, 0.41 to 0.60, 0.61 to 0.80, and 0.81 to 1.00 indicate poor, fair, moderate, substantial, and almost perfect agreement, respectively. Analyses were performed to determine if the magnitude of patient-clinician disagreement was associated with demographic or clinical characteristics. Because of multiple testing, statistical significance for these analyses was defined as P < .01. For ROM measures, the absolute magnitude of disagreement between the patient’s and the clinician’s measurements ranges from 0 to 4 categories. Spearman correlations (r) were performed between the absolute magnitude of disagreement for each ROM measure and race (where white was coded 1 and African American was coded 2), age (coded in years), education (coded from 1 to 6 from least to
(continued). most, where elementary school ¼ 1, high school ¼ 2, associate degree ¼ 3, college ¼ 4, master degree ¼ 5, and doctoral degree ¼ 6), income (coded from 1 to 5 from least to most, where <$25,000 ¼ 1, $25,000-$49,999 ¼ 2, $50,000-$74,999 ¼ 3, $75,000-$100,000 ¼ 4, and >$100,000 ¼ 5), household size (coded from 1 to 4 persons), surgery type (where reverse ¼ 1, TSA ¼ 2), pain score, and total ASES score. The KruskalWallis test was used to determine if the magnitude of disagreement for ROM measures was similar across the 4 employment categories. For strength measures, the absolute magnitude of patientclinician disagreement is dichotomous, that is, the patient and clinician agree or disagree. Fisher exact test was performed to determine if the proportion of disagreements for each strength test was associated with race, employment category, and surgery type. Wilcoxon rank sum test was used to determine if strength test disagreement was associated with age, education, income, household size, pain score, and total ASES score. Because of the small number of patient-clinician disagreements for strength testing at 8.9-N abduction (4 disagreements) and forward flexion (1 disagreement), these two measures were excluded from this analysis. In addition, the 2 patients who underwent hemiarthroplasty were excluded from the analysis of surgery type because of inadequate sample size.
6
J.S. Yang et al.
Figure 3
Diagram-based strength questions using standardized 0.90-kg zip-lock bags filled with water to a premeasured line.
Results
Range of motion
The questionnaire took an average of 17 minutes to complete, with an average difficulty 1.8 of 10. The average age of patients at follow-up was 67 years (range, 46-86 years). There were 54 men and 66 women. The majority of the patients was white (92%) and had a highschool or higher education (97%). Approximately two thirds (66%) of the patients were retired, and 56% made less than $50,000 per year. More than a quarter of the patients (28%) lived alone (Table I). The average ASES score was 79.1 17.4 (range, 32-100). The average SST score was 8 3.1 (range, 0-12).
The clinician and patient responses to the 5 questions related to ROM were in exact agreement 51% to 65% of the time, and they were in approximate agreement (within 1 value) at least 88% of the time (Table II). Only one patient had more than 3 categories of disagreement with the clinician. Interestingly, when lack of concordance between clinician and patient responses was seen, patients tended to err toward overestimating their ROM. This was especially apparent in abduction and forward flexion. For forward flexion, 42% of the patients overestimated their ROM, whereas only 3% underestimated their ROM; 55% had perfect agreement. For
Patient self-assessment after shoulder arthroplasty abduction, 39% of the patients overestimated their ROM, whereas only 9% underestimated their ROM; 52% had perfect agreement. This pattern of patient overestimation continued for external rotation and internal rotation and can be seen in Table II. In addition, we found that patients with clinician-assessed forward flexion or abduction ROM 120 were more likely to overestimate their ROM. The k statistics indicated an overall moderate clinicianpatient agreement (0.50-0.59) on items related to ROM (Table III). There was better agreement on questions for external and internal rotation ROM than for forward flexion and abduction ROM. The k values were similar for the 2 methods of assessing external rotation ROM (either in 90 of abduction or with the arm at the side).
Strength The physician and patient responses to the questions related to strength were in exact agreement 83% to 99% of the time (Table IV). Patients tended to err toward underestimating their strength. For the questions using 3 zip-lock bags, 12% of the patients thought they could not lift the bags; however, the clinician-assessed strength with the dynamometer showed that they in fact could. There was substantial to almost perfect agreement (0.620.92) on items related to strength (Table III). The ability of patients to assess abduction strength was similar to that of forward flexion strength. In general, there was better agreement at questions involving lower weighted bags.
7 Table I
Demographics
Demographics Gender Female Male Race African American White Education level Elementary school High school Associate degree College Master Doctoral Employment status Employed Housework Disabled/retired Unemployed Annual income <$25,000 $25,000-$49,999 $50,000-$74,999 $75,000-$100,000 >$100,000 Household size 1 2 3 4
Frequency 66 54
Percentage 55% 45%
9 111
7.5% 92.5%
4 66 18 14 11 3
3.45% 56.9% 15.5% 12.1% 9.48% 2.59%
24 11 77 4
20.7% 9.5% 66.4% 3.4%
16 42 17 18 10
15.5% 40.8% 16.5% 17.5% 9.7%
28 63 6 3
28% 63% 6% 3%
Associations Two significant associations between surgery type and clinician-patient disagreements were found. In subsequent analysis of abduction ROM, patients who underwent TSA had a larger magnitude of disagreement with the clinician than did patients who underwent reverse arthroplasty (r ¼ 0.25; P ¼ .008). In addition, the proportion of disagreements for strength at 17.8 N (2 bags) of forward flexion was significantly associated with surgical procedure; 21% of the patients who underwent a reverse shoulder arthroplasty vs. 3% of the patients who underwent a TSA procedure disagreed with the clinician (P ¼ .004). All other associations of demographics and clinical data (age, race, education, employment, income, household size, and ASES score) with clinician-patient disagreements were nonsignificant (P > .01). Because several of the primary analyses were not significant, we determined the magnitude of effect that the sample size could detect, if indeed that effect was observed. The sample size has 80% statistical power to detect a minimum correlation coefficient between patient-clinician disagreement and patient characteristics of 0.25 at the .05 a level or 0.30 at the .01 a level. In addition, the sample size has 80% statistical power to detect a minimum 0.21 difference in the proportion of disagreements by surgical
procedure at the .05 a level or a minimum 0.27 difference at the .01 a level.
Discussion A patient self-assessment questionnaire can provide a moderate to high level of agreement with clinician assessments of strength and motion after shoulder arthroplasty. The goal of this study was to develop a home-based questionnaire to objectively assess both shoulder strength and motion and to correlate the patient’s self-assessed measures of strength and motion to the clinician’s measurements of the same. This method of assessment could provide a valuable tool in tracking patient outcomes in a cost-saving and efficient fashion. We demonstrated a high level of compliance and ease of use. Smith and Cofield published a patient-derived shoulder functional outcome questionnaire in 2006.11 They examined 67 consecutive patients after shoulder arthroplasty and administered a questionnaire that contained a clinical and functional outcome scale. Similar to previous validations of subjective questionnaires like ASES and SST, the subjective portion of their questionnaire had a high level of agreement, with k ranging from 0.66 to 0.89. For the
8
J.S. Yang et al. Table II
Amount of categorical disagreement for range of motion test between patient and clinician
Categories of disagreement
Abduction
Forward flexion
External rotation at 90
External rotation at side
Internal rotation
Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Patient’s overestimate by 4 categories 0 3 categories 1 2 categories 10 1 category 36 Perfect agreement 62 Patient’s underestimate by 1 category 8 2 categories 2 3 categories 1 4 categories 0
0 0.8 8.3 30.0 51.7
0 0 10 40 65
0 0 8.4 33.6 54.6
NA 0 0 17 78
NA 0 0 14.3 65.6
1 0 7 27 71
0.8 0 5.8 22.5 59.2
0 2 9 34 67
0 1.7 7.5 28.3 55.8
6.7 1.7 0.8 0
3 1 0 0
2.5 0.8 0 0
22 1 1 NA
18.5 0.8 0.8 NA
13 1 0 0
10.8 0.8 0 0
8 0 0 0
6.7 0 0 0
Table III Reliability statistics for self-assessment and clinician measurements Questionnaire items k) 95% Confidence interval for k Range of motion Abduction Forward flexion External rotation at 90 abduction External rotation at the side Internal rotation Strength Abduction Able to lift 1 bag (8.9 N) Able to lift 2 bags (17.8 N) Able to lift 3 bags (26.7 N) Forward flexion Able to lift 1 bag (8.9 N) Able to lift 2 bags (17.8 N) Able to lift 3 bags (26.7 N)
0.51 0.50 0.56
0.40, 0.61 0.39, 0.62 0.44, 0.69
0.57 0.59
0.46, 0.68 0.50, 0.69
0.81 0.70 0.67
0.64, 0.99 0.55, 0.85 0.53, 0.80
0.92 0.62 0.78
0.76, 1 0.44, 0.80 0.67, 0.89
Weighted k statistics are reported for ROM measures. Simple k statistics are reported for strength measures.
)
functional and objective portion of their questionnaire, the primary measure they examined was ROM with use of diagram-based questions. They found almost perfect agreement between physicians and patients for forward flexion (k 0.89) and moderate agreement for external and internal rotation (k 0.4-0.49). Our study found moderate agreement (k 0.5-0.59) between clinicians and patients for all ROM questions. These results should be interpreted carefully. They reported the k values based on ‘‘approximate agreement’’ that was within 1 or 2 grades of ROM, which was 20 . Our k values were based on exact agreement. Although we do not know their exact agreement k value, their ‘‘exact agreement’’ was actually similar to ours, 40% to 61% of the time. For forward elevation, exact
agreement occurred 31% of the time and approximate agreement 75% of the time. They had 19 data points possible, and patients also overestimated their ROM, and actual measurements were within 20 . Therefore, this likely represents a real phenomenon with regard to patient perception of good outcomes. The difference in forward flexion agreement could be due to the differences in our diagram-based questions; our questionnaire used human models, whereas their questionnaire used pictorial drawings. More important, the nurses measured ROM with the goniometer relative to the thoracic spine, whereas the photographs illustrate normal ROM relative to the floor in a younger person. Our study also included reverse arthroplasty patients, whereas theirs did not; however, in our subgroup analysis, type of surgery did not have any impact on agreement for forward flexion. Our diagram-based ROM questions are similar to those published by Carter et al3 using human models, although our questionnaire also had more categories from which to choose for ROM questions. Carter et al administered a functional outcome questionnaire examining primarily ROM to a group of 100 patients with a variety of shoulder-related complaints. Although they did not report intraclass reliability, they noted agreement between the physician’s and patient’s assessment of motion 85% of the time. They also found that older individuals were more likely to make errors and a statistical trend that less educated people were more likely to make errors. We found no association of demographics and clinician-patient measurement differences. Interestingly, patients erred toward overestimating their ROM. This is consistent with previous literature.11 This was especially true for patients with forward flexion or abduction ROM >120 . In other words, the better ROM a patient has, the more the patient tends to overestimate ROM. This would explain why total shoulder patients had a larger magnitude of disagreement than reverse patients did, as the total shoulder population tends to have better ROM.2 Although the overestimation was primarily within 1 category of measurement, the
89.1 2.5 8.4 106 3 10 83.2 4.2 12.6 99 5 15 88.2 6.7 5.0 105 8 6 89.1 8.4 2.5 106 10 3 99.2 0.8 0 118 1 0 96.6 2.5 0.8
Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage
17.8 N abduction (2 bags) 8.9 N forward flexion (1 bag) 8.9 N abduction (1 bag)
Perfect agreement 115 Patient able/clinician unable 3 Patient unable/clinician able 1
Table IV
Amount of categorical disagreement for strength test between patient and clinician
17.8 N forward flexion (2 bags)
26.7 N abduction (3 bags)
26.7 N forward flexion (3 bags)
Patient self-assessment after shoulder arthroplasty
9 clinician should take this pattern into account when evaluating these questionnaires. Further work may elucidate a measurement tool with greater physician-patient correlation for ROM. The strength portion of the questionnaire had the highest level of agreement. This is encouraging as strength is an important factor to consider in outcomes evaluation. Strength contributes significantly to patient satisfaction and ability to perform activities of daily living.2 Our initial validation study on the patient’s ability to fill the zip-lock bags accurately with water showed that the patients tended to err toward slight overfill (0.99 for a 0.90-kg bag). This may indicate that the agreement of the strength portions of the questionnaire may be even higher than reported as the questionnaire may have underestimated the patient’s self-assessed strength. The limitations of this study include that it examined only the shoulder arthroplasty population. Further study may be needed to have this questionnaire generalizable to other shoulder complaints, although similar studies have shown it is possible.3 Our study population had a high percentage of white patjients and patients with at least a high-school education; further study may also be needed to generalize this questionnaire outside of this population. Seven patients were unable to complete the strength portion of the questionnaire. Most of these individuals were elderly with very poor function. This self-assessment method may not capture data on the patients with very low function; however, an inability to perform the test should alert clinicians that these patients may need to be assessed more closely than patients in whom very limited goals are not anticipated. Five patients did not have adequate cognitive ability to perform the assessment. This is a potential concern in very elderly patients with early dementia or analphabetic patients, highlighting populations that may require more specialized follow-up in certain settings. One primary concern relates to the clinical applicability of our findings as it is unknown how much agreement is needed before the validity of the ROM assessment should be challenged. This may be particularly relevant, for example, if self-assessed motion measures are subsequently used to generate a Constant score. Further studies are needed to determine the effects of the variability of self-assessed motion compared with clinician-assessed motion on the calculation of objective shoulder functional scores. We caution that objective questionnaires such as this should not be used alone to determining patient outcome and satisfaction as subjective measures have been shown to correlate highly with patient satisfaction.5 The questionnaires could potentially preclude the clinician’s access to in-office radiographs that may provide important information as well. However, current technology allows patients to locally obtain studies that can be transmitted by mail or electronic means, conserving resources and potentially even increasing compliance with follow-up. Last, patients were aware that they were participating in a study and were motivated to participate. This could affect the outcome.
10
Conclusion We found a high level of agreement with clinician assessments of strength and motion after shoulder surgery. This self-assessment tool took relatively little time, was reproducible, and was easy to use with a high level of patient participation. It represents an opportunity to use a costefficient method of observing and assessing patients after shoulder arthroplasty. Future research should focus on evaluating this questionnaire in other populations of shoulder patients as well as assessing the responsiveness of this tool to patient changes over time. The use of this specific assessment tool may be helpful in facilitating both clinical and research follow-up of patient outcomes after treatment.
Disclaimer Ken Yamaguchi receives royalties from Tornier and Zimmer. All the other authors, their immediate families, and any research foundation with which they are affiliated have not received any financial payments or other benefits from any commercial entity related to the subject of this article.
References 1. Beaton D, Richards RR. Assessing the reliability and responsiveness of 5 shoulder questionnaires. J Shoulder Elbow Surg 1998;7:565-72.
J.S. Yang et al. 2. Bryant D, Litchfield R, Sandow M, Gartsman GM, Guyatt G, Kirkley A. A comparison of pain, strength, range of motion, and functional outcomes after hemiarthroplasty and total shoulder arthroplasty in patients with osteoarthritis of the shoulder. A systematic review and meta-analysis. J Bone Joint Surg Am 2005;87:1947-56. http://dx.doi.org/10.2106/JBJS.D.02854 3. Carter CW, Levine WN, Kleweno CP, Bigliani LU, Ahmad CS. Assessment of shoulder range of motion: introduction of a novel patient self-assessment tool. Arthroscopy 2008;24:712-7. http://dx.doi. org/10.1016/j.arthro.2008.01.020 4. Galatz LM, Ball CM, Teefey SA, Middleton WD, Yamaguchi K. The outcome and repair integrity of completely arthroscopically repaired large and massive rotator cuff tears. J Bone Joint Surg Am 2004;86:219-24. 5. Harreld K, Clark R, Downes K, Virani N, Frankle M. Correlation of subjective and objective measures before and after shoulder arthroplasty. Orthopedics 2013;36:808-14. http://dx.doi.org/10.3928/ 01477447-20130523-29 6. Keener JD, Steger-May K, Stobbs G, Yamaguchi K. Asymptomatic rotator cuff tears: patient demographics and baseline shoulder function. J Shoulder Elbow Surg 2010;19:1191-8. http://dx.doi.org/10. 1016/j.jse.2010.07.017 7. Kim HM, Teefey SA, Zelig A, Galatz LM, Keener JD, Yamaguchi K. Shoulder strength in asymptomatic individuals with intact compared with torn rotator cuffs. J Bone Joint Surg Am 2009;91:289-96. http:// dx.doi.org/10.2106/JBJS.H.00219 8. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics 1977;33:159-74. 9. Michener LA, McClure PW, Sennett BJ. American Shoulder and Elbow Surgeons Standardized Shoulder Assessment Form, patient self-report section: reliability, validity, and responsiveness. J Shoulder Elbow Surg 2002;11:587-94. http://dx.doi.org/10.1067/mse.2002. 127096 10. Richards RR, An KN, Bigliani LU, Friedman RJ, Gartsman GM, Gristina AG, et al. A standardized method for the assessment of shoulder function. J Shoulder Elbow Surg 1994;3:347-52. 11. Smith AM, Barnes SA, Sperling JW, Farrell CM, Cummings JD, Cofield RH. Patient and physician-assessed shoulder function after arthroplasty. J Bone Joint Surg Am 2006;88:508-13. http://dx.doi.org/ 10.2106/JBJS.E.00132