Osteoarthritis and Cartilage (2001) 9, 746–750 © 2001 OsteoArthritis Research Society International doi:10.1053/joca.2001.0471, available online at http://www.idealibrary.com on
1063–4584/01/080746+05 $35.00/0
Cross-cultural adaptation and validation of Korean Western Ontario and McMaster Universities (WOMAC) and Lequesne Osteoarthritis Indices for Clinical Research S.-C. Bae, H.-S. Lee, H. R. Yun, T.-H. Kim, D.-H. Yoo and S. Y. Kim The Department of Internal Medicine, Division of Rheumatology, Hanyang University College of Medicine and The Hospital for Rheumatic Diseases, Seoul, Korea Summary Objective: Our aims were to translate WOMAC and Lequesne osteoarthritis (OA) indices into Korean (KWOMAC, KLequesne) and confirm their reliability, validity, and responsiveness. Design: The WOMAC and Lequesne indices were translated into Korean by three translators and translated back into English by three different translators. Fifty consecutive patients with OA were asked to rate the comprehensibility of the questions on a 4-point scale. The comprehensibility (responding with ‘good’ and ‘very good’) ranged from 78% to 99%. Test–retest was performed in another 47 patients with knee OA. The final 53 patients with knee OA, within the context of a clinical trial of two non-steroidal antiinflammatory drugs for 4 weeks, were studied to assess the internal consistency, construct validity, and responsiveness of the Korean versions. Results: The test–retest reliability of the KWOMAC 3 subscales and the KLequesne yielded intraclass correlation coefficients of 0.79–0.89 and 0.87. The Cronbach standardized alphas were 0.81–0.96 and 0.75, respectively. For the construct validity, the correlation coefficients of both the KWOMAC subscales and the KLequesne with patient pain assessment and patient global assessment were between 0.30 and 0.70 and the KWOMAC subscales correlated with the KLequesne (0.41–0.55). For responsiveness, the KWOMAC and KLequesne scores significantly improved by 4-week post-treatment compared with pre-treatment; effect size values were between 0.41 and 0.69 for the KWOMAC subscales and 0.70 for the KLequesne; and the relative efficiency values of the KWOMAC subscales vs the KLequesne were between 0.87 and 0.90. Conclusions: The reliability, validity, and responsiveness of the KWOMAC and the KLequesne are confirmed. © 2001 OsteoArthritis Research Society International Key words: WOMAC, Lequesne, Osteoarthritis, Korean.
far. We adapted the WOMAC and Lequesne indices for the Korean language and studied their psychometric properties for research and clinical application.
Introduction Assessment of physical function and pain is essential to evaluate the course of disease and assess treatment effects in patients with musculoskeletal diseases. A number of instruments have been developed to measure functional disability and pain, ranging from physical assessments by trained assessors to self-administered questionnaire1,2. Among these instruments, Western Ontario and McMaster Universities (WOMAC) and Lequesne osteoarthritis (OA) indices were developed specifically to measure the outcome of patients with lower extremity arthritis3. They have been validated, and are probably the two most widely used instruments for assessing outcomes in patients with OA of the hip and/or knee4–8. Although translated versions of the WOMAC and Lequesne indices in several alternative languages are available6,9,10, no Korean version exists so
Materials and methods SUBJECTS
Three groups of patients with OA were studied: the first 50 consecutive subjects with knee and/or hip OA for comprehensibility, the second 47 consecutive subjects with knee OA for test–retest reliability, and the third 53 consecutive subjects with knee OA for internal consistency, construct validity, and responsiveness (Table I). The third group study was performed within the context of a 4-week randomized controlled parallel trial of two non-steroidal antiinflammatory drugs (NSAIDs) (diclofenac and celecoxib); a total of 53 consecutive patients (49 women and four men) with knee OA were selected from the clinical trial and entered in this study. The mean age of the patients was 59.2±7.2 (43–79) years. The mean duration of disease was 8.3±6.6 (1–30) years. The mean total KWOMAC and KLequesne scores were 53.1±17.8 (6– 86) (possible score range: 0–96) and 11.9±2.3 (7–17) (possible score range: 0–24) respectively (Table I). The mean VAS of pain was 67.8±15.8 (37–100) mm. The
Received 27 December 2000; revision requested 1 March 2001; revision received 12 March 2001; accepted 14 June 2001. Supported by a grant of the Korea Health 21 R & D Project, Ministry of Health & Welfare, Republic of Korea (01-PJI-PG1-01 CH10-UOCM). Address correspondence to: Sang-Cheol Bae, MD, PhD, MPH, The Hospital for Rheumatic Diseases, Hanyang University Medical Center, Seoul 133-792, Korea. Tel: 82-2-2290-9203; Fax: 82-22298-8231, E-mail:
[email protected]
746
Osteoarthritis and Cartilage Vol. 9, No. 8
747
Table I Characteristics of three groups of patients
Female:Male Age (years) Disease duration (years) KWOMAC score KLequesne score
1st (N=50) comprehensibility
2nd (N=47) test–retest reliability
3rd (N=53) internal consistency and construct validity
43:7 60.1±8.9 9.0±8.2 — —
43:4 55.2±6.9 7.2±7.1 49.2±16.3 10.5±2.2
49:4 59.2±7.2 8.3±6.6 53.1±17.8 11.9±2.3
psychometric analyses were done by combining data from the two treatment groups. During the clinical trial, patients were assessed at four points: screening, pre-treatment, 2 weeks post-treatment, and 4 weeks post-treatment. All patients met the American College of Rheumatology criteria for knee and/or hip OA11,12, and were attending an outpatient clinic at the Hospital for Rheumatic Diseases, Hanyang University, Seoul, Korea, and gave informed consent. TRANSLATION OF WOMAC AND LEQUESNE OA INDICES
The translation of the WOMAC and Lequesne OA indices into Korean (KWOMAC, KLequesne) followed proposed guidelines by Guillemin et al.13 and was done by three translators and back-translated by three different translators independently with committee review. For the translation and back-translation of Lequesne index we have used the English version of Lequesne index instead of the original French Lequesne index. Two questions of the WOMAC were modified for the Korean culture according to our previous experience; ‘Rising from bed’ and ‘Lying in bed’ were changed to ‘Rising from a mattress’ and ‘Lying on a mattress’ since beds are not popular in Korea14. The WOMAC takes the form of a self-administered questionnaire and contains 24 questions categorized in three subscales (Pain, Stiffness, Physical Function)4,5. Each of questions is rated on either a Likert scale or a visual analog scale ranging from ‘none’ to ‘extreme’. In this study we used 5-point Likert scale format: none (0), mild (1), moderate (2), severe (3), and extreme (4) (5). Likert scales were used as these perform similarly to visual analog scale, but lack scaling problems during reproduction and are easier to score15. Scores for each subscale were determined by summing the component item scores for each subscale (possible score range, Pain: 0–20, Stiffness: 0–8, Physical function: 0–68). The final total aggregated scores (possible score range: 0–96) were determined by summing the scores for each subscale although their relative clinical importance has yet to be determined. The Lequesne has been developed as an interview format and contains 11 questions6 and we used this interview format. It directly aggregates symptoms and function (possible score range: 0–24), which are not graded separately, but can be categorized in three subscales (Pain, Maximum distance walked, Activities of daily living). COMPREHENSIBILITY
Fifty patients with OA were asked to rate each question with regard to whether they understood and were familiar with the task described (comprehensibility) in reflecting
one’s function on a 4-point scale (1: poor; 2: moderate; 3: good; 4: very good). We regarded questions as comprehensible when patients answered three or above.
RELIABILITY
Test–retest reliability was performed within additional group with knee OA since test–retest was not possible due to clinical trial protocol restriction in the third group. The first questionnaire was completed by patients themselves while visiting the clinic and the second done by mail one week after the visit in the second group of 47 subjects. Internal consistency was calculated from the results at pretreatment in the third group of 53 subjects.
CONSTRUCT VALIDITY
Construct validity was assessed in the third group by comparing the response on the pre-treatment administration of either the KWOMAC or the KLequesne and three measures of disease severity and activity: (1) patient pain assessment using a 100-mm horizontal visual analog scale (VAS) between 0 (no pain) and 100 (severe pain); (2) patient global assessment using 5-point Likert scale; and (3) physician global assessment using 5-point Likert scale.
RESPONSIVENESS
Responsiveness of the KWOMAC and the KLequesne was assessed by following three ways in the third group. Firstly, KWOMAC and KLequesne scores between pretreatment and 4 weeks post-treatment were compared. Secondly, effect size (ES) of both instruments was calculated16. Finally, relative efficiency (RE) values of the KWOMAC vs the KLequesne between pre-treatment and 4 weeks post-treatment were obtained17.
STATISTICS
Test–retest reliability was tested using intraclass correlation coefficients18. Internal consistency was measured by Cronbach’s alpha19. Spearman rank correlation coefficients were calculated between either the KWOMAC or the KLequesne scores and three disease severity variables. Responsiveness was tested using Wilcoxon’s nonparametric test20 (for the significance of difference means), ES, and RE. The ES is the mean change in score divided by the standard deviation of baseline score. A large ES indicates high sensitivity to change (0.2: small effect, 0.5: moderate effect, 0.8: large effect)16,21. The RE is calculated
748
S.-C. Bae et al.: Validation of Korean WOMAC and Lequesne Table II Reliability of the Korean WOMAC and Lequesne indices
Korean WOMAC (Total) Pain Stiffness Physical function Korean Lequesne (Total) Pain Maximum distance walked Daily activity of living
R*
Alpha†
Correlation‡
0.95 0.85 0.79 0.89 0.87 0.76 0.71 0.80
0.97 0.89 0.81 0.96 0.75 0.71 0.25 0.85
0.66–0.86 0.67–0.81 0.68 0.67–0.84 0.15–0.57 0.31–0.61 0.15 0.60–0.80
*Test–retest reliability (intraclass correlation coefficient). †Internal consistency (Cronbach’s alpha). ‡Correlations between one item and the total and each subscale score excluding that item using Spearman rank correlation coefficient.
Table IV Responsiveness of the Korean WOMAC and Lequesne indices Difference in mean between Effect before and after treatment size Korean WOMAC (Total) Pain Stiffness Physical function Korean Lequesne (Total) Pain Maximum distance walked Daily activity of living
−11.2* −2.6* −0.8* −7.8* −1.7* −1.3* −0.1 −0.3
0.63 0.69 0.41 0.56 0.70 1.05 0.05 0.08
*P<0.05.
RELIABILITY: INTERNAL CONSISTENCY
as the square of the ratio of 2 appropriate t-statistics, i.e., RE (KWOMAC vs KLequesne)=[(tKWOMAC/tKLequesne)2]17. If an RE is >1, the instrument in the numerator can be inferred to be the more responsive and efficient tool for measuring change than the instrument in the denominator. Conversely, if an RE is <1, the instrument in the denominator can be inferred to be more responsive. Data were analysed using the SAS statistical software for personal computers (SAS 6.12)22.
Results COMPREHENSIBILITY
Patients’ ratings of each question were high, with 82–99% (median: 93%) in the KWOMAC and 78–98% (median: 91%) in the KLequesne of the patients rating each question as ‘good’ or ‘very good’. Among all questions the least comprehensible question was ‘Maximum distance walked’ of the Klequesne (78%). RELIABILITY: TEST–RETEST RELIABILITY
The test–retest reliability yielded 0.79–0.89 of intraclass correlation coefficients for the KWOMAC subscales and 0.87 for the total KLequesne (Table II).
Cronbach’s alphas showed strong reliability with standardized alphas of 0.97 for the total KWOMAC and 0.81– 0.96 for its subscales and 0.75 for the total KLequesne and 0.71–0.85 for its subscales (Table II). The Spearman rank correlation coefficients between one item and the total and each subscale score (excluding that item) were from 0.66 to 0.86 in the KWOMAC and from 0.15 to 0.80 in the KLequesne (Table II). CONSTRUCT VALIDITY
The Spearman rank correlation coefficients of both the KWOMAC subscales and the KLequesne with patient pain assessment and patient global assessment were between 0.30 and 0.70, but those with physician global assessment were between 0.17 and 0.26. The correlation coefficients of the KWOMAC subscales with the KLequesne were between 0.41 and 0.55 (Table III). RESPONSIVENESS
The total and three subscale scores of the KWOMAC significantly improved by 4 weeks post-treatment compared with pre-treatment, while the total and ‘Pain’ subscale scores of the KLequesne significantly improved, but ‘Maximum distance walked’ and ‘Daily activity of living’ subscale scores did not improve (Table IV). The ES values
Table III Construct validity of the Korean WOMAC and Lequesne indices Korean WOMAC Category Korean Lequesne Total Pain Distance Activity VAS PtGA PhGA
Korean Lequesne
Total
Pain
Stiffness
Function
Total
Pain
Distance
Activity
VAS
PtGA
0.59** 0.56** 0.10 0.44** 0.59** 0.46** 0.26
0.53** 0.63** −0.04 0.38** 0.70** 0.50** 0.42**
0.41** 0.44** 0.02 0.31* 0.51** 0.36** 0.36**
0.55** 0.48** 0.14 0.41** 0.49** 0.40** 0.17
0.42** 0.30* 0.25
0.53** 0.24 0.33*
−0.11 −0.06 −0.21
0.34* 0.36** 0.32*
0.53** 0.48**
0.54**
Numbers represent Spearman rank correlation coefficients. *P<0.05; **P<0.01. VAS: patient pain assessment using visual analog scale; PtGA: patient global assessment by Likert scale; PhGA: physician global assessment by Likert scale; Distance: maximum distance walked; Activity: daily activity of living.
Osteoarthritis and Cartilage Vol. 9, No. 8
749
Table V Comparison of relative efficiency of the Korean WOMAC vs Lequesne indices KWOMAC/KLequesne
Total Pain Stiffness Physical function
Relative efficiency=(tKWOMAC/tKLequesne)2
Total
Pain
Maximum distance walked
0.96 0.87 0.88 0.90
1.10 1.00 1.01 1.04
2.36 2.14 2.17 2.23
Daily activity of living 1.31 1.19 1.21 1.24
were 0.63 for the total KWOMAC and 0.70 for the total KLequesne; those of the KLequesne’s ‘Maximum distance walked’ and ‘Daily activity of living’ subscales were 0.05 and 0.08 respectively (Table IV). The RE values for the total and three subscales of the KWOMAC vs the total and three subscales of the KLequesne were between 0.87 and 2.36; the RE value for the ‘Pain’ subscale of the KWOMAC versus ‘Pain’ subscale of the KLequesne was 1.0 and the RE values for the ‘Physical function’ subscale of the KWOMAC vs ‘Maximum distance walked’ and ‘Daily activity of living’ subscales of the KLequesne were 2.23 and 1.14 respectively (Table V).
Discussion We have cross-culturally developed Korean language versions of the WOMAC and the Lequesne and presented evidence of their comprehensibility, reliability, validity, and responsiveness. Their comprehensibility was reasonably high. The least comprehensible question was ‘Maximum distance walked’. This may be attributable to unfamiliarity with the concept of ‘distance in metres’ amongst elderly Koreans since they are not well educated. Although kneeling and squatting are very common activities in Koreans, we did not consider them in this study since the original questionnaires do not include such activities. The next research task will be to identify proper questions and their validity for the Korean context. Test–retest correlations of the KWOMAC subscales (0.79–0.89) and the KLequesne (0.87) were high. The KWOMAC’s test–retest correlations were better than those of the original validation study, but similar to that of the German version4,10. Cronbach’s alphas of the KWOMAC and the KLequesne were 0.97 and 0.75 respectively for all items and >0.70 for their subscales, which were similar to previous work4,10. The internal consistency of the KWOMAC was better than that of the KLequesne, which was similar to Stucki et al.’s work10 but all are in the acceptable range for internal consistency (>0.70)23. For the construct validity, we have used only subjective measurements such as patient and physician global assessments, but not objective measurements such as range of motion (flexion or extension)5,6, Kellgren and Lawrence’s radiologic method10. The KWOMAC significantly correlated with the KLequesne in the total and subscale scores except ‘Maximum distance walked’18. Both instruments significantly correlated with patient pain assessment and patient global assessment, but not with physician global assessment. These findings are consistent with OMERACT III conclusions which show that physician
global assessment is considered as less valid than patient global assessment25. The effect size of the total KWOMAC is little lower than that of the total KLequesne and the RE value of the total KWOMAC vs the total KLequesne was 0.96. However, the effect size values of three individual KWOMAC subscale scores were between 0.41 and 0.69, while KLequesne ‘Pain’ subscale effect size value was 1.05, but ‘Maximum distance walked’ and ‘Activity of daily living’ subscale effect size values were very small (0.05 and 0.08). The RE values of three KWOMAC subscales vs ‘Maximum distance walked’ subscale were over 2. The trend of RE values between the KWOMAC and the KLequesne was similar to Bellamy et al.’s work4. Some limitations require consideration. Firstly, the English version of Lequesne index has been used instead of the original French for the cross-cultural adaptation in Korean since it is quite difficult to find qualified French translators and back-translators in Korea according to proposed guideline13. However, it is unlikely to influence our results because the English Lequesne index has been validated and widely used. Secondly, our study groups were relatively small and also might not represent the whole spectrum of diseases severity in Korea. Finally, a comparison between the KWOMAC and the KLequesne was not performed in terms of time necessitation, i.e. how much time it needs to perform the interview for the KLequesne and achieve the KWOMAC good realization for self-administration. In conclusion, our study suggests that both Korean versions of the WOMAC and the Lequesne are comprehensible, reliable, valid, and responsive instruments to measure outcome in patients with knee OA in Korea, and their psychometric properties are comparable with those of the original versions.
Acknowledgments We gratefully acknowledge Searle and Pfizer Co., Korea for using some data from the clinical trial of celecoxib and diclofenac. We thank Dr Dong-Soo Lee, Eun Yang Kim, Mun-Gap Kim, and Eun-Ju Kwak for the technical support and statistical analyses and Dr Seung-Won Lee for his thoughtful review of the manuscript. We also thank three anonymous reviewers for their insightful comments.
References 1. Karlson EW, Katz JN, Liang MH. Chronic rheumatic disorders. In: Spilker B, Ed. Quality of Life and Pharmacoeconomics in Clinical Trials. Philadelphia: Lippincott-Raven 1996:1029–37. 2. Liang MH, Jette AM. Measuring functional ability in chronic arthritis: a critical review. Arthritis Rheum 1981;24:80–6. 3. Wright JG. Quality of life in orthopedics. In: Spilker B, Ed. Quality of Life and Pharmacoeconomics in Clinical Trials. Philadelphia: Lippincott-Raven 1996: 1039–44. 4. Bellamy N, Buchanan WW, Goldsmith CH, Campbell J, Stitt LW. Validation study of WOMAC: a health status instrument for measuring clinically important patient relevant outcomes to antirheumatic drug therapy in patients with osteoarthritis of the hip or knee. J Rheumatol;1988:1833–40.
750
S.-C. Bae et al.: Validation of Korean WOMAC and Lequesne
5. Bellamy N, Kean WF, Buchanan W, Gerecz-Simon E, Campbell J. Double blind randomized controlled trial of sodium meclofenamate (Meclomen) and diclofenac sodium (Voltaren): Post validation reapplication of the WOMAC osteoarthritis index. J Rheumatol 1992;19:153–9. 6. Lequesne MG, Mery C, Samson M, Gerard P. Indexes of severity for osteoarthritis of the hip and knee. Scand J Rheumatol 1987;65(suppl):85–9. 7. Lequesne MG. Indices of severity and disease activity for osteoarthritis. Semin Arthritis Rheum 1991; 20(suppl 2):48–54. 8. Lequesne MG. The algofunctional indices for hip and knee osteoarthritis. J Rheumatol 1997:779–81. 9. Bellamy N, Campbell J, Stevens J, Pilch L, Stewart C, Mahmood Z. Validation study of a computerized version of the Western Ontario and McMaster Universities VA 3.0 osteoarthritis index. J Rheumatol 1997;24:2413–15. 10. Stucki G, Sangha O, Stucki S, Michel BA, Tyndall A, Dick W, et al. Comparison of the WOMAC (Western Ontario and McMaster Universities) osteoarthritis index and a self-report format of the selfadministered Lequesne-Algofunctional index in patients with knee and hip osteoarthritis. Osteoarthritis Cartilage 1998;6:79–86. 11. Altman R, Asch E, Bloch D, Bole G, Borenstein D, Brandt K, et al. Development of criteria for the classification and reporting of osteoarthritis: classification of osteoarthritis of the knee. Arthritis Rheum 1986;29:1039–49. 12. Altman R, Alarcon G, Appelrouth D, Bloch D, Borenstein D, Brandt K, et al. The American College of Rheumatology criteria for the classification and reporting of osteoarthritis of the hip. Arthritis Rheum 1991;34:505–14. 13. Guillemin F, Bombardier C, Beaton D. Cross-cultural adaptation of health related quality of life measures: literature review and proposed guidelines. J Clin Epidemiol 1993;46:1417–32.
14. Bae SC, Cook EF, Kim SY. Psychometric evaluation of a Korean Health Assessment Questionnaire for clinical research. J Rheumatol 1998;25:1975–9. 15. Jensen MP, Karoly P, Braver S. The measurement of clinical pain intensity: a comparison of six methods. Pain 1986;27:117–26. 16. Liang MH. Evaluating measurement responsiveness. J Rheumatol 1995;22:1191–2. 17. Liang MH, Larson MG, Cullen KE, Schwarz JA. Comparative measurement efficacy and sensitivity of five health status instruments for arthritis research. Arthritis Rheum 1985;28:542–7. 18. Fleiss JL, Cohen J. The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educ Psychol Meas 1973; 33:613–19. 19. Cronbach LJ. Coefficient alpha and the internal structure of tests. Psychometrika 1951;16:297–334. 20. Armitage P. Distribution free methods. In: Statistical Methods in Medical Research. New York: John Wiley 1977:394–407. 21. Kazis LE, Anderson JJ, Meenan RF. Effect sizes for interpreting changes in health status. Med Care 1989;27:S178–89. 22. SAS/STAT User’s guide. Version 6. 4th edn. Cary, NC: SAS Institute Inc 1990. 23. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics 1977; 33:159–74. 24. Streiner DL, Norman GR (Eds). Health Measurement Scales: a practical guide to their development and use, 2nd edn. Oxford: Oxford University Press 1989. 25. Bellamy N, Kirwan J, Boers M, Brooks P, Strand V, Tugwell P, et al. Recommendations for a core set of outcome measures for future phase II trials in knee, hip, and hand osteoarthritis. Consensus development at OMERACT III. J Rheumatol 1997;24:799– 802.