Improving agreement in assessment of synovitis in rheumatoid arthritis

Improving agreement in assessment of synovitis in rheumatoid arthritis

Joint Bone Spine 80 (2013) 155–159 Available online at www.sciencedirect.com Original article Improving agreement in assessment of synovitis in rh...

292KB Sizes 2 Downloads 62 Views

Joint Bone Spine 80 (2013) 155–159

Available online at

www.sciencedirect.com

Original article

Improving agreement in assessment of synovitis in rheumatoid arthritis Peter P. Cheung a,b,∗ , Maxime Dougados a , Vincent Andre c , Nathalie Balandraud d , Gérard Chales f , Isabelle Chary-Valckenaere g , Emmanuel Chatelus h , Emmanuelle Dernis c , Ghislaine Gill g , Mélanie Gilson i , Sandrine Guis d,e , Gael Mouterde j , Stephan Pavy k , Franc¸ois Pouyol l , Thierry Marhadour m , Pascal Richette n,o , Adeline Ruyssen-Witrand p , Martin Soubrier q , Laure Gossec a a

Medicine faculty, Paris-Descartes university, Cochin hospital, UPRES-EA 4058, AP–HP, Rheumatology B, Paris, France Division of Rheumatology, National University Health System, 1E, Kent-Ridge road, Tower Block Level 10, 119228, Singapore c Service de rhumatologie, Centre hospitalier, Le Mans, France d AP–HM, rhumatologie I, hôpital Sainte-Marguerite, 13009 Marseille, France e Aix-Marseille université, CRMBM UMR CNRS7339, 13385 Marseille, France f INSERM UMR 991, hôpital Sud, CHU de Rennes, université de Rennes 1, France g CHU de Nancy, Nancy, France h Hôpital de Hautepierre, Strasbourg, France i Hôpital Sud, CHU de Grenoble, Échirolles, France j Rheumatology Department, Montpellier 1 university, Lapeyronie hospital, UMR 5535, Montpellier, France k Hôpitaux universitaires Paris Sud, CHU de Bicêtre, 94275 Le Kremlin-Bicêtre, France l Hôpital Roger Salengro, CHU Lille, Lille, France m Hôpital de la Cavale Blanche, Brest, France n Pôle appareil locomoteur, Fédération de Rhumatologie, Hôpital Lariboisière, AP–HP, 75010 Paris, France o Université Paris Diderot, Sorbonne Paris Cité, 75205 Paris, France p CHU de Toulouse, Toulouse, France q CHU de Clermont-Ferrand, service de Rhumatologie, 63003 Clermont-Ferrand, France b

a r t i c l e

i n f o

Article history: Accepted 23 July 2012 Available online 19 September 2012 Keywords: Synovitis Rheumatoid arthritis Swollen joints Agreement Clinical examination

a b s t r a c t Objective: Synovitis assessment through evaluation of swollen joints is integral in steering treatment decisions in rheumatoid arthritis (RA). However, there is high inter-observer variation. The objective was to assess if a short collegiate consensus would improve swollen joint agreement between rheumatologists and whether this was affected by experience. Methods: Eighteen rheumatologists from French university rheumatology units participated in three 30 minutes rounds over a half day meeting evaluating joint counts of RA patients in small groups, followed by short consensus discussions. Agreement was evaluated at the end of each round as follows: (i) global agreement of swollen joints (ii) swollen joint agreement according to level of experience of the rheumatologist (iii) swollen joint count and (iv) agreement of disease activity state according to the Disease Activity Score (DAS28). Agreement was calculated using percentage agreement and kappa. Results: Global agreement of swollen joints failed to improve (kappa 0.50 to 0.52) at the joint level. Agreement between seniors did not improve but agreement between newly qualified rheumatologists and their senior peer, which was initially poor (kappa 0.28), improved significantly (to 0.54) at the end of the consensus exercises. Concordance of DAS28 activity states improved from 71% to 87%. Conclusion: Consensus exercises for swollen joint assessment is worthwhile and may potentially improve agreement between clinicians in clinical synovitis and disease activity state, benefit was mostly observed in newly qualified rheumatologists. © 2012 Société franc¸aise de rhumatologie. Published by Elsevier Masson SAS. All rights reserved.

1. Introduction Rheumatoid arthritis (RA) is a chronic inflammatory disease with joint inflammation typically presenting as synovitis.

∗ Corresponding author. E-mail address: peter [email protected] (P.P. Cheung).

Synovitis is the main factor that drives joint destruction and if untreated, it may lead to permanent joint damage and functional disability [1,2]. Traditionally, synovitis is assessed clinically through detection of swollen joints, integral in the assessment of RA disease activity [3]. This helps steer treatment decisions to achieve clinical remission as part of the over-arching principles of “treating to target” [4]. Data indicates that patients with residual joint swelling continue to progress radiographically [5]. Therefore

1297-319X/$ – see front matter © 2012 Société franc¸aise de rhumatologie. Published by Elsevier Masson SAS. All rights reserved. doi:10.1016/j.jbspin.2012.07.014

156

P.P. Cheung et al. / Joint Bone Spine 80 (2013) 155–159

physician detection and agreement on what is clinically swollen is of crucial importance. One major problem with swollen joint detection is the wide inter-observer variation between clinicians both for the global joint count and at the joint level [6–12]. Previous studies have explored ways to reduce inter-observer variation of joint counts through standardization. However, there are very few studies evaluating improvement in swollen joint counts (SJC) and results are conflicting [6,7,12,13]. There also appears to be a high level of heterogeneity in the method of training, standardization technique and also participant characteristics. Some methods used may not necessarily be applicable to clinical practice [12,13]. Therefore a practical and feasible method of improving consensus in synovitis detection among rheumatologists at the joint and patient level is required. It is also unclear whether the level of experience of the observer (rheumatologist) would influence the success of the standardization/consensus. The aim of the present study was therefore to assess if a short collegiate small group consensus would improve swollen joint agreement between rheumatologists, and whether the level of experience of the rheumatologist affected this. 2. Methods 2.1. Patient and rheumatologist selection A cross-sectional study was conducted in Paris in 2011 over half a day. Nine patients with RA from Cochin Hospital who had given oral and written consent with a range of disease activity, were invited. Eighteen rheumatologists from 14 French university rheumatology units participated in the consensus exercises. These rheumatologists also participated concurrently in a training program to teach nurses how to evaluate synovitis, which was an initial part of a randomized controlled national study on patient education and self-assessment of the DAS28 in France (COMorbidities and Education in Rheumatoid Arthritis [COMEDRA], NCT01315652). 2.2. Materials Information on clinical examination was prepared based on the EULAR handbook of disease assessments in RA [14], distributed to the rheumatologists in booklet form, and in video. Rheumatologists had the opportunity to go through the material prior to the consensus exercises. 2.3. Allocation of rheumatologists and patients Rheumatologists underwent three rounds of clinical examination exercises. Prior to commencement, they were randomised into small groups of 3–4. In each round, they individually performed a tender and swollen joint assessment of the 28 joints (two wrists, five metacarpophalangeal, five proximal interphalangeal joints, two elbows, two shoulders and two knees) on one RA patient. Results were recorded and not discussed until everyone in the group had finished their examination. Results, in particular, swollen joints were then discussed in the respective groups. Joints with discordant results were re-examined with the objective to reach a consensus. Each round lasted for at least 30 minutes. The third round was purely an examination round without further discussion of findings. At the end of the consensus session, each rheumatologist had examined three RA patients. The exact format of the sessions and rotations are included in Table 1, Supplementary data. Between each consensus rounds, rheumatologists participated in practical sessions teaching nurses how to evaluate synovitis.

2.4. Global agreement of swollen joints Agreement of swollen joints at the “joint” level between each pair of raters was recorded and pooled together in each round and expressed as a global kappa, proportion of positive agreement (synovitis) and proportion of negative agreement (no synovitis). 2.5. Agreement levels according to experience of rheumatologist To evaluate whether participant experience was associated with agreement, raters were stratified by tertiles of years of experience: recently qualified rheumatologist (< 5 years), experienced (5–10 years) and very experienced (> 10 years). Two by two tables were constructed for the following situations: recently qualified rheumatologist versus very experienced rheumatologist, experienced versus very experienced rheumatologist, recently qualified rheumatologist versus experienced rheumatologist. Agreement as percentage agreement and kappa was calculated accordingly. The strength of agreement was interpreted as follows: Kappa less than 0 was poor, 0–0.2 was slight, 0.21 to 0.4 was fair, 0.41–0.6 was moderate, 0.61–0.8 was substantial and 0.81–1 was excellent [15]. 2.6. Agreement of joint count and disease activity state according to DAS28 Agreement at the “patient” level in terms of SJC and agreement of DAS28 disease activity states was compared between the rheumatologists through Wilcoxon’s rank statistic and total percentage of concordance respectively. An analysis of the effect of disease activity on agreement of SJC and DAS28 was performed as well. 3. Statistical analysis Statistical analysis was carried out using SPSS version 19 (SPSS, Chicago, IL). Descriptive data was summarized using medians and inter-quartile ranges for continuous variables and absolute numbers with proportions in percentages for dichotomous variables. Kappas with 95% confidence intervals were computed. 3.1. Agreement for individual joints for synovitis Agreement at the joint level for swollen joints was assessed at each round, by grouping all the possible paired observations in each group, using two-by-two cross tabulation with calculation of exact percent agreement of paired observations. For example, a group with four rheumatologists would have six-paired observations e.g. Rater 1 vs 2, Rater 2 vs 3, Rater 3 vs 4, Rater 4 vs 1, Rater 2 vs 4, Rater 1 vs 3. Pooled percentage agreement and kappa in each round was then calculated by grouping all the paired results of the raters in all the groups for each round, resulting in a global kappa with 95% confidence intervals and global percentage agreement. As the number of joints with no synovitis was high, we addressed the paradox of the kappa by also calculating separate indexes of agreement: the positive proportion of agreement and the negative proportion of agreement [16]. These analyses were repeated, according to level of experience of the rheumatologists. 3.2. Agreement for joint count and disease activity state (DAS28) To assess agreement on the SJC, Wilcoxon’s signed rank statistic was used to compare the number of ties and number of disagreements of the respective paired results. Rater 1 was more junior than Rater 2 in each paired comparison. Discordant results were

P.P. Cheung et al. / Joint Bone Spine 80 (2013) 155–159

157

Table 1 Agreement regarding swollen joints among the 18 rheumatologists and categorized according to level of experience. Global

Recently qualified vs experienced

Recently qualified vs very experienced

Experienced vs very experienced

Session 1 Kappa (95%CI) Positive agreement Negative agreement Disagreement

0.50 (0.41,0.59) 12% 73% 15%

0.33 (0.15,0.51) 8% 73% 19%

0.28 (0.05,0.51) 7% 71% 22%

0.74 (0.61,0.87) 21% 69% 10%

Session 2 Kappa (95%CI) Positive agreement Negative agreement Disagreement

0.53 (0.46,0.60) 15% 65% 20%

0.49 (0.36,0.63) 17% 64% 20%

0.53 (0.37,0.68) 15% 67% 17%

0.42 (0.24,0.59) 13% 67% 20%

Session 3 Kappa (95%CI) Positive agreement Negative agreement Disagreement

0.52 (0.44,0.60) 11% 73% 16%

0.47 (0.31,0.64) 10% 75% 15%

0.54 (0.38,0.70) 14% 71% 15%

0.42 (0.24,0.61) 11% 71% 18%

then analysed using medians and interquartile ranges over the three sessions to evaluate the change in SJC difference between the pairs. Agreement for disease activity state through DAS28 was analysed between raters descriptively over the three rounds as total percentage concordance.

proportion of positive agreement and negative agreement also showed no improvement (Table 1).

3.3. Effect of disease activity on agreement between rheumatologists

The kappa levels for the three groups according to level of experience (Table 1) showed that initial agreement between recently qualified rheumatologists versus their more senior peers was initially very low with a kappa of 0.28 (95%CI 0.05,0.51) to 0.33 (95%CI 0.15,0.51). The initial kappa for an experienced rheumatologist versus a very experienced rheumatologist was much higher (kappa: 0.74). Over the three sessions, only the kappa of the recently qualified rheumatologist when compared to their more senior peers improved to kappa of 0.54 (Fig. 1). However, the kappa between experienced and very experienced rheumatologists did not improve over the three sessions. The improvement in the recently qualified rheumatologist agreement with their more senior peers was largely due to the increase in the proportion of positive agreement (synovitis).

A visual approach by cumulative boxplots was performed to see whether the level of disease activity affected the agreement levels of DAS28 and the SJC. 4. Results 4.1. Rheumatologist characteristics There were ten male and eight female rheumatologists, with a median of 9.5 years (Q1:Q3, 2.8:14.3) of experience since qualification as a rheumatologist. Seven had less than 5 years, six had between 5 to 10 years and five had more than 10 years of experience. A majority of rheumatologists (83%, n = 15) saw from 10 to 50 RA patients per week and 50% (n = 9) calculated the DAS28 during every consultation. Only 33% (n = 6) of rheumatologists had participated in some form of joint count consensus exercise in the past. 4.2. Patients The nine RA patients (five female, four male) had a median age of 63.7 years with long median disease duration of 18.4 years with median TJC of 3 (Q1:Q3 0,6) and SJC of 6 (Q1:Q3 4,7) respectively. The median DAS28 was 3.7 (Q1:Q3 2.9,4.5).

4.4. Agreement of synovitis according to level of experience

4.5. Agreement of joint counts Out of 70 paired results, 16 (23%) had agreement in the SJC. Of the 54 discordant results, rank results showed that a more experienced rheumatologist frequently reported a higher SJC than the less experienced of the pair. (Table 2, Supplementary data) The SJC difference improved (i.e., decreased) from a median of three joints initially to two joints in the last session with narrowing of the inter-quartile range over the three sessions (2.5 to 1.2).

4.6. Agreement for disease activity state through DAS28 4.3. Agreement for individual joints for synovitis A total of 1484 joints were examined. Overall, rheumatologists agreed on 83% of the joints and disagreed on 17% of the joints. Of the joints that were in agreement, 16% were classified as having synovitis (positive agreement) and 84% as normal (negative agreement). The global kappas for the rheumatologists over the three sessions (Table 1) indicated agreement was moderate throughout the process with no improvement (kappa, 0.50–0.52). The global

Overall, there was good agreement regarding classification of disease activity states according to DAS28 by each rheumatologist. A majority of the situations where disagreement occurred was between low disease activity (DAS28 ≥ 2.6 and ≤ 3.2) and DAS remission (DAS < 2.6), while the rest was between low disease activity and moderate disease activity (DAS > 3.2 and < 5.1), observed only in the first round. By the end of the consensus meeting, concordance in disease activity states between rheumatologists improved from 71% to 87%.

158

P.P. Cheung et al. / Joint Bone Spine 80 (2013) 155–159 R1vR2

1

0.9 0.8 Kappa

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

1

2

3

Round R1vR3

1

0.9 0.8 Kappa

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

1

2

3

Round R2vR3

1

0.9 0.8 Kappa

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

1

2

3

Round Fig. 1. The trend in kappa over the three rounds according to the level of experience of rheumatologists. R1: recently qualified rheumatologists; R2: experienced rheumatologists; R3 very-experienced rheumatologists.

4.7. Effect of disease activity on agreement between rheumatologists For exploratory purposes, the influence of the level of disease activity on agreement of SJC and DAS28 was evaluated but results were unchanged. (Supplementary data, Fig. 1 and 2). The interquartile ranges for SJC and also DAS28 for each patient did not show an incremental increase as the median values increased.

5. Discussion This study has demonstrated that agreement of synovitis at the joint level may improve after short collegiate discussions, particularly between newly qualified rheumatologists and their more senior peers but their agreement started off very low. Despite a lack of global improvement, there was a reduction in the median SJC difference between clinicians and more importantly, increased

concordance in DAS-defined disease activity states after the three sessions. This study has strengths and limitations. As the study was part of the beginning of the COMEDRA trial where the primary objective of participants was to teach nurses on synovitis assessment, rheumatologists were encouraged but not obliged to reach absolute agreement during the consensus exercises. Although there was no didactic teaching before the consensus exercises, information on standardized joint assessment was provided. It was assumed rheumatologists had proficient skills at joint assessment and the aim was to simulate closely to real life clinical setting. In addition, patients examined during the consensus exercises mainly had moderate disease activity, so results may not be reflective if disease activity was higher. It was also difficult to quantify the level of rheumatologist experience, but the year from time of qualification as a rheumatologist was considered sufficient, as all of the participating rheumatologists were fulltime-practicing clinicians in tertiary rheumatology departments. Lastly, tender joints were not evaluated in the consensus exercises mainly because the main interest was reaching agreement in swollen joints, which was more liable to inter-observer variation. Repeated examination of patients may have falsely increased the variation of tender joints. A major strength of this study was that this method of consensus could be applied to daily clinical practice. The consensus exercise was limited to a small group discussion of no more than 4 rheumatologists with a time limit of 30 minutes, which can be conducted during department or inter-department meetings. In addition, this study was the first to evaluate and demonstrate the change in agreement in rheumatologists with varying years of experience. Initial agreement was very low with the less-experienced rheumatologists when compared to their more senior peers, possibly because they were evaluated against seniors from other centres, who they did not train with. It appeared that less-experienced rheumatologists were more assessable to change, especially agreement with which joints had clinical synovitis. In addition, the improvement in agreement of disease activity states among the rheumatologists on a whole make this form of consensus exercise clinically useful in daily practice. Unlike other methods of measurement in medicine such as blood pressure, there are many factors affecting the inter-observer variation with swollen joint assessment that are unique. These factors may include the level of experience of assessors, whom they were trained by, disease activity, or the degree of joint deformity. Previous studies explored standardization in reducing inter-observer variation. In particular, studies on swollen joints have been conflicting [6,12,13]. Bellamy et al. looked at the change in physician SJC reliability through a stringent standardization process, however the changes were minimal partly due to very good inter-observer reliability between the participating observers prior to the standardization. Scott et al. [6] looked at training and education lasting 60 minutes following material in the EULAR handbook for joint evaluation [14] but found there was still considerable variation in the SJC. Grunke et al. [13] indicated that the variance in SJC had reduced with standardization and training, but the results reported were on training sessions as part of large investigator meetings for clinical trials with participants not necessarily rheumatologists. Similarly, studies looking at the benefits of training and standardization on TJC have also been conflicting with some positive studies [11,13], but no improvement in a third study [6]. However, like that of studies on SJC, the studies were clinically heterogeneous. Despite this, standardization and training together with colleagues regularly, addressing at the components of disease activity such as joint assessments has been recommended [17]. One question that remains unanswered is whether the improved agreement by the less-experienced rheumatologists would continue through time. A longitudinal study with regular

P.P. Cheung et al. / Joint Bone Spine 80 (2013) 155–159

consensus would answer this question. Although this study was not designed to assess the effects of underlying joint destruction on the level of agreement, there were indications that groups had some difficulty in reaching a consensus especially in joints with deformities such as underlying subluxation. In conclusion, the importance of multicentre training of joint count evaluation for synovitis is illustrated in this study, a method that can be applied in daily practice. Although swollen joints are liable to inter-observer variation, a half-day consensus exercise is beneficial particularly for newly qualified rheumatologists and perhaps could be incorporated into the formal training program of students and rheumatology fellows in the future. It would be important to examine the benefits longitudinally and whether repeated consensus exercise sessions may improve the global swollen joint agreement. Disclosure of interest The authors declare that they have no conflicts of interest concerning this article. Acknowledgements We thank Roche-Chugai France for financial support of the consensus meeting and joint count didactic materials. We also thank Professor Lyn March for her input and advice. Appendix A. Supplementary data Supplementary data associated with this article can be found, in the online version, at doi:10.1016/j.jbspin.2012.07.014. References [1] Drossaer-Baker KW, de Buck M, van Zeben D, et al. Long term course and outcome of functional capacity in rheumatoid arthritis: the effect of disease activity and radiologic damage over time. Arthritis Rheum 1999;42:1854–60.

159

[2] Finckh A, Liang MH, Mugica van Herckenrode C, et al. Long term impact of early treatment on radiographic progression in rheumatoid arthritis: a metaanalysis. Arthritis Care Res (Hoboken) 2006 55;6:864–72. [3] Scott DL, Antoni C, Choy EH, et al. Joint counts in routine practice. Rheumatology (Oxford) 2003;42:919–23. [4] Smolen JS, Aletaha D, Bijlsma JW, et al. Treating rheumatoid arthritis to target: recommendations of an international task force. Ann Rheum Dis 2010;69:631–7. [5] Aletaha D, Smolen JS. Joint damage in rheumatoid arthritis progresses in remission according to the Disease Activity Score in 28 joints and is driven by residual swollen joints. Arthritis Rheum 2011;63:3702–11. [6] Scott DL, Choy EH, Greeves A, et al. Standardising joint assessment in rheumatoid arthritis. Clinic Rheum 1996;15:579–82. [7] Walsh CA, Mullan RH, Minnock PB, et al. Consistency in assessing the disease activity score 28 in routine clinical practice. Ann Rheum Dis 2008;67: 135–6. [8] Cheung PP, Ruyssen-Witrand A, Gossec L, et al. Reliability of patient selfevaluation of swollen and tender joints in rheumatoid arthritis: a comparison study with ultrasonography, physician and nurse assessments. Arthritis Car Res (Hoboken) 2010;62:1112–9. [9] Lassere M, van der Heijde D, Johnson KR, et al. Reliability of measures of disease activity and disease damage in rheumatoid arthritis: implications for smallest detectable difference, minimal clinically important difference,and analysis of treatment effects in randomised controlled trials. J Rheumatol 2001;28:892–903. [10] Marhadour T, Jouse-Joulin S, Chales G, et al. Reproducibility of joint swelling assessments in long-lasting rheumatoid arthritis: influence on disease activity score-28 values (SEA-Repro Study Part 1). J Rheumatol 2010;37: 932–7. [11] Klinkhoff A, Bellamy N, Bombardier C, et al. An experiment in reducing interobserver variability of the examination for joint tenderness. J Rheumatol 1988;15:492–4. [12] Bellamy N, Anastassiades TP, Buchanan WW, et al. Rheumatoid arthritis antirheumatic drug trials. Effects of standardization procedures on observer dependent outcome measures. J Rheumatol 1991;18:1893–900. [13] Grunke M, Antoni CE, Kavanaugh A, et al. Standardisation of joint examination technique leads to a significant decrease in variability among different examiners. J Rheumatol 2010;37:860–4. [14] Van Riel PLCM, Scott DL. EULAR handbook of clinical assessment in rheumatoid arthritis. Alphen Aan Den Rijn. The Netherlands: Van Zuiden Communications; 2000. [15] Landis RJ, Koch GG. The measurement of observer agreement for categorical data. Biometrics 1977;33:159–327. [16] Cicchetti DV, Feinstein AR. High agreement but low kappa: II. Resolving the paradoxes. J Clin Epidemiol 1990;43:551–8. [17] Porter D, Gadsby K, Thompson P, et al. DAS28 and rheumatoid arthritis: the need for standardization. Musculoskelet Care 2011;9:222–7.