Seminars in Arthritis and Rheumatism ] (2018) ]]]–]]]
Contents lists available at ScienceDirect
Seminars in Arthritis and Rheumatism journal homepage: www.elsevier.com/locate/semarthrit
Measurement properties of patient reported outcome measures for spondyloarthritis: A systematic review Kelly Png, BSc (Pharm) (Hons)a,1, Yu Heng Kwan, BSc (Pharm) (Hons)b,n,1, Ying Ying Leung, MBChB, MDc,f, Jie Kie Phang, BSc (Hons)b,c, Jia Qi Lau, BSc (Pharm) (Hons)a, Ka Keat Limb, Eng Hui Chew, BSc (Pharm) (Hons), PhDa, Lian Leng Low, MBBSd, Chuen Seng Tan, BSc (Hons), MSc, PhDe, Julian Thumboo, MBBSb,f, Warren Fong, MBBSc,f, Truls Østbye, MD, MPH, PhD, FFPHb a
Department of Pharmacy, National University of Singapore, Singapore, Singapore Programme in Health Services and Systems Research, Duke-NUS Medical School, 8 College Rd, Singapore 169857, Singapore c Department of Rheumatology and Immunology, Singapore General Hospital, Singapore, Singapore d Department of Family Medicine and Continuing Care, Singapore General Hospital, Singapore, Singapore e Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Singapore f Duke-NUS Medical School, Singapore, Singapore b
a r t i c l e i n fo
Keywords: Systematic review Measurement properties Patient-reported outcomes measures Spondyloarthritis
a b s t r a c t Objectives: This systematic review aimed to identify studies investigating measurement properties of patient reported outcome measures (PROMs) for spondyloarthritis (SpA), and to evaluate their methodological quality and level of evidence relating to the measurement properties of PROMs. Methods: This systematic review was guided by the preferred reporting items for systematic review and meta-analysis (PRISMA). Articles published before 30 June 2017 were retrieved from PubMed®, Embase®, and PsychINFO® (Ovid). Methodological quality and level of evidence were evaluated according to recommendations from the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN). Results: We identified 60 unique PROMs from 125 studies in 39 countries. Twenty-one PROMs were validated for two or more SpA subtypes. The literature examined hypothesis testing (82.4%) most frequently followed by reliability (60.0%). A percentage of 77.7% and 42.7% of studies that assessed PROMs for hypothesis testing and reliability, respectively had “fair” or better methodological quality. Among the PROMs identified, 41.7% were studied in ankylosing spondylitis (AS) only and 23.3% were studied in psoriatic arthritis (PsA) only. The more extensively assessed PROMs included the ankylosing spondylitis quality of life (ASQoL) and bath ankylosing spondylitis functional index (BASFI) for ankylosing spondylitis, and the psoriatic arthritis quality of life questionnaire (VITACORA-19) for psoriatic arthritis. Conclusion: This study identified 60 unique PROMs through a systematic review and synthesized evidence of the measurement properties of the PROMs. There is a lack of validation of PROMs for use across SpA subtypes. Future studies may consider validating PROMs for use across different SpA subtypes. & 2018 Elsevier Inc. All rights reserved.
Introduction Spondyloarthritis (SpA) is a heterogeneous group of chronic diseases involving the inflammation of axial and peripheral joints, entheses and other extra-articular sites [1]. These disabling diseases are associated with decreased performance of activities of
n
1
Corresponding author. E-mail address:
[email protected] (Y.H. Kwan). Co-first authors.
https://doi.org/10.1016/j.semarthrit.2018.02.016 0049-0172/& 2018 Elsevier Inc. All rights reserved.
daily living, quality of life (QoL) and work productivity [2]. Although SpA has been commonly divided into various subtypes they share similar axial or peripheral manifestations. It is also phenotypically diverse and unpredictable, hence a patient with SpA may experience symptoms from different SpA subtypes over time [3]. SpA is commonly divided into axial and peripheral SpA. Axial SpA (axSpA) subtypes include ankylosing spondylitis (AS) and non-radiographic axial spondyloarthritis (nr-axSpA); while peripheral SpA subtypes include psoriatic arthritis (PsA), inflammatory bowel disease-associated arthritis (IBD-SpA), reactive
2
K. Png et al. / Seminars in Arthritis and Rheumatism ] (2018) ]]]–]]]
arthritis (ReA), and undifferentiated SpA (USpA) [4]. The goal of SpA management is to control the signs and symptoms by adjusting pharmacological and non-pharmacological treatments according to the disease activity, which may be assessed using patientreported outcome measures (PROMs) [5]. PROMs are measures of the patients’ own perception of their health status [6]. Thus, PROMs should be completed by the patients themselves [7]. PROMs reflect the impact of disease or treatment response from the patient’s perspective, adding monitoring value to clinical measures [8]. The PROMs that were originally developed and validated for SpA are subtype-specific [9]. Examples of PROMs for AS include the bath ankylosing spondylitis disease activity index (BASDAI) and the bath ankylosing spondylitis functional index (BASFI) [10]. An example of a PROM for PsA include the psoriatic arthritis quality of life (PsAQOL) [11]. SpA subtypes may affect the physical, social and psychological aspect in individuals, and disease burden and scores for PROMs were reported to be similar [12–14]. These findings suggest that PRO domains may be collated into a set of PROMs for use in both axial and peripheral SpA [15]. Existing studies have validated PROMs only in SpA subtypes [16]. However, no systematic reviews have summarized PROMs for SpA in general. Therefore, we conducted a systematic literature review to identify studies investigating measurement properties of PROMs for SpA, and evaluated their methodological quality and level of evidence relating to the measurement properties of PROMs.
measurement properties include internal consistency, reliability, measurement error, content validity, structural validity, hypothesis testing, cross-cultural validity, criterion validity, and responsiveness. Their definitions are presented by Mokkink et al. [23]. Articles were excluded if the PROMs were completed by proxy, or if patients were diagnosed with ReA or juvenile idiopathic arthritis (JIA). These exclusions were not used to construct the search strategy to avoid the omission of relevant studies. If only part of the study population consisted of SpA patients, the articles were included if results were reported separately for this group of patients. The type of studies (e.g., randomized controlled trial, cross-sectional study, and cohort study) were not part of our exclusion criteria to assess certain measurement properties. This was so that this systematic literature review would be able provide a comprehensive overview of the measurement properties of all types of PROMS. Data extraction Where available, the following data were extracted from the articles by two reviewers (K.P. and J.Q.L.). (1) General characteristics of the study populations—sample size, age, gender, and country where study was conducted. (2) Disease characteristics of study populations—disease studied and duration of illness. (3) Characteristics of PROMs—language version used, domains assessed, number of domains and items, and response scale.
Methods This systematic review was guided by the preferred reporting items for systematic review and meta-analysis (PRISMA) statement [17]. The methodological quality of all included studies from the systematic literature review was assessed using the COnsensus-based Standards for the selection of health Measurement Instruments (COSMIN) checklist [18], and the results were used to determine the overall evidence of each PROM [19]. Search strategy The PubMed®, Embase®, and PsychINFO® (Ovid) databases were searched for any articles published before 30 June 2017. A search strategy (Supplementary Tables 1–3) of three components was used as follows [20]: disease terms, construct of interest, and measurement properties. Where available, the sensitivity of the searches was enhanced using search filters developed by Terwee et al. [21] and the PROM Group, University of Oxford [22]. The search records were downloaded into Endnote X7 and any duplicates were removed. Article selection All titles and abstracts were screened independently by two reviewers (K.P. and J.Q.L.). A third reviewer (Y.H.K.) was consulted when disagreement arose between the two reviewers. For articles that were potentially relevant, the full-text articles were independently reviewed by the same two reviewers for inclusion and exclusion. The final pool of relevant articles also included handsearched articles from the reference list of included articles. Articles were included if they were full-text original publications in English that validated PROMs for the following SpA subtypes: AS, PsA, IBD-SpA, nr-axSpA, and USpA. In addition, the study participants had to be more than 18 years old, and must have evaluated the PROMs for at least one of the nine measurement properties listed in the COSMIN checklist. The nine
The PROMs identified from the systematic literature were stratified according to their domains of interest using the World Health Organization Quality of Life Assessment (WHOQOL) structure [24]. The six domains from the WHOQOL structure include physical, psychological, level of independence, social relationships, environment and spirituality/religion/personal beliefs. Overall quality of life and general health perspective were included in the WHOQOL structure but not grouped in any domains. The WHOQOL structure was used because it was developed to be used in various cultural settings and reflects the multi-dimensional nature of quality of life [24]. Assessment of methodological quality All relevant articles were independently evaluated by two reviewers (K.P. and J.K.P.) for methodological quality of studies using the COSMIN checklist. Please refer to Mokkink et al. [18] for the COSMIN checklist. Any disagreements between reviewers were resolved with a third reviewer (Y.H.K.). In the COSMIN checklist, standards for every measurement property is detailed to rate the methodological quality of the studies [18]. Thus, the assessments to be completed depended on the measurement properties studied in the article. Measurement properties that were not validated in any study were omitted in the results table of assessment of methodological quality. Each standard from the COSMIN checklist has a set of items which were rated individually using a 4-point scale (“poor,” “fair,” “good,” or “excellent”). The item with the lowest rating determined the overall rating for the measurement property [25]. Assessment of quality of measurement properties The quality of measurement properties of PROMs was assessed using the quality criteria by Terwee et al. [19]. First, the measurement properties evaluated by the studies were identified. Next, according to the results from the study of each measurement
K. Png et al. / Seminars in Arthritis and Rheumatism ] (2018) ]]]–]]]
property, a “positive” (þ ), indeterminate’ (?) or “negative” (−) rating was assigned. Please refer to Terwee et al. [19] for the quality criteria of measurement properties.
3
Table 1 Characteristics of articles included. General characteristics
n (%) (N ¼ 125)
Evidence synthesis
Number of unique countries involved Number of unique PROMs studied
39 60
For each PROM, an evidence synthesis across all studies was conducted for each measurement property using the table of levels of evidence (please refer to Terwee [20] for the levels of evidence). The quality of the measurement property of a PROM was adjusted by using studies of higher methodological quality to determine the strength of the quality of the measurement properties [26]. For example, two studies were revealed to have “good” methodological quality in their assessment of the same PROM for internal consistency. Both studies reported that this PROM had a Cronbach's α of more than 0.70, which indicated that this PROM had positive ( þ) quality for internal consistency. An evidence synthesis would indicate that this PROM had “strong” positive evidence ( þ þ þ) for internal consistency. Evidence synthesis determined whether the measurement properties of PROMs have ‘strong’ positive/ negative ( þ þ þ / − − −), ‘moderate’ positive/ negative (þ þ / − −), ‘limited’ positive/ negative ( þ/ −), ‘conflicting’ “strong” positive/negative (þ þ þ/−− −), “moderate” positive/negative (þ þ /− −), “limited” positive/ negative (þ /−), “conflicting” (±), or “unknown” evidence (?) [20]. If there were no studies that validated a PROM for any measurement properties, levels of evidence were not assigned for those measurement properties of the PROM. These were omitted from the results table of evidence synthesis.
Sample size o30 30–49 50–99 ≥100
5 8 33 79
(4.0%) (6.4%) (26.4%) (63.2%)
30 40 50 60
3 25 61 26
(2.4%) (20.0%) (48.8%) (20.8%)
Proportion of males (x)b o0.5 0.5 ≤ x o 0.6 0.6 ≤ x o 0.7 0.7 ≤ x o 0.8 0.8 ≤ x o 0.9 ≥0.9
21 22 27 34 14 5
(16.8%) (17.6%) (21.6%) (27.2%) (11.2%) (4.0%)
Disease characteristics
n (%) (N ¼ 125)
Disease type SpA in general axSpA nr-axSpA AS PsA IBD-SpA USpA
7 3 2 75 45 1 3
(5.6%) (2.4%) (1.6%) (60.0%) (36.0%) (0.8%) (2.4%)
Disease durationa 0 o mean disease duration ≤ 10 10 o mean disease duration ≤ 20 20 o mean disease duration ≤ 30 30 o mean disease duration ≤ 40
35 61 6 2
(28.0%) (48.8%) (4.8%) (1.6%)
Results Search results and characteristics of articles included A total of 10,254 articles were obtained from the database search (Fig.), of which 1285 duplicates were excluded. A review of
Mean agea 20 o mean 30 o mean 40 o mean 50 o mean
age age age age
≤ ≤ ≤ ≤
All percentages were calculated with a denominator of the total number of studies (n ¼ 125). Abbreviations: AS, ankylosing spondylitis; axSpA, axial spondyloarthritis; IBD-SpA, inflammatory bowel disease-associated arthritis; nr-axSpA, nonradiographic axial spondyloarthritis; PsA, psoriatic arthritis; ReA, reactive arthritis; SpA, spondyloarthritis; USpA, undifferentiated SpA. a Some values were in the form of median, range, or were not reported. Attempts were made to contact the authors for clarification. b Some values were not reported. Attempts were made to contact the authors for clarification.
the title and abstracts excluded 8785 articles. Then a full-text review excluded 64 articles with reasons provided in Figure and addition of five articles from hand-searching resulted in 125 relevant articles. The characteristics of the relevant studies are presented in Table 1. The studies were mostly conducted in the United Kingdom (n ¼ 17), the United States (n ¼15), Canada (n ¼ 14), and Turkey (n ¼ 14). Sixty unique PROMs were identified from the included studies. Characteristics of PROMs
Fig. Flow chart of the systematic literature review.
The characteristics of the identified PROMs were presented in Supplementary Table 4. Some PROMs were studied in patients with AS only (n ¼ 25), with PsA only (n ¼ 14) and with two or more SpA subtypes (n ¼ 21). PROMs of SpA were validated in 39 countries. All PROMs identified from the systematic literature review were self-administered questionnaires of 60 unique instruments. Studies that involved axSpA included more than one SpA subtype (nr-axSpA and AS). Out of 60 PROMs, 17 PROMs were
K. Png et al. / Seminars in Arthritis and Rheumatism ] (2018) ]]]–]]]
4
Table 2 PROM domains studied in SpA subtypes. Axial SpA WHOQOL domains
PROMs
Physical health
AIMS AIMS2 AS-AIMS2 ASAS HI ASQoL BASDAI Body chart DASH EASi-QoL EQ-5D EQ-VAS FACIT-fatigue Fatigue HUI-3 MAF MFI miniBASDAI Multidimensional PROM MUMQ Nocturnal back pain Nocturnal pain VAS Pain VAS PGI PsAID PSI PtGADA SASPA SF-12 SF-36 SF-6D Sleep VAS Spinal pain VAS VITACORA-19 Worst Itch WTP
Psychological
Level of independence
AIMS AIMS2 AS-AIMS2 ASAS EF ASAS HI CASQ EASi-QoL EQ-5D HUI-3 MAF MFI Multidimensional PROM MUMQ PASS PCS PGI PsAID SF-12 SF-36 SF-6D SPS-6 VITACORA-19 AIMS AIMS2 AS-AIMS2 ASAS HI ASES-AS AS-WIS BASFI CASQ DASH DFI EASi-QoL EQ-5D HAQ-DI HAQ-S
AS
✓ ✓ ✓ ✓ ✓ ✓
Peripheral SpA nr-axSpA
PsA
IBD-SpA
USpA
✓
✓
✓ ✓ ✓ ✓
✓ ✓
✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
✓ ✓
✓ ✓
✓
✓
✓ ✓
✓ ✓ ✓ ✓
✓ ✓
✓ ✓ ✓
✓
✓ ✓ ✓
✓
✓
✓ ✓ ✓ ✓
✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
✓ ✓ ✓ ✓
✓
✓
✓
✓ ✓ ✓ ✓ ✓
✓
✓ ✓
✓
✓
✓
✓
✓
✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
✓ ✓ ✓
✓
✓ ✓ ✓
✓ ✓
✓ ✓ ✓
✓
✓ ✓
✓ ✓
K. Png et al. / Seminars in Arthritis and Rheumatism ] (2018) ]]]–]]]
5
Table 2 (continued ) Axial SpA WHOQOL domains
PROMs
AS
HAQ-Skin HUI-3 MAF Multidimensional PROM MUMQ PGI PsAID RLDQ SF-36 SF-6D SPS-6 SQUASH SRPQ s-SRPQ VITACORA-19 WHODAS II WPAI:SpA Social relations
Environment
Overall quality of life and general health perceptions
Peripheral SpA nr-axSpA
PsA
IBD-SpA
USpA
✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
✓ ✓
✓
✓
✓ ✓ ✓ ✓
✓ ✓
✓ ✓
✓
✓
✓
✓
✓ ✓
✓ ✓
AIMS2 AS-AIMS2 ASAS EF ASAS HI CASQ DASH PGI PsAID SF-36 SF-6D SRPQ s-SRPQ VITACORA-19 WHODAS II ASAS EF ASAS HI EASi-QoL HAQ-S IPAQ MUMQ PGI PsAID PsAQoL SQUASH VITACORA-19
✓ ✓ ✓ ✓ ✓
ASQoL BASG Multidimensional PROM PGA PJA PSA RS
✓ ✓ ✓ ✓
✓ ✓ ✓ ✓
✓ ✓ ✓ ✓ ✓
✓
✓ ✓ ✓ ✓ ✓
✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
✓ ✓ ✓
✓
✓ ✓ ✓ ✓
✓ ✓ ✓ ✓ ✓
✓ ✓ ✓
✓
✓, group of PROM domains studied in the SpA subtype (reactive arthritis and juvenile idiopathic arthritis were not included in this study). Abbreviations: AIMS, arthritis impact measurement scales; AS-AIMS2, ankylosing spondylitis arthritis impact measurement scales 2; ASES-AS, arthritis self-efficacy scale for ankylosing spondylitis; ASAS HI, assessment of spondyloarthritis international society health index; ASQoL, ankylosing spondylitis quality of life; AS-WIS, ankylosing spondylitis work instability scale; BASDAI, bath ankylosing spondylitis disease activity index; BASFI, bath ankylosing spondylitis functional index; BASG, bath ankylosing spondylitis patient global score; CASQ, combined ankylosing spondylitis questionnaire; DASH, disabilities of arm, shoulder, and hand questionnaire; DFI, Dougados functional index; EF, environmental factor; EQ-5D, EuroQol-5 dimensions; EASi-QoL, evaluation of ankylosing spondylitis quality of life; FACIT, functional assessment of chronic illness therapy; HAQ-DI, health assessment questionnaire disability index; HAQ-S, health assessment questionnaire for the spondyloarthropathies; HUI-3, health utilities index 3; IPAQ, international physical activity questionnaire; MUMQ, Maastricht utility measurement questionnaire; mFSS, modified fatigue severity scale; MAF, multidimensional assessment of fatigue scale; PASS, patient acceptable symptom state; PCS, pain catastrophizing scale; PGA, patient global assessment; PGI, patient generated index; PJA, patient joint assessment; PROM, patient reported outcome measure; PSA, patient skin assessment; PsAID, psoriatic arthritis impact of disease questionnaire; PsAQoL, psoriatic arthritis quality of life; PSI, psoriasis symptom inventory; PtGADA: patient global assessment of disease activity; RLDQ, revised leeds disability questionnaire; RS, well-being rating scale; SF-6D, short-form six-dimension; SF-36, short form 36 health surveys; SASPA, Stockerau activity score for psoriatic arthritis; SQUASH, short questionnaire to assess health enhancing physical activity; SRPQ, social role participation questionnaire in patients with ankylosing spondylitis; s-SRPQ, short form of the social role participation questionnaire in patients with ankylosing spondylitis; VITACORA-19, psoriatic arthritis quality of life questionnaire; WPI, work productivity and activity impairment questionnaire; WHODAS II, World Health Organization disability assessment schedule II; WPS, work productivity survey; WTP, willingness-to-pay.
developed based on qualitative methodologies to illicit patient's perspective. Table 2 presents the PROM domains of interest studied among the various SpA subtypes. All PROMs were categorized into domains of interest according to the WHOQOL structure
[24]. PROMs for different SpA subtypes were observed to have similar domains of interest, in which the two most common domains were physical health (58.3% of PROMs) and level of independence (51.7% of PROMs). These domains were identified
K. Png et al. / Seminars in Arthritis and Rheumatism ] (2018) ]]]–]]]
6
Table 3 Evidence synthesis of measurement properties for each PROM.
Instrument AIMS AIMS-global AIMS-pain AIMS-Physical AIMS2 AIMS2-physical function AIMS2-physical function; pain; psychological function; and social function AS-AIMS2 ASAS EF ASAS HI
ASES-AS ASQoL AS-WIS BASDAI
BASDAI-fatigue BASFI
BASG
Body chart CASQ DASH DFI
EASi-QoL EQ-5D EQ-VAS FACIT-fatigue Fatigue scale HAQ-Pain HAQ-DI-physical function; pain; psychological function; and social functionb HAQ-DI
HAQ-S
HAQ-S-stiffness HAQ-skin HUI-3 IPAQ MAF MFI MiniBASDAI Multidimensional PROMCASQ
Study Number Internal Measurement population of studies consistency Reliability error
Content validity
Structural validity
Hypothesis testing
Crosscultural validity
Criterion validity
Responsiveness
PsA PsA PsA AS PsA AS PsA
2 1 1 2 2 1 1
0 0 0 0 0 0 ?
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
þ ? þ − þ − 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
? 0 0 0 ? ? ?
AS AS Nr-axSpA AS AxSpA Nr-axSpA AS AS AS SpA AS AxSpA IBD-SpA Nr-axSpA PsA USpA SpA AS AS AxSpA IBD-SpA Nr-axSpA PsA USpA AS IBD-SpA PsA USpA AS AS PsA AS PsA SpA USpA AS AS PsA AS AS PsA PsA PsA PsA
1 1 1 2 1 1 1 11 2 1 23 2 1 1 5 3 2 1 29 1 1 1 3 3 10 1 2 1 1 1 1 9 1 1 1 2 2 3 1 1 1 2 1 1
þ 0
þ ?
0 0
0 0
þ 0
− −
0 0
0 0
0 0
þ
?
0
?
?
þþ
0
0
0
? þþ þ
þ þþ þþ
0 0 0
0 þþþ 0
0 þþþ ?
þþþ þþ þþ
0 ? þ
0 0 ?
? þ 0
þþ
þþ
?
þþþ
−
þþ
0
0
?
a
− þþ
0 ?
0 þþþ
a
þþ
þþ
þ ±
0 ?
0 0
? þ
?
þ
?
þþþ
0
±
0
0
þþ
a
0 þ 0 þ
0 0 0 0
0 0 0 0
a
þ 0 −−
? 0 þ
þ þ − ±
0 0 0 ?
0 0 0 0
? þ 0 ±
þþþ 0
þ ±
0 ?
þþ 0
? 0
þþþ þþ
0 0
0 0
? ?
a
þ ?
0 0
0 0
a
0
þ þþ
0 0
0 ?
? ?
?
0 0 0
0 0 0
0 0 0
0
? þ 0
0 0 0
0 0 0
? 0 ?
AS PsA SpA AS PsA SpA PsA PsA AS AS AS AS AS AS PsA IBS-SpA
2 9 1 6 1 1 1 1 1 1 1 1 1 1 1 1
þþ
0
0
0
þ
þþþ
0
0
?
?
þþ
0
0
0
±
0
0
?
0 0 0 0 0 0 0 þ
0 0 0 þ ? − 0 ?
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 ? 0 0 ?
? ? − ? 0 þ þ 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 ? 0 ?
? a a
a a
K. Png et al. / Seminars in Arthritis and Rheumatism ] (2018) ]]]–]]]
7
Table 3 (continued )
Instrument Multidimensional PROMjoint pain; spinal pain Multidimensional PROMmorning stiffness; PGA Multidimensional PROMmodified rheumatology index MUMQ Nocturnal back pain NRS
Nocturnal pain VAS Pain VAS
PASS PCS PGA PGI PJA PSA PsAID-9 PsAID-12 PsAQoL PSI PtGADA
RLDQ RS SAPSA SF-12 SF-36
SF-36-PF SF-36-physical function; pain; and psychological functiona SF-6D Sleep VAS SPS-6 Spinal pain VAS SQUASH SRPQ s-SRPQ VITACORA-19 WHODAS II Worst itch NRS WPAI:SpA WTP
Study Number Internal Measurement population of studies consistency Reliability error AS PsA IBS-SpA AS PsA IBS-SpA AS PsA IBS-SpA AS AS AxSpA Nr-axSpA AS AS PsA USpA AS AS PsA AS PsA PsA PsA PsA PsA PsA AS AxSpA Nr-axSpA AS SpA AS PsA AS AS AxSpA Nr-axSpA PsA SpA PsA PsA
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 4 1 1 1 6 1 2 2 1 2 6 1 1 1 1 4 1 1 1 1 3 1 1 3 1 2 1
AS PsA PsA AS AS AS AS PsA AS PsA AS PsA AS PsA
1 1 1 1 1 1 2 1 1 2 1 1 1 1
Content validity
Structural validity
Hypothesis testing
Crosscultural validity
Criterion validity
Responsiveness
þ
?
0
0
0
þ
0
0
?
0
0
0
0
0
þ
0
0
0
þ
?
0
0
?
þ
0
0
?
0 0
? þ
0 0
0 ?
0 0
− ?
0 0
0 0
− ?
a
0 ?
0 ?
0 0
a
0 0
0 0
0 0
? ?
? þ þ þ þ þ ? ? þþ þ 0
0 0 0 0 0 0 0 0 ? 0 0
0 0 0 0 0 0 ? ? þþþ 0 ?
0 ? þþ þ þ þ þ þþ þþ þ ?
0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 ? 0 0 0
0 0 þ ? 0 0 ? ? ? ? 0
þþ
þþ
0
−−
?
þþ
0
0
?
a
0 þ 0
? 0 0 0
0 0 0 ?
a
þ 0 þþ
þ 0 þ
þ ? þ þþ
0 0 0 0
0 0 0 0
? ? ? ?
þþ ?
0 0
0 0
0 0
þ 0
? 0
0 0
0 0
0 ?
0
−
?
0
0
þþ
0
0
?
a
0 þ 0 þ þþ
0 0 0 0 ?
0 0 0 0 ?
a
0 ? 0 ? þ
0 0 0 0 0
0 0 0 0 0
? 0 ? 0 0
0 þ 0 0 0 0
0 0 0 0 0 0
0 þþ 0 ? 0 ?
? þþ 0
þþ þþ ? 0 ? þ
0 0 0 0 0 0
0 0 0 0 0 0
0 ? 0 ? 0
a
a
þþ a
0 a a
0 0 þþþ þ a
? a
0 ? þþ þþ 0 a
0 0
a
a
þþ a
0 a a
0 0 ? ? a
0 a
0 0
a
0 0
0, measurement property was not assessed by any study. þþþ /− − −, strong positive/negative evidence (defined as consistent positive/negative findings in multiple studies of good methodological quality, or in one study of excellent methodological quality). þþ/− −, moderate positive/negative evidence (defined as consistent positive/negative findings in multiple studies of fair methodological quality, or in one study of good methodological quality). þ/−, limited positive/negative evidence (defined as positive/negative finding one study of fair methodological quality). ±, conflicting evidence (defined as conflicting findings). ? Unknown (defined as only studies of poor methodological quality). Abbreviations: AS, ankylosing spondylitis; axSpA, axial spondyloarthritis; IBD-SpA, inflammatory bowel disease-associated arthritis; nr-axSpA, non-radiographic axial spondyloarthritis; PsA, psoriatic arthritis; SpA, spondyloarthritis; USpA, undifferentiated SpA; NRS, numerical rating scale; VAS, visual analog scale; AIMS, arthritis impact measurement scales; AS-AIMS2, ankylosing spondylitis arthritis impact measurement scales 2; ASES-AS, arthritis self-efficacy scale for ankylosing spondylitis; ASAS HI, assessment of spondyloarthritis international society health index; ASQoL, ankylosing spondylitis quality of life; AS-WIS, ankylosing spondylitis work instability scale; BASDAI, bath ankylosing spondylitis disease activity index; BASFI, bath ankylosing spondylitis functional index; BASG, bath ankylosing spondylitis patient global score; CASQ, combined ankylosing spondylitis questionnaire; DASH, disabilities of arm, shoulder, and hand questionnaire; DFI, Dougados functional index; EF, environmental factor; EQ5D, EuroQol-5 dimensions; EASi-QoL, evaluation of ankylosing spondylitis quality of life; FACIT, functional assessment of chronic illness therapy; HAQ-DI, health assessment questionnaire disability index; HAQ-S, health assessment questionnaire for the spondyloarthropathies; HUI-3, health utilities index 3; IPAQ, international physical activity
K. Png et al. / Seminars in Arthritis and Rheumatism ] (2018) ]]]–]]]
8
questionnaire; MUMQ, Maastricht utility measurement questionnaire; mFSS, modified fatigue severity scale; MAF, multidimensional assessment of fatigue scale; PASS, patient acceptable symptom state; PCS, pain catastrophizing scale; PGA, patient global assessment; PGI, patient generated index; PJA, patient joint assessment; PROM, patient reported outcome measure; PSA, patient skin assessment; PsAID, psoriatic arthritis impact of disease questionnaire; PsAID-9, 9-item psoriatic arthritis impact of disease questionnaire; PsAID-12: 12-item psoriatic arthritis impact of disease questionnaire; PsAQoL, psoriatic arthritis quality of life; PSI, psoriasis symptom inventory; PtGADA, patient global assessment of disease activity; RLDQ, revised leeds disability questionnaire; RS, well-being rating scale; SF-6D, short-form six-dimension; SF-36, short form 36 health surveys; SF-36-PF, SF-36 physical function subscale; SASPA, Stockerau activity score for psoriatic arthritis; SQUASH, short questionnaire to assess health enhancing physical activity; SRPQ, social role participation questionnaire in patients with ankylosing spondylitis; s-SRPQ, short form of the social role participation questionnaire in patients with ankylosing spondylitis; VITACORA-19, psoriatic arthritis quality of life questionnaire; WPI, work productivity and activity impairment questionnaire; WHODAS II, World Health organization disability assessment schedule II; WPS, work productivity survey; WTP, willingness-to-pay. a b
Internal consistency and structural valididy does not apply to PROMs that only measure one item. Study assessed selected items domains to assess its measurement properties.
in PROMs that were studied across SpA subtypes: (1) the physical health domain in BASDAI and Short Form 36 Health Surveys (SF36) and (2) the level of independence domain in BASFI, Health Assessment Questionnaire Disability Index (HAQ-DI), Health Assessment Questionnaire for the Spondyloarthropathies (HAQS), Revised Leeds Disability Questionnaire (RLDQ), and SF-36. The psychological and social relations domains were only assessed by SF-36 across SpA subtypes. The environment domain was only assessed by HAQ-S across SpA subtypes. Assessment of methodological quality The results from the assessment of methodological quality of studies are presented in Supplementary Table 5. Among the 125 studies, the two most commonly assessed measurement properties were hypothesis testing (82.4%) and reliability (60.0%). A proportion of 77.7% and 42.7% of the studies that assessed PROMs for hypothesis testing and reliability respectively had ‘fair’ or better methodological quality. The less commonly assessed measurement properties among the studies were structural validity (20.8%), content validity (16.8%), measurement error (3.2%), and criterion validity (2.4%). Of the 44 studies (35.2%) in which PROMs were translated, 6.8% assessed the cross-cultural validity measurement property. After hypothesis testing and reliability, internal consistency (52.8%), and responsiveness (41.6%) were commonly assessed by the studies. A proportion of 34.8% and 17.3% of studies that assessed PROMs for internal consistency and responsiveness respectively had “fair” or better methodological quality. Evidence synthesis The results from evidence synthesis of PROMs are summarized in Table 3. A proportion of 61.5% of unique PROMs assessed for hypothesis testing had “limited,” “moderate” or “strong” positive evidence. A proportion of 69.2% of unique PROMs assessed for reliability had “limited” or “moderate” positive evidence. “Limited,” “moderate,” and “strong” levels of evidence refer to PROMs validated by two or more studies with “fair,” “good,” and “excellent” methodological quality respectively. PROMs with at least five measurement properties with “limited” or better included the ankylosing spondylitis quality of life (ASQoL), BASFI and the psoriatic arthritis quality of life questionnaire (VITACORA-19).
Discussion To the best of our knowledge, this is the first systematic review to summarize the PROMs for SpA and to assess their overall level of evidence. As there are PROMs dedicated for use in certain SpA subtypes, a summary of PROMs for SpA together with information on their overall level of evidence may inform the usefulness of PROMs for SpA in general. This is needed especially where PRO domains have been reported to be similar in PsA and axSpA [12,13], suggesting the possible use of a single set of PROMs for
SpA in general. A summary of 125 studies identified 60 unique PROMs for SpA. Our study revealed that 61.5% of unique PROMs assessed for hypothesis testing had “limited,” “moderate,” or “strong” positive evidence while 69.2% of unique PROMs assessed for reliability had “limited” or “moderate” positive evidence. These following PROMs were assessed for more measurement properties and had higher levels of evidence than other PROMs: ASQoL and BASFI for AS, and VITACORA-19 for PsA. All three PROMs were created based on patient feedback. Our study has several strengths. We used 3 databases and sensitive search filters to capture as many potentially relevant articles as possible. Its rigor was established using the PRISMA statement and COSMIN. The PRISMA statement was used because it improves the transparency and clarity of systematic review [27]. COSMIN was used because it is a consensus-based standard to evaluate the methodological quality of studies on measurement properties and the quality of measurement properties of PROMs [28]. The ASQoL, BASFI, and VITACORA-19 were assessed for more measurement properties and had higher levels of evidence than other PROMs. After the ASQoL, BASFI, and VITACORA-19, the BASDAI and PsAQoL were extensively assessed. Although BASDAI and BASFI were studied in most SpA subtypes, their measurement properties were not properly assessed for peripheral SpA. Few studies used study populations of 100 or more PsA patients to validate BASDAI and BASFI (1 versus 0 studies), hence the BASDAI and BASFI are recommended to be used for AS than for any SpA subtype. The structural validity, content validity, measurement error and criterion validity measurement properties were least commonly assessed across studies, which may be addressed in future research. However, it is worth noting that measurement properties may not be equally important depending on the use of the PROM. For example, the cross-cultural validity measurement property may not be important if the original version of a PROM was used for the intended study population. This may be a reason for some measurement properties not being studied in the articles. “Poor” methodological quality ratings were frequently observed in studies that assessed PROMs for internal consistency (65.2%) and responsiveness (82.7%). This was attributed to “unknown” (?) levels of evidence for the internal consistency (25.0%) and responsiveness (78.9%) measurement properties of PROMs. Future studies that validate PROMs should consider adhering to standards of methodological quality to raise the level of evidence of the measurement properties of PROMs. Among the limitations of this study was the inclusion of English full-text articles only. Full-text articles were necessary as they are peer-reviewed and recommended by Terwee et al. [29]. Moreover, details of the study methods were needed to assess the methodological quality of the study. The 15 foreign-language articles that would otherwise be eligible for full text screening made up only 8.1% of the included articles. Data synthesis were conducted on all available data from studies on SpA in general than on individual subtypes. When available, we indicated the proportion of SpA subtypes in the study populations (Supplementary Table 5). Data from randomized controlled trials were also not included when evaluating
K. Png et al. / Seminars in Arthritis and Rheumatism ] (2018) ]]]–]]]
responsiveness of PROMs, and this limitation should be perused in future studies. Out of 60 unique PROMs identified, only 21 PROMs were validated for more than two SpA subtypes. Thus, it cannot be assumed that all PROMs identified in this study can be used across SpA subtypes. Future work may consider validating PRO domains relevant to patients across the SpA subtypes. These studies also will need to attain standards of methodological quality. Out of 60 PROMs, only 17 PROMs were created using patient feedback. In addition, insufficient evidence for the content validity measurement property suggests that future qualitative studies may consider validating PROMs to be used across different SpA subtypes. Qualitative studies may elucidate if the items in PROMs are relevant and comprehensive for the construct to be measured [25]. In conclusion, this study has identified 60 unique PROMs through a systematic review and evaluated their level of evidence, adjusted using results from an assessment of methodological quality. The ASQoL and BASFI for AS and VITACORA-19 for PsA were validated by studies with “fair” or better methodological quality for more measurement properties. Future studies may consider validating PROMs to be used across different SpA subtypes.
Appendix A. Supporting information Supplementary data associated with this article can be found in the online version at http://dx.doi.org/10.1016/j.semarthrit.2018. 02.016.
References [1] Parma A, Cometi L, Leone MC, Lepri G, Talarico R, Guiducci S. One year in review 2016: spondyloarthritis. Clin Exp Rheumatol 2017;35:3–17. [2] Ramonda R, Marchesoni A, Carletto A, Bianchi G, Cutolo M, Ferraccioli G, et al. Patient-reported impact of spondyloarthritis on work disability and working life: the ATLANTIS survey. Arthritis Res Ther 2016;18:78, http://dx.doi.org/ 10.1186/s13075-016-0977-2. [3] Baeten D, Breban M, Lories R, Schett G, Sieper J. Are spondylarthritides related but distinct conditions or a single disease with a heterogeneous phenotype? Arthritis Rheum 2013;65:12–20, http://dx.doi.org/10.1002/art.37829. [4] Raychaudhuri SP, Deodhar A. The classification and diagnostic criteria of ankylosing spondylitis. J Autoimm 2014;48:128–33, http://dx.doi.org/10.1016/ j.jaut.2014.01.015. [5] Smolen JS, Schöls M, Braun J, Dougados M, FitzGerald O, Gladman DD, et al. Treating axial spondyloarthritis and peripheral spondyloarthritis, especially psoriatic arthritis, to target: 2017 update of recommendations by an international task force. Ann Rheum Dis 2017. [6] Dawson J, Doll H, Fitzpatrick R, Jenkinson C, Carr AJ. The routine use of patient reported outcome measures in healthcare settings. Br Med J 2010. http://dx. doi.org/10.1136/bmj.c186. [7] Bryan S, Davis J, Broesch J, Doyle-Waters MM, Lewis S, McGrail K, et al. Choosing your partner for the PROM: a review of evidence on patient-reported outcome measures for use in primary and community care. Healthc Policy 2014;10:38–51. [8] van der Heijde D, Joshi A, Pangan AL, Chen N, Betts K, Mittal M, et al. ASAS40 and ASDAS clinical responses in the ABILITY-1 clinical trial translate to meaningful improvements in physical function, health-related quality of life and work productivity in patients with non-radiographic axial spondyloarthritis. Rheumatology 2016;55:80–8, http://dx.doi.org/10.1093/rheumatology/ kev267. [9] Kiltz U, Gossec L, Baraliakos X, Braun J. PROMs for Spondyloarthritis. In: El Miedany Y, editor. PROMs Rheumatoid Disease. Cham: Springer International Publishing; 2016. p. 121–47. [10] Haywood KL, Garratt AM, Dawes PT. Patient-assessed health in ankylosing spondylitis: a structured review. Rheumatology 2005;44(5):577–86, http://dx. doi.org/10.1093/rheumatology/keh549.
9
[11] Mease PJ. Measures of psoriatic arthritis: Tender and Swollen Joint Assessment, Psoriasis Area and Severity Index (PASI), Nail Psoriasis Severity Index (NAPSI), Modified Nail Psoriasis Severity Index (mNAPSI), Mander/Newcastle Enthesitis Index (MEI), Leeds Enthesitis Index (LEI), Spondyloarthritis Research Consortium of Canada (SPARCC), Maastricht Ankylosing Spondylitis Enthesis Score (MASES), Leeds Dactylitis Index (LDI), Patient Global for PsoriaticArthritis, Dermatology Life Quality Index (DLQI), Psoriatic Arthritis Quality of Life (PsAQOL), Functional Assessment of Chronic Illness Therapy–Fatigue (FACIT-F), Psoriatic Arthritis Response Criteria (PsARC), Psoriatic Arthritis Joint Activity Index (PsAJAI), Disease Activity in PsoriaticArthritis (DAPSA), and Composite Psoriatic Disease Activity Index (CPDAI). Arthritis Care Res 2011;63:S64–85, http://dx.doi.org/10.1002/acr.20577. [12] Michelsen B, Fiane R, Diamantopoulos AP, Soldal DM, Hansen IJW, Sokka T, et al. A comparison of disease burden in rheumatoid arthritis, psoriatic arthritis and axial spondyloarthritis. PLOS One 2015;10:e0123582, http://dx. doi.org/10.1371/journal.pone.0123582. [13] Zink A, Thiele K, Huscher D, Listing J, Sieper J, Krause A, et al. Healthcare and burden of disease in psoriatic arthritis. A comparison with rheumatoid arthritis and ankylosing spondylitis. J Rheumatol 2006;33:86–90. [14] Kwan YH, Fong W, Tan VIC, Lui NL, Malhotra R, Ostbye T, et al. A systematic review of quality-of-life domains and items relevant to patients with spondyloarthritis. Semin Arthritis Rheum 2017. http://dx.doi.org/10.1016/j. semarthrit.2017.04.002. [15] Nash P, Mease PJ, Braun J, van der Heijde D. Seronegative spondyloarthropathies: to lump or split? Ann Rheum Dis 2005;64(Suppl. 2):ii9–13, http://dx. doi.org/10.1136/ard.2004.033654. [16] Callahan LF. The history of patient-reported outcomes in rheumatology. Rheum Dis Clin North Am 2016;42:205–17, http://dx.doi.org/10.1016/j. rdc.2016.01.012. [17] Moher D, Liberati A, Tetzlaff J, Altman DG, The PG. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLOS Med 2009;6:e1000097, http://dx.doi.org/10.1371/journal.pmed. 1000097. [18] Mokkink LB, Terwee CB, Patrick DL, Alonso J, Stratford PW, Knol DL, et al. The COSMIN checklist for assessing the methodological quality of studies on measurement properties of health status measurement instruments: an international Delphi study. Qual Life Res 2010;19:539–49, http://dx.doi.org/ 10.1007/s11136-010-9606-8. [19] Terwee CB, Bot SD, de Boer MR, van der Windt DA, Knol DL, Dekker J, et al. Quality criteria were proposed for measurement properties of health status questionnaires. J Clin Epidemiol 2007;60:34–42, http://dx.doi.org/10.1016/j. jclinepi.2006.03.012. [20] Terwee CB. Protocol for systematic reviews of measurement properties COSMIN: COSMIN; 2011. [21] Terwee CB, Jansma EP, Riphagen II, de Vet HCW. Development of a methodological PubMed search filter for finding studies on measurement properties of measurement instruments. Qual Life Res 2009;18:1115–23, http://dx.doi.org/ 10.1007/s11136-009-9528-5. [22] Mackintosh A, Comabella CCi, Hadi M, Gibbons E, Fitzpatrick R. PROM Group Construct and Instrument Type Filters. COSMIN; 2010. [23] Mokkink LB, Terwee CB, Patrick DL, Alonso J, Stratford PW, Knol DL, et al. The COSMIN study reached international consensus on taxonomy, terminology, and definitions of measurement properties for health-related patientreported outcomes. J Clin Epidemiol 2010;63:737–45, http://dx.doi.org/ 10.1016/j.jclinepi.2010.02.006. [24] The World Health Organization quality of life assessment (WHOQOL): Position paper from the World Health Organization. Soc Sci Med 1995;41:1403–9, http: //dx.doi.org/10.1016/0277-9536(95)00112-K. [25] Terwee CB, Mokkink LB, Knol DL, Ostelo RW, Bouter LM, de Vet HC. Rating the methodological quality in systematic reviews of studies on measurement properties: a scoring system for the COSMIN checklist. Qual Life Res 2012;21:651–7, http://dx.doi.org/10.1007/s11136-011-9960-1. [26] Terwee CB, Knol DL, de Vet HCW, Mokkink LB. Systematic reviews of measurement properties.Measurement in Medicine: A Practical Guide.. Cambridge: Cambridge University Press; 2011, 275–314. [27] Liberati A, Altman DG, Tetzlaff J, Mulrow C, Gøtzsche PC, Ioannidis JPA, et al. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate healthcare interventions: explanation and elaboration. Br Med J 2009:339, http://dx.doi.org/10.1136/bmj.b2700. [28] Mokkink LB, Prinsen CAC, Bouter LM, de Vet HCW, Terwee CB. The COnsensusbased Standards for the selection of health measurement instruments (COSMIN) and how to select an outcome measurement instrument. Brazilian J Phy Ther 2016;20:105–13, http://dx.doi.org/10.1590/bjpt-rbf.2014.0143. [29] Terwee CB, Prinsen CA, Ricci Garotti MG, Suman A, de Vet HC, Mokkink LB. The quality of systematic reviews of health-related outcome measurement instruments. Qual Life Res 2016;25:767–79, http://dx.doi.org/10.1007/ s11136-015-1122-4.