CLINICAL GASTROENTEROLOGY AND HEPATOLOGY 2004;2:769 –777
Responsiveness and Interpretation of a Symptom Severity Index Specific to Upper Gastrointestinal Disorders DENNIS A. REVICKI,* ANNE M. RENTZ,‡ JAN TACK,§ VINCENZO STANGHELLINI,储 NICHOLAS J. TALLEY,¶ PETER KAHRILAS,# CHRISTINE DE LA LOGE,** ELYSE TRUDEAU,** and DOMINIQUE DUBOIS‡‡ *Center for Health Outcomes Research, MEDTAP International, Bethesda, Maryland; ‡Center for Health Outcomes Research, MEDTAP International, Sindelfingen, Germany; §Department of Gastroenterology, University of Leuven, Leuven, Belgium; 储Department of Internal Medicine and Gastroenterology, University of Bologna, Bologna, Italy; ¶Department of Medicine, University of Sydney, Sydney, Australia; # Division of Gastroenterology, Northwestern University, Chicago, Illinois; **Mapi Values, Lyon, France; and ‡‡Johnson & Johnson Pharmaceutical Services, LLC, Beerse, Belgium
Background & Aims: Determining clinically meaningful change of patient-reported outcome measures is important for evaluating effectiveness of treatments for gastrointestinal (GI) diseases. This study evaluates responsiveness of the Patient Assessment of Gastrointestinal Disorders–Symptom Severity Index (PAGI-SYM) in gastroesophageal reflux disease (GERD) and dyspepsia. Methods: The PAGI-SYM was based on a review of the published literature and interviews with patients and clinicians. Items were developed to be linguistically and culturally appropriate for multicountry studies. The PAGI-SYM includes 6 subscales: heartburn/regurgitation, fullness/ early satiety, nausea/vomiting, bloating, upper abdominal pain, and lower abdominal pain. Subjects with GERD (n ⴝ 810) or dyspepsia (n ⴝ 767) participated in this multicountry, observational study. All subjects completed the PAGISYM, a global symptom relief questionnaire, and a measure of patient-rated change in GI-related symptoms, the Overall Treatment Effect (OTE) scale. Responsiveness was evaluated at 8 weeks by comparing groups by disease, symptom relief, and OTE (improved, stable, and worsened). Results: Subjects reporting symptom relief reported significantly lower (better) PAGI-SYM scores than those reporting no symptom relief (P < 0.0001 to P < 0.0005). Subjects with improvements in overall GI symptoms exhibited significant decreases in PAGI-SYM subscale scores compared with those who remained the same or worsened (all P values < 0.0001). Effect sizes ranged from 0.21–1.28, and standard errors of measurement ranged from 0.29 – 0.63, depending on subscale and disease sample. Conclusions: The PAGI-SYM is a brief symptom severity instrument that measures common GI symptoms. Results suggest that the PAGI-SYM is responsive and sensitive to change in clinical status in subjects with GERD or dyspepsia.
astrointestinal (GI) disorders such as gastroesophageal reflux disease (GERD) and functional dyspepsia occur frequently in the general population.1–5 For many, diagnosis is made primarily on reports of symp-
G
tom frequency, type, and severity. Despite negative diagnostic tests, many people experience interference with everyday activities and health-related quality of life.2,6 – 8 For example, patients with GERD might or might not have negative diagnostic test results, such as endoscopy or pHmetry, and the results of these tests might not be correlated with symptom reports. Patient-reported symptom severity and disease-specific health-related quality of life outcomes are important for assessing treatment of GI diseases because they reflect the patient’s direct experiences of symptoms and their functioning and well-being. To be maximally useful in clinical trial and clinical practice applications, these measures need to have evidence supporting their reliability, validity, and responsiveness. Frequently, the scores based on self-reported, multi-item questionnaires are not completely understood by clinicians caring for patients. Therefore, information needs to be developed as to the magnitude of differences and changes in scores that are clinically significant.9 Guidance is also needed as to the interpretation of minimal important and clinically relevant differences in scores on these patient-reported outcomes. Patient-reported symptom and health status outcomes are increasingly included in clinical trials to evaluate competing therapies. Therefore, there is the need to have symptom severity and health-related quality of life instruments that are reliable and valid and that can detect important changes in clinical status. European and U.S. Abbreviations used in this paper: ANCOVA, analysis of covariance; GERD, gastroesophageal reflux disease; GI, gastrointestinal; OTE, Overall Treatment Effect; PAGI-SYM, Patient Assessment of Gastrointestinal Disorders–Symptom Severity Index; SEM, standard error of measurement. © 2004 by the American Gastroenterological Association 1542-3565/04/$30.00 PII: 10.1053/S1542-3565(04)00348-9
770
REVICKI ET AL.
regulatory agencies require psychometric evidence on self-assessed instruments before their use in supporting claims of clinical benefit.10 –12 This psychometric evidence includes information on the instrument’s responsiveness and guidelines for interpretation of differences or changes in scores. This report describes the evaluation of the responsiveness of the Patient Assessment of Gastrointestinal Disorders–Symptom Severity Index (PAGI-SYM), a selfadministered symptom severity tool for use in patients with GERD, dyspepsia, and gastroparesis. An international validation study was conducted including data collected from 617 U.S. and 1130 European subjects with GERD, dyspepsia, or gastroparesis (U.S. only).13,14 The development procedures and psychometric evaluation of the PAGI-SYM have been reported elsewhere,14 and the responsiveness of the PAGI-SYM in patients with gastroparesis has been evaluated.13 The current report focuses on the responsiveness and clinical interpretation of the PAGI-SYM in subjects with GERD or dyspepsia.
Methods Study Design/Overall Approach Study sample. U.S. GERD and dyspepsia subjects were recruited from a national survey of GI disorders,4,15 whereas the European GERD and dyspepsia subjects were recruited from clinical centers. Eligible subjects provided consent to participate in the study, were able to read and understand their primary language, and were 18 years of age or older. Exclusion criteria included a history of gastric surgery or cancer of the GI tract or psychiatric disorders or cognitive impairments that would interfere with participation. Institutional review board or ethics committee approvals from the participating organizations were obtained before the start of the psychometric evaluation study. Diagnostic criteria. GERD was defined as heartburn with a frequency greater than or equal to 2 days per week and either bitter, sour, acid taste in mouth or regurgitation of fluid/food in mouth. GERD subjects were excluded if their primary medical complaint was another GI disease such as irritable bowel syndrome, peptic ulcer, or dyspepsia based on self-reported symptoms and diagnoses in the U.S. subjects and clinician evaluation in the European subjects. Dyspepsia was defined 1 of 3 ways: (1) early satiety plus postprandial fullness, ⱖ2 times per week, but no constipation or vomiting (a subject must have had both early satiety and postprandial fullness, but needed to meet only 1 frequency criterion); (2) frequent nausea, ⱖ1 day per week, with or without vomiting; or (3) Rome II criteria of pain or discomfort centered in the upper abdomen.16 A history of relevant symptoms during the previous 3 months was required. If a subject had evidence of organic disease that was likely to explain the
CLINICAL GASTROENTEROLOGY AND HEPATOLOGY Vol. 2, No. 9
symptoms, evidence that dyspepsia was exclusively relieved by defecation or associated with a change in stool frequency or form, or heartburn as the primary complaint, it was grounds for exclusion.
Patient Assessment of Gastrointestinal Disorders–Symptom Severity Index The PAGI-SYM is composed of 20 items and 6 subscales: heartburn/regurgitation (7 items), nausea/vomiting (3 items), postprandial fullness/early satiety (4 items), bloating (2 items), upper abdominal pain (2 items), and lower abdominal pain (2 items) (see Appendix). Subscale scores are calculated by averaging across items comprising the subscale; scores vary from 0 (none or absent) to 5 (very severe). The half-scale rule is applied for missing data (i.e., the subscale score is calculated by using the mean of nonmissing items; when more than 50% of items are missing, the score is set to missing). The PAGISYM subscale scores have good internal consistency and testretest reliability and evidence supporting content and construct validity.13,14 Internal consistency reliability ranged from 0.79 – 0.91, and test-retest reliability ranged from 0.60 – 0.82 for the PAGI-SYM subscales. Demographic characteristics. Data on patient gender, age, marital status, educational level, and employment status were collected at baseline. Clinical status measures. Patient-rated global severity of GI problems was measured on a 6-point Likert scale, ranging from 0 (none/absent) to 5 (very severe). Patient-rated global symptom relief was assessed with a dichotomous question (was symptom relief adequate over the past 2 weeks, yes or no). Both global items were collected at baseline and after 8 weeks. Change in symptoms was assessed by patients completing the Overall Treatment Effect scale (OTE) at 8 weeks.17 First, respondents indicated whether their GI-related symptoms had improved, remained the same, or worsened since the last evaluation. If subjects indicated that symptoms had improved, they were then asked to rate the degree of improvement on a 7-point scale from “Almost the same, hardly better at all” (1) to “A very great deal better” (7). If subjects indicated that symptoms had worsened, they were asked to rate the degree of worsening on a 7-point scale from “Almost the same, hardly worse at all” (⫺1) to “A very great deal worse” (⫺7). For the OTE, a score between ⫺1 and 1 was considered as no change in GI symptoms. The OTE has been used in several instrument development and clinical studies.9,18
Data Collection Procedures Data collection was completed by telephone interview for GERD and dyspepsia subjects in the U.S. Subjects were identified from respondents to a nationally representative survey of upper GI symptoms4 and interviewed at baseline and after 8 weeks. The European GERD and dyspepsia subjects were identified from patients treated within clinical centers. A separate study was conducted to evaluate whether PAGI-SYM
September 2004
RESPONSIVENESS OF THE PAGI-SYM
771
Table 1. Demographic Characteristics for PAGI-SYM Validation Study Disease group
Mean age (SD) Female (%) Married/living with partner (%) Employed full- or part-time (%)
Total sample
GERD (n ⫽ 810)
Dyspepsia (n ⫽ 767)
48.1 (14.9) 60.3 68.0
50.0 (14.7) 56.3 71.1
46.2 (15.0) 64.6 64.7
55.2
52.2
58.4
SD, standard deviation.
scores varied by self-completion or interviewer administration.19 There were no significant differences in scores by mode of administration.
Statistical Analysis Analyses were performed to evaluate the relationship between PAGI-SYM scores and global symptom severity and responsiveness of the PAGI-SYM in the GERD and dyspepsia samples. Analysis of covariance (ANCOVA) was used to compare PAGI-SYM subscale scores by levels of global disease severity (i.e., none, mild, moderate, severe). Age and gender were included in the ANCOVA models. We evaluated the relationship of the PAGI-SYM scores to change in clinical status as measured by the OTE and global symptom relief. In the absence of a single, unequivocal indicator of change or stability, it is recommended to use several methods to assess responsiveness.9 Three complementary methods for examining responsiveness were used: (1) betweensubject change, (2) effect size, and (3) standard error of measurement (SEM). Between-subject change was evaluated on the basis of change in clinical status and observed changes in GI-related
Figure 1. Mean PAGI-SYM scores by global disease severity in the GERD sample. ANCOVA model, baseline data, adjusted for age and gender.
Figure 2. Mean PAGI-SYM scores by global disease severity in the dyspepsia sample. ANCOVA model, baseline data, adjusted for age and gender.
symptoms. With independent sample t tests, we compared change in PAGI-SYM scores from baseline to 8 weeks between patients experiencing symptom relief versus those who did not. The relationship between the PAGI-SYM scores and change in clinical status was also evaluated by using ANCOVA models. Subjects were classified as improving, stable, or worsening in their GI-related symptoms by using the patient-rated OTE at 8 weeks. ANCOVA models, adjusting for gender and age, were used to compare mean PAGI-SYM scores by change in clinical status. These ANCOVAs were performed by diagnosis group. We estimated effect sizes for subjects reporting improvements in clinical status by using 2 methods.20 The first method, effect size 1, is a quantitative measure of change in score and provides a means of standardizing the comparison between groups. Effect size 1 equals the mean baseline to 8-week change in PAGI-SYM scores for subjects who improved divided by the standard deviation of baseline scores for all patients (Mean baseline score ⫺ Mean 8-week score/Standard deviation of baseline scores). The second estimate of effect size is a variation of the above effect size by using the same numerator but limiting the denominator to the standard deviation of score changes among stable patients only (Mean score change/Standard deviation of score changes among stable patients). SEM was also estimated for PAGI-SYM scores.21,22 SEM reflects the variability between an individual’s observed score and the true score. It takes into consideration that some observed change might be due to random measurement error. SEM equals the standard deviation of the PAGI-SYM score multiplied by the square root of 1 minus its internal consistency reliability coefficient. There is a close relationship between the minimally important difference and one SEM. Effect sizes and SEMs were calculated for the 2 disease-specific groups.
772
REVICKI ET AL.
CLINICAL GASTROENTEROLOGY AND HEPATOLOGY Vol. 2, No. 9
Table 2. Mean PAGI-SYM Subscale Scores by Symptom Relief Status at 8 Weeks Dyspepsia symptom relief
Heartburn/regurgitation Fullness/early satiety Nausea/vomiting Bloating Upper abdominal pain Lower abdominal pain
GERD symptom relief
No, mean (SD)
Yes, mean (SD)
No, mean (SD)
Yes, mean (SD)
1.35 (1.12) 2.27 (1.21) 1.12 (1.14) 2.41 (1.35) 2.12 (1.27) 1.66 (1.41)
0.63 (0.79) 1.11 (1.03) 0.48 (0.75) 1.33 (1.20) 1.07 (1.09) 0.77 (1.04)
2.12 (1.07) 2.07 (1.20) 1.17 (1.11) 2.15 (1.33) 1.99 (1.28) 1.45 (1.36)
0.95 (0.83) 0.92 (0.96) 0.45 (0.74) 1.09 (1.12) 0.97 (1.02) 0.60 (0.95)
NOTE. T tests by symptom relief status. All results significant at P ⬍ 0.0001.
Results Study Sample The PAGI-SYM validation study included 810 GERD patients (51.4%) and 767 dyspepsia patients (48.6) for a total of 1577 subjects (Table 1). The mean age of respondents was 48.1 years (standard deviation, 14.9), 60.3% were female, 68% were married or living with partner, and 55.2% were employed either full- or part-time. Twenty-eight percent of the sample was from the U.S., with the remaining subjects from France (16.9%), Germany (13.5%), Italy (12.7%), the Netherlands (13.6%), or Poland (15.0%). Patient assessment of gastrointestinal disorders–symptom severity index scores by global disease severity. The PAGI-SYM subscale scores varied signifi-
cantly by global disease severity in the GERD and dyspepsia samples (Figures 1 and 2). In the GERD sample, subjects reporting greater global symptom severity also reported higher scores on the PAGI-SYM subscales (Figure 1). These differences were greatest for heartburn/ regurgitation (P ⬍ 0.0001) and upper abdominal pain (P ⬍ 0.0001). In the dyspepsia group, subjects with more severe disease reported higher (worse) PAGI-SYM subscale scores (Figure 2). The differences were largest on fullness/early satiety and upper abdominal pain (both P ⬍ 0.0001).
Between-patient change. Responsiveness to changes in clinical symptom status was evaluated by assessing the relationship between PAGI-SYM subscale scores and patient ratings of changes in overall symptom relief and overall GI symptoms during the period between the baseline and 8-week follow-up assessment (Tables 2 and 3). Mean PAGI-SYM scores at 8 weeks are summarized by symptom relief status in Table 2. In both the GERD and dyspepsia samples, mean PAGI-SYM subscale scores were statistically significantly different between those subjects reporting global symptom relief compared with those who did not report symptom relief (all P ⬍ 0.0001). The largest differences in mean scores in the dyspepsia sample were seen for fullness/early satiety, bloating, and upper abdominal pain. In the GERD sample, the largest differences in mean scores were seen for heartburn/regurgitation, fullness/early satiety, bloating, and upper abdominal pain. Mean baseline to 8-week changes in PAGI-SYM subscale scores were compared between subjects reporting symptom relief and those reporting no relief (Table 3). PAGI-SYM total and subscale change scores differed significantly by patient-rated symptom relief at 8 weeks. Statistically significant differences were observed for all mean PAGI-SYM subscale scores in the dyspepsia (P ⬍ 0.001 to P ⬍ 0.0001) and GERD samples (P ⬍ 0.05 to
Table 3. Baseline to 8-Week Change in PAGI-SYM Scores by Symptom Relief Status at 8 Weeks Dyspepsia symptom relief
Heartburn/regurgitation Fullness/early satiety Nausea/vomiting Bloating Upper abdominal pain Lower abdominal pain ⬍ 0.0001. ⬍ 0.001. cP ⬍ 0.05.
aP bP
GERD symptom relief
No, mean (SD)
Yes, mean (SD)
Difference
No, mean (SD)
Yes, mean (SD)
Difference
0.07 (0.78) ⫺0.10 (0.87) ⫺0.20 (0.92) ⫺0.04 (1.15) ⫺0.21 (1.23) 0.01 (1.20)
⫺0.37 (0.78) ⫺0.91 (1.19) ⫺0.49 (1.01) ⫺0.92 (1.29) ⫺1.16 (1.33) ⫺0.43 (1.17)
0.44a 0.81a 0.29b 0.88a 0.94a 0.44a
⫺0.22 (0.90) 0.04 (0.89) ⫺0.07 (0.97) ⫺0.13 (1.22) ⫺0.18 (1.29) ⫺0.03 (1.22)
⫺0.90 (1.06) ⫺0.54 (1.04) ⫺0.40 (0.94) ⫺0.53 (1.15) ⫺0.79 (1.27) ⫺0.24 (1.11)
0.67a 0.58a 0.32b 0.40b 0.61a 0.22c
September 2004
RESPONSIVENESS OF THE PAGI-SYM
773
Table 4. Responsiveness Statistics for the PAGI-SYM (Mean Change Scores) PAGI-SYM subscale scores Patient-Rated Change GERD Improveda No changeb Worsenedc P valued Dyspepsia Improvede No changef Worsenedg P valued
Heartburn/ regurgitation
Fullness/ early satiety
Nausea/ vomiting
Bloating
Upper abdominal pain
Lower abdominal pain
⫺1.17 ⫺.42 .28 ⬍0.0001
⫺.77 ⫺.13 .48 ⬍0.0001
⫺.57 ⫺.13 .38 ⬍0.0001
⫺.73 ⫺.17 .24 ⬍0.0001
⫺1.03 ⫺.40 .68 ⬍0.0001
⫺.40 ⫺.06 .22 ⬍0.0001
⫺0.49 ⫺0.11 0.46 ⬍0.0001
⫺1.12 ⫺.25 .06 ⬍0.0001
⫺.69 ⫺.19 .22 ⬍0.0001
⫺1.15 ⫺.25 .12 ⬍0.0001
⫺1.45 ⫺.43 .41 ⬍0.0001
⫺.67 ⫺.08 .33 ⬍0.0001
aN
ranges from 360 –348. ranges from 317–305. cN ranges from 48 – 46. dTwo-tailed P value from ANCOVA model adjusting for gender and age. eN ranges from 348 –317. fN ranges from 312–259. gN ranges from 56 – 46. bN
P ⬍ 0.0001). In all cases, subjects reporting symptom relief also reported greater decreases in PAGI-SYM total and subscale scores than those subjects reporting no symptom relief. Mean baseline to 8-week changes in subscale scores by change in clinical status, on the basis of patient ratings, are summarized in Table 4. There were statistically significant differences in baseline to 8-week mean change in heartburn/regurgitation scores by patient-assessed change in clinical status in the GERD (P ⬍ 0.0001) and dyspepsia groups (P ⬍ 0.0001). In the GERD sample, mean change in heartburn/regurgitation scores differed between the improved (⫺1.17), no change (⫺0.42), and worsened (0.28) groups (all P ⬍ 0.0001). For the dyspepsia sample, heartburn/regurgitation change scores differed between the improved (⫺0.49), no change (⫺0.11), and worsened (0.46) groups (all P ⬍ 0.0001). Statistically significant differences in 8-week mean change in fullness/early satiety scores were observed by patient-rated changes in symptom status in the GERD (P ⬍ 0.0001) and dyspepsia groups (P ⬍ 0.0001). The improved group showed a 0.77–1.12 point decrease in fullness/early satiety scores, whereas the worsened group increased 0.06 – 0.48 points. We found statistically significant differences in bloating change scores between the patient-rated clinical status in both the GERD (P ⬍ 0.0001) and dyspepsia groups (P ⬍ 0.0001). The improved group demonstrated 0.73–1.15-point decreases in mean bloating scores, whereas the stable and worsened groups showed changes ranging from ⫺0.17 to ⫺0.25 points and 0.12 to 024, respectively. There were statistically significant differences in
changes in nausea/vomiting score among the improved, stable, and worsened groups in the GERD (P ⬍ 0.0001) and dyspepsia samples (P ⬍ 0.0001). Subjects reporting improvements in clinical status had decreases of 0.57– 0.69 points, whereas those in the stable group changed only ⫺0.13 to ⫺0.19 points and those in the worsened group changed 0.22– 0.38 points. Mean baseline to 8-week changes in upper abdominal pain and lower abdominal pain scores were significantly different (both P ⬍ 0.0001) by change in clinical status groups. For upper abdominal pain, the improved group reported the largest decreases in scores (⫺1.03 to ⫺1.45), with the worsened group increasing in upper abdominal pain scores during the 8-week period (0.41– 0.68). This pattern of results was observed for the lower abdominal pain scores, although the magnitude of effects was smaller. Effect sizes. Effect sizes were estimated on the basis of patient-rated improvement in GI symptoms for the PAGI-SYM subscale scores (Table 5). Effect sizes associated with improved versus no change groups ranged from 0.30 (lower abdominal pain, GERD sample) to 1.28 (upper abdominal pain, dyspepsia sample), depending on method, for PAGI-SYM subscale scores. In the GERD sample, the largest effect sizes were observed for heartburn/regurgitation (1.14, 1.16) and upper abdominal pain (0.80, 0.82) scores. For the dyspepsia sample, the largest effect sizes were seen on the upper abdominal pain (1.22, 1.28), fullness/early satiety (1.00, 1.02), and bloating (0.83, 0.85) subscales. Standard error of measurement. SEMs were calculated for the PAGI-SYM subscale scores (Table 5). The SEMs ranged from 0.29 (for heartburn/regurgitation in
774
REVICKI ET AL.
CLINICAL GASTROENTEROLOGY AND HEPATOLOGY Vol. 2, No. 9
Table 5. Effect Size and SEM for PAGI-SYM Subscale Scores by Disease Group PAGI-SYM subscale
GERD Effect size Effect size SEM Dyspepsia Effect size Effect size SEM
Heartburn/ regurgitation
Fullness/ early satiety
Nausea/ vomiting
Bloating
Upper abdominal pain
Lower abdominal pain
1 2
1.14 1.16 0.31
0.64 0.68 0.41
0.54 0.56 0.40
0.55 0.55 0.54
0.80 0.82 0.42
0.32 0.30 0.32
1 2
0.48 0.45 0.29
1.00 1.02 0.42
0.66 0.66 0.42
0.85 0.83 0.53
1.22 1.28 0.47
0.42 0.41 0.37
the dyspepsia sample) to 0.54 (for bloating in the GERD sample) for the PAGI-SYM subscale scores. In both the GERD and dyspepsia samples, the smallest SEMs were seen for heartburn/regurgitation and lower abdominal pain scores.
Discussion The PAGI-SYM was developed to measure the severity of GI symptoms on the basis of patient selfassessments. The PAGI-SYM was developed and psychometrically evaluated in parallel in 6 countries,14 so it can be used in international clinical trials and for monitoring outcomes in clinical practice. Previous research has demonstrated that the PAGI-SYM is reliable and valid in subjects with GERD, dyspepsia, or gastroparesis.13,14 The findings of this study indicate that the PAGI-SYM varies significantly by measures of global disease severity and has preliminary evidence supporting responsiveness in samples of subjects with GERD or dyspepsia. In the GERD and dyspepsia samples, PAGI-SYM subscale scores varied significantly by global disease severity, with higher (worse) scores observed in those subjects who rated their GI disease as moderate to severe. As expected, we observed significant differences in mean PAGI-SYM change scores from baseline to 8 weeks between respondents who reported symptom relief compared to those reporting no symptom relief. For all comparisons, those subjects who reported symptom relief had mean PAGI-SYM scores that were significantly less severe than those who reported no symptom relief. There were significant differences between those subjects with and without symptom relief on baseline to 8-week change in heartburn/regurgitation (difference, 0.67 points), upper abdominal pain (difference, 0.61 points), and fullness/early satiety (difference, 0.58 points) scores in the GERD sample and for upper abdominal pain (difference, 0.94 points), bloating (difference, 0.88 points), and fullness/early satiety (difference, 0.81 points)
scores in the dyspepsia sample. Clearly, symptom relief was strongly associated with mean symptom severity and with changes in symptom severity as measured by the PAGI-SYM. Subjects who rated themselves as improved over 8 weeks showed significantly larger decreases in mean PAGI-SYM subscale scores compared to those patients who remain unchanged or who have worsened. These findings were consistently demonstrated in both the GERD and dyspepsia samples in this study. In the current study, physicians and patients agreed on 84% of the assessments about change in clinical status (i.e., improved, stable, worsened). For GERD patients, the most relevant PAGI-SYM subscale is heartburn/regurgitation. In GERD subjects who improved, a decrease (indicating improvements in symptom severity) of 1.2 points was observed in mean heartburn/regurgitation scores. This change was significantly different and better than those changes observed in the worsening or stable groups. For the other PAGI-SYM subscales, the largest improvements were seen in upper abdominal pain scores compared with a more modest decrease in the stable or increases in worsening subjects. Dyspepsia subjects who improved had 1.1-point decreases in fullness/early satiety, 1.2-point decreases in bloating scores, and 1.5-point decreases in upper abdominal pain scores. Subjects who remained stable or worsened reported minor changes in fullness/early satiety, bloating, and upper abdominal pain scores. Effect sizes for the PAGI-SYM subscale scores varied depending on specific subscale, disease group, and effect size method. For the GERD sample, effect sizes for heartburn/regurgitation ranged from 1.14 –1.16, suggesting large effects in this important and relevant symptom measure. According to Cohen,23 large effect sizes are 0.80 or greater. The findings suggest that changes or differences of 0.75 points are clearly clinically significant in GERD patients, and minimally important differences
September 2004
might be 0.55 points. We also observed that there was a 0.51-point difference in heartburn/regurgitation scores between patients reporting no restricted activity days and those reporting 3– 6 restricted activity days.14 The SEM for the PAGI-SYM heartburn/regurgitation subscale scores was 0.31, representing an approximately 0.32-point meaningful change in these scores, after adjusting for reliability. Previous research has found that the minimally important difference is generally equivalent to one SEM.9,21,22 On the basis of the validity study data, the SEMs, and effect size estimates, a treatment difference of 0.30 – 0.55 points is recommended as the minimally important difference for heartburn/regurgitation scores in GERD studies. For the dyspepsia sample, effect sizes for fullness/early satiety, bloating, and upper abdominal pain ranged from 0.83–1.28 depending on effect size method, reflecting relatively large effect sizes. The results indicate that changes of differences of 0.87 points in fullness/early satiety, 0.90 points in bloating, and 1.0 points in upper abdominal pain scores are definitely clinically significant. The minimally important differences for these subscale scores might range from 0.60 – 0.70 points. We also observed that there were differences in fullness/early satiety (difference, 0.64 points), bloating (difference, 0.31 points), or upper abdominal pain (difference, 0.60 points) scores between patients reporting no restricted activity days and those reporting 3– 6 restricted activity days.14 The SEMs observed in this study indicate that a 0.47-point difference in fullness/early satiety scores, a 0.72-point difference in bloating scores, and a 0.56-point difference in upper abdominal pain scores might be clinically meaningful. On the basis of these data, a treatment difference of 0.30 – 0.70 points is recommended as the minimally important difference for the fullness/early satiety, bloating, and upper abdominal pain scores in patients with dyspepsia. Several methods have been used to examine the clinical significance associated with changes in patient-reported outcome measures, the anchor-based and distribution-based methods.9,24,25 Anchor-based methods take patient-reported outcome scores and link them to an independent criterion measure, such as clinical status based on physician or laboratory measures. The independent indicators can be a clinical measure commonly used and understood by clinicians that is associated with the disease of interest or other patient-reported instruments with established measurement qualities. Distributionbased methods rely on statistical distributions of patientreported outcome measures from a given study. These methods include between-person standard deviations or
RESPONSIVENESS OF THE PAGI-SYM
775
effect sizes26 and SEM.21,22 Multiple indicators (anchors) and methods are used to evaluate clinical significance, and it is the triangulation of findings from these different methods that determines the minimal important difference for the patient-reported outcome measures.9 In the current study, we used a combination of methods to assess responsiveness and have found evidence supporting the clinical sensitivity of the PAGI-SYM. Specific subscales in the PAGI-SYM might be most useful for clinical studies in GI disorders. For GERD studies, the obvious primary end point should be the heartburn/regurgitation score, whereas the other PAGISYM subscales should be considered secondary end points. It is uncertain which subscales might be most useful in studies comparing dyspepsia treatments. However, the fullness/early satiety, bloating, and upper abdominal pain subscales of the PAGI-SYM seemed most responsive to changes in clinical status among dyspepsia subjects in the current study. Several potential limitations to this study should be considered when interpreting these results. First, data were collected by using telephone interviews and selfcompleted questionnaires. Although there might be some differences by mode of administration, a separate study demonstrated no differences between telephone and self-completed PAGI-SYM scores.19 Patients in this observational study were treated by using different medications according to usual clinical management by their physicians. No detailed information is available as to these treatments, so it is not possible to differentiate the effectiveness of different therapies. Finally, we combined data from U.S. and European patients for the responsiveness study, and there might be some differences in clinical characteristics between these groups. However, we completed extensive psychometric analyses by country and detected no substantive differences across countries.14 In summary, the PAGI-SYM is a brief 20-item symptom severity measure capturing data on common GI symptoms. In this study, we demonstrated responsiveness to changes in clinical status in more than 1500 subjects with GERD or dyspepsia. Additional studies are needed to confirm the responsiveness of the PAGI-SYM in other samples of patients with GI disorders and for other countries. Selected subscales within the PAGISYM might be useful as clinical end points for clinical trials comparing medical therapies for GERD or dyspepsia. The PAGI-SYM instrument might also be successfully used in monitoring patient outcomes and quality of care in community medical practice settings.
776
REVICKI ET AL.
CLINICAL GASTROENTEROLOGY AND HEPATOLOGY Vol. 2, No. 9
References 1. Drossman DA, Li Z, Andruzzi E, Temple RD, Talley NJ, Thompson WG, Whitehead WE, Janssens J, Funch-Jensen P, Corazziari E, Richter JE, Koch GG. U.S. householder survey of functional gastrointestinal disorders: prevalence, sociodemography, and health impact. Dig Dis Sci 1993;38:1569 –1580. 2. Rome II Multinational Working Teams. Drossman DA, Corazziari E, Talley NJ, Thompson WG, Whitehead WE. Rome II: the functional gastrointestinal disorders. 2nd ed. McLean, VA: Degnon Associates, 2000. 3. Everhart JE. Overview. In: Everhart JE, ed. Digestive diseases in the United States: epidemiology and impact. Washington, DC: U.S. Department of Health and Human Services, Public Health Service, National Institute of Health, National Institute Diabetes and Digestive and Kidney Diseases, 1994. 4. Camilleri M, Whitehead WE, Stewart W, Kahrilas PJ, Sonnenberg A, Robinson P, Sloan S, Revicki D, Willian MK. A U.S. national survey of upper gastrointestinal symptoms in 21,000 community participants (abstr 851). Gastroenterology 2000;118:A144. 5. Frank L, Kleinman L, Ganoczy D, Farup C, McQuaid K, Tougas G, Eggleston A, Sloan S, Nguyen M. Upper gastrointestinal symptoms in North America: prevalence and relationship to healthcare utilization and quality of life. Dig Dis Sci 2000;45:809 – 818. 6. Drossman DA. Do the ROME criteria stand up? In: Talley NJ, ed. Functional dyspepsia and irritable bowel syndrome: concepts and controversies. Dordrecht, the Netherlands: Kluwer Academic Publishers, 1998:11–18. 7. Talley NJ, Fullerton S, Junghard O, Wiklund I. Quality of life in patients with endoscopy-negative heartburn: reliability and sensitivity of disease-specific instruments. Am J Gastroenterol 2002;96:1998 –2004. 8. Havelund T, Lind T, Wiklund I, Glise H, Hernqvist H, Lauritsen K, Lundell L, Pedersen SA, Carlsson R, Junghard O, Stubberod A, Anker-Hansen O. Quality of life in patients with heartburn but without esophagitis: effects of treatment with omeprazole. Am J Gastroenterol 1999;94:1782–1789. 9. Guyatt GH, Osoba D, Wu AW, Wyrwich KW, Norman GR. Methods to explain the clinical significance of health status measures. Mayo Clin Proc 2002;77:371–383. 10. Revicki DA, Osoba D, Fairclough D, Barofsky I, Berzon R, Leidy NK, Rothman M. Recommendations on health-related quality of life research to support labeling and promotional claims in the United States. Qual Life Res 2000;9:887–900. 11. Scientific Advisory Committee of the Medical Outcomes Trust. Assessing health status and quality-of-life instruments. attributes and review criteria. Qual Life Res 2002;11:193–205. 12. Chassany O, Sagnier P, Marquis P, Fullerton S, Aaronson N, Group ERIoQA. Patient-reported outcomes: the example of health-related quality of life—a European guidance document for the improved integration of HRQoL assessment in the drug regulatory process. Drug Info J 2002;36:206 –218. 13. Revicki DA, Rentz AM, Dubois D, Kahrilis P, Stanghellini V, Talley N, Tack J. Gastroparesis Cardinal Symptom Index (GCSI): development and validation of a patient reported assessment of severity of gastroparesis symptoms. Aliment Pharmacother Ther 2003;18:141–150. 14. Rentz AM, Ciesla G, Kahrilas P, Stanghellini V, Tack J, Talley N, de la Loge C, Trudeau E, Dubois D, Revicki DA. Development and psychometric evaluation of the patient assessment of upper
15.
16.
17.
18.
19.
20. 21.
22.
23. 24.
25. 26.
gastrointestinal symptom severity index (PAGI-SYM) in patients with functional gastrointestinal disease. Qual Life Res (in press). Jones M, Coulie B, Revicki DA, Stewart W, Whitehead WE. Combined factor and cluster analysis to identify subgroups with functional gastrointestinal disorders (abstr). Gastroenterology 2002;122:A571. Talley NJ, Stanghellini V, Heading RC, Koch KL, Malagelada JR, Tytgat GN. Functional gastroduodenal disorders. Gut 1999; 45(suppl 2):II37–II42. Jaeschke R, Singer J, Guyatt GH. Measurement of health status: ascertaining the minimal clinically important difference. Control Clin Trials 1989;10:407– 415. Revicki DA, Sorensen S, Maton PN, Orlando RC. Health-related quality of life outcomes of omeprazole versus ranitidine in poorly responsive symptomatic gastroesophageal reflux disease. Dig Dis 1998;16:284 –291. Stanghellini V, Rentz AM, Schmier JK, Jones R, Dubois D, Peeters K, Revicki DA. Mode of administration of the patient assessment of upper gastrointestinal disorders-symptom severity (PAGI-SYM) and patient assessment of upper gastrointestinal disorders-quality of life (PAGI-QOL). American College of Gastroenterology Annual Meeting, 2001. Kazis LE, Anderson JJ, Meenan RF. Effect sizes for interpreting changes in health status. Med Care 1989;27(suppl):S178 –189. Wyrwich KW, Nienaber NA, Tierney WM, Wolinsky FD. Linking clinical relevance and statistical significance in evaluating intraindividual changes in health-related quality of life. Med Care 1999;37:469 – 478. Wyrwich KW, Tierney WM, Wolinsky FD. Further evidence supporting an SEM-based criterion for identifying meaningful intra-individual changes in health-related quality of life. J Clin Epidemiol 1999;52:861– 873. Cohen J. Statistical power analysis for the behavioral sciences. 2nd ed. Hillsdale, NJ: Lawrence Erlbaum Associates, 1988. Crosby RD, Kolotkin RL, Williams GR. Defining clinically meaningful change in health-related quality of life. J Clin Epidemiol 2003;56:395– 407. Lydick E, Epstein RS. Interpretation of quality of life changes. Qual Life Res 1993;2:221–226. Hays RD, Anderson RT, Revicki DA. Assessing reliability and validity of measurement in clinical trials. In: Fayers PM, ed. Quality of life assessment in clinical trials: methods and practice. Oxford: Oxford University Press, 1998.
Address requests for reprints to: Dennis Revicki, Ph.D., Center for Health Outcomes Research, MEDTAP International, 7101 Wisconsin Avenue, Suite 600, Bethesda, Maryland 20814. e-mail:
[email protected]; fax: (301) 654-9864. Dr. Talley is a consultant for Fanizzi Associates, Novartis, Oridion, Solvay, and Yamanouchi Pharma America, Inc. and has received research support from Merck, Forest, AstraZeneca, Novartis, and Solvay. Dr. Revicki and Ms. Rentz have received research support from AstraZeneca, Novartis, and Johnson & Johnson. For more information, contact Dennis Revicki, Ph.D., Center for Health Outcomes Research, MEDTAP International, 7101 Wisconsin Avenue, Suite 600, Bethesda, Maryland 20814; telephone: (301) 6549729; fax: (301) 654-9864; e-mail:
[email protected]. For permission to use the PAGI-SYM instrument, contact Natasha Serrano at
[email protected].
September 2004
RESPONSIVENESS OF THE PAGI-SYM
777