The Breathlessness, Cough, and Sputum Scale

The Breathlessness, Cough, and Sputum Scale

The Breathlessness, Cough, and Sputum Scale* The Development of Empirically Based Guidelines for Interpretation Nancy Kline Leidy, PhD; Stephen I. Ren...

231KB Sizes 12 Downloads 53 Views

The Breathlessness, Cough, and Sputum Scale* The Development of Empirically Based Guidelines for Interpretation Nancy Kline Leidy, PhD; Stephen I. Rennard, MD, FCCP; Jordana Schmier, MA; M. Kathryn C. Jones, MA, MSc; and Mitch Goldman, MD, PhD

Background: A patient report of respiratory symptoms in COPD is essential to successfully monitoring disease, adjusting treatment, and evaluating outcomes. Objective: To develop empirically based guidelines for interpreting mean changes in symptom scores using the Breathlessness, Cough, and Sputum Scale (BCSS). Methods: Analyses were performed on data from three multinational trials (n ⴝ 2,971). Mean changes in BCSS score with treatment were examined by physician and patient ratings of treatment efficacy, juxtaposed with percentage change in symptoms, statistical effect size (ES), ⌬FEV1, and change in St. George Respiratory Questionnaire (SGRQ) score. BCSS scores during an exacerbation were examined relative to changes in peak expiratory flow and rescue medication use. Results: Mean baseline BCSS total score was 5.2 ⴞ 2 (ⴞ SD); 90% of scores were between 2 and 9. Highly efficacious treatment (n ⴝ 257; physician rating) was associated with a ⌬BCSS total score of ⴚ 1.3 ⴞ 1.8, representing a 24% improvement (ES ⴝ 0.7), and corresponding to a 10% improvement in FEV1 and ⌬SGRQ score total of ⴚ 10.3 ⴞ 13.8. Similar changes in BCSS score were observed during recovery from an exacerbation (n ⴝ 713; ⴚ 1.3 ⴞ 1.8). Mean change with moderately efficacious treatment (n ⴝ 965) was ⴚ 0.7 ⴞ 1.8, a 13% improvement (ES ⴝ 0.3) corresponding to ⌬SGRQ total score of ⴚ 6.8 ⴞ 12.6. Mildly efficacious treatment (n ⴝ 891) was associated with a change of ⴚ 0.35, a 7% improvement (ES ⴝ 0.18), with a ⌬FEV1 <1% and ⌬SGRQ total score of ⴚ 2.6 ⴞ 11.7. Conclusions: Patient-reported daily symptom data are sensitive to change and useful for both observational studies and controlled clinical trials of patients with COPD. A mean ⌬BCSS total score > 1.0 represents substantial symptomatic improvement, changes of approximately 0.6 can be interpreted as moderate, and changes of 0.3 can be considered small. (CHEST 2003; 124:2182–2191) Key words: bronchitis; clinical trials; COPD; cough; dyspnea; emphysema; exacerbation; health-related quality of life; instrument; patient-reported outcome; sputum; St. George Respiratory Questionnaire; symptoms Abbreviations: BCSS ⫽ Breathlessness, Cough, and Sputum Scale; ES ⫽ effect size; HRQL ⫽ health-related quality of life; Mb ⫽ baseline mean value; Mt ⫽ treatment mean value; ␴b ⫽ baseline variance in Breathlessness, Cough, and Sputum Scale score; PEF ⫽ peak expiratory flow; SGRQ ⫽ St. George Respiratory Questionnaire

refers to a cluster of diseases (including C OPD chronic bronchitis and emphysema) characterized by progressive reduction in expiratory airflow with increasing breathlessness, and often with cough, and sputum production.1 Consensus statements published by the Global Initiative for Chronic Obstructive Lung Disease,1 as well as the European Respiratory Society,2 the American Thoracic Society,3 and the British Thoracic Society,4 emphasize the importance of assessing these symptoms, in addition to spirometry, in the diagnosis of COPD. Although 2182

pulmonary function is an important objective marker of the degree of airflow limitation, the patient’s report of symptom severity is an independent measure essential to successfully monitor disease activity, adjust treatment, and evaluate outcomes of care.1 There are relatively few instruments to assess the severity of respiratory symptoms and symptom outcomes in the COPD population within the context of a clinical trial. The Baseline Dyspnea Index/Transition Dyspnea Index,5 modified Medical Research Council Dyspnea Scale,6 Oxygen Cost Diagram,7 and Clinical Investigations

Borg Scale8 have been used to evaluate breathlessness but do not address cough or sputum. The length and 3-month recall period of the chronic lung disease severity index9 is not conducive to evaluating treatment outcomes in controlled clinical trials. Condition-specific health-related quality of life (HRQL) measures, such as the St. George Respiratory Questionnaire (SGRQ)10 or the Chronic Respiratory Questionnaire,11 were not designed to assess symptom severity per se, but rather the impact of symptoms on daily functioning and well-being. None of these measures prospectively assess the severity or variability of symptoms day to day. In fact, as of this date, there are no published, validated (reliable, valid, responsive) measures to assess day-to-day symptom severity in COPD. The Breathlessness, Cough, and Sputum Scale (BCSS) was developed to meet the need for a precise, easy-to-use instrument for tracking the severity of respiratory symptoms and evaluating efficacy of treatment in clinical trials of patients with COPD. Designed as part of a daily diary, subjects are asked to assess and record the severity of three symptoms of COPD: breathlessness, cough, and sputum. The diary format of the BCSS enables investigators, and potentially clinicians, to assess symptom variability, including the variance associated with acute exacerbations, and to evaluate the trajectory of symptom severity over time in this patient population. The symptoms of breathlessness, cough, and sputum have been identified as cardinal symptoms of COPD in various consensus statements,1– 4 and are those most likely to be affected by pharmacotherapy designed to ameliorate and control respiratory symptoms in this population. The BCSS is a brief, three-item, patient-reported outcome measure in which each of the three symptoms assessed by the measure is represented by a single item (see Appendix). Patients are asked to evaluate each symptom/item on a 5-point Likert-type scale, ranging from 0 to 4, with higher scores indicating a more severe manifestation of the symptom. *From MEDTAP International (Dr. Leidy and Ms. Schmier), Bethesda, MD; University of Nebraska Medical Center (Dr. Rennard), Omaha, NE; and AstraZeneca (Dr. Goldman and Ms. Jones), Wilmington, DE. This study was supported by AstraZeneca Pharmaceuticals LP. Dr. Jones and Dr. Goldman are employees of AstraZeneca. Dr. Rennard has received research grant support, honoraria for speaking, and fees for consultancy from AstraZeneca. Neither he nor his family have any ownership position in AstraZeneca. Manuscript received October 15, 2002; revision accepted July 23, 2003. Reproduction of this article is prohibited without written permission from the American College of Chest Physicians (e-mail: [email protected]). Correspondence to: Nancy Kline Leidy, PhD, Center for Health Outcomes Research, MEDTAP International, Inc., 7101 Wisconsin Ave, Suite 600, Bethesda, MD 20814; e-mail: [email protected] www.chestjournal.org

A daily total score is expressed as the sum of three item scores, with a range of 0 to 12. For greater precision as a clinical trial outcome, weekly and period (eg, baseline, treatment, follow-up) scores can be computed by aggregating daily scores over time. Results of a previous investigation12 provided evidence that the BCSS is a reliable and valid measure of symptom severity. Item and total scale scores were found to be internally consistent (Cronbach ␣ ⫽ 0.70 daily; 0.95 to 0.99 over time) and reproducible under stable conditions, in this case, a situation in which both the physician and patient agreed that the treatment produced no change or was mildly effective. Intraclass correlation coefficients for item and total scores ranged from 0.74 to 0.78. Values for both indicators of reliability exceeded the guideline of 0.70 for group-level analyses.13 The study also provided evidence of the concurrent, convergent, divergent, and discriminant validity of the BCSS. The previous investigation12 also indicated the instrument is responsive to change, with preliminary tests suggesting a mean decline of 1 point on the BCSS total scale signifies a substantial reduction in symptom severity. Mean score changes considered small (minimal) to moderate and changes observed during the course of an acute exacerbation of COPD have not yet been determined. The purpose of this study was to evaluate the magnitude of mean change scores observed in the BCSS under various scenarios and develop empirically based guidelines for interpreting results of clinical trials and observational or naturalistic studies in which this instrument is used.

Materials and Methods Design Secondary analyses were performed on available, blinded data from 3,643 patients undergoing treatment during the course of three phase-IIIa, multicenter, multinational, randomized, double-blind, placebo-controlled clinical trials evaluating the safety and efficacy of sibenadet, a novel dual-dopamine D2 receptor/␤2 adrenoreceptor agonist developed to treat bronchoconstriction and ameliorate respiratory symptoms in patients with COPD14,15; two of the trials involved a 12-week treatment period (n ⫽ 2,440), and the third trial was 26 weeks in length (n ⫽ 1,203).15 The statistical analysis plan for the secondary analysis was developed by the authors based on the study protocols before receiving the data, and all analyses were performed across treatment groups. Study Subjects All patients were between 35 years and 80 years of age with a history of COPD ⱖ 2 years, a smoking history, and a percentage of predicted FEV1 between 20% and 70%. Exclusion criteria included a complicating comorbid condition or need for domiciliary oxygen.15 CHEST / 124 / 6 / DECEMBER, 2003

2183

Measures In these trials, the BCSS was administered as a daily diary, yielding multiple assessments over time for each subject. Consistent with its use in clinical trials14,15 and the validation study,12 daily scores for each patient were aggregated over the baseline period and again over the final 4 weeks of treatment, defined by the end of study or study discontinuation. Unless otherwise specified, change scores reflect the difference between these two values. In addition to the BCSS, physician and patient ratings of treatment efficacy at the end of the study, FEV1, SGRQ scores, evening peak flow (PEF) [available in two of the three studies], rescue medication use, and physician-determined severity of the first exacerbation during the study period were used in the analyses. Physician and patient ratings of treatment efficacy were appraised on a 5-point scale, from “highly effective” to “made condition worse.” The SGRQ is a widely-used, condition-specific measure of health status or HRQL.10 Scores range from 0 to 100, with higher scores indicating poorer health status. Guidelines for interpretation suggest changes in the SGRQ total score of ⫾ 4 points are clinically meaningful, ⫾ 8 are considered moderate, while ⫾ 12 are interpreted as a large change in health status.16 For the subgroup of patients experiencing at least one exacerbation during the study period, symptom severity (BCSS), PEF, and rescue medication use prior to, during, and immediately following the first exacerbation were examined for the subgroup as a whole and by physician-defined severity of exacerbation (mild, moderate, severe). COPD exacerbations were defined as worsening symptoms of COPD requiring drug therapy in addition to study drug, rescue medication, and doses of concomitant COPD medications.14 Analytical Approach Analyses were performed across treatment groups using aggregated data from the three clinical trials. A triangulation approach was used, where mean score changes on the BCSS were examined in light of mean changes observed in related, commonly used clinical parameters and statistical indicators of magnitude. The following clinical parameters were used in this study: patient and physician ratings of treatment efficacy, FEV1, SGRQ score, presence of an exacerbation, and PEF and rescue medication use during an exacerbation. Percentage change and effect size (ES) served as statistical indicators. The analytical procedures were as follows. First, three different methods were used to stratify the data into five groups based on ratings of treatment efficacy: highly effective, moderately effective, mildly effective, not effective, and condition made worse. The three rating methods were physician rating of treatment efficacy, patient rating of treatment efficacy, and convergence of physician and patient ratings, that is, the subgroup of patients for whom both the physician and the patient agreed on their ratings of treatment efficacy. The purpose of this procedure was to provide insight into the magnitude of BCSS score changes associated with various levels of treatment efficacy as perceived by physicians and patients. Although the stratification process reduced the effective sample size from the 3,643 randomized patients to 2,971 patients, due to the availability of ratings data as well as complete data on all clinical parameters, the relatively large sample sizes in four of the five efficacy subgroups (n ⫽ 195 to n ⫽ 1,063) engendered confidence in the stability of the parameter estimates. Results for the subgroup of patients rated “worse” should be considered preliminary due to relatively small sample sizes (n ⫽ 24 to n ⫽ 62). Because this study was designed to develop interpretation guidelines, ie, to assign meaning to change scores rather than to 2184

test the statistical significance of change, only one series of statistical tests was performed at this step in the analyses. Paired t tests were used to confirm that any changes in BCSS score observed in the smaller, “not effective” groups were not significantly different from zero. The statistical significance of change observed in the remaining groups was not relevant to the purposes of this study. Second, the mean BCSS total and item change scores, stratified by level of treatment efficacy using the three methods outlined above, were juxtaposed with percentage change in symptom severity, statistical ES, ⌬FEV1, and ⌬SGRQ score. ES estimates were computed by subtracting the baseline mean value (Mb) from the treatment mean value (Mt) and dividing by the baseline variance in BCSS score (␴b) [ES ⫽ Mt ⫺ Mb/␴b], and were interpreted as follows: values at or near 0.10 were considered small, values at or near 0.30 were regarded as moderate, and values near or ⬎ 0.50 were interpreted as large.17 Changes in SGRQ score were evaluated in light of recommended guidelines for the measure as outlined above.16 Percentage change and ⌬FEV1 provided further clinical context for the interpretation of changes in BCSS score. Finally, to further evaluate the meaning associated with ⌬BCSS scores under conditions of known change in the patient’s clinical condition, scores during the 7 days preceding a physiciandocumented exacerbation, the first 7 days of the exacerbation, and the first 7 days following medically confirmed resolution were also examined. First, mean daily scores were computed across patients and exhibited graphically, to note the daily patterns of symptom change during this period. Mean values were then computed for each of the three time periods, consistent with the principal of data aggregation over time used in clinical trials. These mean values were then juxtaposed with data on percentage ⌬BCSS score, ES, and available data on PEF and rescue medication use during these time periods. The BCSS was also tested for sensitivity to change under this clinical condition using repeated-measures analyses of variance. PEF and rescue medication use were subjected to the same statistical tests to confirm that the three parameters were all moving in the same direction, ie, reflected expected changes in the clinical condition of these patients. Finally, Spearman rank-correlation coefficients (rs) were used to evaluate the consistency of patient ranking based on the magnitude of changes in BCSS score, PEF, and rescue medication use during exacerbations.

Results Of the 3,643 patients enrolled in the three trials, 2,971 patients were eligible for these secondary analyses based on the existence of data necessary for the analyses outlined above. Within this group, 1,348 patients had an exacerbation during the course of the trial, of whom 713 patients had BCSS data for use in these analyses. Demographic and clinical characteristics of the analytical sample are provided in Table 1. Mean baseline BCSS total score for the sample was 5.18 ⫾1.97 on the 12-point scale (mean ⫾ SD), with 90% of the sample scoring between 2 and 9). Triangulation results for the BCSS total and item scores for patients with complete data on all indicators are provided in Tables 2, 3. Treatment considered highly efficacious was associated with mean ⌬BCSS total score ⬎ ⫺ 1.0, which corresponded to ⬎ 20% improvement in symptoms and an ES Clinical Investigations

Table 1—Sample Demographic and Clinical Characteristics* Characteristics

Data

Sample size, No.† Male gender Mean age, yr White race Smoking history, pack-years Pulmonary function FEV1, L FEV1% predicted FVC, L FEV1/FVC Exacerbations‡ ⱖ1 Complete BCSS data for analysis

2,971 2,224 (75) 63.5 ⫾ 8.8 2,936 (98.8) 45.4 ⫾ 13.73 1.29 ⫾ 0.49 41.3 ⫾ 13.6 2.62 ⫾ 0.84 0.50 ⫾ 0.12 1,348 (45) 713 (53)

*Data are presented as mean ⫾ SD or No. (%) unless otherwise indicated. †With data suitable for secondary analysis. ‡Physician-identified exacerbation of COPD.

⬎ 0.60, exceeding the 0.50 value considered large. Highly effective treatment was also associated with a ⌬SGRQ total score greater than the 8-point change regarded as a moderate change in health status, and an 8 to 10% improvement in FEV1. For patients in whom treatment was considered moderately efficacious, a mean ⌬BCSS total score of at least ⫺ 0.60 was observed, corresponding to ⬎ 10% improvement in symptoms and an ES of at least 0.30, considered moderate. Mean ⌬SGRQ total score exceeded the 4-point guideline for meaningful improvement in health status, while FEV1 improved by approximately 5%. Finally, mean ⌬BCSS total score associated with treatment considered mildly effective was approximately ⫺ 0.35, representing a 6 to 7% improvement in symptoms and an ES nearing 0.20, larger than the 0.10 regarded as small. In this case, mean ⌬SGRQ score was less than the 4-point guideline for minimal change, with no change in FEV1. Mean BCSS total score did not change in the subgroup of patients for whom the treatment was considered ineffective (t ⫽ ⫺ 0.12, p ⫽ 0.90 for physician rating of treatment efficacy; t ⫽ ⫺ 1.89, p ⫽ 0.06 for patient rating of treatment efficacy; and t ⫽ ⫺ 1.47, p ⫽ 0.14, for the subgroup of patients and physicians who concurred on their efficacy ratings). Mean changes in the three items comprising the BCSS are shown in Table 3. Because of the consistency in results across the three methods, only results for physician ratings of treatment efficacy are shown. Change scores for SGRQ subscales are provided to further aid in the interpretation process. Breathlessness, cough, and sputum scores for treatment considered highly effective changed by an www.chestjournal.org

average of ⫺ 0.56, ⫺ 0.40, and ⫺ 0.30, respectively, from baseline to the end of treatment. These changes reflected ⬎ 20% improvement in each of the three symptoms and ESs of 0.79 (breathlessness, large), 0.52 (cough, moderate to large), and 0.37 (sputum, moderate). Concomitant ⌬FEV1 and ⌬SGRQ score are provided in Table 2. Mean daily BCSS total scores over time for the exacerbation period by severity of exacerbation are shown in Figure 1. Mean values for each of the three arbitrarily defined phases of exacerbation and mean change over time are shown in Table 4. As expected, there were significant time and severity-by-time interaction effects for the BCSS total score (p ⬍ 0.0001). Rescue medication use also showed a significant time (p ⬍ 0.0001) and severity-by-time interaction (p ⬍ 0.001). Mean ⌬BCSS scores for periods of symptomatic decline (1.24, 24% deterioration) and recovery (⫺ 1.28, 20% improvement) were similar to those observed with highly effective treatment. PEF data (n ⫽ 448) showed a 4% decline and 5% improvement during deterioration and recovery, respectively. As one might expect, the most remarkable symptomatic improvements were seen in patients recovering from a severe exacerbation, where a mean ⌬BCSS total score of ⫺ 1.81 was observed (n ⫽ 99), with a parallel (24%) decline in rescue medication use (n ⫽ 69). The individual item scores deteriorated an average of 0.31 ⫾ 0.60 for breathlessness, 0.50 ⫾ 0.73 for cough, and 0.43 ⫾ 0.68 points for sputum; with recovery, scores improved by ⫺ 0.39 ⫾ 0.66, ⫺ 0.49 ⫾ 0.75, and ⫺ 0.39 ⫾ 0.70 for the three symptoms, respectively. There was a statistically significant correspondence between ⌬BCSS score and change in both PEF and rescue medication use during the course of the exacerbation (p ⬍ 0.0001). As peak flow declined, BCSS total score increased (symptoms became more severe; rs ⫽ 0.51). Correlations between PEF and the three symptoms comprising the BCSS were ⫺ 0.44, ⫺ 0.46, and ⫺ 0.45 for breathlessness, cough, and sputum, respectively. Similarly, as BCSS scores increased, there was a concomitant increase in rescue medication use (rs ⫽ 0.37 for the total score; rs ⫽ 0.40, rs ⫽ 0.30, and rs ⫽ 0.27 for breathlessness, cough, and sputum, respectively). During the recovery period, BCSS scores (symptoms) improved as PEF improved (rs ⫽ 0.47 for the total score; rs ⫽ ⫺ 0.45, rs ⫽ ⫺ 0.34, and rs ⫽ ⫺ 0.36 for breathlessness, cough, and sputum, respectively) and rescue medication use declined (rs ⫽ 0.39 for the total score; rs ⫽ 0.40, rs ⫽ 0.29, and rs ⫽ 0.29 for breathlessness, cough, and sputum, respectively). CHEST / 124 / 6 / DECEMBER, 2003

2185

2186

Clinical Investigations

⫺ 10.27 (13.82) 257 ⫺ 21.3 0.60 ⫺ 6.80 (12.59) 965 ⫺ 14.0 0.41 ⫺ 2.56 (11.73) 891 ⫺ 5.2 0.15 0.67 (10.78) 819 1.4 0.04 5.05 (9.34) 39 9.7 0.31

⫺ 0.67 (1.82) 965 ⫺ 13.0 0.34

⫺ 0.36 (1.74) 891 ⫺ 6.9 0.18

⫺ 0.01 (1.65) 819 0.1 0.00

⫺ 0.25 (1.81) 39 ⫺ 4.3 0.11

SGRQ Total

⫺ 1.26 (1.81) 257 ⫺ 24.3 0.67

BCSS Total

⫺ 0.05 (0.21) 39 ⫺ 5.3 0.14

⫺ 0.02 (0.23) 819 ⫺ 1.5 0.04

0.01 (0.25) 891 0.6 0.02

0.08 (0.31) 965 5.8 0.16

0.14 (0.39) 257 10.0 0.27

FEV1, L

*Subgroup in which physicians and patients concurred on treatment efficacy. †ES ⫽ (Mt ⫺Mb)/␴b.

Highly effective Mean (SD) ⌬ No. Change, % ES Moderately effective Mean (SD) ⌬ No. Change, % ES Mildly effective Mean (SD) ⌬ No. Change, % ES Not effective Mean (SD) ⌬ No. Change, % ES Made worse Mean (SD) ⌬ No. Change, % ES

Efficacy Rating†

Physician (n ⫽ 2,971)

0.11 (1.84) 62 1.9 0.05

0.12 (1.64) 621 2.4 0.06

⫺ 0.33 (1.69) 98 6.3 0.17

⫺ 0.60 (1.79) 1063 ⫺ 12.0 0.31

⫺ 1.13 (1.85) 428 ⫺ 21.6 0.60

BCSS Total

6.48 (11.78) 62 12.8 0.43

1.44 (10.64) 621 3.0 0.09

⫺ 1.39 (11.07) 798 ⫺ 2.8 0.08

⫺ 6.35 (12.39) 1063 ⫺ 13.1 0.39

⫺ 9.78 (13.05) 428 ⫺ 20.0 0.58

SGRQ Total

Patient (n ⫽ 2,272)

6.48 (11.78) 62 12.8 0.43

⫺ 0.01 (0.24) 621 0.1 0.02

0.00 (0.25) 798 0.0 0.01

0.06 (0.31) 1063 4.4 0.12

0.11 (0.34) 428 8.6 0.22

FEV1, L

⫺ 0.53 (1.89) 24 ⫺ 8.7 0.25

0.11 (1.62) 516 2.0 0.05

⫺ 0.37 (1.75) 486 ⫺ 7.0 0.19

⫺ 0.63 (1.78) 666 ⫺ 12.4 0.33

⫺ 1.36 (1.82) 195 ⫺ 26.7 0.77

BCSS Total

5.81 (9.62) 24 10.9 0.39

1.65 (10.46) 516 3.4 0.10

⫺ 1.26 (11.69) 486 ⫺ 2.5 0.07

⫺ 7.06 (12.77) 666 ⫺ 14.4 0.43

⫺ 11.73 (14.16) 195 ⫺ 24.5 0.69

SGRQ Total

⫺ 0.06 (0.18) 24 ⫺ 6.2 0.18

⫺ 0.01 (0.24) 516 ⫺ 0.6 0.02

0.00 (0.26) 486 0.2 0.01

0.08 (0.32) 666 5.8 0.17

0.15 (0.39) 195 10.0 0.27

FEV1, L

Physician and Patient* (n ⫽ 1,887)

Table 2—Change From Baseline to End of Study in Mean BCSS Total Score by Physician and Patient Ratings of Treatment Efficacy Juxtaposed With Mean ⌬ SGRQ Total Score and FEV1

Table 3—Change From Baseline to End of Study in Mean BCSS Item Scores by Physician Rating of Treatment Efficacy, Juxtaposed With Mean Change in SGRQ Subscale Scores BCSS Items Efficacy Rating* Highly Effective Mean (SD) ⌬ No. Change, % ES Moderately effective Mean (SD) ⌬ No. Change, % ES Mildly effective Mean (SD) ⌬ No. Change, % ES Not effective Mean (SD) ⌬ No. Change, % ES Made Worse Mean (SD) ⌬ No. Change, % ES

SGRQ Subscales

Breathlessness

Cough

Sputum

Symptoms

Activities

Impacts

⫺ 0.56 (0.70) 257 ⫺ 28.7 0.79

⫺ 0.40 (0.73) 257 ⫺ 23.1 0.52

⫺ 0.30 (0.76) 257 ⫺ 20.1 0.37

⫺ 14.91 (20.49) 257 ⫺ 25.2 0.70

⫺ 9.01 (17.00) 257 ⫺ 14.5 0.44

⫺ 9.46 (16.34) 257 ⫺ 25.7 0.50

⫺ 0.35 (0.71) 965 ⫺ 18.1 0.48

⫺ 0.20 (0.75) 965 ⫺ 11.8 0.25

⫺ 0.11 (0.72) 965 ⫺ 7.6 0.14

⫺ 9.88 (19.80) 965 ⫺ 17.0 0.48

⫺ 5.89 (15.04) 965 ⫺ 9.3 0.31

⫺ 6.35 (14.78) 965 ⫺ 16.7 0.33

⫺ 0.21 (0.69) 891 ⫺ 10.4 0.28

⫺ 0.13 (0.72) 891 ⫺ 7.6 0.16

⫺ 0.02 (0.72) 891 ⫺ 1.2 0.02

⫺ 5.27 (19.18) 891 ⫺ 9.3 0.25

⫺ 1.99 (14.36) 891 ⫺ 3.1 0.11

⫺ 1.95 (14.24) 891 ⫺ 5.1 0.10

⫺ 0.05 (0.66) 819 ⫺ 2.6 0.07

⫺ 0.03 (0.70) 819 ⫺ 1.9 0.04

0.08 (0.69) 819 5.7 0.09

⫺ 1.46 (19.84) 819 ⫺ 2.5 0.07

0.42 (13.45) 819 0.6 0.02

1.54 (13.05) 819 4.2 0.09

0.02 (0.67) 39 1.0 0.03

⫺ 0.11 (0.72) 39 ⫺ 6.2 0.11

⫺ 0.16 (0.75) 39 ⫺ 1.02 0.16

5.98 (16.99) 39 9.8 0.29

4.44 (14.39) 39 6.2 0.25

5.06 (10.54) 39 13.3 0.27

*ES ⫽ (Mt ⫺ Mb)/␴b.

Discussion The purpose of this study was to develop guidelines for interpreting group-level changes in BCSS scores in clinical studies of COPD. A triangulation method was used, juxtaposing BCSS change scores by clinical indicators of treatment success commonly used in clinical practice and trials involving patients with COPD (physician and patient perception of treatment efficacy, FEV1, and SGRQ scores) along with statistical indicators of magnitude (percentage change and ES). Daily and weekly BCSS scores were also examined under conditions of known change, that is, from the period immediately prior to a medically confirmed exacerbation to the first 7 days of the exacerbation period, and the first 7 days following physician determined resolution. In this case, ⌬BCSS scores were examined in light of two objective clinical measures commonly used to evaluate treatment success under these conditions (PEF and rescue medication use), as well as the statistical indicators of magnitude. It is important to note that this study was a secondary analysis of an existing data set, rather than a prospective study of change. In order to fully utilize the available data without bias, a comprehensive statistical analysis plan was developed by investigawww.chestjournal.org

tors who did not develop the instrument and who were blind to the data at the time the plan was written (N.K.L., J.S.). These investigators performed the analyses independently and according to the prespecified, a priori analysis plan. All results were examined and critiqued by clinical investigators (S.I.R., M.G.), who also assisted in the interpretation and presentation of results. Thus, every step possible was taken to minimize bias. The results of this secondary analysis suggests that a mean improvement in BCSS total score with treatment approaching or exceeding 1 point represents a substantial improvement in symptoms in the COPD population. This value corresponded to ratings by both physicians and patients that the treatment was highly efficacious. The corresponding mean ⌬SGRQ total score exceeded the guideline of 8 points considered a large improvement in health status, indicating this level of symptom relief was accompanied by substantial improvements in patient functioning and well-being. Symptomatic improvements ⬎ 1 point were also observed during recovery from acute exacerbations of COPD, and were accompanied by improvements in PEF and a reduction in rescue medication use. Together, these results indicate mean changes ⬎ 1 point on the BCSS total CHEST / 124 / 6 / DECEMBER, 2003

2187

Figure 1. Mean BCSS total score for patients with complete data (n ⫽ 713, total; n ⫽ 219, mild; n ⫽ 395, moderate; n ⫽ 99, severe) for the pre-exacerbation period (7 days before exacerbation), exacerbation period (first 7 days of exacerbation), and postexacerbation period (first 7 days after medically confirmed resolution) by exacerbation severity. (Sample variability [SD] for the three severity groups during exacerbation: 2.18 to 2.38, mild, 2.18 to 2.44, moderate; and 2.24 to 2.64, severe).

score represent a relatively dramatic improvement in symptoms for patients with COPD. That is not to say that further improvement would not be possible or desirable, but rather that treatment leading to a group-level mean change of ⱖ 1 point on the BCSS total score corresponds to substantial symptomatic relief in this population. Mean improvements in BCSS total score of ⫺ 0.60 to ⫺ 0.70 were observed in patients for whom treatment was considered moderately effective. This change was associated with smaller yet substantial percentage improvements in symptoms, along with clinically meaningful changes in health status or HRQL. Once again, patients experienced sufficient symptomatic relief to also realize meaningful, though less dramatic, improvements in daily function and well-being. Together, these results suggest that a mean ⌬BCSS total score of or near 0.60 can be interpreted as a moderate-to-large symptomatic improvement from both clinical and empirical perspectives. A mean ⌬BCSS total score of ⫺ 0.30 to ⫺ 0.40 accompanied treatment considered mildly efficacious and corresponded to 6 to 7% improvement in symptoms and negligible change in HRQL. How2188

ever, results from interviews with COPD patients conducted as part of a cognitive debriefing study for the BCSS suggest patients can perceive and appreciate a small improvement in symptoms, regardless of any concomitant improvement in day-to-day activities (data not shown). One could argue, therefore, that a ⌬BCSS total score ⬎ 0.30 to 0.35, though small, represents clinically meaningful symptomatic relief to patients with COPD. Mean BCSS values of ⫺ 0.01 to ⫺ 0.12 were observed in patients for whom treatment was perceived, by physicians and patients, as ineffective. No changes in pulmonary function or health status were observed in this group. These results attest to the reproducibility of the BCSS under conditions of clinical stability. They also cast doubt on the clinical significance of any treatment in which a mean ⌬BCSS total score of ⬍ 0.20 is observed. Interpretation of individual items comprising the BCSS requires further quantitative and qualitative study of symptom-specific change. However, based on the results of this and previous studies,12 mean change scores of 0.25 to 0.50 on breathlessness, 0.15 to 0.40 on cough, and 0.10 to 0.30 on sputum could be interpreted as moderate-to-large changes, with Clinical Investigations

www.chestjournal.org

CHEST / 124 / 6 / DECEMBER, 2003

2189

5.29 (2.15)

4.77 (2.10)

5.33 (2.05)

6.32 (2.28)

219

395

99

Prior

713

No.

7.50 (2.20) 1.19 (2.03) 18.8 0.52

6.72 (2.15) 1.39 (1.77) 26.0 0.68

5.78 (2.19) 1.01 (1.51) 21.2 0.48

6.54 (2.24) 1.24 (1.74) 23.5 0.58

During

BCSS Total Score

5.70 (2.21) ⫺ 1.81 (2.10) ⫺ 24.1 0.82

5.33 (2.10) ⫺ 1.39 (1.80) ⫺ 20.7 0.65

4.94 (2.19) ⫺ 0.84 (1.53) ⫺ 14.6 0.39

5.26 (2.15) ⫺ 1.28 (1.80) ⫺19.6 0.57

After

69

259

120

448

No.

213.33 (70.67)

247.29 (83.92)

269.71 (83.71)

248 (83.6)

Prior

204 (73) ⫺ 9 (23) ⫺ 4.3 0.13

237 (83) ⫺ 11.60 (24) ⫺ 4.3 0.13

261 (83) ⫺ 9 (19) ⫺ 3.3 0.11

238 (83) ⫺ 10 (22) ⫺ 4.0 0.12

During

PEF, mL

225.49 (78.22) 21.42 (38.00) 10.5 0.29

249.56 (83.92) 12.87 (27.31) 5.4 0.16

266.27 (85.25) 5.36 (10.13) 2.1 0.06

250 (84.2) 12.18 (28.0) 5.1 0.15

After

94

384

205

683

No.

6.7 (5.3)

4.7 (4.0)

3.4 (3.8)

4.6 (4.2)

Prior

8.1 (11.6) 1.4 (8.7) 21.2 0.27

5.4 (4.3) 0.70 (1.9) 14.7 0.17

4.0 (4.4) 0.6 (2.4) 17.3 0.15

5.3 (6.0) 0.8 (3.8) 16.6 0.18

During

After

6.2 (5.9) ⫺ 1.9 (7.3) ⫺ 24.0 0.17

4.8 (3.9) ⫺ 0.7 (2.3) ⫺ 12.2 0.15

3.9 (4.1) ⫺ 0.1 (2.1) ⫺ 2.2 0.02

4.6 (4.2) ⫺ 0.7 (3.5) ⫺ 12.4 0.11

Rescue Medication Use, Puffs/d

*Patients with complete data over three time periods (n ⫽ 713): prior (7 days before exacerbation), during (first 7 days of examination), and after (first 7 days after medically confirmed resolution). †Change, % Change, and ES are for changes from the before to during and from during to after exacerbation periods. (Mt ⫺ Mb)/␴b.

Overall Mean (SD) Change Change, % ES Mild Mean (SD) Change Change, % ES Moderate Mean (SD) Change Change, % ES Severe Mean (SD) Change Change, % ES

Physician Rating of Exacerbation Severity†

Table 4 —Mean BCSS Total Scores Immediately Prior, During, and Immediately After Acute Exacerbation by Exacerbation Severity, Juxtaposed With PEF and Rescue Medication Use*

the descending values across the three symptoms consistent with the dominance of breathlessness in the selection and evaluation of treatment in this population. These values are proposed as a basis for further clinical and qualitative investigation. Results of this study also substantiate the need for observational and qualitative studies of the pattern and meaning associated with day-to-day symptom variability and the predictive value of increasing scores over time in patients with COPD. The current study showed a progressive increase in symptoms prior to medical confirmation of an exacerbation, implying a delay in seeking treatment. Unfortunately, this waiting period impedes timely intervention for acute exacerbations to reduce the severity and duration of the exacerbation.18 The tendency for patients to tolerate increasing symptoms may also contribute to the underreporting of exacerbations described by Seemungal and colleagues.19,20 Daily monitoring of the key symptoms of COPD may be a useful clinical tool for helping patients track the status of their disease, detect subtle yet clinically meaningful increases in symptom severity, modify their treatment or seek care, and initiate early treatment of a developing exacerbation. Consensus documents and new guidelines for the classification, treatment, and management of COPD1 suggest symptom management should play a primary role in the treatment of stable disease. In this context, the evaluation of new treatments requires a robust quantitative method for symptom assessment. The current study indicates daily patient reports of symptom severity, using instruments such as the BCSS, are sensitive to the effects of treatment in this population, and could therefore be used to assess therapeutic interventions. Current guidelines also recognize exacerbations as therapeutic targets in COPD. Several therapies have been suggested to decrease exacerbation frequency, severity, or duration21–25; these studies have been hampered, however, by a lack of consensus on the definition for both exacerbation events and severity. A recent consensus panel26 suggested tying exacerbation definition and severity assessment to healthcare utilization. While operationally convenient and relevant for health economic analyses, variations in health-care delivery systems complicate the generalizability of results obtained from studies using such definitions. Since COPD exacerbations are characterized by symptomatic worsening, an instrument such as the BCSS that provides simple and robust quantification of symptoms that can be repeated on a daily basis, may prove particularly valuable for evaluating exacerbations in patients with COPD. In summary, the purpose of this study was to establish guidelines for interpreting mean change 2190

scores in clinical trials in which the BCSS is used to evaluate symptomatic improvement in patients with COPD. A triangulation approach was used, in which changes in BCSS scores were evaluated in light of changes observed in familiar clinical and statistical indicators of improvement in this population. Based on this evaluation, a mean improvement of ⱖ 1.0 on the BCSS total score is an indication that, on average, patients in a given treatment group experienced dramatic symptomatic relief with treatment; changes of ⫺ 0.60 to ⫺ 0.70 represent moderate-to-large symptomatic improvement; and mean changes at or near ⫺ 0.35 indicate the treatment under evaluation provides small, but clinically meaningful symptom relief. It is doubtful that treatment with a group-level mean change ⬍ ⫺ 0.20 offers meaningful symptomatic benefit. A prospective study with a prior hypotheses addressing change in BCSS scores and other clinical indicators of improvement, along with qualitative interviews with patients and clinicians concerning the magnitude of symptomatic change with treatment, would be needed to further this tool for clinical research. Nonetheless, these results provide evidence that patient-reported diary card data can be sensitive to change in patients with COPD and may serve as valuable tools for both clinical research and as an assessment of the effect of medical and therapeutic interventions in such patients.

Appendix: BCSS The BCSS was used in clinical trials as part of a patient’s daily diary card. Below is a sample instruction page from the diary card. Please complete in the evening (prior to going to bed): Please enter day (Monday, Tuesday, etc.): Please record the date (day/month): How much difficulty did you have breathing today? 0 ⫽ None: unaware of any difficulty 1 ⫽ Mild: noticeable during strenuous activity (eg, running) 2 ⫽ Moderate: noticeable during light activity (eg, bedmaking) 3 ⫽ Marked: noticeable when washing or dressing 4 ⫽ Severe: almost constant, present even when resting How was your cough today? 0 ⫽ None: unaware of coughing 1 ⫽ Rare: cough now and then 2 ⫽ Occasional: less than hourly 3 ⫽ Frequent: one or more times an hour 4 ⫽ Almost constant: never free of cough or need to cough How much trouble was your sputum today? 0 ⫽ None: unaware of any difficulty 1 ⫽ Mild: rarely caused problem 2 ⫽ Moderate: noticeable as a problem 3 ⫽ Marked: caused a great deal of inconvenience 4 ⫽ Severe: an almost constant problem For permission to use this instrument, contact Dr. Mitchell Goldman at [email protected] Clinical Investigations

References 1 National Heart, Lung, and Blood Institute. Global Initiative for Chronic Obstructive Lung Disease. Bethesda, MD: US Department of Health and Human Services, Public Health Service, National Institutes of Health, 2001. Publication No. 2701A. Available from: http://www.goldcopd.com/workshop/ index.html; accessed November 10, 2003 2 Siafakas NM, Vermeire P, Price NB, et al. Optimal assessment and management of chronic obstructive pulmonary disease (COPD): The European Respiratory Society Task Force. Eur Respir J 1995; 8:1398 –1420 3 American Thoracic Society Board of Directors. Standards for the diagnosis and care of patients with chronic obstructive pulmonary disease. Am J Respir Crit Care Med 1995; 152: S77–S120 4 COPD Guideline Group of the Standards of Care Committee of the British Thoracic Society. BTS guidelines for the management of chronic obstructive pulmonary disease. Thorax 1997; 52(suppl 5):S1–S28 5 Mahler DA, Faryniarz K, Tomlinson D, et al. Impact of dyspnea and physiologic function on general health status in patients with chronic obstructive pulmonary disease. Chest 1992; 102:395– 401 6 Eltayara L, Becklace MR, Volta CA, et al. Relationship between chronic dyspnea and expiratory flow limitation in patients with chronic obstructive pulmonary disease. Am J Respir Crit Care Med 1996; 154(6 pt 1):1726 –1734 7 McGavin CR, Artvinli M, Naoe H, et al. Dyspnoea, disability, and distance walked: comparison of estimates of exercise performance in respiratory disease. BMJ 1978; 2:241–243 8 Borg GA. Psychophysical bases of perceived exertion. Med Sci Sports Exerc 1982; 14:377–381 9 Selim AJ, Ren XS, Fincke G, et al. A symptom-based measure for the severity of chronic lung disease: results from the Veterans Health Study. Chest 1997; 111:1607–1614 10 Jones PW, Quirk FH, Baveystock CM, et al. A self-complete measure of health status for chronic airflow limitation: the St. George’s Respiratory Questionnaire. Am Rev Respir Dis 1992; 145:1321–1327 11 Guyatt GH, Berman LB, Townsend M, et al. A measure of quality of life for clinical trials in chronic lung disease. Thorax 1987; 42:773–778 12 Leidy NK, Schmier J, Jones MKC, et al. Evaluating symptoms in COPD: Validation of the Breathlessness, Cough and Sputum Scale. Respir Med 2003; 97(Suppl A):S59 –S70 13 Nunnally JC, Bernstein IH. Psychometric theory. 3rd ed. New York, NY: McGraw-Hill, 1994

www.chestjournal.org

14 Celli B, Halpin D, Hepburn R, et al. Symptoms are an important outcome in chronic obstructive pulmonary disease clinical trials: results of a 3-month comparative study using the Breathlessness, Cough and Sputum Scale (BCSS). Respir Med 2003; 97(Suppl A):S35–S43 15 Laursen LC, Lindqvist A, Hepburn T, et al. The role of the novel D2/␤2-agonist, Viozan (sibenadet HCl), in the treatment of symptoms of COPD: results of a large-scale clinical investigation. Respir Med 2002; 97(Suppl A):S23–S33 16 Jones PW, Bosh TK. Quality of life changes in COPD patients treated with salmeterol. Am J Respir Crit Care Med 1997; 155:1283–1289 17 Cohen J. Statistical power analysis for the behavioral sciences. 2nd ed. Hillsdale NJ: Lawrence Erlbaum Associates, 1988 18 Seemungal T, Harper-Owen R, Bhowmik A, et al. Respiratory viruses, symptoms, and inflammatory markers in acute exacerbations and stable chronic obstructive pulmonary disease. Am J Respir Crit Care Med 2001; 164:1618 –1623 19 Seemungal T, Donaldson GC, Paul EA, et al. Effect of exacerbation on quality of life in patients with chronic obstructive pulmonary disease. Am J Respir Crit Care Med 1998; 157:1418 –1422 20 Seemungal TAR, Donaldson GC, Bhowmik A, et al. Time course and recovery of exacerbations in patients with chronic obstructive pulmonary disease. Am J Respir Crit Care Med 2000; 161:1608 –1613 21 Casaburi R, Mahler DA, Jones PW, et al. A long-term evaluation of once-daily inhaled tiotropium in chronic obstructive pulmonary disease. Eur Respir J 2002; 19:217–224 22 Vincken W, van Noord JA, Greefhorst AP, et al. Improved health outcomes in patients with COPD during 1 yr’s treatment with tiotropium: Dutch/Belgian Tiotropium Study Group. Eur Respir J 2002; 19:209 –216 23 Szafranski W, Cukier A, Ramirez A, et al. Efficacy and safety of budesonide/formoterol in the management of chronic obstructive pulmonary disease. Eur Respir J 2003; 21:74 – 81 24 Paggiaro PL, Dahle R, Bakran I, et al. Multicentre randomised placebo-controlled trial of inhaled fluticasone propionate in patients with chronic obstructive pulmonary disease: International COPD Study Group. Lancet 1998; 351:773–780 25 Burge PS, Calverley PM, Jones PW, et al. Randomised, double blind, placebo controlled study of fluticasone propionate in patients with moderate to severe chronic obstructive pulmonary disease: the ISOLDE trial. BMJ 2000; 320:1297– 1303 26 Rodriguez-Roisin R. Toward a consensus definition for COPD exacerbations. Chest 2000; 117(Suppl 2):398S– 401S

CHEST / 124 / 6 / DECEMBER, 2003

2191