Assessing drugs for the treatment of heart failure

Assessing drugs for the treatment of heart failure

Assessing Drugs for the Treatment of Heart Failure JOHN REYNOLDS HAMPTON, M.D., Nottingham, United Kingdom The aims of treatment in patients with...

653KB Sizes 1 Downloads 111 Views

Assessing Drugs for the Treatment of Heart Failure JOHN

REYNOLDS

HAMPTON,

M.D.,

Nottingham, United Kingdom

The aims of treatment in patients with heart failure, as with any other condition, are to relieve symptoms and prolong life. A secondary objective is to do so at the lowest possible economic cost. When treatment of the cause of heart failure is possible-which usually means some form of surgery, but could include treatment of a primary disease, such as thyrotoxicosis-then this is obviously the treatment of choice. In patients for whom there is no such definitive treatment, a wide and increasing variety of drugs are available. When new antifailure drugs are to be developed there are therefore two problems: first, to ensure that they are more effective than placebo treatment, and second, to compare them with existing drugs. Both of these tests can present ethical difficulties, for it is unreasonable to withhold established effective treatment in order to conduct a placebo-controlled trial, and when drugs are being compared, it is unlikely that any new medication will be dramatically superior to an old one. The more effective the treatments already available, the harder it becomes to evaluate a new drug.

I

t is obviously unethical to include patients who have heart failure, which causes severe breathlessness, fatigue, and ankle swelling, in a trial in which standard antifailure drugs are withheld. Trials in which patients are to be treated only with the new compound or with placebo are therefore limited to subjects with trivial symptoms. Digoxin and diuretics are effective in relieving symptoms, and at the very least a new drug can only be compared with placebo in a trial design that allows digoxin and diuretics to be given to all patients. In many patients, diuretic therapy may lead to as much symptomatic improvement as there is likely to be, demonstrating that the extra effect of the new drug may be difficult to evaluate. The problem is made worse by the fact that the exercise tolerance of many patients with heart failure improves spontaneously over a period of weeks or months. This is true when they are included in a clinical trial and are observed and frequently exercised in the hospital. A considerable “placebo effect” of several agents, which has led to doubts about their efficacy, has been seen in trials. There are two possible approaches to this problem. Superficially, the simplest is to evaluate the drug efficacy on the basis of a “quality-of-life” questionnaire. Scientifically, the best way to evaluate a drug would be to record measurements of patients’ exercise tolerance as accurately as possible. Unfortunately, both of these approaches have their problems.

ASSESSINGEFFICACYIN RELIEFOF SYMPTOMS Quality-of-life questionnaires can be general or can be disease-specific, with questions aimed at the particular symptoms of heart failure. There is no general agreement as to which type of questionnaire is most appropriate, and few questionnaires have been validated in patients with heart failure in a way that demonstrates that their results do indeed correlate with the severity of the disease. Although general questionnaires are preferred because they reflect the way patients generally feel, these questionnaires may record changes unrelated to heart failure; for example, they might merely record an improvement in well-being that results from the withdrawal of some other medication that caused side effects. 5530s

May 29, 1991

The American Journal of Medicine

Volume 90 (suppl 5B)

SYMPOSIUM

Modified Bruce protocol

ON IBOPAMINE/HAMPTON

Fixed workload protocol 20

15

10

5 L-4 g

: 0

0 3

6

9

12

15

16

0

I

I

I

I

I

I

3

6

9

12

15

16

Minutes of exercise

1

21

I

24

Minutes of exercise

Figure 1. Pattern of oxygen uptake in a single patlent with heart failure exercised with two different protocols

Accurate evaluation of exercise capability is difficult because of the availability of many well-established exercise tolerance tests, including treadmill, bicycle, or ergometer, or tests in which a patient’s ability to walk freely is evaluated. Within each of these groups of tests, different exercise protocols may provide different results, Treadmill testing is the most commonly used method of evaluating response of patients with heart failure to new treatments, but there is no agreement on a generally accepted exercise protocol. A rapid build-up of work load, used in the Bruce protocol, is likely to have a different effect on a patient than a protocol in which the workload is increased slowly. In the former type of test, a patient is likely to discontinue due to breathlessness. In the latter type, onset of fatigue may well be more important. The detailed measurements of the patient’s maximum oxygen uptake (ii0, ,,,) during exercise do not solve this problem. Figure 1 shows the pattern of oxygen uptake in a single patient with heart failure who was exercised to his symptom-limited maximum with two different treadmill protocols. When the Bruce protocol was used, there was progressive increase in 00, with each exercise stage until the test was discontinued due to breathlessness. When the patient performed a test at a fixed speed and slope on the treadmill, a plateau of ir02 was quickly achieved and maintained until the test was discontinued because of patient fatigue. The maximum ii0, was lower in the second test than in the first. Nonetheless, the evaluation obtained in each case could not be taken to be the patient’s maximum ir0,. Figure 2 shows some results from a study (unpublished data) in which a group of 10 patients was exercised using the Bruce protocol, and then again

1000

VO,-maxlmum

0

Placebo

q

Flosequinan

oxygen uptake.

800 3n z : ” ii

600 400

.!200

0 Bruce

Treadmill

Fixed

exercise

test

2. Mean exercise duration in a group of patients treated at random wit h placebo or flosequinan, and tested with two different treadmill protocols.

Figure

using a fixed-speed protocol on a treadmill. The exercise duration with the former protocol was greater than with the latter. When these patients were treated with the vasodilator flosequinan, the mean exercise duration achieved with the two protocols was almost identical, indicating that the relative increase induced by flosequinan treatment was considerably greater when the fixed-speed protocol was used. Figure 3 shows similar data recalculated to indicate the total work done. The difference between the results of the exercise tests with the patients taking placebo has now been reversed, with the fixed-speed protocol indicating greater exercise tolerance than the Bruce protocol. Flosequinan treatment improved exercise tolerance, but the relative increase remained similar despite the exercise protocol used. These results indicate that a markedly different impression of a drug’s efficacy may be ob-

May 29, 1991

The American Journal of Medicine

Volume 90 (suppl 56)

5531s

SYMPOSIUM

ON IBOPAMINE / HAMPTON

40

T

30 = 5 Y

I .:.:.:.:.:.:.:.:.:.:.:. :.:.:.:.j:.:.:.:.:.::: .:.:.:.:.:.:.:.:.:.:... . . . . ..L . . . . . . ..L..... . . ..c.. . .._..C....... ::::::::‘::~i:‘::~;:~; ::::::::.:.:,:.:.:,:.:. ::::::::::::::::::::::: ccc. . . . . . . . . . . . . . . . . ::::::::::::::::::::>: i.... ‘.y.:.:.:.:.:‘: ::::::$~::j::::::::: ::::::.:.:,:.:.:.:.:.:. . i..... :.:.:.:.:.:.:..

20

2

........,.,.,.,.,.,.,., ~.~~~~~:‘:.~.~~:~,~,~ y:::::::::::::::::::. gg:;~.$~.~.~ . i.iiiii..... ::::.:.:.:.:.:.:.:.:.: ~:~gg$$$ ‘” 2..

lo-

O-

Fixed

Bruce Treadmill

exercise

test

Figure 3. Results of the same study shown in Figure 2, except that total work on the treadmill has been calculated.

TABLE I Percentage increase in Exercise Time Measured by Four Different Tests in a Trial in Which Enoximone or Placebo Were Added to Maintenance Therapy with Captopril and Diuretics [l] Improvementin Exercise Tolerance (%) TreadmIll Bruce Fixed rate Corridor walk Pedometer at home

31 7x ;

tained if different exercise protocols are used, and if results are expressed in different ways. Table I shows the results of a study [l] in which enoximone or placebo treatment was added in patients who still had symptoms of heart failure despite treatment with digoxin, diuretics, and captopril. Four different exercise tests were used, two involving a treadmill (Bruce protocol and fixedworkload), a test in which the patient was allowed to walk down a corridor at his own speed, and finally a test in which the amount of “free-range” activity undertaken at home was measured over the course of a week by means of pedometers. Table I shows the increase in exercise capacity associated with active treatment, expressed as a percentage of the pretreatment value. A reasonable increase was demonstrated with both treadmill tests, although the fixed-rate protocol showed twice the amount of increase as did the Bruce protocol. In the corridor walk test and in the “free-range” evaluation made with pedometers, there was, however, virtually no increase associated with enoximone therapy. It is debatable which of these results represents the “true” effect of extra treatment. The super-

55328

May 29, 1991

vised exercise on the protocol presumably reflects the patient’s potential, and evaluated in this way enoximone is, evidently an effective additional treatment. However, the corridor walk test, and perhaps to a greater extent the evaluation of distance walked by pedometer, are more accurate reflections of how the patient feels and what he finds himself able to do. By these standar’ds, the addition of enoximone to a diuretic and an angiotensinconverting enzyme (ACE) inhibitor produces no benefit. It is evident that by choosing an appropriate test, a drug can be shown to have a wide variety of effects.

The American Journal of Medicine

EVALUATIONOF THE EFFECTOF A DRUGON FATALITYIN PATIENTSWITH HEARTFAILURE. The treatment of symptoms is always of paramount importance, and it is difficult to withhold treatment that improves fatigue simply to conduct a placebo-controlled trial with mortality as its endpoint. Although one large trial is in progress in which asymptomatic patients with left ventricular dysfunction are being treated with an ACE inhibitor or placebo, it is arguable whether these patients have the same clinical syndrome that is usually called “heart failure.” Mortality-endpoint trials in true heart failure have to be conducted with all patients on background treatment with diuretics and possibly digoxin, which are, of course, drugs that have never been evaluated for an effect on mortality. Patients with very severe heart failure improve symptomatically when ACE inhibitors are given in addition to diuretics. The Consensus Trial [Z] showed that in severely ill patients, an ACE inhibitor also prolongs survival. Any further mortalityendpoint study in patients with severe heart failure should be conducted with all patients receiving both diuretics and an ACE inhibitor. This does not necessarily apply to patients with mild or moderate heart failure. In this group of patients ACE inhjbitars have not been shown to prolong survival so it is still ethical, and important, to compare the effect on fatality of different drugs that are equally effective in relieving symptoms. Evaluating the effect of a new drug is difficult when it has to be compared with placebo. The problem is greater when it has to be compared with existing treatments that are known to be effective. Unless there is good reason to believe that the new drug is markedly superior to the old, a clinical trial to demonstrate its effect will have to be large. In general, new drugs only confer a marginal advantage once an effective treatment is available. Suppose, for example, in a group of patients with moderate heart failure, the fatality rate in a given

Volume 90 (suppl 58)

SYMPOSIUM

_____I_

cumulative

Fatality

Cumulative Fatality %

%

a

Time

Cumulative Fatality %

e

Time

cuxmllative Fatality

Time

Change

to ‘A’

l.3

Cumulative Fatality %

c

Change

-

ON IBOPAMINE i HAMPTON

%

Patient Change

withdrawn to ‘A’

d

Time

Figures 4. a-e: Theoretical outcome in a trial in which patients with heart failure are treated with a conventional drug “A” or a new drug “B” In which patients treated with “B” may cross over to treatment “A”.

to ‘A’

Time

period is expected to be 15%, the new treatment is expected to reduce this to 10% (an absolute reduction of 5%, but a relative reduction of 30%). To have 90% power to demonstrate this difference at the 5% (p < 0.05) level, 1,840 patients will have to be included in the trial. If the next treatment to be tested---whether by comparison with, or by adding to, the proven therapy-is expected to produce an equal 30% improvement in fatality (10% reduced to 6.‘i%), approximately 3,000 patients will have to be included. If, more realistically, it is supposed that the new drug will confer only a 20% further benefit (10% reduced to 8%), then approximately 8,000 patients will have to be included in the trial. On these grounds alone, it seems likely that many new treatments will never be adequately investigated for a possible effect on mortality. Another major theoretical problem is the effect of “crossover” between treatments during the course of a clinical trial. For example, attempt to compare an established drug “A” with a new drug

‘“I!?’ for an effect on mortality. Patients in heart failure trials will inevitably deteriorate symptomatically, and when this happens the physician caring for the patient will inevitably wish his patient to be treated with the established drug. The patient is likely to be withdrawn from the trial and the code broken, and if he is found to be taking the established drug “‘A,” treatment will be continued. If, however, he is found to be taking “B,” then he will probably be changed to “A.” Figure 4a shows what one might hope will happen in a situation where the two drugs are known to be equally effective in relieving symptoms. There is no crossover between groups, the fatality rate on “B” is less than on “A” and at the end of the trial, intention-to-treat analysis reveals a significant difference between the two treatments. Figure 4b shows what would happen if the two drugs had the same effect on fatality. Although there would be a tendency for patients on the new drug “B” to be changed to the old drug “A” because of symptomatic deterioration, the death rates would be identical all through the trial and neither explicative, nor intention-to-treat, analysis would reveal any difference between them. Figure 4c shows how intention-to-treat analysis might miss a beneficial effect of “B” in terms of fatal&y. Although the fatality rate in patients treated with “B” is beginning to separate from those treated with ‘“A” as the trial progresses, because of

May 29, 1991

The American Journal of Medicine

Volume 90 jsuppl 5B)

5E-33s

SYMPOSIUM

ON IBOPAMINE / HAMPTON

Comparison

of Drugs

9.5% CI for Relative

Difference Different

a

?

Not

+

% Increase

0

% Decrease

symptomatic deterioration, a significant proportion of those being treated with “B” is changed over to “A.” The fatality curves of the two groups might either remain parallel or even converge, so that at the end of the trial intention-to-treat analysis should show no difference between them. Explicative analysis, making allowance for the number of days patients remained on their allocated treatment, might demonstrate the superiority of “B.” Figure 4d shows that intention-to-treat analysis might also fail to reveal a harmful effect of drug “B” when changed over to drug “A,” and the fatality curves then remain parallel or converge. At the end of the trial there is no difference between the groups, but explicative analysis, allowing for time on treatment, would point to a less favorable effect of “B.” The examples shown in Figures 4c and 4d might suggest that explicative analysis is preferred when two drugs are being compared under circumstances when there may be crossover between treatment groups. Figure 4e shows that intention-to-treat analysis is essential for safety purposes. Suppose drug “B” accelerates the cause of the disease. As the trial progresses, the fatality curves will separate, but because of symptomatic deterioration, patients on “B” will be changed to “A.” Due to their underlying condition, having worsened on drug “B,” the fatality curves continue to diverge, and by intention-to-treat analysis there is a significant difference between treatment groups at the end of the trial, although explicative analysis would not demonstrate this.

5B-34s

May 29, 1991

The American Journal of Medicine

--+

different

Figure 5. The confidence intervals (Cl) of the results of hypothetical trials designed to demonstrate the difference betaeen treatments (top two bars) or to demonstrate similarity of treatments (bottom 3 bars).

These hypothetical examples of trial results show how difficult it is to compare drugs in terms of their effect on fatality when the use of the drugs also affects patients’ symptoms. It is evident that intention-to-treat analysis cannot be considered as the sole arbiter, even though it is of paramount importance in terms of safety. Hence intention-to-treat and “explicative” analysis must both be considered, and it will also be necessary to take into account the number of days the patient remains on the treatment to which he was originally allocated.

PROBLEMOF ESTABLISHINGSIMILARITY Up to now clinical trials have been primarily concerned with demonstration of “difference” between treatments, and usually of difference between active and placebo treatments. Now that we have a choice of active treatments for heart failure, we have to consider “similarity.” It is important to k-now whether treatments are similar in terms of fatality reduction so that a patient can be given the drug that relieves his symptoms best and causes the fewest side effects. If one treatment is superior in prolonging life, that would be the treatment of choice unless there is a very marked difference from other drugs in the relief of symptoms. If two drugs have the same effect on fatality, then a patient can try both and take for the long term the drug with the fewest side effects that makes the patient feel best. Figure 5 shows the problem of defining “similarity” in terms of confidence intervals. The top two bars represent the 95% confidence intervals for the

Volume 90 (suppl 56)

SYMPOSIUM

results of two trials designed to demonstrate a difference between treatments, with the point estimate score (relative risk reduction) shown as a vertical line within the bars. In the top bar, the 95% confidence interval for the relative risk reduction is narrow and clear of the line of zero effect; hence there can be confidence that the treatments are different. In the trial represented by the second bar, the relative risk reduction is the same, but the confidence interval is wide; hence a difference between treatments would not be demonstrated. The lower three bars represent the results of trials designed to demonstrate “similarity” between treatments. In the first of these there is no difference at all between the two treatments, and the confidence interval around this lack of difference is small. It would seem entirely reasonable to conclude that the treatments have a similar effect. It is unlikely that in any trial, two treatments will have an identical effect, and the lower two bars represent trials in which one treatment showed a modest reduction in event rate compared with the other. In both trials the confidence interval around the demonstrated difference shows no effect. In the upper trial the confidence interval is relatively narrow, and there cannot be a great difference between treatments in either direction. A clinician would probably consider the treatments “similar.” In the bottom trial, the confidence interval is wide and includes the possibility of either treatment conferring considerable benefit; no clinician would accept the result as demonstrating “similarity” between the two. This sequence of possibilities shows that “similarity” of treatments is a relative rather than an absolute concept: “not shown to be different” is not the

ON IBOPAMINE I HAMPTON

same as “can be accepted to be similar.” Acceptable limits of error have to be defined within the confidence interval that is acceptable-that will allow us to consider treatment being “similar.” Narrow confidence intervals imply large trials, but these require immense financing.

CONCLUSION Evaluating therapy in heart failure is very difficult, and is becoming more so as we have to choose between treatments that have been shown to be superior to placebo. The value of detailed physiologic measurements is limited by the different physiologic effects of different exercise tests, with each having a different relationship to the type of exercise that a patient undertakes. Drugs that can be shown to reduce symptoms evaluated by one test may have little effect on the result of other tests. Evaluating effect on fatality is difficult when patients have symptoms that must be treated; many factors make the comparison of drugs-as opposed to the comparison of a drug and placebodifficult, and it seems intrinsically unlikely that new drugs will have a dramatically greater effect on fatality than old ones. The problem we confront is defining the limits within which we are prepared to consider the results of two drugs to be “similar.” For once two drugs are accepted to have a similar effect on fatality, it becomes easier to allow a patient to decide for himself which he prefers.

REFERENCES 1. Cowley AJ Stainer K, Fulwood L, Muller AF, Hampton JR: Effects of enoximone in patients with heart failure uncontrolled by captopril and diuretics. Int J Cardiol 1990; 28 (I): 45-53. 2. CONSENSUS Trial Study Group: Effects of enalapril on mortality in severe congestive heart failure. N Engl J Mad 1987; 316: 1429-35.

May 29, 1991

The American Journal of Medicine

Volume 90 (suppl 5B)

5B-35s