Journal of Clinical Epidemiology 61 (2008) 99e101
COMMENTARY
Objective and perspective determine the choice of composite endpoint George F. Borm*, Steven Teerenstra, Gerhard A. Zielhuis Department of Epidemiology and Biostatistics, Radboud University Nijmegen Medical Centre, 133, PO Box 9101, Geert Grooteplein 21, NL-6500 HB Nijmegen, The Netherlands Accepted 1 October 2007
Abstract The most important consideration in the choice of study design and endpoint is that these two features match and represent the objective and the perspective of the trial as closely as possible. The mechanism may also be helpful, but arguments based on the mechanism, structure or levels of the variables, or based on practical considerations, such as (statistical) efficiency, must always be secondary considerations. Ó 2008 Elsevier Inc. All rights reserved. Keywords: Statistical design; Clinical trial; Composite endpoints; Intermediate endpoints; Multiplicity; Statistical power
1. Introduction The use of composite endpoint was reviewed in terms of the rationale, potential problems and advantages by Ferreira-Gonza´les et al. They defined a composite endpoint as the occurrence of any event in a given set of events [1]. This concept was broadened to health indices by Freemantle and Calvert and they put forward endpoints that were the mean values of a number of variables [2e4]. Borm et al. discussed an even more general concept and accepted any combination of variables as a composite endpoint [5]. For example, an endpoint is not only the occurrence of any event (disjunctive endpoint), or the occurrence of all events (conjunctive endpoint), but also the occurrence of a number of events, or the weighted mean of variables, possibly in combination with the occurrence of certain events. They also addressed composite objectives and gave the following example: the objective of a study is met when at least one out of a number of (partial) objectives is met. In her letter, Adamiak made interesting and important comments on the articles on composite endpoints by Ferreira-Gonza´les and Borm [6]. She pointed out that these articles focused on the technical aspects of the composite endpoints and lacked guidance on how to choose them. Endpoints can refer to different levels of aggregation and various levels of reality: They can refer to processes in the body, such as the restoration of health, or they can be clinical choices, for example, the decision to refrain from
* Corresponding author. Tel.: þ31-24-361-7667; fax: þ31-24-3613505. E-mail address:
[email protected] (G.F. Borm). 0895-4356/08/$ e see front matter Ó 2008 Elsevier Inc. All rights reserved. doi: 10.1016/j.jclinepi.2007.10.001
further treatment. Adamiak questioned whether these levels can always be combined and requested a theory that provides guidance on when and how to combine different endpoints into a single composite outcome. Below, we make a first sketch of such a theory and use it to discuss Adamiak’s comments. To illustrate our theory, we use the Digital In Vitro Fertilization (Digital IVF) trial that aims to show that the use of an Internet web site improves patient well-being [5]. In the trial, the Internet is used to inform and support patients on IVF treatment. The patients in the experimental group have access to an interactive web site that is used to exchange information between gynecologists, general practitioners, and patients. This is intended to help streamline the procedure and make it easier for patients to be optimally informed and to express their preferences and opinions. The patients in the control group receive standard care. 2. The objective A study starts with a research question, usually formulated as a global objective (Fig. 1). The global aim of the Digital IVF trial is to show that the intervention improves patient well-being. A more precise aim may be to show that it improves patient satisfaction and quality of life during the IVF procedure. At present, this formulation of the objective is too global for statistical evaluation. We still have to choose, for example, the patient satisfaction scales and quality-of-life scales we want to use, we have to decide on the number and timing of the assessments, and on how to combine them. To make these decisions, the trial needs to be considered in more detail.
100
G.F. Borm et al. / Journal of Clinical Epidemiology 61 (2008) 99e101
perspective indicates the point of view from which the intervention is evaluated. In the IVF trial, possible perspectives are the opinion of the patient during the procedure, or at the end, the opinion of the patient several weeks later, the opinion of the partner or the clinician, etc. Other types of perspective are individual health vs. public health, efficacy vs. cost-effectiveness, drug action vs. patient benefit, intention to treat (efficacy in all the individuals who are prescribed a drug) vs. per protocol (efficacy in those who take the drug as prescribed), etc. The perspective is an important factor in the design and evaluation of the trial. Fig. 1. The relation between the mechanism of action and the objective of a trial.
3. The mechanism Usually, information is available about the mechanism that determines the outcome of the intervention, or assumptions have been made about the variables and relevant factors. The variables B1, B2, . and X1, X2, . are assumed to be involved in the mechanism (Fig. 1). Potential endpoints are shown in bold italics. The variables B1, B2, . represent factors that may have an impact on the outcome, but they are not influenced by the other variables, or by the intervention. In the IVF trial, patient satisfaction and quality of life at various points during the IVF procedure are potential endpoints, whereas the age of the patient, the number of previous IVF attempts, etc., are factors that may have an impact on the outcome, but are not influenced by the other variables. Following Adamiak’s request, it is necessary to consider different levels of reality, for example, a within-patient level (age, health status, quality of life), a social level (education, profession), the geographic level (distance to the IVF center), and the financial level (cost). The levels are indicated by ovals.
4. The perspective More precise formulation of the objective requires choosing exactly which assessments should be included in the objective. In the IVF trial, it was decided to measure patient satisfaction using the Patient Satisfaction Questionaire III (PSQ III) scale and to measure quality of life using the FertiQol scale. It was also decided to measure them only once, just before the end of the IVF procedure, that is, before the result of the procedure (pregnancy yes/no) was known. The latter choice was made because the success of the procedure may have a strong impact on the quality of life, so an assessment at a later time may not properly represent the effect of the use of the web site. However, if the aim of the study had been to evaluate the quality of life of the patient throughout the whole IVF procedure, including a possible pregnancy, assessments at a later stage would need to be considered. This illustrates that the exact objective and the corresponding assessments depend on the perspective. The
5. The endpoint The variables that form part of the objective may have to be combined and a choice has to be made of which hypotheses to test. In the IVF trial, possible objectives are improvements in mean PSQ III and mean FertiQol scores. Another objective could be that one of the scores improves, whereas the other does not deteriorate [5]. These objectives can be shown graphically by plotting the differences DPS in PSQ III scores between the groups on one axis and the differences DQoL in FertiQol scores on the other. The points that correspond with the objective of the trial (i.e., ‘‘the successful outcome region’’) then illustrate the desired outcome of the trial. Figure 2 shows a successful outcome region: on average, patient satisfaction and quality-of-life scores increase, whereas neither decreases by more than 0.2 points [5]. This approach can be used in two ways: first decide which results of the trial constitute a successful outcome and then construct the corresponding composite endpoint, or vice versa. Basically, choosing an endpoint, whether composite or not, is a question of indicating the successful outcome region, that is, selecting outcomes of the study that can be considered to represent benefit. This means that the perspective is important, because benefit is not an absolute entity, but depends on who benefits or who decides about the definition of benefit.
Fig. 2. The successful outcome region of the Digital IVF trial.
G.F. Borm et al. / Journal of Clinical Epidemiology 61 (2008) 99e101
A consequence of all these considerations is that in the choice of composite endpoint, the mechanism, including the levels of the variables, is irrelevant. In fact, sometimes it is essential to combine variables from different levels. A well-known example is the cost-effectiveness ratio, which combines the within-patient level (efficacy) and the financial level (cost). Contrastingly, health indices usually combine variables from a single (within-patient) level. In the IVF trial, some patients may opt to seek psychological counseling. If this decision is considered an undesirable outcome, it can be included in the composite endpoint. The composite endpoint will then consist of two variables from the within-patient level (PSQ III and FertiQol scores) and one variable that is based on a decision made by the patient (psychological counseling). The latter decision can be linked to multiple levels, because it may not only be influenced by the opinion and advice of her general practitioner, but also by the opinions of others and even by whether or not the medical insurance company covers the cost of the counseling.
6. The scope The mechanism may have relevance on another aspect of the objective: the scope. Scope reflects the level of generalization that is required. For example, a study that aims to provide results that are valid in the local or national situation may have to be designed differently from a study that aims to provide results that are valid worldwide. In the IVF study, the scope is restricted to IVF patients in the Netherlands. If the scope had been the European Community, the organization of the medical care in the various countries would need to be taken into account. In that case, a model is required that describes how the organization of the care influences (modifies) the outcome of the intervention. On the basis of that model, the outcome of the study can be adjusted to the local situation. Another factor that may have impact on patient satisfaction is the traveling distance to the IVF center: patients who live far from the center may benefit more from the intervention. This can be investigated by including the distance to the center as a covariate in the analysis. The results can be used to estimate the efficacy of the web site method in subgroups who live at different distances from the IVF center. Extrapolation of the efficacy to situations in which the distances are systematically larger than those in the IVF study requires additional assumptions about the mechanism and decisions have to be made about whether extrapolation makes sense and how the extrapolation should be modeled (linear, quadratic, etc.). In short, the mechanism makes it possible to generalize results to situations in which trial data are lacking or insufficient. A disadvantage of this approach is that the results are based on assumptions about the mechanism, not on facts. The results may also be based on imprecise
101
information: the power of the study or the statistical precision may be too low to provide robust evidence about subgroups. Nevertheless, it is often unavoidable to extrapolate trial results to situations that are not properly represented in the trial, because practical constraints and limitations make it impossible to investigate all possible aspects of an intervention in all possible situations.
7. The limitations Important factors in the design of a study are the practical limitations, such as the maximum possible size of the study, maximum duration of the follow-up period, or ethical considerations. When statistical efficiency or shorter follow-up is the reason to use composite endpoints, the assumed mechanism may help to select these composite endpoints. An example of such a composite endpoint is the use of recurrence-free survival as the primary endpoint in oncology trials. The rationales behind this choice are that recurrence of the tumor is assumed to predict survival and that a study with recurrence-free survival as an endpoint requires a shorter follow-up period than a study that evaluates survival. However, the increased efficiency may come at a price, because endpoints based on assumed mechanisms may later prove to be unreliable [7].
8. Conclusion The most important consideration in the choice of study design and endpoint is that these two features match and represent the objective and the perspective of the trial as closely as possible. The mechanism may also be helpful, but arguments based on the mechanism, structure, or levels of the variables, or based on practical considerations, such as (statistical) efficiency, must always be secondary considerations. References [1] Ferreira-Gonza´les I, Permanyer-Miralda G, Busse JW, Bryant DM, Montori VM, Alonso-Coello P, et al. Methodologic discussion for using and interpreting composite endpoints are limited, but still identify major concerns. J Clin Epidemiol 2007;60:651e7. [2] Freemantle N, Calvert M. Weighting the pros and cons for composite outcomes in clinical trials. J Clin Epidemiol 2007;60:658e9. [3] Ferreira-Gonza´les I, Permanyer-Miralda G, Busse JW, Bryant DM, Montori VM, Alonso-Coello P, et al. Composite endpoints in clinical trials: the trees and the forest. J Clin Epidemiol 2007;60:660e1. [4] Freemantle N, Calvert M. Composite outcomes-final comment for now.. J Clin Epidemiol 2007;60:662. [5] Borm GF, van der Wilt GJ, Kremer JAM, Zielhuis GA. A generalized concept of power helped to choose optimal endpoints in clinical trials. J Clin Epidemiol 2007;60:375e81. [6] Adamiak GT. Multilevel endpoints and the problem of theoretical aggregation and overlapping. J Clin Epidemiol 2008;61:198e9. [7] Flemming TR, DeMets DL. Surrogate end points in clinical trials: are we being misled? Ann Intern Med 1996;125(7):605e13.