The Statistical Analysis of Adherence Data Obtained from Markers Abraham Silvers*, Michael L. Russell, and William Insull, Jr. Research and Demonstration Center of Heart and Blood Vessels, Baylor College of Medicine, and Lipid Research Center, Houston, Texas
ABSTRACT: Adherence markers provide new kinds of clinical trial data. Adherence data on
individual participants obtained from markers can be used for designing the trial sample size and stratification, evaluating the adequacy of randomization, directing the management of adherence, and analyzing and interpreting the trial's final results. Examples of these are presented. Analyses of adherence data can employ conventional procedures. Box plot techniques are proposed for flagging extreme values of the distribution of adherence either at a single time point or in a time continuum. The effect of difference~ m distribution of the markers in two or more groups in a clinical trial is discussed. Survival analysis is illustrated as one technique in univariate and multivariate analysis of markers with covariate effects. Design considerations with markers in a clinical trial are discussed, with particular emphasis on some of the assumptions and biases that must be considered for the analysis.
INTRODUCTION Patient n o n a d h e r e n c e to a prescribed regimen i s a problem in m a n y aspects of clinical trials a n d medical practice. The level of n o n a d h e r e n c e to a regimen m a y be affected b y a variety of causes, such as life-style, educational level, social and economic status, cultural b a c k g r o u n d , and the a d e q u a c y of p h y sicians, nurses, and other medical p e r s o n n e l as counselors. In clinical trials, it can be an insidious problem, because evaluation of efficacy is based o n the a s s u m p t i o n of complete a d h e r e n c e to the regimen. H o w e v e r , specific statistical techniques are available for use in various aspects of a clinical trial (i.e., design, quality control, data analysis) to aid the clinician to m o n i t o r a n d to control the effects of patient n o n a d h e r e n c e . An i m p o r t a n t consideration for control a n d analysis is the definition of adherence. In the best case a patient's a d h e r e n c e s h o u l d be based o n an actual biological m e a s u r e of a d h e r e n c e instead of a more indirect measure, such as a relative pill count. The use of biological a d h e r e n c e markers in a clinical trial
Address reprint requests to: Abraham Silvers, Ph.D., EPRI 3412 Hillview, Palto Alto, Calif. 94303. *Present address Electrical Power Research Institute. Controlled Clinical Trials 5:544-555 (1984) © ElsevierSciencePublishingCo., Inc. 1984 52 VanderbiltAve., New York, New York 10017
544 o197-2456/84/$o3.oo
Analysis of Adherence Data From Markers
545
would be a major improvement over other available measures. This article addresses the statistical evaluation of the data from such an adherence marker.
DEFINITIONS A N D ASSUMPTIONS
The goals of the proposed statistical analyses are: 1. To establish the distribution of adherence, indicating the level of variance. 2. To determine if substantially unequal distributions of adherence exist in two or more groups. Initially we must consider the definition of adherence level. Adherence may be defined either as a continuous variable or as a categorical variable. For example, if it is considered a categorical variable it may be defined as high (i.e., 80%-100% adherence), moderate (i.e., 40%-80% adherence) or low (i.e., <40% adherence). After the participants have been distributed by adherence level into each of these groups, the adherence outliers become identified. An outlier can be defined as an adherence level below an expected limit. Oufliers are part of the distribution of adherers, in a region that would be considered unacceptable. These arbitrary definitions also may be sufficient to define a reasonable stratification for achieving comparable groups. Once the definition of adherence level has been determined, distribution of partial adherence in both treatment groups can be calculated. If the distributions are from the same population, that is, if the distribution location and shape parameters are not significantly different, or the distributions are not markedly displaced with respect to each other, the trial may proceed. If the converse exists (Figure 1), serious bias may be present, which could compromise the trial results. If a serious bias exists due to the inability of clinical management to correct the poor adherence, it may be necessary to make adjustments in the analysis phase so that comparisons can be made. Generally, this is possible if the distributions in both groups are not widely different. If they are, it may reflect inadequate randomization. Finally, several assumptions common to clinical trials must be recognized: (1) the adherence level is correlated to the efficacy of the drug (i.e., complete adherence is associated with the known efficacy of the drug), and (2) the analysis outcome data from the trial does not differentiate the partial adherer from complete adherer. The basis for differential treatment is to minimize the variability of adherence, that is, to correct differences in adherence within the trial cohort. For example, in cancer trials, differential attention is given for the reduction of toxicity to return the patient to the normal status or health. This is the accepted treatment goal.
IDENTIFYING NONADHERENCE PATIENTS
Adherence data are collected at various points in a clinical trial and, over time several different patterns may be observed. The patterns can occur in any phase of a trial and may change with time, either for the individual or for the group. Figure 2 depicts the adherence level over time. In this diagram,
546
A. Silvers, M. L. Russell and W. Insull, Jr.
#
5O R E L A T I U E F R E Q U E hl C Y
.,#/\. ,/
40
,
\/',
30
//
L>O
\',,
10 0
0.
50. 100. AOHEREI~CE LEUELS
150.
200.
Figure 1 Hypothetical example of differences between treatment groups for distributions of parUal adherence. - - # - - , group 1; - - . - - , group 2. a stable value o v e r time is o b s e r v e d for data f r o m an a d h e r e n c e marker. The p a t t e r n described b y the " b e s t fit" line s h o w s that there d o e s not a p p e a r to be a relationship b e t w e e n a d h e r e n c e a n d time (Table 1). Figures 3 a n d 4 a n d Table 2 s h o w the c o n v e r s e , that is, there a p p e a r s to be a relationship b e t w e e n a d h e r e n c e a n d time. In Figure 4 is a c o n t i n u i n g t r e n d in the increase of n e w
Figure 2 Stable pattern of adherence over time. See Table 1. - - * - - , group 1; . 105.1 + 1.038X. 130. A D H E R E H C E L E U E L
100. 88, 68. 40.
0.
"
:
0. 2. MGI-ITH5
4.
6.
8.
10.
12.
,
547
Analysis of Adherence Data From Markers
Table 1
D e s c r i p t i o n of S t a b l e P a t t e r n of A d h e r e n c e O v e r T i m e (See Fig. 2) Parameter Table
Parameter Intercept Slope
Fitted value
Standard deviation
T value
105.0667 6.178198 17.00605 1.038086 0.8515666 1.219031 Analysis of Variance Table
Sig. Lev. 0.0001 0.2576
Source
Sum of squares
O.F.
Mean square
F value
Sig. Lev.
Regression Residual
115.8953 623.9309
1 8
115.8953 77.99136
1.486001 --
0.258 --
Linear fit to: Curve 1 of COM2 Number of data points = 10. Correlation coefficient R = 0.3957929 R2 = 0.156652. Standard deviation of regression = 8.831272.
a d h e r e n c e a n d t i m e . I n F i g u r e 5, t h e s u d d e n rise in t h e d a t a f r o m a n a d h e r e n c e m a r k e r o v e r m o n t h s is s e e n . In t h e s e d a t a , w e o b s e r v e s e v e r a l c o n s i s t e n t p a t t e r n s of n o n a d h e r e n c e . These regularities are associated with nonadherence and can be defined in t e r m s of a d e v i a t i o n f r o m t h e e x p e c t e d n o r m . T h i s c o m b i n a t i o n of a r e g u l a r pattern with a deviation from adherence norms can be described by using specific s t a t i s t i c a l t e c h n i q u e s . T h i s s t a t i s t i c a l p r o c e s s r e q u i r e s t h a t t h e d a t a b e organized, significant patterns identified, statistical parameters estimated, and regular patterns defined.
Table 2
D e s c r i p t i o n of P a t t e r n of I n c r e a s i n g A d h e r e n c e W i t h T i m e (See Fig. 4) Parameter Table
Parameter Intercept Slope
Fitted value
Standard deviation
30.95896 6.850053 0.1757996 0.05806217 Analysis of Variance Table
T value
Sig. lev.
4.519521 3.027782
0.0009 0.0115
Source
Sum of squares
O.F.
Mean square
F value
Sig. lev.
Regression Residual
1081.07 1297.172
1 11
1081.07 117.9247
9.167463 --
0.0115 --
Linear fit to: Curve 1 of COM4. Number of data points = 13. Correlation coefficient R = 0.6742158. R2 = 0.454567. Standard deviation of regression = 10.85931.
548 1OO.
A D H E R E H C E
#
80. 68
40.
L E U E L
28.
~° g.
58.
log.
158.
IBg.
DAYS Figure 3
P a t t e r n of d e c r e a s i n g a d h e r e n c e w i t h time. - - # - - ,
Figure 4
--
93.1 -
.3214,X
P a t t e r n of i n c r e a s i n g a d h e r e n c e w i t h time. See Table 2. - - , - 30.96 + .1758,X.
B5. A D H E R E N C E
78. 6g. 5g.
40. k E U E L
30. ~---~. 10. g. 50.
0. ~YS
100.
158.
I~.
--
549 15.
bl 0 H
12.
10. A
D H E
8.
R
E H C E C Iq s E S
6.
.
....., --__....__ 2..
/
1 O O
0.
I
O.
I
I. HOHTHS
2.
I
I
I
I
I
I
I
' I
3.
4.
5.
6.
7.
8.
9.
10.
I
It
Figure 5 Graphical description of increasing adherence with time, using monthly rate of new nonadherence cases.
Figure 6 Example of similar patterns of adherence over time for two treatment groups. --*--, group 1; --#--, group 2. 100.
#
A D H E R
E H C E L E U E L
4e.
B.
2.
MONTHS
4.
6.
8.
18.
12.
5~0
A. Silvers, M. L. Russell and W. Insull, Jr. The recognition of these patterns in the adherence data is the critical first step for analysis and its management. Further implications of this pattern can be determined through a comparison with the control. Data should be compared from the experimental group and the control group. In Figure 6, both the experimental groups and the control group have the same adherence distribution over time, indicating a successful trial regarding adherence. If the groups do not have the same distribution, there may be serious difficulty in the trial. A main objective of this evaluation is the detection of the occurrence of unusual events that represent disproportionate nonadherence. The mathematical term "outliers" is used to label an unusual event that can have a disproportionate effect on the process being monitored because the event lies outside the expected range for the data. An outlier is an effect beyond the defining limits. Because averaging techniques may mask the effect or the importance of the outlier, the data must be scanned to identify these outliers. The process of identifying these outliers is called "flagging." An outlier represents a violation of the established standard for adherence. Outliers may be obvious or they may require some statistical procedures to detect them. Useful procedures are box graphic displays. A box display depicting continuous data is called a box plot. The box graphic display in Figure 7 is in respect to an adherence marker. A box plot is a compact way of showing the nature of the distribution of data points, as in Figure 7, which consists of a box, two tails, and outliers. The box spans the range of 50% of the points (25%-75%). It is bisected by a median and a plus denotes the mean. Tails connect the box in the most extreme points, which are less than 1.5 times the interquartile range. The points outside this are displayed as outliers (Fig. 8). Further tables can be produced that display the values of the mean, the median, first quartile value, third quartile value, lowest value in a box, highest value in a box, mild outliers, and extreme outliers. Box plots for a control group may be compared to box plots in an experimental group. There
Figure 7 Boxplot for adherence for two treatment groups. 1--group 1; 2--group 2.
See text. M E A
H
GO. 4 6 6 0 7 6 0 . 4625?
50.4G257 40.46257
-4--
30.46257 20. 46257
10. 46257 .46L:~T31 O. X--UAL
1.
I
2.
!
3.
551
Analysis of Adherence Data From Markers
M E A H
520.5
X
419.5 319.5
I
219.5-
+1
119.5 19.5 O, X--Ur4_
i
!
1.
2.
Figure 8 Demonstration of outlier, X, in boxplot analysis. 1--group 1.
may be differences between the two groups in one or more dimensions of the box plot analysis. In particular, differences in outliers can be displayed (Figs. 9 and 10, Table 3). Some of the previous examples show the serial relationship depicting change and the outcome over time or what is called the time trend (see Figs. 2-4). The trend depicts if the outcome for one day is different from another. It is useful to look at the trend by means of a boxed plot. In Figure 11, the intervals are months and the mean and median increase with time. A trend can produce possible future adherence problems. The analysis of outliers and trends may require the use of other sophisticated statistical techniques, which are not discussed here.
Figure 9 Difference between two groups in several dimensions of the boxplot analysis of adherence, l--adherence level; 2--group 2 adherence.
BO~LOTS OF li~e.5 M E A H
COLUMNS OF ADH n~
i
m
+
82.5 62.5 42.5 22.5. 2.5 0.
1.
X
I
2.
3.
1 2
aADJ (adjacent) is extreme value in tail.
Adherence Level Group 2 Adherence
71.30769 90.33333
89 95
45 91.25
92.5 98
8 91
96 100
(3,4)
Difference Between T w o G r o u p s in Several D i m e n s i o n s of the Boxplot Analysis of A d h e r e n c e (See Fig. 9) First Third Lower Upper Mild Extreme Row Name X value Mean Median quartile quartile ADJa ADJa outliers outliers
Table 3
~r
03 ol to
553
Analysis of Adherence Data From Markers
I BOXPLOTS OF COLUMNS OF CPRI M
E A N
a
m ,2.
O. 1. X~
3.
5.
4.
6.
Figure 10 Differences among groups in several dimensions of the boxplot analysis of adherence. 1--group 1; 2--group 2; 3---group 3; 4--group 4; 5---group 5.
IDENTIFYING FACTORS ASSOCIATED WITH NONADHERENCE
A final topic is the concept of refining the data to determine factors associated with nonadherence. In general, if multiple factors may be influencing adherence, statistical techniques such as regression analysis can be used. The independent contributions of variables to nonadherence can be described by a multivariate model. However, a univariate pattern analysis as a first step may reveal potential difficulties associated with nonadherence. For example, patterns may indicate that nonadherence is a problem for a particular group, for such characteristics as age, sex, education level, and so forth. A useful technique is survival analysis, which loosely describes the future trend of
Figure 11 Time trend of adherence using boxplot analysis. 1--time 1; 2--time 2; 3--time 3.
X
M E A H
X X
e.
1.
X--Lk~L
2.
3.
4.
554
A. Silvers, M. L. Russell and W. Insull, Jr.
A D H E R E H C E
.B
.6-
\,
.4P R 0 P 0 R T ] 0 H
\, \,
.2-
"Z
O. ;
2. -1. I'I(~THS
4.
O.
6.
IO.
13.
Figure 12 Differences between groups of survival analysis. --*--, group 1; - - # - - , group 2.
nonfailures. In this case, partial a d h e r e n c e m a y be called a failure. The time course of a g r o u p of individuals with failures can be described by a survival curve. It r e p r e s e n t s a time course of a g r o u p of individuals w h o are a d h e r e n t at specific times. For example, in two survival curves, the second g r o u p is becoming n o n a d h e r e n t more quickly t h a n the first group (Fig. 12). Further analysis m a y indicate that variables in the second g r o u p cause it to be far
Figure 13 Differences between groups by survival analysis considering variable of travel. --*--, in town; - - # - - , on the road. .-
A
D H E
.B--
R E N C E P R O P O R T I O 14
O,Q--
2. ~n~s
4.
6.
e.
11.
Analysis of Adherence Data From Markers
555
more nonadherent than the first group. For example, individuals in the second group may be out of town more often than those of the first group (Fig. 13) or the second group may have education levels different from the first group. Variables such as these may be the cause for the difference in the two survival curves. This univariate analysis can be enhanced by a regression analysis, such as the use of the Cox proportional hazards linear model, which may indicate the variables that independently affect nonadherence.
SUMMARY
Several statistical approaches were discussed for handling adherence data. A major emphasis was that averaging may be misleading and may not help the trial leader. In-depth analysis of nonadherence needs to uncover the reasons for nonadherence. The statistician should work closely with the behavioral counselor to change the partial adherer to a complete adherer, which limits the number of outliers and skewness of the distribution. Many other areas need elucidation, which probably will be forthcoming due to the growing interest in this topic. For example, sample size determination in which potential nonadherers are considered dropouts may be part of a clinical design. The use of other approaches to flagging outliers and the use of exploratory graphic procedures for detecting nonacceptable adherence rates need enumeration. The incorporation of nonparametric procedures, especiaUy nonparametric density estimation techniques for describing the distributi0n of adherers, is another area for further elaboration.