Some methodologic issues in analyzing data from a randomized adolescent tobacco and alcohol use prevention trial

Some methodologic issues in analyzing data from a randomized adolescent tobacco and alcohol use prevention trial

Journal of Clinical Epidemiology 56 (2003) 332–340 Some methodologic issues in analyzing data from a randomized adolescent tobacco and alcohol use pr...

99KB Sizes 0 Downloads 43 Views

Journal of Clinical Epidemiology 56 (2003) 332–340

Some methodologic issues in analyzing data from a randomized adolescent tobacco and alcohol use prevention trial Donald J. Slymena,*, John P. Eldera, Alan J. Litrownikb, Guadalupe X. Ayalab, Nadia R. Campbella b

a Graduate School of Public Health, San Diego State University, 5500 Campanile Drive, San Diego, CA 92182-4162, USA SDSU/UCSD Joint Doctoral Program in Clinical Psychology, San Diego State University, 6363 Alvarado Ct., San Diego, CA 92120-4913, USA

Received 26 July 2001; received in revised form 3 December 2002; accepted 4 December 2002

Abstract Three issues concerning the design and analysis of randomized behavioral intervention studies are illustrated and discussed within the framework of a tobacco and alcohol prevention trial among migrant Latino adolescents. The first issue arises when subjects are randomized in clusters rather than individually. Because subject observations cannot be assumed to be independent, information pertaining to the degree of clustering must be reported, and analyses must take the clustering into account. The second issue concerns the impact of compliance to the intervention and the importance of measuring compliance in the experimental and attention-control groups. A compliance analysis should control for participant contact with study personnel. Investigators must consider ways of constructing a compliance measure that is common to both conditions. Third, because outcomes are measured repeatedly over time, we illustrate the importance of assessing the impact of missing-data patterns on outcomes and the extent to which the patterns may modify the treatment effect. 쑖 2003 Elsevier Inc. All rights reserved. Keywords: Behavioral interventions; Data analysis; Statistical models

1. Introduction Randomized behavioral intervention studies have increased in number and complexity as researchers attempt to devise strategies to effectively treat and prevent lifestylerelated diseases. Although these studies share many traits with traditional therapeutic clinical trials, they also possess unique characteristics that suggest analytical issues that should be examined as part of the intervention evaluation. Interventions for tobacco or alcohol prevention, improved nutrition, and increasing physical exercise are challenging from many perspectives, including data analysis, which frequently involves longitudinal and clustered data. We examine the following three issues within the framework of a randomized tobacco and alcohol prevention trial among migrant Latino adolescents in San Diego County (Sembrando Salud or “Sowing the Seeds of Health”): (1) assessing and adjusting for cluster randomization; (2) assessing compliance to the intervention; and (3) assessing patterns of missing data, which may affect the evaluation, using pattern-mixture modeling. The third issue arises due to repeated measures

* Corresponding author. Tel.: 619-594-6439; fax: 619-594-6112. E-mail address: [email protected] (D.J. Slymen). 0895-4356/03/$ – see front matter 쑖 2003 Elsevier Inc. All rights reserved. doi: 10.1016/S0895-4356(03)00012-X

taken over time for each subject. This dependence of withinsubject observations occurs in longitudinal data and must be accounted for in the analysis. Readers unfamiliar with longitudinal data methods should note the importance of accommodating the dependence of repeated measures and of the three issues listed above. Although these issues have been discussed individually in statistics textbooks and in the current literature, few examples of their application to one trial are available. This article reinforces the importance of evaluating these components in behavioral intervention studies.

2. Methods Sembrando Salud was designed as a community-based tobacco and alcohol use prevention program targeting migrant adolescents. The participants were identified through the Migrant Education Program in San Diego County, California. Twenty-two schools in San Diego County participated in the study, yielding 660 adolescents and one adult caregiver per child. Schools were randomly assigned to an experimental intervention designed to prevent the use of tobacco and alcohol or to an attention-control first aid/home safety program.

D.J. Slymen et al. / Journal of Clinical Epidemiology 56 (2003) 332–340

After recruitment at a given school, trained evaluation assistants, who were bilingual, bicultural, and blinded to condition, conducted the face-to-face surveys at baseline (M1), with parents and adolescents assessed simultaneously in separate areas. Based on the predetermined random assignment of schools, parents and adolescents were exposed to eight sessions of the experimental or attention-control intervention over a 7- to 10-week period, adjusting for school closures (e.g., vacations). Once the group educational sessions were completed, a post-assessment (M2) was conducted on 637 of the participants. One- and 2-year followups (M3 and M4) were completed on 587 and 537 participants, respectively. Surveys were conducted face to face at M1 and M2 and by phone at M3 and M4. A $10 incentive for completion of the assessments was provided to each family to minimize attrition. Rates of attrition were similar across the experimental and attention-control conditions. Educational sessions were held during evening hours and arranged to accommodate families’ schedules. The educational programs for the experimental and attention-control conditions had equivalent formats except for the specific content of the material. Small group sessions were held weekly for 8 weeks for the adolescents, and parents attended the first, second, and eighth sessions. Both programs targeted skill development whereby social learning techniques (e.g., modeling, rehearsal, and reinforcement) were used. The programs were specifically tailored to a migrant Latino audience. Participants in both conditions followed the same format in the educational program and in subsequent booster phone calls and newsletters; the only difference was the content of the material. This allowed for an assessment of participant adherence on outcome measures using information from both groups to minimize bias. The educational sessions were conducted in groups. A total of 23 Group Leaders were recruited from local universities and colleges, screened by project staff, randomized to one of the two educational programs (13 in the experimental condition and 10 in the attention-control condition), and trained. Group Leaders were monitored throughout the intervention period to insure that the programs were being presented as designed. The clustering of observations at the Group Leader level was examined in detail even though randomization occurred at the school level. School-level clustering altered beta coefficients and standard errors (SEs) by less than 5%, and P values were not altered. Therefore, to simplify the presentation, only clustering at the Group Leader level is examined. A 201-item survey was constructed using previously developed scales or items and was translated into Spanish and back-translated. The primary outcomes to assess the impact of the intervention on adolescent tobacco/alcohol use were 30-day smoking and drinking and susceptibility to smoking [1,2] and alcohol—all dichotomous outcomes. Adolescents were coded as susceptible to smoking if they were current smokers, if they did not show a firm resolve not to smoke in the future, if they would accept a cigarette from a friend,

333

or if they intended to smoke in the next year. Similar criteria were applied to alcohol. Because of the low prevalence rates of 30-day smoking, this outcome was not examined in detail beyond the initial main analysis (i.e., compliance was not examined). Subject participation was measured in several ways based on information collected on the number of sessions attended, homework completed, and boosters and newsletters received. To illustrate the compliance analysis, participation level was based on the number of intervention sessions attended plus the number of homework assignments completed; the scores ranged from 0 to 15. Due to the binary longitudinal nature of the four outcomes, generalized estimating equations (GEE) were used for the primary analyses [3]. For each outcome, models were constructed treating the post-intervention measures (M2, M3, and M4) as a set of repeated measures and adjusting for the baseline measure M1. The models included indicator variables to represent the time effects, the intervention effect, and the time by intervention interaction. The interaction would assess whether any intervention effect was consistent across the time periods. If the interaction was not significant, then the term was dropped and a second model was fitted with only the time and group effects. All models were refitted adjusting for age, gender, and baseline acculturation. The GENMOD procedure in SAS was used to obtain the estimates. Further details about Sembrando Salud are found elsewhere [4,5]. In this article, we examine 30-day drinking and susceptibility to tobacco and alcohol to illustrate concepts. Sembrando Salud follows a design that is typical of randomized behavioral intervention studies. Baseline data are collected. Subjects are randomized in clusters to one of two conditions: experimental intervention or an attentioncontrol. The experimental or attention-control programs are delivered with assessments made on the extent to which each subject was exposed to the program (participation rate). Follow-up measurements are taken repeatedly to assess the effectiveness of the intervention and to assess how well the effect is maintained during follow-up. In spite of efforts to insure subjects’ return for measurement evaluation, some loss is inevitable, giving rise to missing data and various patterns or missingness by measurement period. Subsequent sections address the issues of analysis in the presence of cluster randomization, compliance, and patterns of missing data. 2.1. Adjusting for clustered observations Donner and Klar [6], Murray [7], and others have expressed the case for adjusting for clustered observations in randomized studies. Because subjects within a cluster are not independent, it is inappropriate to use statistical methods that assume independence among all subjects. The lack of independence arises for different reasons depending on the nature of the cluster and the study design. In Sembrando

334

D.J. Slymen et al. / Journal of Clinical Epidemiology 56 (2003) 332–340

Salud, clustering effects may arise from two sources: children attending the same school and participants who are attending the same group session. The school clustering was negligible and is not addressed further. Because the group clustering is more critical and represents a stronger degree of dependence, we measured the degree of clustering and assessed its impact on data analysis. Although some of the clustering may be attributed to regional similarities as it pertains to specific parts of San Diego County, the larger influence is attributed to the effect of personal interactions among members of the same intervention condition and being taught by the same Group Leader. The intraclass correlation is often used as a measure of the degree of clustering and is useful for sample size estimation because it is a necessary component for adjustment [8]. Although recent cluster randomization trials routinely report correlation estimates [9–12], it can be difficult to obtain this information from the literature. An estimate of the correlation can be obtained by performing a one-factor ANOVA with “cluster” as the factor and using the formula: ρˆ ⫽ [MSBetween ⫺ MSWithin]Ⲑ[MSBetween ⫹ (m ⫺ 1)MSWithin] where MSBetween and MSWithin represent the between-cluster and within-cluster mean square estimates from the ANOVA, and m is the average cluster size. Another useful quantity is the design effect or inflation factor: 1 ⫹ (m ⫺ 1) ρˆ which indicates the degree to which a sample size estimated assuming independence among subjects must be inflated to account for the effect of clustering. Table 1 displays the intraclass correlation and design effect estimates for susceptibility to smoking and drinking and 30-day drinking by intervention status and measurement period. The average cluster sizes (m) ranged from 6.9 to 9.4. Intraclass correlations ranged from 0 to 0.065. There are no obvious trends with respect to outcome measure, intervention status, or measurement period. Design effects ranged from 1.0 to 1.49. A design effect of 1.49, for example, indicates that a sample size should be inflated by 49% to overcome the lack of independence for a study

of comparable design. Investigators who are considering group educational sessions as the primary intervention component using similar outcomes may wish to examine sample sizes using the range of intraclass correlations seen in Table 1 to determine the impact of this particular type of clustering. In Sembrando Salud, the primary outcomes are binary and measured on individuals over time. In addition, individual-level characteristics were incorporated into the models for adjustment. Therefore, regression models were fitted to the data. Two approaches have been used to adjust for clustering when individual-level data are involved: mixed effects regression models and GEEs. Details of these methods are found elsewhere [13–15]. We chose GEE, preferring a population-average interpretation rather than a subjectspecific interpretation, because the primary emphasis of the study was to determine the average difference between control and experimental conditions. The impact of clustering can best be examined by fitting models with and without adjusting for clustering and observing the differences in regression coefficients, SEs, and tests of hypotheses. To illustrate the differences, the GEE approach was fitted to each of the three binary outcomes at M4 with models that included an indicator for intervention condition and the corresponding baseline measure (M1). The cluster adjustment was at the group level, formed from the delivery of the educational sessions in groups ranging in size from 2 to 14. There were a total of 70 groups included in this analysis. Because all outcomes were binary, the models were specified with a logit link and binomial error distribution. An exchangeable correlation structure was specified to account for the within-group correlation; however, the empirical-based estimates were reported. The unadjusted analysis is essentially a particular form of a generalized linear model with a logit link and binomial error. The results are displayed in Table 2. In this instance, the results are consistent between the unadjusted and adjusted analyses. The beta estimates represent the log odds of the event (30-day drinking or susceptibility to smoking or drinking). The percent changes observed in the betas from unadjusted to adjusted ranged from 2.3% to 4.2% toward 0, and the changes in the SE estimates of the betas increased from 7.4% to 12.6%. Consequently, all of the P values were slightly higher after

Table 1 Intraclass correlations and design effects with groups as the clustering variable for selected outcomes stratified by intervention condition and measurement period Measurement perioda Outcome measure

Intervention

M2

Susceptibility to smoking

Control Experimental Control Experimental Control Experimental

0.049 0.0 0.017 0.036 0.065 0.004

30-day drinking Susceptibility to drinking

M3 (1.37) (1.0) (1.13) (1.31) (1.49) (1.03)

Abbreviations: M2, post-assessment; M3, 1-year follow-up; M4, 2-year follow-up. a Values are intraclass correlations with design effects in parentheses.

0.010 0.006 0.017 0.0 0.030 0.011

M4 (1.07) (1.05) (1.11) (1.0) (1.20) (1.088)

0.048 0.039 0.022 0.053 0.059 0.0

(1.28) (1.28) (1.13) (1.39) (1.35) (1.0)

D.J. Slymen et al. / Journal of Clinical Epidemiology 56 (2003) 332–340

335

Table 2 Analysis at M4 only using generalized estimating equations to assess the impact of group clustering on the effect of invervention Unadjusted for clustering

Susceptibility to smoking ExperimentalⲐcontrol 30-day drinking ExperimentalⲐcontrol Susceptibility to drinking ExperimentalⲐcontrol

Adjusted for clustering

Beta

Standard error

P value

Beta

Standard error

P value

⫺0.337

0.228

.14

⫺0.323

0.256

.21

0.261

0.380

.49

0.240

0.428

.57

0.348

0.203

.09

0.340

0.218

.12

Abbreviation: M4, 2-year follow-up. All models included the corresponding baseline measure (not shown) and the intervention status variable. To adjust for clustering at the group level, generalized estimating equations were used.

adjustment. In general, one would expect larger SEs after adjustment, which would result in more conservative tests. Similar results were observed at M2 and M3 and are not presented. Although the results in this example were not enough to make substantive changes in the interpretation, the observed changes do suggest the need to evaluate the degree of clustering carefully. Investigators are becoming more familiar with mixed effects and GEE models to account for clustering, but it is questionable how widespread the practice is. The incorporation of these approaches in well-known statistical packages, such as SAS, STATA, and SUDAAN, has helped to accelerate their use. However, because of the complexity of the methodology, there is a need to train data analysts in the appropriate use of these approaches. 2.2. Participant adherence In most clinical trials and behavioral intervention studies, efforts are made to measure the extent to which the subject was exposed to the intervention. This exposure may take the form of pill counts in drug studies, attendance at group intervention activities, and other markers of participation. The quantitative measures of exposure may be as simple as a dichotomous indicator of achieving a satisfactory level of exposure or a continuous response that might be perceived as measuring a “dose” of the intervention. The exact means of characterizing this exposure vary considerably depending upon the nature of the study. In clinical trials, there is evidence that patients who comply with taking medication, regardless of whether they are in the active treatment or placebo conditions, have better outcomes [16,17]. Those who are more compliant may have more positive expectations and may benefit more from any intervention. It is important to control for this effect. Assessing the relationship between compliance to medication and health outcomes only in the active treatment group may be misleading. In behavioral intervention studies, it is often the case that in spite of having an attention-control condition, the measure of compliance can be described only in the experimental condition and not in the control condition. To obtain the

same measure in the control condition, the participants must have comparable exposure to non-experimental educational materials in a format identical to that used with experimental subjects. Such was the case in Sembrando Salud. This design serves to remove any effects caused by participant contact by study personnel. Therefore, we may look at degree of exposure to the tobacco and alcohol program received in the experimental condition contrasted to the control condition and its exposure to first aid/home safety materials and the relationship to the three tobacco and alcohol outcome measures. Comparable components of the intervention format included number of sessions attended by youth and parents, number of homework assignments, and number of boosters and newsletters received. To capture exposure to the eight group sessions and seven homework assignments by youth, a score ranging from 0 to 15 was constructed to determine the number of sessions attended plus the number of homework assignments completed. This equal weighting reflects the investigators’ belief of the equal importance of the group sessions and homework assignments to the intervention. Table 3 displays descriptive statistics showing the rates of 30-day drinking and susceptibility to smoking and alcohol stratified by exposure to the intervention for control and experimental conditions at M2. A “dose-response” can be measured by assessing the differential rate of change in the outcome as compliance increases between the control and experimental conditions. Table 3 suggests that, for susceptibility to smoking and alcohol, there is a decrease in rates in the experimental condition as compliance increases compared with that of the control condition. The trend for 30-day drinking is less clear. A formal assessment of adherence can be carried out by evaluating the interaction between adherence and condition. For each outcome, a model is fitted that includes terms for the baseline (M1) value for the outcome, the condition indicator variable, the adherence measure (which is left as a continuous variable), and the adherence-by-condition interaction. The GEE method was used to account for the clustering effect of the group leaders. In Table 4, the first three columns display the results of this analysis. The interaction term is significant

336

D.J. Slymen et al. / Journal of Clinical Epidemiology 56 (2003) 332–340

Table 3 Percent susceptible to smoking and alcohol and 30-day drinking rates by degree of adherence to intervention as measured by number of intervention sessions and homework by youth stratified by intervention condition at M2 Percent positive Outcome Susceptibility to smoking 0–1a 2–8 9–12 13–15 30-day drinking 0–1 2–8 9–12 13–15 Susceptibility to alcohol 0–1 2–8 9–12 13–15

Control

Experimental

(22Ⲑ66)b (25Ⲑ61) (21Ⲑ59) (34Ⲑ96)

34.2 37.1 26.3 21.5

(27Ⲑ79) (36Ⲑ97) (26Ⲑ99) (17Ⲑ79)

4.6 6.6 5.1 6.3

(3Ⲑ66) (4Ⲑ61) (3Ⲑ59) (6Ⲑ96)

12.7 9.3 3.0 8.8

(10Ⲑ79) (9Ⲑ97) (3Ⲑ99) (7Ⲑ80)

34.9 45.0 25.4 32.3

(23Ⲑ66) (27Ⲑ60) (15Ⲑ59) (31Ⲑ96)

40.3 41.2 24.7 19.2

(31Ⲑ77) (40Ⲑ97) (24Ⲑ97) (15Ⲑ78)

33.3 41.0 35.6 35.4

Abbreviation: M2, post-assessment. a Sum of intervention sessions ⫹ homework, youth only. b Number positive per total sample.

at the .05 level for susceptibility to smoking and drinking. The beta estimates indicate that there are decreasing rates in susceptibility among experimental subjects compared with the control subjects as compliance increases. This trend is not significant for 30-day drinking. The models were repeated adjusting for age, gender, and acculturation. The P values for the interaction terms were .046 for smoking susceptibility and .047 for alcohol susceptibility, indicating marginal significance. Similar analyses were carried out at M3 and M4; however, the same trends were not apparent. None of the interaction terms was significant. The lack of significant findings at M3 and M4 might suggest the importance of extending the intervention over a larger period of time or the need for more intense booster measures.

It is often the case that compliance cannot be measured in the attention-control condition using the same marker as in the experimental condition because the exact format of intervention delivery and assessment was not followed. Just as the overall assessment of the experimental effect may be biased without an attention-control condition, an analysis of adherence can suffer from similar limitations. Such an analysis may be carried out only in the experimental condition. By comparison, Table 4 includes an analysis of adherence for each outcome using only the experimental condition. The two terms in the model are the adherence main effect variable and the baseline value for the outcome. The table indicates the results may differ substantially from the analysis, which includes attention-control information. Whereas the analysis of adherence for susceptibility to alcohol was only marginally significant with the attention-control group included (P ⫽ .034), it is highly significant without it (P ⫽ .003). For susceptibility to smoking, a marginally significant result (P ⫽ .036) is not significant when the attention-control group information is excluded (P ⫽ .084). The attention-control condition can account for study personnel influence and secular trends, neither of which can be distinguished from a potential relationship with compliance using only the experimental subjects. The compliance analysis does not take the place of the overall analysis to assess the differences between the experimental and control conditions. It is meant to augment the main analysis and to provide insight into the possible influence of participant variability. 2.3. Assessment of missing data patterns The compliance analysis examines the influence of subject participation during intervention delivery. However, it is assumed that we are still able to measure the outcome on these subjects regardless of their level of exposure to the education materials. In an intervention study where repeated measures are taken post-intervention, it is possible that subjects will not have observed outcomes on one or more of the measurement periods. Subjects may drop out of the study

Table 4 Comparison of analyses for dose response to the adherence to the intervention with and without the attention-control condition evaluated at M2 With attention-control

Susceptibility to smoking Adherence main effect Experimental/control indicator Adherence by intervention interaction 30-day drinking Adherence main effect Experimental/control indicator Adherence by intervention interaction Susceptibility to alcohol Adherence main effect Experimental/control indicator Adherence by intervention interaction Abbreviation: M2, post-assessment.

Without attention-control

Beta

Standard error

P value

Beta

Standard error

P value

0.032 0.343 ⫺0.077

0.024 0.335 0.037

.19 .31 .036

⫺0.045

0.026

.084

0.025 0.982 ⫺0.081

0.040 0.564 0.059

.54 .082 .17

⫺0.056

0.045

.21

⫺0.004 0.386 ⫺0.072

0.023 0.364 0.034

.88 .29 .034

⫺0.078

0.026

.003

D.J. Slymen et al. / Journal of Clinical Epidemiology 56 (2003) 332–340

over time, or they may miss intervening measurements for various reasons. The result is different patterns of missing data, which may influence the treatment assessment. For example, the difference between control and experimental conditions may favor the experimental group among subjects with complete data, whereas the treatment effect may diminish as the number of missing visits increases or diminishes when stratified into early or late patterns of missed visits. It is the potential dependence of the treatment effect on the missing data patterns that we wish to explore. In Sembrando Salud, because there were three measurement periods after the intervention, eight possible patterns (23 ⫽ 8) of missing data could arise. The eight patterns along with the frequencies observed for susceptibility to smoking and alcohol are displayed in Table 5, letting M represent missing values and O represent observed values. For smoking susceptibility, complete data are available on 517 subjects. The patterns OOM and OMM have the two largest frequencies, with missing data at 53 and 56, respectively. There are 13 subjects with missing data at all three post-intervention periods. Similar frequencies are seen for alcohol susceptibility. The basic idea is to stratify the sample according to these patterns or simpler combinations of the patterns and to incorporate the strata as covariates into regression models to examine the effects of the missing-data patterns on the outcome measures. The patterns are represented by dummy variables in the model. Not all patterns can be represented in the regressions: The MMM pattern provides no data, and the frequencies for several others are too small. However, we can express other meaningful patterns of missing data by combining categories. For example, it is possible to construct a set of patterns based on none missing, one missing, or two missing represented by two dummy variables. Or, more simply, a contrast of none missing versus at least one missing is represented by one dummy variable. In other situations, it may be more meaningful to focus on patterns representing a progressively missing process where all missing data are due to dropouts. For example, the patterns OOO, OOM, and OMM suggest that the first missing period implies that all subsequent periods are missing as well. Once the patterns are determined, the dummy variables are constructed and entered into the model as main effects and Table 5 Missing data patterns for three post-intervention repeated measures and frequencies in Sembrando Salud Pattern

Frequency: susceptibility to

M2

M3

M4

Smoking

Alcohol

O O O O M M M M

O O M M O O M M

O M O M O M O M

517 53 7 56 7 3 1 13

516 52 4 54 11 4 1 15

Abbreviations: M, missing value; O, observed value.

337

as interactions with other terms. This allows an assessment of the impact of the missing-data patterns on the outcome and whether the relationship between outcome and other effects varies by missing-data pattern (interaction). To illustrate pattern-mixture modeling with Sembrando Salud, a simple dichotomous variable is constructed with complete data (OOO) coded as 1 and all other patterns coded as 0. From a practical perspective, other meaningful more complex patterns yielded some sparse frequencies and were not feasible to examine. Table 6 displays prevalence rates for susceptibility to smoking and alcohol broken down by whether or not at least one period is missing, intervention condition, and measurement period. Among those with no missing data, the rates of smoking susceptibility are lower in the experimental condition compared with the control condition; however this trend is reversed among those with at least one missing period. The same trend is not evident with alcohol susceptibility. Two classes of models (pattern-mixture models and selection models) have been described [18–20] for handling missing observations. Whereas selection models require detailed information about the missing data process, pattern-mixture models avoid this requirement. With pattern-mixture models, the population is stratified by the missing data patterns, and this information is incorporated into the modeling process. Pattern-mixture models may be applied to a variety of longitudinal models, such as mixed effects regression models and generalized estimating equations [21–23], as long as the approach allows for missing data over time. GEE is used here using the same setup for binary outcomes previously described. For smoking and alcohol susceptibility, the initial model consisted of the following terms: baseline susceptibility, time period (two dummy variables), intervention condition indicator, complete data indicator (yes/no), interaction of intervention with time (two dummy variables), interaction of intervention with complete data indicator, interaction of time with complete data (two dummy variables), and the three-way interaction of time by intervention by complete data (two dummy variables). Consequently, this model augments the basic model used to assess the intervention by including the complete data main effect and interactions with time, intervention, and time by intervention. Table 7 displays the initial fitted model for smoking susceptibility. The three-way interaction was not significant and was removed. In the next fitted model, the two-way interaction of compete data with time was not significant and was removed. However, the complete data main effect and the interaction between complete data and the intervention remained significant and were retained. This final model is also shown in Table 7. The parameter estimates reflect the trends seen in the prevalence rates in Table 6. Among participants with complete data, the experimental subjects showed lower prevalence rates of smoking susceptibility compared with the control subjects over time. However, among those with at least one missing period, there is a marked difference in rates, with the experimental subjects demonstrating higher

338

D.J. Slymen et al. / Journal of Clinical Epidemiology 56 (2003) 332–340

Table 6 Prevalence rates for pattern of at least one missing measurement period on susceptibility to smoking and alcohol stratified by intervention status and measurement period Measurement period Missing data pattern Susceptibility to smoking None missing At least one missing Susceptibility to alcohol None missing At least one missing

Intervention status

M2

M3

M4

Control Experimental Control Experimental

37.3 26.9 31.2 47.3

(82Ⲑ220)a (80Ⲑ297) (19Ⲑ61) (26Ⲑ55)

34.1 27.6 18.8 54.8

(75Ⲑ220) (82Ⲑ297) (6Ⲑ32) (17Ⲑ31)

23.2 16.8 16.7 22.2

(51Ⲑ220) (50Ⲑ297) (⫺1Ⲑ6) (2Ⲑ9)

Control Experimental Control Experimental

32.9 31.7 39.3 30.6

(72Ⲑ219) (94Ⲑ297) (24Ⲑ61) (15Ⲑ49)

32.0 32.7 36.4 32.4

(70Ⲑ219) (97Ⲑ297) (12Ⲑ33) (11Ⲑ34)

25.6 31.7 14.3 44.4

(56Ⲑ219) (94Ⲑ297) (1Ⲑ7) (4Ⲑ9)

Abbreviations: M2, post-assessment; M3, 1-year follow-up; M4, 2-year follow-up. a Number of subjects per total sample.

rates of smoking susceptibility compared with the control subjects. One explanation is that the reasons for missing measurement periods (largely attributed to dropping out of the study) varied between control and experiment groups. The rates are similar for the control subjects across the missing data patterns, whereas rates for the experimental group varied (lower for none missing and higher for at least one missing). A plausible explanation may be that it is clear to the adolescents in the experimental condition that the Table 7 Pattern mixture model to assess the impact of at least one missing measurement period on log-odds of susceptibility to smoking: initial and final models Term Initial model Intercept Baseline susceptibility Time: M3ⲐM2 Time: M4ⲐM2 Intervention (experimentalⲐcontrol) Complete data (yesⲐno) Intervention × M3ⲐM2 Intervention × M4ⲐM2 Complete data × M3ⲐM2 Complete data × M4ⲐM2 Intervention × complete data M3/M2 × intervention × complete data M4/M2 × intervention × complete data Final model Intercept Baseline susceptibility Time: M3ⲐM2 Time: M4ⲐM2 Intervention (experimentalⲐcontrol) Complete data (yesⲐno) Intervention × M3ⲐM2 Intervention × M4ⲐM2 Intervention × complete data

Beta estimate

Standard error

P value

⫺1.57 1.68 ⫺0.68 ⫺1.20 0.84 0.35 1.08 ⫺0.18 0.52 0.41 ⫺1.31 ⫺0.88

0.28 0.14 0.60 1.12 0.42 0.31 0.75 1.46 0.62 1.14 0.46 0.79

⬍.001 ⬍.001 .25 .29 .04 .26 .15 .90 .40 .72 .0043 .26

0.28

1.49

.85

⫺1.70 1.68 ⫺0.23 ⫺0.82 1.03 0.52 0.31 0.13 ⫺1.55

0.24 0.14 0.17 0.20 0.35 0.26 0.23 0.27 0.37

⬍.001 ⬍.001 .18 ⬍.001 .0031 .043 .17 .62 ⬍.001

Abbreviations: M2, post-assessment; M3, 1-year follow-up; M4, 2-year follow-up.

program is trying to get them to decide not to smoke. If they do not see the benefits of abstaining and have peers or parents who smoke, then the message is inconsistent with their “intentions” (health beliefs, etc.), and they may be more likely to drop out. They also may be more likely to skip measurement because they know what the program wants to hear, and they are unable to say it. It has been observed that Latinos often give socially desirable responses when participating in surveys or research projects and may avoid reporting undesirable behaviors [24]. A similar analysis was carried out for alcohol susceptibility. An initial model was fitted using the same terms described above. In subsequent models, all of the three-way interaction terms and the two-way terms involving complete data and the main effect for complete data were eliminated. Consequently, the missing data pattern of complete data versus at least one missing had no effect on the assessment of treatment for susceptibility to alcohol (not shown).

3. Conclusion This article illustrates some methodologic issues that are common in analyzing randomized behavioral intervention studies. The general type of study described here is a cluster randomization trial of an experimental condition compared with an attention-control in which the two conditions follow a similar intervention format with compliance assessment, and repeated measures are taken during the post-intervention period. The intent is to subject such a study to more rigorous analysis adhering to the design of the study and taking advantage of its unique characteristics. Hence, these issues should be understood within the context of this randomized study design. There may be other circumstances in behavioral and community studies where randomization may not be appropriate and where quasi-experimental designs involving multiple baselines or interrupted times series may be a better choice. It is not our intent to suggest the issues described here would necessarily apply to other designs.

D.J. Slymen et al. / Journal of Clinical Epidemiology 56 (2003) 332–340

Although the appropriate power considerations and analyses for cluster-randomized trials are well documented, there is a need to alert data analysts to these issues. As summarized by Donnner and Klar [6], recent reviews of the public health and medical literature suggest that these methods are not widely disseminated or routinely used. Authors are beginning to report intraclass correlations as part of the main trial publication or in follow-up articles. However, given the variety of cluster types and outcomes, it is difficult to find critical information on estimates of the degree of clustering that is needed for power calculations. We discuss a compliance analysis to illustrate the importance of measuring “dosage” in the experimental and attention-control groups. This feature of the design allows for removal of bias due to participant contact with study personnel from the adherence analysis through an evaluation of the adherence by treatment condition interaction. This analysis is a considerable improvement over those that rely only on compliance information from the experimental condition. However, this analytical approach requires careful consideration at the design stage of crafting the attention-control condition not only to control for the amount of attention all participants receive, but also to allow for a compliance measurement common to both conditions. In spite of our best efforts to minimize attrition and missed visits, some loss of information over time is inevitable. Pattern-mixture modeling is a useful technique to explore the impact of missing data in longitudinal studies. For intervention trials where repeated measures are taken post-intervention, one may examine the influence of missing-data patterns on the outcome and the extent to which the patterns may modify the treatment effect. Methods for analyzing longitudinal data make assumptions about the missing data mechanism, and these assumptions may be explored to some extent by using pattern-mixture modeling to augment the primary analyses. Missing data mechanisms are often described as one of three types: missing completely at random (MCAR), where the missingness does not depend on the observed or unobserved responses; missing at random, in which missingness depends on observed data but not on the unobserved response; or non-ignorable missing, where the mechanism is dependent on knowledge of the unobserved response. Park and Lee [25] describe the use of patternmixture models with GEE and construct a test to examine the MCAR assumption in GEE. Our simple example, using only two patterns, was meant to illustrate the use of this technique and to bring it to the attention of a wider audience. As more sophisticated methods of design and analysis become available for randomized behavioral intervention studies, a more in-depth examination of experimental effects can be carried out, and analyses can be constructed that adhere to the study design and examine underlying assumptions of the statistical models. This article attempts to bring some methods forward to an audience that can benefit from these advances.

339

Acknowledgment This research was supported by the National Cancer Institute (grant RO1 CA58858). References [1] Pierce JP, Farkas AJ, Evans N, et al. An improved surveillance measure for adolescent smoking? Tob Control 1995;4(Suppl 1):S47–56. [2] Pierce JP, Choi WS, Gilpin EA, et al. Validation of susceptibility as a predictor of which adolescents take up smoking in the United States. Health Psychol 1996;15:355–61. [3] Liang KY, Zeger SL. Longitudinal data analysis using generalized linear models. Biometrika 1986;74:13–22. [4] Elder JP, Campbell NR, Litrownik AJ, et al. Predictors of cigarette and alcohol susceptibility and use among Hispanic migrant adolescents. Prev Med 2000;31:115–23. [5] Litrownik AJ, Elder JP, Campbell NR, et al. Evaluation of a tobacco and alcohol use prevention program for Hispanic migrant adolescents: promoting the protective factor of parent-child communication. Prev Med 2000;31:124–33. [6] Donner A, Klar N. Design and analysis of cluster randomization trials in health research. London: Arnold; 2000. [7] Murray DM. Design and analysis of group-randomized trials. New York: Oxford University Press; 1998. [8] Donner A, Birkett N, Buck C. Randomization by cluster: sample size requirements and analysis. Am J Epidemiol 1981;114: 906–14. [9] Slymen DJ, Hovell MF. Cluster versus individual randomization in adolescent tobacco and alcohol studies: illustrations for design decisions. Int J Epidemiol 1997;26:767–71. [10] Gulliford MC, Ukoumunne OC, Chinn S. Components of variance and intraclass correlations for the design of community-based surveys and intervention studies: data from the Health Survey for England 1994. Am J Epidemiol 1999;149:876–83. [11] Murray DM, Rooney BL, Hannan PJ, et al. Intraclass correlation among common measures of adolescent smoking: estimates, correlates, and applications in smoking prevention studies. Am J Epidemiol 1994;140:1038–50. [12] Siddiqui O, Hedeker D, Flay BR, et al. Intraclass correlation estimates in a school-based smoking prevention study: outcome and mediating variables, by sex and ethnicity. Am J Epidemiol 1996;144: 425–33. [13] Hedeker D, Gibbons RD, Flay BR. Random-effects regression models for clustered data with an example from smoking prevention research. J Consult Clin Psychol 1994;62:757–65. [14] Norton EC, Bieler GS, Ennett ST, et al. Analysis of prevention program effectiveness with clustered data using generalized estimating equations. J Consult Clin Psychol 1996;64:919–26. [15] Diggle PJ, Liang KY, Zeger SL. Analysis of longitudinal data. Oxford: Oxford University Press, Clarendon Press; 1994. [16] Kaplan RM, Ries AL. Adherence in the patient with pulmonary disease. In: Hodgkin K, Connors GL, Bell CW, editors. Pulmonary rehabilitation: guidelines to success. Philadelphia: Lippencott; 2000:. p. 347–61. [17] Epstein LH. The direct effects of compliance upon outcome. Health Psychol 1984;3:385–93. [18] Little RJA. Pattern-mixture models for multivariate incomplete data. J Am Stat Assoc 1993;88:125–34. [19] Little RJA. A class of pattern-mixture models for normal incomplete data. Biometrika 1994;81:471–83. [20] Little RJA. Modeling the drop-out mechanism in repeated-measure studies. J Am Stat Assoc 1995;90:1112–21. [21] Hedeker D, Gibbons RD. Application of random-effects pattern-mixture models for missing data in longitudinal studies. Psychol Methods 1997;2:64–78.

340

D.J. Slymen et al. / Journal of Clinical Epidemiology 56 (2003) 332–340

[22] Park T, Lee SY. Simple pattern-mixture models for longitudinal data with missing observations: analysis of urinary incontinence data. Stat Med 1999;18:2933–41. [23] Verbeke G, Molenberghs G. Linear mixed models for longitudinal data. New York: Springer-Verlag; 2000.

[24] Marin G, Marin B. Research with Hispanic populations. Newbury Park (CA): Sage; 1991: p. 105–7. [25] Park T, Lee SY. A test of missing completely at random for longitudinal data with missing observations. Stat Med 1997;16: 1859–1871.