Constructing a database of individual clinical trials for longitudinal analysis

Constructing a database of individual clinical trials for longitudinal analysis

ARTICLE IN PRESS Controlled Clinical Trials 24 (2003) 324–340 Constructing a database of individual clinical trials for longitudinal analysis Christ...

132KB Sizes 0 Downloads 52 Views

ARTICLE IN PRESS

Controlled Clinical Trials 24 (2003) 324–340

Constructing a database of individual clinical trials for longitudinal analysis Christopher H. Schmid, Ph.D.a,*, Marcia Landa, M.S.a, Tazeen H. Jafar, M.D., M.P.H.b, Ioannis Giatras, M.D.c, Tauqeer Karim, M.D.d, Manoj Reddy, M.D.d, Paul C. Stark, Sc.D.a, Andrew S. Levey, M.D.d, for the Angiotensin-Converting Enzyme Inhibition in Progressive Renal Disease (AIPRD) Study Group a

Biostatistics Research Center, Division of Clinical Care Research, Department of Medicine, New England Medical Center, Boston, Massachusetts, USA b Department of Medicine, The Aga Khan University, Karachi, Pakistan c Hygeia Hospital, Athens, Greece d Division of Nephrology, Department of Medicine, New England Medical Center, Boston, Massachusetts, USA Manuscript received April 1, 2002; manuscript accepted October 28, 2002

Abstract Individual patient data are often required to evaluate how patient-specific factors modify treatment effects. We describe our experience combining individual patient data from 1946 subjects in 11 randomized controlled trials evaluating the effect of angiotensin-enzyme converting (ACE) inhibitors for treating nondiabetic renal disease. We sought to confirm the results of our meta-analysis of group data on the efficacy of ACE inhibitors in slowing the progression of renal disease, as well as to determine whether any study or patient characteristics modified the beneficial effects of treatment. In particular, we wanted to find out if the mechanism of action of ACE inhibitors could be explained by adjusting for follow-up blood pressure and urine protein. Each trial site sent a database of multiple files and multiple records per patient containing longitudinal data of demographic, clinical, and medication variables to the data coordinating center. The databases were constructed in several different languages using different software packages with unique file formats and variable names. Over 4 years, we converted the data into a standardized database of more than 60,000 records. We overcame a variety of problems including inconsistent protocols for measurement of key variables; varying definitions of the baseline time; varying follow-up times and intervals; differing medication-reporting protocols; missing variables; incomplete, missing, and implausible data values; and concealment of key data in

* Corresponding author: Christopher H. Schmid, Ph.D., Biostatistics Research Center, New England Medical Center Box 63, 750 Washington St., Boston, MA 02111. Tel.: 1-617-636-5179; fax: 1-617-636-5560. E-mail address: [email protected] 0197-2456/03/$—see front matter © 2003 by Elsevier Inc. All rights reserved. doi:10.1016/S0197-2456(02)00319-7

ARTICLE IN PRESS C.H. Schmid et al./Controlled Clinical Trials 24 (2003) 324–340

325

text fields. We discovered that it was easier and more informative to request computerized data files and merge them ourselves than to ask the investigators to abstract partial data from their files. Although combining longitudinal data from different trials based on different protocols in different languages is complex, costly, and time-intensive, analyses based on individual patient data are extremely informative. Funding agencies must be encouraged to provide support to collaborative groups combining databases. © 2003 Elsevier Science Inc. All rights reserved. Keywords: ACE inhibitors; Combining information; Individual patient meta-analysis; Heterogeneity; Meta-analysis; Meta-regression; Renal disease

Introduction Randomized clinical trials (RCTs) have become the gold standard for evidence in clinical medicine. This eminence has spawned a proliferation of studies in industry and academia seeking to “prove” that a treatment, technique, or device is superior to or at least the equivalent of a competitor. Inevitably, trial designs overlap and, not infrequently, trial results conflict. Although these apparent contradictions can often be attributed to an insufficient size of some or all of the studies, quite often results are truly heterogeneous and require explanation. When heterogeneity among studies arises from different patient characteristics, treatment formulations, or experimental conditions, it suggests that all treatments do not work for all patients to the same degree. Discovering how treatment efficacy and safety vary across populations and treatment protocols obviously has important implications from the patient’s perspective. It also affects the provider who needs to make treatment decisions for individual patients, the payer who needs to determine how best to allocate scarce financial resources, and the policy expert who must recommend appropriate guidelines. The most common statistical approach for investigating and evaluating heterogeneity is regression analysis, testing for interactions between treatment and other factors. When only summary data are available from each study, meta-regression of study treatment effects (e.g., log odds ratios) on study characteristics or factor summary statistics, such as means, is the standard choice [1,2]. Because the unit of analysis is the study, meta-regression can usefully detect interactions with factors that apply uniformly to all subjects in the study (e.g., blinding or use of placebo) but has little power to estimate interactions with factors that vary at the subject level (e.g., age or blood pressure) [3]. Meta-regression may be useful, though, when studies sample distinct subpopulations. In studying the relationship between mortality and time to treatment with a thrombolytic agent for patients having an acute myocardial infarction [1,4], several trials were explicitly designed to test treatment efficacy among patients presenting long after chest pain onset. These studies found a smaller benefit than did those studies in which patients presented earlier. Nevertheless, to detect interactions with subjectlevel factors, it will often be necessary to use the records of individual patients. Meta-analyses combining individual patient data have been infrequent because of their extremely intensive use of resources and need for obtaining detailed information that may no longer be or may never have been available. Descriptions of the technique have noted its required heavy investment of time, labor and money and the need for an experienced committed team of researchers [5,6]. Nevertheless, the ability to perform patient-level and subgroup

ARTICLE IN PRESS 326

C.H. Schmid et al./Controlled Clinical Trials 24 (2003) 324–340

analyses may lead to patient-specific clinical recommendations. Additional benefits accrue from the potential to find and correct inconsistencies noted in the data summaries, fill in missing values, obtain additional follow-up and unpublished data, and verify the adequacy of study quality and analysis [5]. Although individual-patient analyses have increasingly appeared in the clinical literature, aided by such collaborative groups as the Early Breast Cancer Trialists Collaborative Group [7] and the Cochrane Collaboration [8], most have sought only to improve treatment comparisons by obtaining updated event times from which to construct survival curves [9,10]. Few have attempted to develop multivariate regression models to better understand treatment efficacy, particularly using follow-up data. As described by Selker et al. [6, 11], however, such modeling cannot only bring more power to investigate treatment effects in subpopulations, but can also extend the generalizability and robustness of treatment recommendations by incorporating a broad spectrum of patients. Olkin has labeled meta-analysis with original data as the highest level of evidence [12]. Combining longitudinal data poses special problems. In any longitudinal study, accurate recording of dates is essential. This task is complicated when data from different studies are being combined retrospectively. Recording may be inconsistent across studies, information may be incomplete, and investigators may be unable to resolve discrepancies that occurred some time ago. Although some trialists and meta-analysts advocate large simple trials in which only minimal patient-level data are collected to avoid some of these problems [13], this approach prevents the construction of multivariate models that may uncover important treatment mechanisms of action [11,14,15]. In this paper, we present our experience with assembling longitudinal data on individual patients collected from different clinical trials into a single database. Regression analyses of a variety of outcomes are intended to expand on a meta-analysis of the literature based on summary data, further investigating the mechanisms by which the treatment acts successfully. We discuss the objectives of the study, the formation of the collaborative group, and the mechanics of database construction before concluding with some recommendations. Angiotensin-converting enzymes Inhibition in Progressive Renal Disease (AIPRD) Study End-stage renal disease (ESRD) affects more than 340,000 Americans and costs more than 15 billion dollars each year in dialysis, renal transplantation, Medicare-approved drug treatment, and home health care, in addition to costs to other providers and many nonmonetary costs to patients and caregivers [16]. The prevalence of earlier stages of renal disease is much higher [17]. Angiotensin-converting enzyme (ACE) inhibitors have been proven highly effective in slowing the progression of renal disease due to diabetes [18]. Evidence of efficacy in nondiabetic renal disease has been more problematic as the completed clinical trials exhibited substantial heterogeneity of effects. We showed in a meta-analysis of 11 RCTs published in 1997, however, that treatment regimens including an ACE inhibitor reduced the risk of development of ESRD by 40% in nondiabetic renal disease compared with regimens that did not include an ACE inhibitor [19]. Despite the importance of the finding of this large effect, several questions remained unanswerable from this analysis of summary data from the literature. Our analysis had focused

ARTICLE IN PRESS C.H. Schmid et al./Controlled Clinical Trials 24 (2003) 324–340

327

on the number of ESRD events in each trial over the course of patient follow-up, which varied between 1 and 4 years across the trials. But other than counting events, we had made no use of the extensive data collected in each trial at regular intervals during follow-up. We hoped to be able to answer these questions by analysis of the individual patient data from the clinical trials. Chronic renal disease is characterized by hypertension, proteinuria, and a progressive decline in glomerular filtration rate (GFR), manifested by a progressive rise in serum creatinine and onset of signs and symptoms of ESRD. ACE inhibitors lower blood pressure and urine protein excretion, which may favorably affect the progression of renal disease [20,21]. However, it is not known whether the effectiveness of ACE inhibitors to slow the development of ESRD derives solely from these two consequences or whether they have an additional effect through another mechanism of action. To answer this question, we proposed to adjust the ACE inhibitor effect for changes in follow-up levels of blood pressure and urine protein. Ultimately, we hoped to be able to recommend optimal levels of blood pressure and urine protein excretion for managing patients. Second, we sought to answer questions related to administration of the medications. Do ACE inhibitors work equally well for all patients with nondiabetic renal disease or were there treatment interactions that would indicate differential efficacy? Although doses and concomitant medications were not randomized and thus their differential efficacy could be confounded by aspects of study design or execution, we also hoped to gain some information about appropriate dosage and the effect of concomitant medications on ACE inhibitor response. Third, we wanted to explore several methodologic issues. What tasks and how much effort would be required to combine longitudinal data from studies stored in different formats and languages? How would results from analysis of the individual patient data compare with results based on analysis of the summary data for detecting interactions? In this paper, we address the first of these methodologic issues and describe our experience with combining databases. We have discussed several of the clinical issues in recent publications [14,15] and are currently preparing manuscripts that will discuss the other issues. Development of AIPRD Study Group collaborative infrastructure The initial meta-analysis described in detail procedures for searching the literature to find published or unpublished English-language RCTs on the effect of ACE inhibitors for treating renal disease in mainly nondiabetic samples [19]. The authors contacted the principal investigators of each of the 14 trials that met the selection criteria and invited them to form a collaborative group that would assemble all the individual patient data into a single database for future analyses by the group. Investigators were assured that their data would not be shared with other members of the study group or with investigators outside the study group without their consent. They were also informed of the study group’s publication policy. Coinvestigators from the original studies were also invited to participate in the collaboration. Eleven principal investigators agreed to participate forming the ACE Inhibition in Progressive Renal Disease (AIPRD) Study Group in 1996 [22–31; B.M. Brenner, personal communication;

ARTICLE IN PRESS 328

C.H. Schmid et al./Controlled Clinical Trials 24 (2003) 324–340

R. Toto, personal communication]. These 11 trials represent 95% of a total number of 2057 participants enrolled in the 14 studies identified at that time. The investigators at the other three studies did not respond to requests to supply data. All 11 studies required hypertension or decreased renal function for patient entry and each had common exclusion criteria [14]. The similar inclusion and exclusion criteria together with the randomization to treatment eliminated many of the comparability issues that can plague the combination of data from observational studies [6]. The AIPRD Study Group includes 25 members from Australia, Denmark, France, Italy, The Netherlands, Sweden, and the United States. The data coordinating center, located at New England Medical Center in Boston under the direction of a nephrologist (A.S.L.), has in the past 5 years comprised the efforts of two statisticians (C.H.S., P.C.S.), a programmer (M.L.), and four clinical fellows (T.H.J., I.G., T.K., M.R.). Authorship policy As has been discussed elsewhere, authorship can be a contentious issue in meta-analysis [5]. Trialists are extremely protective of their data. Having invested immense effort in collecting their data, they have no wish for others to receive the credit for subsequent new discoveries. Some will refuse to part with their data even after publication of results, believing it unfair for an outside team to gain notoriety from use of their data. We set up a policy of selecting writing committees for each proposed manuscript. Any member of the collaboration may propose a manuscript topic, which is then reviewed by an executive committee of three members. If approved, the topic is communicated to the general membership and volunteers for the writing committee are solicited. The analysis is usually performed at the data coordinating center where the database resides, but portions of the database may be sent to other study group members if they have proposed the analysis and wish to take the lead on it. Once analyses are completed and manuscripts drafted, members of the writing committee comment and suggest revisions as appropriate. All manuscripts cite the AIPRD Study Group in authorship along with the names of members of the writing committees. To date, the group has published three manuscripts [14,15,19] and is preparing six others for publication. Collaborators meetings As mentioned above, the AIPRD Study Group has met three times for collaborators meetings at the American Society of Nephrology (ASN) annual meetings in 1998, 1999, and 2000. The common forum at the ASN has not only served to gather participants but has reduced costs since investigators have not required additional funds to attend. These meetings have provided a means for the members of the clinical trials groups to meet the coordinating center staff as well as to learn about and discuss progress on the database, analyses undertaken, and future work. At each ASN meeting, members of the coordinating center have also presented results orally and in poster form. Thus, in addition to attending the collaborators meeting, members have been able to participate in the dissemination of their work to colleagues.

ARTICLE IN PRESS C.H. Schmid et al./Controlled Clinical Trials 24 (2003) 324–340

329

Before each meeting, the coordinating center staff have prepared and mailed to study group members data books that contain large numbers of tables describing the data and analyses carried out. Data books are also sent as supporting material to writing committee members along with manuscript drafts. This material has helped members reconnect to the project when their input is needed. We have found collaborators meetings to be invaluable for keeping the project on track, answering questions, resolving data queries, and making decisions about data and analytic issues.

Defining a database structure Upon agreement to compile the databases in 1996, each trial site sent to the data coordinating center original patient records and supporting documentation containing longitudinal data of demographic, clinical, and medication variables. Each database contained multiple files and multiple records per patient including formatting codes, each with unique file formats and variable names, in several different software packages and languages (Table 1). Construction of the final AIPRD study database was completed 4 years later in 2000. Assembling individual study data Before describing the structure that we used in setting up the AIPRD database, it will be useful to review the formats of the component databases from the individual trials. To expedite standardization, we originally requested that the groups extract specific information including demographics, cause and duration of renal disease, coexisting conditions, previous medications, dietary sodium and protein intake, and various renal disease outcomes, side effects, fatal and nonfatal events, withdrawals, and longitudinal follow-up. The longitudinal measurements consisting of blood pressure, serum creatinine concentration, creatinine clearance, GFR, urine protein excretion, urine albumin excretion, and antihypertensive medications were to be reported every 6 months for the first year and every year thereafter. Several groups did return data in this format; others provided their complete files including all follow-up visits and dates. We decided to retain the additional information. Consequently, for some studies we have exact dates of physiologic measurements and medication changes, whereas for others we have knowledge of these values at prespecified times but lack precise information on times of measurement and medication changes. Table 1 records the variety of different database formats, languages, and documentation received. This mix required substantial initial effort to untangle, especially because the varying clarity and completeness of the documentation required communication with the study data management teams. Because of the multinational nature of the 11 studies, measurement units and medications in several needed to be translated into English. This required use of foreign-language medical dictionaries, international medications compendiums and databases, and consultation with staff pharmacists and multilingual associates. Mindful of the additional statistical power to be gained from longer follow-up, we also enquired of each study team whether additional follow-up could be gathered. Each team informed us that resources for such an effort were not available. Some teams had disbanded and no longer maintained contact with study subjects; others with more recent information

No Yes

Enalapril 2.5

NS

BCDG

No Yes

Captopril 12.5–50

Nifedipine

DF

BDEF

Placebo

Enalapril 5–40

Yes Yes

112 2685 SAS 3

1993 United States English

3a

BDEFG

Placebo

Enalapril 5–40

Yes Yes

124 4131 SAS 3

1993 United States English

4b

No No

100 917 Excel 3

1994 France French

6 [25]

Yes Yes

51 141 Paper 1

1994 Australia English

7 [26]

CD

A/A CDF

A/A

BDE

Nifedipine

Enalapril Enalapril Enalapril 10–40 5–10 5–20

Yes Yes

103 585 Excel 4

1994 Holland English

5 [24]

BCD

A/A

Cilazapril 2.5–5

No No

260 2143 SAS 2

1995 Sweden Swedish

8 [27]

10 [29]

Yes Yes

Yes Yes

323 3819 Text 3

1999 Italy English

11 [30,31]

BCDEF

Placebo

BCDEFG

Placebo

BCDEFG

Placebo

Enalapril Benazapril Ramipril 5 10 1.25–5

Yes Yes

1996 1996 Australia Italy English Italian, French, German 70 583 660 8223 SAS SAS 2 3

9 [28]

b

Personal communication (B.M. Brenner). Personal communication (R. Toto). ACEIACE inhibitor; A/Aatenolol/acebutolol; Bbeta-adrenergic blockers; Ccalcium channel blockers; Ddiuretics; Eperipheral alpha-adrenergic blockers; Fcentral alpha-adrenergic agonists; Gvasodilators.

a

70 320 Paper 2

121 1784 DBase 3

Patients (n) Visits Data format Years planned follow-up Blinding Advice on dietary protein restriction Type of ACEI ACEI dose (mg/day) Control medication Concomitant medications

1992 Denmark English

1992 Italy Italian

2 [23]

330

Publication year Location Language

1 [22]

Study number

Table 1. Characteristics of AIPRD studies

ARTICLE IN PRESS

C.H. Schmid et al./Controlled Clinical Trials 24 (2003) 324–340

ARTICLE IN PRESS C.H. Schmid et al./Controlled Clinical Trials 24 (2003) 324–340

331

had no plans to follow subjects longer than specified in their protocols. Because the endpoints we sought were not available for all studies in public databases, we reluctantly abandoned any efforts to extend follow-up. This would appear to be the common fate of most attempts to update outcomes other than mortality in meta-analyses. Defining a centralized data structure We decided to set up the AIPRD database in a relational structure that was easy to query and use for data entry. Because some variables consisted of longitudinal measurements with multiple records per patient while others were single values, we constructed the database as a set of five tables (“Study,” “Patients,” “Outcomes,” “Visits,” and “Medications”) with accompanying forms suitable for data entry for those studies supplying paper forms. Each of these tables is described more completely in the following sections. Briefly, the Study table consisted of one record per study containing information pertinent to all the patients in the study. The Patients and Outcomes tables contained one record of multiple variables for each patient and the Visits and Medications tables consisted of multiple records per patient, each containing variables relating to a single visit or medication prescription. In each form, validity checks were set up to minimize potential data entry errors. These checks included triggers to alert the data entry operator about out-of-range entries and checkboxes for categorical variables that would allow only one entry per set of responses. Fig. 1 depicts the process involved in constructing this database from the raw data files. Because the coordinating center processed databases as they were received, implementation was not as smooth as the figure might indicate. Several redesigns of the final data structure were required in order to accommodate data features new to the latest incoming study. For example, the decision to switch away from the structure we had initially requested required us not only to redesign the database but also to reenter the data from the paper forms. Altogether, designing the new database, entering the data from the original submissions, and checking the data required approximately 2–3 person-years of effort, an amount substantially greater than that estimated previously by Stewart and Clarke [5]. Constructing database tables Study table The Study table consists of one record for each of the 11 studies detailing information specific to the study that might have an effect on its outcomes. Included are protocol features such as the type of control (placebo or other non-ACE inhibitor) and ACE inhibitor used, whether or not the study was blinded and randomized, year of publication, number of years of planned follow-up, and presence of dietary protein or sodium restrictions for participants. Patients table The final Patients table consists of 1946 records, one for each patient in each of the 11 studies. This includes 66 patients with diabetes. In most analyses, we concentrated on the 1860 nondiabetics with complete baseline information, but for this discussion we describe

ARTICLE IN PRESS 332

C.H. Schmid et al./Controlled Clinical Trials 24 (2003) 324–340

Fig. 1. Flow chart for construction of database. Rectangles indicate tables or files used and ovals indicate programs or processes through which these files were converted in order to create the final database. ACEangiotensinconverting enzyme; ESRDend-stage renal disease; GFRglomerular filtration rate.

the entire database. Each patient is uniquely identified by his or her study number and patient number within study. The patient and study numbers are the links used to connect the database tables that hold the different types of data. In addition to the treatment group, the Patients table lists a variety of demographic (age, sex, race, height, and weight), disease history (renal biopsy, cause and duration of renal disease, and seven coexisting conditions), and previous medication variables. Cause of renal disease is set up so that only one choice is allowed. Multiple entries are allowed for coexisting conditions and previous medications. Although coexisting conditions were recorded in many of the study databases, they were inconsistently defined and so were not usable in statistical analyses. Most other variables were complete, but height was not recorded in one study and several studies did not take renal biopsies or note the duration of disease. Outcomes table The Outcomes table consists of one record per patient listing the date of inclusion in the study (begin date), the last date of follow-up (end date), and dates for ESRD, mortality, nonfatal events, and withdrawals. When an exact date was not available, generally for those studies (1 and 6) providing information at specified months, the event month relative to the baseline was given instead. Nonfatal event dates were not completely available in all studies, however, and some studies only provided some of the nonfatal events. The Outcomes table also gives cause of death (unavailable for study 1) and reason for withdrawal from the study (unavailable for study 1). Definitions of adverse events varied across studies so we only analyzed study withdrawals.

ARTICLE IN PRESS C.H. Schmid et al./Controlled Clinical Trials 24 (2003) 324–340

333

Visits table The Visits table stores blood pressure, serum creatinine concentration, creatinine clearance, GFR, urine protein excretion, urine albumin excretion, urine creatinine excretion, urine volume, cholesterol, and triglycerides recorded at each patient visit. Each of the 31,345 records in the table contains measurements from a single dated visit. These variables were not all measured or recorded consistently across studies. Blood pressures were measured both while sitting and standing; GFR was measured in 9 of the 11 studies by a variety of techniques. Differing units of measurement were easily standardized, but in some studies it was necessary to impute missing values. For example, creatinine clearance could be computed from the ratio of urine creatinine excretion to serum creatinine concentration or from serum creatinine concentration using the Cockcroft and Gault formula [32]. The total urine protein excretion rate could be computed with the total protein concentration and urine volume. When urine volume was missing, we imputed values from the predictions of a regression of protein excretion on protein concentration on available patients. When protein concentration was missing, we generally had a dipstick reading from which we could roughly estimate the concentration. Medications table Each unique medication taken during the study period by any patient is stored as a record in the Medications table. The 28,073 records include data from 10 of the 11 studies. One study did not supply medication data. Each record stores the start and stop date (or month) for each medication as well as the medication name, dose (mg/day), and its categorization as an ACE inhibitor, beta-adrenergic blocker, calcium channel blocker, diuretic, peripheral alpha-adrenergic blocker, central alpha-adrenergic agonist, or vasodilator. The simplicity of this structure belies the large amount of work required for its construction. Several of the studies supplied medication data as text strings that combined drug name, dosage, and schedule. We needed to develop programming algorithms to parse these text strings to pull out the components, convert doses to a daily amount, and assign drug class. Several of the databases also supplied this text in languages other than English, thus requiring translation into English. In addition, studies often neglected to supply the date on which a medication stopped. We then had to infer this date from the next start date or the end of study, making the assumption that the medication was not stopped between visits. This assumption was especially problematic when the patient was taking more than one medication because it was not then obvious which medications might have been stopped. Because the studies used five different types of ACE inhibitors, all with varying standard dosages, we needed to construct a rule to standardize dose. As seven studies tested enalapril, we converted the others to “enalapril-equivalents,” using the rule that enalapril 1.0 mg was equivalent to benazapril 1.0 mg/day, cilazapril 4.0 mg, and ramipril 4.0 mg. Dosing information was not available for captopril in the one study in which it was used, so no conversion was developed for it. The inclusion of all treatments, both concomitant and randomized, in the Medications table allowed for many additional analyses in addition to intent to treat. In longitudinal analy-

ARTICLE IN PRESS 334

C.H. Schmid et al./Controlled Clinical Trials 24 (2003) 324–340

ses, treatment can be considered a time-dependent variable, and different variables such as whether or not the patient is currently on the randomized treatment, whether the patient has recently changed treatments, or whether the patient is on his or her most common treatment could all be used to measure effects [33].

Reconciling dates and error cleaning Dates To define lengths of courses of treatment, changes in physiologic parameters, and outcome event times, we needed to reconcile the different formats for dates we received from the trials into a common format with a single baseline time and follow-up dates that specified the time elapsed since baseline. Studies 2, 5, 7, and 8 listed values every 6 months for the first year and every 12 months thereafter; study 1 provided measurements every 2 months; study 6 provided some variables every 3 months and others every 6 months; studies 10 and 11 supplied visit numbers and exact dates; and studies 3, 4, and 9 supplied dates at unspecified intervals. When studies provided combinations of dates and months or dates and visit number, we needed to match records where both variables were present with those where only month or visit was present. Baseline Ideally, the baseline date should be the date when the patient was randomized and study treatment (ACE inhibitor versus control) was begun. Unfortunately, studies specified three types of dates that could be construed as baseline: inclusion date, randomization date, and start of medication date. Theoretically, a protocol could specify the inclusion date as the date on which the patient was accepted into the study, followed a short time later by randomization, and then still later by start of medication. Practically, these events might be simultaneous, as the patient would not be enrolled until randomization and would not be randomized until the medication was to be started. In any event, if studies included multiple dates, we used the date when the study medication was begun as the baseline date if this was available (studies 3, 4, 9, and 11). Trial-defined study start dates, baseline dates, or inclusion dates were used for five more studies (2, 5, 7, 8, and 10). In studies 1 and 6, which supplied only month number, the month zero visit was taken as the baseline, and the starting date was arbitrarily assigned as January 1, 1980 so that each patient would have a begin date. This simplified programming, although the actual date played no role in our analyses. Logistically, determining these dates from the information provided often required linking several files because the medication and patient visit information resided in physically different locations. In addition, when multiple dates were provided, they needed to be matched against each other to search for inconsistencies. Since the majority of the databases were supplied in SAS format, we wrote SAS programs for carrying out the manipulations needed to standardize the data. For those studies supplied in other formats, we converted the files into SAS using the DBMS Engines software provided with SAS.

ARTICLE IN PRESS C.H. Schmid et al./Controlled Clinical Trials 24 (2003) 324–340

335

In several studies, assignment of a baseline visit date required relaxation of our definitions when initial medication date did not match a visit date. Several studies collected prebaseline levels of blood pressure, serum creatinine, urine protein, and GFR. We chose as the baseline the visit that occurred nearest to and before the begin date or month zero if the data were recorded monthly. Studies 10 and 11 provided only visit numbers, not study begin dates, but documentation indicated that visit 5 in study 10 and visit 3 in study 11 constituted the begin date. We used the values at the designated visits for baseline, unless they were unavailable, in which case we looked to an earlier visit. In a few cases, we accepted measurements taken up to 8 days after the begin date if no previous measurements were available. Reconciliation was complicated in study 8 by its labeling of the beginning of the prestudy washout period as baseline, when in fact the baseline was the next date recorded at the end of the washout period, often several months later. Follow-up In addition to defining the baseline date, we needed to determine the end date for each patient, ideally the end of follow-up while assigned to the study treatment (ACE inhibitor versus control). All of the studies had a prespecified length of follow-up in their protocol. The study end date was defined as the last visit recorded or the study end date noted in the file. The latter could reference a date of withdrawal for a medication side effect, fatal or nonfatal event, protocol violation, or loss to follow-up. Occurrence of a nonfatal renal event (doubling of baseline serum creatinine concentration or onset of ESRD) also terminated followup in some studies. The end date was assigned the date of ESRD or death when either of these events occurred. Otherwise, most trials provided a single field that indicated the end date of the study. One study, which provided data at 2-month intervals with no dates, supplied dropout, death, and ESRD information in the medication field. End date was inferred from this. Usually, dates were provided for each follow-up visit and for each time of medication change. In four studies that provided only visit or month numbers, times of follow-up were reconstructed from the visit/month schedule and the study begin date. For example, when only the follow-up months were available, visit dates were computed by adding 30.4375 times the month number to the begin date. Not all studies reported follow-up at the same times, although all followed their own regular schedule. Nevertheless, not all follow-up information was recorded at each visit, especially for measurements like urine protein, which required substantial effort to collect. Furthermore, three studies sent follow-up data only for the original months requested, thereby excluding some of their collected data points. Two studies also provided data for an extended period after the follow-up period defined by the protocol. Reconciliation, data cleaning, and correction In any combination of databases that requires variable standardization, it is crucial to reconcile the new database by checking any information, particularly summary statistics, that may have been published in the original study reports. These summary statistics include numbers of patients, means and standard deviations in each treatment arm, as well as results

ARTICLE IN PRESS 336

C.H. Schmid et al./Controlled Clinical Trials 24 (2003) 324–340

from analyses such as treatment group comparisons and regressions that are explained in enough detail to be reconstructed. As we carried out these tasks, it became apparent that the investigators had excluded certain patients from some analyses. In some cases, we were able to find these exclusions in their descriptions of methods used, but in others the coordinating center team needed to recontact the investigators. Our investigations also uncovered some implausible values as well as missing values. We detected implausible values by running computer programs that screened for outliers and by manually checking values. These manual checks included examination of trends in longitudinal series that might flag values unreasonable in relation to other values in that patient’s sequence. All inconsistencies noted by the computer algorithms were manually checked by clinicians. We queried investigators by e-mail or fax about any questionable or missing values. In one case, two members of the coordinating center traveled to one of the sites to gather some data. In some cases, the investigators were able to help, but in others they could not, either because their data management team had disbanded or because they could not uncover the reason for the error in the data. When no help was forthcoming, investigators at the coordinating center made their own decisions. In most cases, implausible values were set to missing. In a few cases where the errors appeared obvious, such as the improper placement of a decimal point, an appropriate correction was made. In some cases, reconciliation of inconsistencies occurred after a long delay, following completion of analyses and publication of results. Rather than retain incorrect data, all computations were repeated after correction of errors and compared with the previous analyses and publications. There were no substantial differences in results or conclusions; thus errata were not published. Later publications use the corrected database. All error corrections were embedded in computer code from which modified datasets could be automatically generated. This method served two purposes. First, it maintained the original form of the study datasets, and second, it provided written documentation of decisions and changes. In the end, we chose not to analyze some variables that were extremely incomplete due to lack of collection in some studies or missing values. These data included hemoglobin, triglycerides, cholesterol, urine albumin, coexisting conditions, duration of hypertension, and prior history of medication use. Discussion Our purpose in writing this description of our experience has been to inform others who also might wish to combine longitudinal data from studies with different protocols for retrospective analysis. Differences among study protocols and their database structures can make combining longitudinal data from different RCTs an extremely complex and time-intensive task. The use of different software packages, recording of information in different languages, and different intervals for reporting measurements all complicate data synthesis. We hope that some of the lessons we learned can be of use to future endeavors. Many of the difficulties we faced no doubt derived from the nature of our collaboration, originating from a meta-analysis of group data without any institutional funding. This handicap slowed the assembly and processing of data, which led to several false starts in database

ARTICLE IN PRESS C.H. Schmid et al./Controlled Clinical Trials 24 (2003) 324–340

337

construction. Although each study quickly agreed to contribute data, the coordinating team was able to provide little support for each team’s effort and in fact spent considerable time searching for financial support for the project. Not all of the collaborators, especially those with older studies, were able to send their data as quickly as others. The first database was received in September 1996 and the last not until May 1998. Although some reluctance to part with primary data certainly played a role in these delays, the more common explanation was that study teams from the older studies had either disbanded or moved on to new projects and lacked the financial resources, time, or knowledge to respond promptly. At the same time, this lack of support prevented the constitution of any preliminary meeting of the collaborative group to standardize definitions, address inconsistencies, and clarify issues. As a result, problems were addressed as they arose and their resolution sometimes undid earlier efforts. The easiest studies to assemble, clean, and integrate were those with intact data management and analysis groups that could be queried. Luckily, these included the largest studies in AIPRD so that many questions could be answered. The hardest to work with were those for whom the only available contact was a clinical investigator unfamiliar with the details of database construction. These naturally tended to be the older studies. In future efforts, we would choose to request complete databases as these are the most informative as well as the easiest to provide, contrary to our initial beliefs. Nevertheless, the real problem was the variety of types of data formats and documentation. Because clinical trials have different objectives and resources available to them, standardization of data reporting beyond the broad criteria listed in the CONSORT statement [34] may not be possible. Prospective meta-analysis, the simultaneous design and execution of different studies whose results will be combined at their conclusion, provides a potential solution. Certainly, standardizing protocols, data collection formats, variable definitions, and analytic techniques would simplify the clerical tasks of combining data. Coordinating objectives would also presumably improve the return on information collected. At the very least, inferences about dose effects would not necessarily be confounded with studies. As meta-analysis becomes a more common technique for systematic review and as clinicians receive better research training, investigators will undoubtedly begin to consider future uses of their data by others in their study planning. Nevertheless, there will still often be additional studies to include in the meta-analysis. A related issue is that of access to data. The National Institutes of Health now advocate sharing of data by investigators and the Freedom of Information Act in the United States now mandates that all data collected with federal funding be available to those who request it. But data collected in other nations or in the distant past may not have such protections. Bias may result. Because we were able to collect most of the available data on ACE inhibitors for nondiabetic renal disease, our results are robust. Others may not be so fortunate. It is hoped that in the future the wide availability of electronic data storage and computer networking will alleviate this difficulty. Researchers must convey to funding agencies the advantages of prospectively and retrospectively combining databases and thereby extending their useful lifetime, because the benefits come with considerable costs attached. These costs are not easily funded from industry sources because they typically are not intended to promote a specific agent, but rather a class

ARTICLE IN PRESS 338

C.H. Schmid et al./Controlled Clinical Trials 24 (2003) 324–340

of agents. The funding of US$348,000 for 3 years that we received from the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) amounted to US$179 per patient. Development of the Thrombolytic Predictive Instrument [11], five logistic regression models to predict the risk of 30-day and 1-year mortality, cardiac arrest, stroke, and major bleeding among patients presenting to the emergency department with a myocardial infarction, involved combining 12 databases and cost US$741,000 for 3 years or approximately US$150 per patient 10 years ago. These costs far exceed the US$20 per patient suggested by Stewart and Clarke in their 1995 discussion of individual patient meta-analysis [5]. Several explanations in addition to monetary inflation can be advanced to explain this discrepancy. First, as Stewart and Clarke note, support through grant funding tends to be more expensive than that through core funding because of administrative costs. Second, developing complex models for multiple outcomes and examining medication effects require collecting information on many covariates, which differs entirely from the Oxford model of collecting a small number of key patient descriptors on a large number of patients [13]. Third, our data are longitudinal, whereas others have only collected baseline covariates and event times. Nevertheless, phase III clinical trials can be much more expensive. The recently completed African American Study of Kidney Disease and Hypertension cost more than US$50 million, for example. In our experience, many scientific review panel members do not recognize the substantial time and expertise needed to pool massive amounts of data. One review of our initial grant submission questioned why we needed much money since “the databases already exist in computerized format.” Apparently, at least in the United States, many experts give lower priority to combining existing clinical trials databases for retrospective analyses compared with collection and analysis of new data. At the suggestion of staff from NIDDK, we requested review by a more methodologically oriented study section for our resubmission. While this review was more favorable, the proposal was not funded until NIDDK convened an internal review group with a range of both nephrologic and methodologic expertise. In the end, combining databases is a massive task that requires much concentrated collaborative effort of clinicians, statisticians, programmers, and data managers. The link between clinician and programmer must be especially close during data cleaning. Automation simplifies the process, but only if proper clinical parameters are taken into account. Translating medical terms, categorizing medications, interpreting and standardizing doses, and reconciling different units and modes of measurement all depend on clinical expertise but are efficiently implemented by computer. In our experience, the clinical fellows and statistical programmer spent countless intensive hours together laboring to clean the data. Nevertheless, the potential benefits from the collaborative effort can be rewarding. Patient-specific data are more reliable, analyses are more detailed, and results are more credible. The conclusions drawn may lead to treatment recommendations or they may indicate a need for further research. For example, our analyses found an important treatment interaction demonstrating that patients starting with higher levels of urine protein experienced greater treatment benefit than those with lower levels. Conversely, patients with low urine protein levels received very little, if any, benefit. This interaction was only detectable with the patient-level data and not with summary values. Whatever the outcome of the analysis, the expert composition of the collaborative group will give wider endorsement and dissemination

ARTICLE IN PRESS C.H. Schmid et al./Controlled Clinical Trials 24 (2003) 324–340

339

to the results [5] and will accelerate future research and focus it on the key questions that need to be answered. In the end, the cost of the collaboration may well pay for itself in better use of existing information and reduction in future expenditures on new phase III clinical trials. Acknowledgments Members of the Angiotensin-Converting Enzyme Inhibition in Progressive Renal Disease Study Group include Drs. P.C. Zucchelli (Malpigi-Bologna, Italy); A. Kamper and S. Strandgaard (Copenhagen, Denmark); R.D. Toto (Dallas, Texas, United States); B.M. Brenner and N.E. Medias (Boston, Massachusetts, United States); B.G. Delano (New York, New York, United States); S. Shahinfar (Blue Bell, Pennsylvania, United States); G.G. van Essen, P.E. de Jong, A.J. Apperloo and D. de Zeeuw (Groningen, Netherlands); P. Landais and J.P. Grunfeld (Paris, France); K.M. Bannister (Adelaide, Australia); A. Himmelmann and L. Hansson (Goteborg, Sweden); B.U. Ihle and G.J. Becker (Melbourne, Australia); G. Maschio, C. Marcantoni, and L. Oldrizzi (Verona, Italy); G. Remuzzi, P. Ruggenenti, and A. Perna (Bergamo, Italy). We thank the reviewers for their excellent comments, which we have incorporated into the manuscript. This work was supported by grants RO1-HS10064 from the Agency for Healthcare Research and Quality (Dr. Schmid), RO1-DK53869A from the National Institute of Diabetes and Digestive and Kidney Diseases (Dr. Levey), grant 1097-5 from the Dialysis Clinic, Inc. (Dr. Jafar), Paul Teschan Research Fund (Dr. Jafar), New England Medical Center-St. Elizabeth’s Hospital Medical Center Clinical Research Fellowship Program (Dr. Jafar), and an unrestricted grant from Merck Research Laboratories (Dr. Levey). References [1] Schmid CH. Exploring heterogeneity in randomized trials via meta-analysis. Drug Inf J 1999;3:211–224. [2] Thompson SG, Sharp SJ. Explaining heterogeneity in meta-analysis: a comparison of methods. Stat Med 1999;18:2693–2708. [3] Lambert PC, Sutton AJ, Abrams KR, Jones DR. A comparison of summary patient-level covariates in meta-regression with individual patient data meta-analysis. J Clin Epidemiol 2002;55:86–94. [4] Fibrinolytic Therapy Trialists’ (FTT) Collaborative Group. Indications for fibrinolytic therapy in suspected acute myocardial infarction: collaborative overview of early mortality and major morbidity results from all randomised trials of more than 1000 patients. Lancet 1994;343:311–322. [5] Stewart LA, Clarke MJ. Practical methodology of meta-analyses (overviews) using updated individual patient data. Stat Med 1995;14:2057–2079. [6] Selker HP, Griffith JL, Beshansky JR, et al. The thrombolytic predictive instrument project: combining clinical study data bases to take medical effectiveness research to the streets. In: Grady ML, Schwartz HA, editors. Medical Effectiveness Research Data Methods. Pub. AHCPR92-0056. Washington DC: Agency for Health Care Policy and Research, Department of Health and Human Services, 1992. 9–31. [7] Early Breast Cancer Trialists’ Collaborative Group. Systemic treatment of early breast cancer by hormonal, cytotoxic, or immune therapy. 133 randomised trials involving 31,000 recurrences and 24,000 deaths among 75,000 women. Lancet 1992;339:1–15, 71–85. [8] Chalmers I. The Cochrane collaboration: preparing, maintaining, and disseminating systematic reviews of the effects of health care. Ann N Y Acad Sci 1993;703:156–163. [9] Duchateau L, Pignon JP, Bijnens L, et al. Individual patient- versus literature-based meta-analysis of survival data: time to event and event rate at a particular time can make a difference, an example based on head and neck cancer. Control Clin Trials 2001;22:538–547. [10] Jeng GT, Scott JR, Burmeister LF. A comparison of meta-analytic results using literature vs. individual patient data. JAMA 1995;274:830–836.

ARTICLE IN PRESS 340

C.H. Schmid et al./Controlled Clinical Trials 24 (2003) 324–340

[11] Selker HP, Griffith JL, Beshansky JR, et al. Patient-specific predictions of outcomes in myocardial infarction for real-time emergency use: A Thrombolytic Predictive Instrument. Ann Intern Med 1997;127:538–556. [12] Olkin I. Statistical and theoretical considerations in meta-analysis. J Clin Epidemiol 1995;48:133–146. [13] Peto R, Collins R, Gray R. Large-scale randomized evidence: large, simple trials and overviews of trials. J Clin Epidemiol 1995;48:23–40. [14] Jafar TH, Schmid CH, Landa M, et al. for the AIPRD Study Group. Angiotensin-converting enzyme inhibitors and progression of nondiabetic renal disease: a meta-analysis of patient-level data. Ann Intern Med 2001;135:73–87. [15] Jafar TH, Stark PC, Schmid CH, et al. for the AIPRD Study Group. Proteinuria as a modifiable risk factor for the progression of non-diabetic renal disease. Kidney Int 2001;60:1131–1140. [16] United States Renal Data System. 2001 Annual Report of the U.S. Renal Data System. Bethesda, Maryland: National Institute of Health, National Institute of Diabetes and Digestive and Kidney Diseases, 2001. [17] National Kidney Foundation. K/DOQI clinical practice guidelines for chronic kidney disease: evaluation, classification and stratification. Am J Kidney Dis 2002;39:S1–S266. [18] The ACE Inhibitors in Diabetic Nephropathy Trialist Group. Should all patients with type 1 diabetes mellitus and microalbuminuria receive angiotensin-converting enzyme inhibitors? A meta-analysis of patient level data. Ann Intern Med 2001;134:370–379. [19] Giatras I, Lau J, Levey AS. Effect of angiotensin-converting-enzyme inhibitors on the progression of nondiabetic renal disease: a meta-analysis of randomized trials. Ann Intern Med 1997;127:337–345. [20] Brenner BM, Meyer TW, Hostetter TH. Dietary protein intake and the progressive nature of kidney disease: the role of hemodynamically mediated glomerular injury in the pathogenesis of progressive glomerular sclerosis in aging, renal ablation, and intrinsic renal disease. N Engl J Med 1982;307:652–659. [21] Remuzzi G, Bertani T. Is glomerulosclerosis a consequence of altered glomerular permeability to macromolecules? Kidney Int 1990;38:384–394. [22] Zucchelli P, Zuccala A, Borghi M, et al. Long-term comparison between captopril and nifedipine in the progression of renal insufficiency. Kidney Int 1992;42:452–458. [23] Kamper AL, Strandgarrd S, Leyssac PP. Effect of enalapril on the progression of chronic renal failure. Am J Hypertens 1992;5:423–430. [24] van Essen GG, Apperloo AJ, Rensma PL, et al. Are angiotensin converting enzyme inhibitors superior to beta blockers in retarding progressive renal function decline? Kidney Int Suppl 1997;63:S58–62. [25] Hannedouche T, Landais P, Goldfarb B, et al. Randomized controlled clinical trial of enalapril and beta-blockers in nondiabetic chronic renal failure. BMJ 1994;309:833–837. [26] Bannister KM, Weaver A, Clarkson AR, Woodroffe AJ. Effect of angiotensin-converting enzyme and calcium channel inhibition on progression of IgA nephropathy. In: Clarkson AR, Woodroffe AJ, editors. Contrib Nephrol 1995;111:184–192. [27] Himmelmann A, Hansson L, Hansson BG, et al. ACE inhibition preserves renal function better than beta-blockade in the treatment of essential hypertension. Blood Press 1995;4:85–90. [28] Ihle BU, Whitworth JA, Shahinfar S, Cnaan A, Kincaid-Smith PS. Angiotensin-converting-enzyme inhibition in non-diabetic progressive renal insufficiency: a controlled double-blind trial. Am J Kidney Dis 1996;27:489–495. [29] Maschio G, Alberti D, Janin G, et al. and the Angiotensin-Converting-Enzyme Inhibition in Progressive Renal Insufficiency Study Group. Effect of the angiotensin-converting-enzyme inhibitor benazepril on the progression of chronic renal insufficiency. N Engl J Med 1996;334:939–945. [30] The GISEN Group (Gruppo Italiano di Studi Epidemiologici in Nefrologia). Randomised placebo-controlled trial of effect of ramipril on decline in glomerular filtration rate and risk of terminal renal failure in proteinuric, non-diabetic nephropathy. Lancet 1997;349:1857–1863. [31] Ruggenenti P, Perna A, Gherardi G, et al. Renoprotective properties of ACE-inhibition in non-diabetic nephropathies with non-nephrotic proteinuria. Lancet 1999;354:359–364. [32] Cockcroft DW, Gault MH. Prediction of creatinine clearance from serum creatinine. Nephron 1976;16:31–41. [33] White IR, Pocock SJ. Statistical reporting of clinical trials with individual changes from allocated treatment. Stat Med 1996;15:249–262. [34] Begg C, Cho M, Eastwood S, et al. Improving the quality of reporting of randomized controlled trials: the CONSORT statement. JAMA 1996;276:637–639.