Epidemiologic Methods Developments: A Look Forward to the Year 2032

Epidemiologic Methods Developments: A Look Forward to the Year 2032

Epidemiologic Methods Developments: A Look Forward to the Year 2032 ROSS L. PRENTICE, PHD This article responds to a request by the Editors for a per...

94KB Sizes 0 Downloads 53 Views

Epidemiologic Methods Developments: A Look Forward to the Year 2032 ROSS L. PRENTICE, PHD

This article responds to a request by the Editors for a perspective on potential epidemiologic methods developments between now and the year 2032 when the American College of Epidemiology will have its 50th Annual Meeting. The response begins by describing the need for enhanced methods in epidemiologic research and goes on to suggest some approaches to satisfying such needs. The suggested approaches include the more extensive use of biomarkers for exposure assessment, the greater standardization of data analysis and reporting methods, and enhancement of the interplay between observational studies and randomized controlled trials. It is argued that a phased approach to epidemiologic hypothesis evaluation may often be needed, with hypotheses that are promising in observational studies subjected to controlled trials having well-selected intermediate outcomes. It is also argued that a multidisciplinary, coordinated community of scientists interested in disease risk estimation and disease prevention will be needed for epidemiologic research to fulfill its potential over the next 25 years. Ann Epidemiol 2007;17:906–910. Ó 2007 Elsevier Inc. All rights reserved. KEY WORDS: Biomarkers, Cohort, Case-control, Hazard Ratio, High-dimensional Data, Observational Studies, Randomized Controlled Trials, Reporting Standards.

INVITED COMMENTARY INTRODUCTION The past 25 years have seen some remarkable progress and convergence in epidemiologic methods and practices. During this time hospital-based case-control studies substantially gave way to population-based case-control studies, paying due attention to the avoidance of selection bias. Several additional large cohort studies having periodic exposure assessments and long-term follow-up were initiated in this time period, both in the United States and elsewhere. Cohort studies typically offer good protection against recall bias and selection bias. As such studies have matured, they have come to fulfill a rather central role in epidemiologic research, even for relatively rare diseases, sometimes through consortia of cohorts. Observational study results have contributed to the rationale for a number of randomized controlled disease prevention or screening trials. The randomized controlled trial (RCT) design has the important advantage of providing protection against confounding by baseline risk factors and typically provides a setting for unbiased outcome ascertainment. Recent biologic technology developments, including single nucleotide polymorphism (SNP) chips, provide novel opportunities to accelerate

From the Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle. Address correspondence to: Ross L. Prentice, PhD, Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave North, M3-A410, P.O. Box 19024, Seattle, WA 98109-1024. Tel.: (206) 667-4264; fax: (206) 667-4142. E-mail: [email protected]. Received May 15, 2007; accepted July 11, 2007. Ó 2007 Elsevier Inc. All rights reserved. 360 Park Avenue South, New York, NY 10010

learning about genetic aspects of disease risk and to learn more about disease pathways and mechanisms. On the data analytic side, logistic regression methods became the workhorse for summarizing epidemiologic associations about a quarter century ago (1–3), extending earlier important work on odds ratio estimation (4) in a manner that allows a thorough accommodation of multiple potential confounding factors. Over the intervening period these methods have often given way to hazard ratio (Cox regression) methods (5, 6) that allow one to take, for example, a fuller account of exposure histories in a cohort study setting. Continued methodologic advances will be needed in the next 25 years to address outstanding important topics in epidemiologic research. A cohort study setting will be used to examine these outstanding topics and to suggest approaches to resolving them.

METHODOLOGIC NEEDS IN EPIDEMIOLOGIC RESEARCH The basic ingredients of an epidemiologic cohort study are rather simple: Let z0 (u) Z {z1(u), z2(u), .} denote a set of numerically coded variables that describe an individual’s exposures and other characteristics at a time u following selection into a cohort. Let Z(t) Z {z(u), u!t} denote the corresponding ‘covariate’ history up to follow-up time t, including a baseline covariate history Z(0) that may include information that pertains to time periods prior to cohort enrollment. Epidemiologic studies typically involve the association between aspects of Z(t) and the occurrence (hazard) rate for a health event (disease) of interest, denoted by l {t;Z(t)}. The Cox regression model specifies 1047-2797/07/$–see front matter doi:10.1016/j.annepidem.2007.07.003

AEP Vol. 17, No. 11 November 2007: 906–910

Prentice EPIDEMIOLOGIC METHODS BY 2032

Selected Abbreviations and Acronyms RCT Z randomized controlled trial SNP Z single nucleotide polymorphism

 0 lft; ZðtÞg ¼ l0 ðtÞ exp xðtÞ b

(i)

where x(t)0 Z {x1(t), ., xp(t)} is a regression p-vector formed from Z(t) and t, and b0 Z (b1, ., bp) is a corresponding hazard ratio parameter to be estimated. The time-varying feature of x(t) provides an opportunity to examine hazard ratios for an exposure of interest as a function of the preceding exposure duration, or as a function of preceding exposure history up to the time t more generally. The fact that the baseline hazard rate function l0 in (i) need not be restricted, and can be generalized to differ among strata formed from {Z(t),t}, implies a good deal of flexibility to this modeling framework. The availability of nonparametric estimators of the cumulative baseline hazard function allows this regression procedure to be used for absolute risk, as well as relative risk, estimation. Suppose now that a cohort study is conducted, and hazard ratio parameters (b) are estimated to describe the association between an exposure and a disease. Under what circumstances can the association be regarded as reliably estimated? Under what circumstances can a reliably estimated association be afforded a causal interpretation? The development of research strategies that can yield answers to these two questions is the fundamental methodologic need in epidemiologic research over the next 25 years! RELIABLE ASSOCIATION PARAMETER ESTIMATION To examine possible ways of addressing the first question, suppose that a standard approach to confounding control has been taken with established, possibly time-varying, risk factors for the study disease included in Z(t), and carefully modeled through x(t) in (i). Under this scenario one might expect parameter estimates of this same association to substantially agree across studies, even if the risk factor distributions differ among study populations, and to some extent, even if some confounders are missing in each such study. Since consistency across studies is a major strategy to gaining confidence in observational associations, it is useful to consider the sources of bias in association parameter estimation and to consider methodologic strategies for reducing such bias. A quite important source of bias arises from measurement error in exposure variables of interest. Under a classical measurement error model, the measured exposure is equal to the actual exposure (e.g., an element of x(t)) plus ‘noise’ that is independent of the actual exposure and, importantly, is

907

independent of all study subject characteristics. It is well recognized that hazard ratios, or odds ratios, tend to be biased toward the null under this classical measurement model (at least with a single exposure variable) and that the attenuated association parameters will not agree across studies unless the ratio of the variance of the noise to the actual exposure variance is common across studies. Moreover, for self-reported exposures, a measurement model that is more flexible than the classical measurement model will typically be needed. For example, the measurement error properties of such exposures as dietary patterns, physical activity patterns, environmental exposures, occupational exposures, and medication histories, to name just a few, are likely to differ among individuals in a study population, perhaps based on such variables as body mass, age, ethnicity, and behavioral factors and are likely to differ among study populations. For example, recent reports (7, 8) on the association between dietary fat, adjusted for energy, and postmenopausal breast cancer find a positive association when a food record (diary) is used to assess dietary intake, but fail to do so when a food frequency questionnaire is used. These studies indicate that measurement error may be highly influential in this important research area and point to the need for objective biomarkers of exposure that can be used to calibrate selfreport data and lead to association parameter estimates that apply to the ‘actual’ exposure. Progress on this important measurement error topic can be expected in the near term if exposure biomarkers are available that themselves adhere to a classical measurement model. For example, studies ongoing with the Women’s Health Initiative cohorts will use biomarkers of total energy, protein energy, and activity-related energy expenditure to calibrate corresponding self-report estimates in dietary and physical activity association studies of several chronic diseases. Odds ratio and hazard ratio models and estimation procedures for these analyses are already emerging (9–12). However, there is a great need for the development of additional exposure biomarkers having a classical error structure and for the conduct of human feeding studies to facilitate the use of blood nutrient concentration biomarkers that are unlikely to adhere to a classical measurement model. Note also that these same measurement modeling developments are needed to control confounding by difficult-to-measure exposures. For example, it may be that epidemiologic association studies to date on such important outcomes as coronary heart disease, diabetes, or breast cancer have been substantially uncontrolled for confounding by dietary and physical activity variables. Measurement error in outcome ascertainment may be another source of bias in association parameter estimation. Differential ascertainment of the timing of disease diagnosis among study subjects may be intrinsic to disease processes that develop over many years (e.g., several neurologic

908

Prentice EPIDEMIOLOGIC METHODS BY 2032

diseases) or may be induced by such health-related behaviors as screening mammograms or prostate-specific antigen testing. These latter behaviors may in turn correlate with other disease risk factors, implying a need to carefully standardize for disease screening behavior in association parameter estimation. The study of exposures that can effect the results of disease screening tests, such as postmenopausal hormone therapy effects on mammograms, or the use of finasteride or dutasteride on prostate-specific antigen levels, are particularly challenging and may require the development of a suitable outcome ascertainment measurement model to avoid bias. This is a relatively undeveloped aspect of epidemiologic methodology that will be of increasing importance in upcoming years. Now consider the association between a well-measured (or well-calibrated) exposure and a well-measured outcome. Association parameters, such as hazard ratio parameters in (i), may depend on other factors as well as on the details of the preceding exposure histories. While ratio parameters are expected to be more robust to effect modification than are corresponding absolute risk parameters, some dependence of hazard ratios (or odds ratios) in disease risk factors should not be considered as an exception and would usefully be allowed for in data analysis. For example, one might project that future reporting could include hazard ratio parameter estimates and confidence intervals that have been standardized to a specified distribution for key disease risk factors. In addition to standard analyses, each cohort (or case-control) study could estimate a hazard ratio function that flexibly depends on an agreed-upon set of risk factors, with the standardized hazard ratio estimate a simple weighted average of the resulting hazard ratio estimates, with weights defined by a standard risk factor distribution. This type of hazard ratio reporting could be readily implemented following the identification of a set of risk factors and a specification of a corresponding standard joint distribution. In addition to enhancing comparability across studies, the inclusion of standardized hazard ratio estimates in study reporting could lead to a reduction in emphasis on ‘interactions’ of hazard ratios on study subject characteristics, with its attendant multiple testing limitations. This type of standardization could also extend to a specified exposure duration distribution or, perhaps better, average hazard ratio estimates could be presented over several prespecified exposure durations as a routine aspect of epidemiologic reporting.

METHODS TO ASCRIBE A CAUSAL INTERPRETATION AND THE ROLE OF RANDOMIZED CONTROLLED TRIALS The value of a reliably estimated exposure-disease association is much enhanced if convincing evidence can be

AEP Vol. 17, No. 11 November 2007: 906–910

provided that the exposure itself affects disease risk (i.e., causes or prevents disease). It is unfortunately true that one can never be certain in an observational study that confounding has been adequately controlled. Modern observational study data analysis methods that attempt to emulate a randomized trial enhance the ability to make good use of measured confounders that may be time-varying (13) and are likely to become more widely used in upcoming years. However, the possibility that some important, nonredundant confounders have been omitted can never be ruled out, as has recently been reinforced (14) by a prominent critic of epidemiologic methods, based on a view of differences between clinical trials and observational study results on postmenopausal hormone therapy and coronary heart disease. In fact, the interface between clinical trials and observational studies can provide a key area for the further development of epidemiologic methods and research strategies in upcoming decades. Joint analyses of data from clinical trials and observational studies of a certain exposure, when both types of study are available in comparable populations, can allow a direct assessment of residual confounding. When this type of analysis was carried out using the Women’s Health Initiative randomized controlled trials (RCTs) and observational cohort study, it appeared that much of the cardiovascular disease hazard ratio discrepancy (higher hazard ratios from the clinical trial) could be attributed to observational study estimates’ being dominated by long-term exposure and clinical trial estimates’ primarily reflecting relatively short-term exposure for both estrogen plus progestin (15) and for estrogen alone (16). After controlling for a substantial list of potential confounding factors, and allowing for these differences in exposure patterns, the evidence for residual confounding was weak for coronary heart disease and venous thromboembolism, but some evidence remained for stroke. This analytic exercise highlighted the need for cohort studies to include an adequate number of recent exposure initiators and the need to allow hazard ratios to depend on exposure timing and duration in data analysis. There are other reasons for cultivating a smooth transition between observational studies and RCTs in epidemiologic research. For example, observational studies may establish an association between fruit and vegetable consumption and the risk of cardiovascular disease or cancer, but may lack the specificity to identify the responsible nutrients or biologically active components. In addition, RCTs are well suited to examine the benefits and risks of an intervention among persons meeting age and other eligibility criteria, whereas observational data on persons who self-select the behavior in question are often limited, even within very large cohort studies. It is well recognized that full-scale disease screening or prevention RCTs are logistically difficult and expensive;

AEP Vol. 17, No. 11 November 2007: 906–910

hence, few can be conducted at any point in time. Also, because of cost constraints, such trials are typically designed to have power for the overall intervention versus control group comparison for key clinical outcomes, making them sensitive to departures from adherence assumptions, or from other design specifications. These issues imply the need for a careful developmental process typically involving observational studies and basic research studies, prior to undertaking such a full-scale trial. They also imply a value to joint analysis of RCTs and observational data in which the clinical trial provides an anchor for adjusting observational data on topics where the RCT data alone lack power or are otherwise insufficient.

INTERMEDIATE OUTCOME TRIALS AND HIGHDIMENSIONAL BIOLOGIC DATA With observational studies limited by confounding uncertainties, and full-scale RCTs limited by cost, a research agenda built on these study designs alone is somewhat incomplete for developing reliable disease prevention approaches and recommendations. This is also true of the sources of data to generate new hypotheses for disease prevention. RCTs having intermediate outcomes have potential to substantially fill this research gap. While intermediate outcome trials have been a part of the research agenda leading up to a full-scale prevention trial for many years, it is only recently that it becomes possible to conceive of the simultaneous use of a rather comprehensive set of intermediate markers that may be relevant to a broad range of intervention benefits and risks. The development of practical, high-dimensional biologic data of various types can be expected to greatly stimulate epidemiologic research throughout the next 25 years and beyond. Already high-dimensional tagging SNP sets (17) in the 100,000 to 500,000 range are available, require modest DNA volumes, and can be applied in a high-throughput fashion. These strategies are already being applied to several thousand cases and controls, for each of several prominent chronic diseases. These studies can be expected to provide much insight into the importance of genotype in determining the risk of major complex diseases, and they are expected to provide leads toward a further understanding of disease pathways and mechanisms as well as targets for disease prevention, screening, or treatment. The design and analysis issues surrounding the use of these data are substantial, primarily due to the likelihood of many false positives if testing takes place at conventional levels. Multistage designs that screen out many of the unpromising SNPs early (18, 19) and novel procedures for controlling false discovery rates (20) seem likely to become the norm in this area.

Prentice EPIDEMIOLOGIC METHODS BY 2032

909

Biomarkers that are modified by an exposure will be needed, however, for the development and initial evaluation of prevention interventions. While it is early to speculate on the nature of the biomarkers that will be most useful for this purpose, measures that assess high-dimensional protein concentrations in plasma can be used to illustrate the concept. Technologies for interrogating the proteome in a high-dimensional fashion are under intensive development, for example, using separation techniques followed by mass spectrometry in subfractions (21), or antibody arrays (22). As the knowledge base develops concerning the relationship between circulating protein concentrations and a range of diseases, it may be possible to usefully assess the likely benefits and risks of an intervention by examining its effects on the expression of a large number of proteins. This type of high-dimensional intermediate outcome trial could be used to screen out intervention concepts arising from observational epidemiology or other sources, or to project the health implications of hypothetical chemopreventive or behavioral interventions. Proteomic, metabolomic (small molecule) (23), and other types of biomarker data are likely to become integral to epidemiologic research within the next 25 years, though related technologies may take some years to stabilize. The most promising interventions could be subjected to full-scale intervention trials with clinical outcomes, if the public health implications are sufficiently great.

ORGANIZATION OF THE POPULATION SCIENCE RESEARCH COMMUNITY The developments and potential developments alluded to above will require an enhanced level of organization and collaboration in the population science community. This is already occurring, for example, in high-dimensional genotype association studies where consortia of cohorts and case-control studies have formed to provide the needed infrastructure. Coordination across disease outcomes is another important topic in prevention research. Behavioral or chemopreventive interventions often have potential to affect the risk of a number of chronic diseases, and the utility of otherwise promising intervention for one disease can be thwarted by an unanticipated adverse effect on another clinical outcome, as is illustrated by recent studies with cyclooxygenase 2 inhibitors for prevention of colorectal adenoma. The development and evaluation of biomarkers of exposure, disease risk, or disease diagnosis is fundamentally a multidisciplinary effort that requires the involvement of basic and clinical scientists, in addition to population scientists (see Doll and Peto (24) for related comments that date back to the time that the American College of Epidemiology was initiated).

910

Prentice EPIDEMIOLOGIC METHODS BY 2032

Uncertainties concerning the needed and preferred population science research agenda imply the need for both related innovative methodologic research as well as an open dialogue within and beyond the community of scientists interested in epidemiology and population science. Taken together, these lines of thinking suggest the need for additional forums composed of scientists interested in disease prevention and population science to help identify and solicit needed research and to help prioritize disease prevention hypotheses that merit further development and evaluation using intermediate or clinical outcomes. One possible form for such a forum would be a standing population sciences cooperative group, with interdisciplinary representation from both National Institutes of Health scientists and the external scientific community. Perhaps the American College of Epidemiology can help to foster this or other organizational innovation well before its 50th Annual Meeting. This work was supported by National Institutes of Health grants CA53996, CA86368, and CA106320.

REFERENCES

AEP Vol. 17, No. 11 November 2007: 906–910

9. Carroll RJ, Freedman L, Kipnis V, Li L. A new class of measurement error models, with application to dietary data. Can J Stat. 1998;26:467–477. 10. Prentice RL, Sugar E, Wang CY, Neuhouser M, Peterson R. Research strategies and the use of nutrient biomarkers in studies of diet and chronic disease. Public Health Nutr. 2002;5:977–984. 11. Sugar EA, Wang CY, Prentice RL. Logistic regression with exposure biomarkers and flexible measurement error. Biometrics. 2007;63:143–151. 12. Shaw P. Estimation methods for Cox regression with nonclassical measurement error. Dept of Biostatistics, University of Washington. Doctoral dissertation. 2006. 13. Robins JM. The analysis of randomized and non-randomized AIDS treatment trials using a new approach to causal inference in longitudinal studies. In: Sechrest L, Freeman H, Mulley A, eds. Health services research methodology: a focus on AIDS. Washington (DC): US Public Health Service, National Center for Health Services Research; 1989: p. 113–159. 14. Taubes G. Epidemiology monitor interviews 10 years after publication of controversial science on epidemiology. Epidemiol Monitor. 2005;26:4–30. 15. Prentice RL, Langer R, Stefanick ML, Howard BV, Pettinger M, Anderson G, et al, for the Women’s Health Initiative Investigators. Combined postmenopausal hormone therapy and cardiovascular disease: toward resolving the discrepancy between Women’s Health Initiative Clinical Trial and Observational Study Results. Am J Epidemiol. 2005;162:404–414. 16. Prentice RL, Langer RD, Stefanick ML, Howard BV, Pettinger M, Anderson GL, et al, for the Women’s Health Initiative Investigators. Combined analysis of Women’s Health Initiative observational and clinical trial data on postmenopausal hormone treatment and cardiovascular disease. Am J Epidemiol. 2006;163:589–599. 17. Hinds DA, Stuve LL, Nilsen GB, Halperin E, Eskin E, Ballinger DG, et al. Whole-genome patterns of common DNA variation in three human populations. Science. 2005;307:1072–1079.

1. Prentice RL, Pyke R. Logistic disease incidence models and case-control studies. Biometrika. 1979;66:403–411.

18. Prentice RL, Qi L. Aspects of the design and analysis of high-dimensional SNP studies for disease risk estimation. Biostatistics. 2006;7:339–354.

2. Breslow NE, Day NE. Statistical methods in cancer research. Volume Id The Analysis of Case-Control Studies. IARC Scientific Publication No. 32, International Agency for Research on Cancer, Lyon, France; 1980.

19. Skol AD, Scott LJ, Abecasis GR, Boehnke M. Joint analysis is more efficient than replication-based analysis for two-stage genome-wide association studies. Nat Genet. 2006;38:209–213.

3. Breslow NE, Day NE. Statistical methods in cancer research. Volume IId The design and analyses of cohort studies. IARC Scientific Publication No. 32, International Agency for Research on Cancer, Lyon, France; 1987.

20. Benjamini Y, Hochberg Y. Controlling false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B. 1995;57:289–300.

4. Mantel N, Haenszel W. Statistical aspects of the analyses of data from retrospective studies of disease. J Natl Cancer Inst. 1959;22:719–748. 5. Cox DR. Regression models and life tables (with discussion). J R Stat Soc B. 1972;34:187–220.

21. Wang H, Clouthier SG, Galchev V, Misek DE, Duffner U, Min CK, et al. Intact-protein-based high-resolution three-dimensional quantitative analysis system for proteome profiling of biological fluids. Mol Cell Proteomics. 2005;4:618–625.

6. Kalbfleisch JD, Prentice RL. The statistical analysis of failure time data. 2nd ed. New York: Wiley; 2002.

22. Wang X, Yu J, Sreekumar A, Varambally S, Shen R, Giacherio D, et al. Autoantibody signatures in prostate cancer. N Engl J Med. 2005;353:1224–1235.

7. Bingham SA, Luben R, Welch A, Wareham N, Khaw K-T, Day N. Are imprecise methods obscuring a relationship? Report from the EPIC Norfolk Prospective Cohort Study. Lancet. 2003;362:212–214.

23. Shurubor YI, Matson WR, Martin RJ, Kristal BS. Relative contribution of specific sources of systematic errors and analytic imprecision to metabolite analysis by HPLC-ECD. Metabolomics. 2005;1:159–168.

8. Freedman LS, Potischman N, Kipnis V, Midthune D, Schatzkin A, Thompson FE, et al. A comparison of two dietary instruments for evaluating the fat-breast cancer relationship. Int J Epidemiol. 2006;35:1011–1021.

24. Doll R, Peto R. The causes of cancer: quantitative estimates of avoidable risks of cancer in the United States today. New York: Oxford University Press; 1981 p. 1259–1260.