Genetic associations: false or true?

Genetic associations: false or true?

Update TRENDS in Molecular Medicine cells are, at least in part, sustained by poor production of IFN-g by DCs in the presence of unopposed activatio...

84KB Sizes 12 Downloads 107 Views

Update

TRENDS in Molecular Medicine

cells are, at least in part, sustained by poor production of IFN-g by DCs in the presence of unopposed activation of GATA-3. There are additional mechanisms by which DCs might contribute to CTLA-4-dependent control of T-cell homeostasis. For example, IFN-g released as a response to B7 activation by CTLA-4 can act in an autocrine or paracrine fashion to induce the expression of the enzyme indoleamine 2,3-dioxygenase (IDO) [16]. IDO catalyzes the oxidative cleavage of the indole ring of tryptophan, the least abundant and therefore most important essential amino acid for cell growth and proliferation. Cells synthesizing IDO could affect immune homeostasis in several ways, including tryptophan depletion, prevention of T-cell cycle progression, enhancement of activationinduced T-cell death, and direct regulatory effects of apoptotic tryptophan catabolites [19,20]. Interestingly, Th1 cells seem to be more susceptible than Th2 cells to the activity of these catabolites. It is possible that IDO-dependent effects contribute significantly to the maintenance of T-cell homeostasis [20], and could therefore play an important role in CTLA-4-dependent regulation of the size and composition of T-cell pools.

Vol.9 No.4 April 2003

4 5

6 7

8 9 10 11

12

13

14

Conclusions The factors that influence the survival and homeostasis of naive, effector and memory T cells have long remained obscure. In recent years, it has become clear that the homeostasis of distinct T-cell pools is a dynamic process regulated by internal stimuli. Recent evidence indicates that CTLA-4 might be one such internal stimulus. This new perspective on CTLA-4 function offers clues to the mechanisms that regulate the size and composition of the peripheral lymphocyte pool. However, the effects on T-cell homeostasis of actions targeting co-stimulatory pathways are complex [21– 23] and, hence, there are still many unanswered questions and a need for further work. References 1 Jameson, S.C. (2002) Maintaining the norm: T-cell homeostasis. Nat. Rev. Immunol. 2, 547 – 556 2 Nakajima, H. et al. (1997) The common cytokine receptor g chain plays an essential role in regulating lymphoid homeostasis. J. Exp. Med. 185, 189 – 195 3 Lodolce, J.P. et al. (1998) IL-15 receptor maintains lymphoid

15

16 17 18 19 20 21

22 23

135

homeostasis by supporting lymphocyte homing and proliferation. Immunity 9, 669 – 676 Chan, K.F. et al. (2000) Signaling by the TNF receptor superfamily and T cell homeostasis. Immunity 13, 419 – 422 Gudmundsdottir, H. and Turka, L.A. (2001) A closer look at homeostatic proliferation of CD4þ T cells: costimulatory requirements and role in memory formation. J. Immunol. 167, 3699 – 3707 Fry, T.J. and Mackall, C.L. (2001) Interleukin-7: master regulator of peripheral T-cell homeostasis? Trends Immunol. 22, 564– 571 Boursalian, T.E. and Bottomly, K. (1999) Survival of naive CD4 T cells: roles of restricting versus selecting MHC class II and cytokine milieu. J. Immunol. 162, 3795– 3801 Marrack, P. et al. (2000) Homeostasis of ab TCRþ T cells. Nat. Immun. 1, 107 – 111 Bour-Jordan, H. and Blueston, J.A. (2002) CD28 function: a balance of costimulatory and regulatory signals. J. Clin. Immunol. 22, 1 – 7 Thompson, C.B. and Allison, J.P. (1997) The emerging role of CTLA-4 as an immune attenuator. Immunity 7, 445 – 450 Takahashi, T. et al. (2000) Immunologic self-tolerance maintained by CD25þCD4þ regulatory T cells constitutively expressing cytotoxic T lymphocyte-associated antigen 4. J. Exp. Med. 192, 303 – 310 Salomon, B. and Bluestone, J.A. (2001) Complexities of CD28/B7: CTLA-4 costimulatory pathways in autoimmunity and transplantation. Annu. Rev. Immunol. 19, 225– 252 Bour-Jordan, H. et al. (2003) CTLA-4 regulates the requirement for cytokine-induced signals in TH2 lineage commitment. Nat. Immun. 4, 182– 188 Waterhouse, P. et al. (1995) Lymphoproliferative disorders with early lethality in mice deficient in Ctla-4. Science 270, 985 – 988 Tivol, E.A. et al. (1995) Loss of CTLA-4 leads to massive lymphoproliferation and fatal multiorgan tissue destruction, revealing a critical negative regulatory role of CTLA-4. Immunity 3, 541 – 547 Grohmann, U. et al. (2002) CTLA-4 – Ig regulates tryptophan catabolism in vivo. Nat. Immun. 3, 1097– 1101 Murphy, K.M. and Reiner, S.L. (2002) The lineage decisions of helper T cells. Nat. Rev. Immunol. 2, 933– 944 Ho, I.C. and Glimcher, L.H. (2002) Transcription: tantalizing times for T cells. Cell 109 (Suppl.), S109 – S120 Munn, D.H. (2002) Tolerogenic antigen-presenting cells. Ann. New York Acad. Sci. 961, 343 – 345 Fallarino, F. et al. (2002) T cell apoptosis by tryptophan catabolism. Cell Death Differ. 9, 1069 – 1077 Bachmann, M.F. et al. (2001) Normal pathogen-specific immune responses mounted by CTLA-4-deficient T cells: a paradigm reconsidered. Eur. J. Immunol. 31, 450 – 458 Yu, X. et al. (2000) The role of B7 costimulation in CD4/CD8 T cell homeostasis. J. Immunol. 164, 3543 – 3553 Prlic, M. et al. (2001) Homeostatic expansion occurs independently of costimulatory signals. J. Immunol. 167, 5664 – 5668

1471-4914/03/$ - see front matter q 2003 Elsevier Science Ltd. All rights reserved. doi:10.1016/S1471-4914(03)00026-1

Genetic associations: false or true? John P.A. Ioannidis Clinical and Molecular Epidemiology Unit, Department of Hygiene and Epidemiology, University of Ioannina School of Medicine and Biomedical Research Institute, Foundation for Research and Technology – Hellas, Ioannina, Greece

Genetic association studies for multigenetic diseases are like fishing for the truth in a sea of trillions of candidate analyses. Red herrings are unavoidably common, and Corresponding author: John P.A. Ioannidis ([email protected]). http://tmm.trends.com

bias might cause serious misconceptions. However, a sizeable proportion of identified genetic associations are probably true. Meta-analysis, a rigorous, comprehensive, quantitative synthesis of all the available data, might help us to separate the true from the false.

136

Update

TRENDS in Molecular Medicine

Most common major diseases appear to have a complex hereditary component. Coronary artery disease, cancer, stroke, diabetes, depression and dementia are typical examples, for which not one but several genes modulate the disease risk. However, there are more than one million common variants in our genome [1,2], and identifying those that are truly responsible for each disease represents an unprecedented challenge; the number of candidate analyses that could be undertaken is astronomical. Considering the possible combinations of polymorphisms, diseases, outcomes, genetic contrasts, subgroup analyses and replication experiments, there are trillions of genetic association analyses that could be performed (Table 1). Data fishing is not new to epidemiologists [3] but, in the past, candidate risk factors were limited to a group of wellknown players: smoking, nutrition, occupation, drinking, reckless driving, drugs and sex. A few hundred occupational hazards and a few thousand nutritional parameters (assuming we consider boiled potatoes, baked potatoes and French fries as separate risk factors) are no match for the number of candidate genetic risk factors. Methods of large-scale genotyping might facilitate analyses [4], but even if we perform 100 trillion of these overnight, we will still have to deal with five trillion false associations along with an unknown number of true ones. This is the consequence of chance and of our own rules.

Vol.9 No.4 April 2003

We use statistical tests to claim that an association exists when the probability (p) value is , 0.05. Hence, if we run 100 tests for diverse associations, we would expect five of them to have a p value , 0.05 simply by chance alone [5]. Suppose we take 1000 patients with diabetes mellitus and 1000 disease-free controls, and have an astrologer record 100 astrological parameters for each subject. If we run the data, we might find five outrageous astrological parameters that statistically significantly predispose humans to diabetes. Chance can play a different trick for true associations. Suppose that ten laboratories are testing whether polymorphism X affects the risk of stroke, and the truth is that X increases the odds of suffering a stroke by 30%. As a result of random variability, the ten laboratories get quite different estimates, hovering around the true value of 30%. Professor Winner finds a formally statistically significant 100% increase in odds, Doctor Onthespot finds a 30% increase, but fails to reach statistical significance owing to a limited amount of data, and Professor Badluck finds no increased risk at all. Other investigators find trends for 10 – 60% increased odds. A major journal then publishes Professor Winner’s discovery of a polymorphism that doubles the risk of stroke. This is the winner’s curse: the first ‘positive’ study, even on a true association, provides inflated estimates compared to the reality.

Table 1. Fishing analyses: counting fish in the sea of genetic association analyses Multiplier

Parameter

Example

. 106

Different gene variants

There are more than one million known single-nucleotide polymorphisms and this figure increases substantially if we also consider other gene variants. Let us consider the APOE (apolipoprotein E) gene 1 polymorphism

. 103

Different diseases with known, possible or speculated genetic component

According to MEDLINE, in the year 2002 alone, studies were published addressing the relationship of this polymorphism with familial Alzheimer’s disease, sporadic Alzheimer’s disease, colorectal cancer, fatty liver, atherosclerosis, hyperlipidemia, acute ischemic stroke, spina bifida, coronary artery disease, normal tension glaucoma, hypertension, Parkinson’s disease, diabetic nephropathy, pre-eclampsia, hepatitis-C-related liver disease, cerebrovascular disease, coronary artery disease post-renal transplantation, non-specified cognitive impairment, childhood nephrotic syndrome, spontaneous abortion, multiple sclerosis, alcohol withdrawal, cognitive dysfunction after coronary artery surgery, alcoholic chronic pancreatitis, alcoholic cirrhosis, macular toxicity from chloroquine, macular edema, aortic-valve stenosis, vascular dementia, type-II diabetes mellitus, and migraine

. 10

Different outcomes to consider per disease

For coronary artery disease one might use as outcome(s): death (any cause), death (from myocardial infarction), sudden death, myocardial infarction, unstable angina, stable angina, coronary artery stenosis, arrhythmia, congestive heart failure, myocardial hibernation, response to medical treatment, response to interventional treatment, vessel restenosis, creatine-kinase elevations, troponin-I elevations, troponin-T elevations, myoglobin elevations, ejection fraction on echocardiography, high-risk angiographic characteristics of coronary lesions, or various combinations of such endpoints

. 10

Possible genetic contrasts

Three common alleles (12, 13, 14) result in six possible genotypes (3/3, 4/3, 3/2, 4/4, 4/2 and 2/2). Alleles (or genotypes) can be compared with each other alone or in various combinations. In the simplest case of two alleles, there are seven potential genetic contrasts

. 10

Tempting subgroup splits

Men versus women, European descent versus African descent versus Asian descent, comparisons of specific ethnicities, young versus older adults, various age cut-offs, dozens of nutritional-intake subgroups, smokers versus non-smokers, and several pharmacogenetic interactions

. 10

Replications on the same question

There are already more than ten different published studies of the association of the APOE 1 polymorphism with ischemic heart disease alone. It is possible that more investigators might be working on it or might have already completed their analyses, but their data is still unpublished

http://tmm.trends.com

TRENDS in Molecular Medicine

Empirical evidence, bias and heterogeneity Lohmueller et al. have recently shown that these problems are not just theory [6]. In a large-scale empirical evaluation of 25 postulated genetic associations, they showed that the results of the first statistically significant (‘positive’) study were inflated when compared with the results of other published research on the same question. This is consistent with prior large-scale observations [7]. Omitting the first ‘positive’ study, is there still an association when only the remaining studies are considered? Lohmueller et al. captured 301 such ‘validation’ studies and combined their results [6]. To do this, one cannot simply add up the numbers, and an entire discipline, meta-analysis, has developed around how to perform these grand-scale combinations of data [8]. In this era of globalization, meta-analysis has revolutionized many scientific fields, including clinical medicine, where it is already considered to be the highest level of scientific evidence [9]. Now its time appear to have come in genetics. In eight of the 25 genetic associations examined by Lohmueller et al. [6], the combined results of the subsequent studies were formally statistically significant, whereas the other 17 proposed associations were not replicated, and are probably therefore red herrings fished out of the sea of chance. Thus, perhaps , 30% of the current claims of significant genetic associations are indeed true. So far we have dealt with a perfect world, in which no bias and no genuine diversity exists in estimating genetic associations. However, ‘negative’ studies that do not reach formal statistical significance might never be published. This publication bias is well documented in clinical research and it is probably more common in basic sciences [10], with the consequence that published estimates of effects might be inflated relative to the truth. Lohmueller et al. [6] evaluated whether publication bias could explain the apparently replicated genetic associations. They estimated that all replications could be reduced to the level of chance findings if 84% of studies are not published [6]. This is unlikely, but we do not really know how many studies remain unpublished. Lohmueller et al. also applied statistical tests to evaluate whether smaller studies tend to report larger effects, an indirect hint for publication bias [11]. Positive signals were seen in only three of the 25 associations, which is encouraging, but in a larger set of 55 genetic associations we have recently documented positive signals in up to 38% of postulated associations [12]. The threat of bias should not be understated. Even if ‘negative’ studies are eventually published, the likely delay means that the scientific literature might be distorted for several years [13]. Modeling tests (cumulative meta-analysis and recursive cumulative meta-analysis [14,15]) can capture the evolution of the combined results over time and the gradual dissipation of postulated genetic effects [7]. Variability in the results of different studies on the same question does not necessarily represent bias. Genetic associations can have a different strength in different populations and in different settings. Analysis offour genetic associations by Lohmueller et al. [6] suggested that racial diversity was not strong enough to explain the heterogeneity. However, more data will be needed to reach safe conclusions http://tmm.trends.com

137

Vol.9 No.4 April 2003

on how common genuine heterogeneity is and what causes it. One size might not fit all [16]. Individualizing the genetic risk will require heterogeneity to be taken into account, including the interaction of genes with acquired environmental risk factors. However, this requires far more data than we would typically expect to have available. A big picture of small effects It is likely that most genuine genetic associations represent modest effects with odds ratios of 1.1 – 1.5 (i.e. a 10 – 50% relative increase in the likelihood of getting a disease). Any one polymorphism usually explains only 1 – 8% of the overall disease risk in the population. This might sound quite small, but the additive effect of several such risk factors could make up the 20 – 70% of the overall disease risk that is attributed to genetic factors in most common diseases. An adequately powered study to detect single genetic associations would typically require one thousand subjects or more, depending on the prevalence of the implicated polymorphism and the exact odds ratio. However, most genetic association studies conducted to date have been underpowered to perform the detection task. Sample sizes have typically been 100–300 and rarely . 1000 (Fig. 1) [12]. The required sample size is even larger if the investigator wants to document the interaction of several gene variants or the interaction of genetic and environmental parameters. How do we achieve this? One option is to conduct single very large studies with many thousands of subjects. Such studies should be encouraged, but they are costly and unlikely to be feasible for all research questions. Most of these questions will continue to be addressed using relatively small studies by many investigators around the globe. Carefully conducted meta-analyses of these data are essential for clarifying whether these associations are true or not. Meta-analyses must be performed and reported with high standards [17], and should be all-inclusive. Consortium 150

120

Number of studies

Update

90

60

30

0 0–100

501–600

1001–1100

>1500

Sample size TRENDS in Molecular Medicine

Fig. 1. Sample size in 579 genetic association studies addressing 55 different genetic associations that have been evaluated in meta-analyses with binary outcomes (for a list of these associations see Ref. [12]). Sample size refers to subjects or alleles, in the case and control groups, involved in the targeted genetic contrast.

Update

138

TRENDS in Molecular Medicine

approaches, in which individual subject data are contributed from several teams to common databases, should also be encouraged [18]. Primary genetic association studies must also be conducted with appropriate attention to study design, analysis and reporting, and all scientific data should be widely available. Final proof of the identity of molecular-level risk factors (e.g. single base-pair changes) for human disease requires comprehensive large-scale evidence from thousands of individuals, and this is a prime opportunity to bridge excellence in molecular science with well-conducted clinical and epidemiological research. Acknowledgements J.P.A.I is also an Adjunct Professor of Medicine at the Division of Clinical Care Research, Department of Medicine, Tufts – New England Medical Center, Boston, MA, USA.

References 1 Sachidanandam, R. et al. (2001) A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature 409, 928– 933 2 Taylor, J.G. et al. (2001) Using genetic variation to study human disease. Trends Mol. Med. 7, 507 – 512 3 Michels, K.B. and Rosner, B.A. (1996) Data trawling: to fish or not to fish. Lancet 348, 1152– 1153 4 Mohlke, K.L. et al. (2002) High-throughput screening for evidence of association by using mass spectrometry genotyping on DNA pools. Proc. Natl. Acad. Sci. U. S. A. 99, 16928 – 16933 5 Oxman, A.D. and Guyatt, G.H. (1992) A consumer’s guide to subgroup analyses. Ann. Intern. Med. 116, 78 – 84

Vol.9 No.4 April 2003

6 Lohmueller, K.E. et al. (2003) Meta-analysis of genetic association studies supports a contribution of common variants to susceptibility to common disease. Nat. Genet. 33, 177 – 182 7 Ioannidis, J.P. et al. (2001) Replication validity of genetic association studies. Nat. Genet. 29, 306 – 309 8 Cooper, H. and Hedges, L.V. (1994) The Handbook of Research Synthesis, Russell Sage Foundation 9 Harbour, R. and Miller, J. (2001) A new system for grading recommendations in evidence based guidelines. Br. Med. J. 323, 334 – 336 10 Easterbrook, P.J. et al. (1991) Publication bias in clinical research. Lancet 337, 867 – 872 11 Lau, J. et al. (1997) Quantitative synthesis in systematic reviews. Ann. Intern. Med. 127, 820 – 826 12 Ioannidis, J.P. et al. (2003) Genetic associations in large versus smaller studies: an empirical assessment. Lancet 361, 567 – 571 13 Ioannidis, J.P. (1998) Effect of the statistical significance of results on the time to completion and publication of randomized efficacy trials. J. Am. Med. Assoc. 279, 281– 286 14 Lau, J. et al. (1992) Cumulative meta-analysis of therapeutic trials for myocardial infarction. N. Engl. J. Med. 327, 248 – 254 15 Ioannidis, J. and Lau, J. (2001) Evolution of treatment effects over time: empirical insight from recursive cumulative metaanalyses. Proc. Natl. Acad. Sci. U. S. A. 98, 831 – 836 16 Lau, J. et al. (1998) Summing up evidence: one answer is not always enough. Lancet 351, 123 – 127 17 Stroup, D.F. et al. (2000) Meta-analysis of observational studies in epidemiology: a proposal for reporting. J. Am. Med. Assoc. 283, 2008– 2012 18 Ioannidis, J.P. et al. (2002) Meta-analysis of individual participants’ data in genetic epidemiology. Am. J. Epidemiol. 156, 204 – 210 1471-4914/03/$ - see front matter q 2003 Elsevier Science Ltd. All rights reserved. doi:10.1016/S1471-4914(03)00030-3

Could you name the most significant papers published in life sciences this month? Updated daily, Research Update presents short, easy-to-read commentary on the latest hot papers, enabling you to keep abreast with advances across the life sciences. Written by laboratory scientists with a keen understanding of their field, Research Update will clarify the significance and future impact of this research. Our experienced in-house team are under the guidance of a panel of experts from across the life sciences who offer suggestions and advice to ensure that we have high calibre authors and have spotted the ‘hot’ papers. Visit the Research Update daily at http://update.bmn.com and sign up for email alerts to make sure you don’t miss a thing. This is your chance to have your opinion read by the life science community, if you would like to contribute contact us at [email protected] http://tmm.trends.com