Opportunities and Challenges for Genome Sequencing in the Clinic

Opportunities and Challenges for Genome Sequencing in the Clinic

CHAPTER THREE Opportunities and Challenges for Genome Sequencing in the Clinic Gianpiero L. Cavalleri*,1, Norman Delanty*,† *Molecular and Cellular T...

219KB Sizes 4 Downloads 63 Views

CHAPTER THREE

Opportunities and Challenges for Genome Sequencing in the Clinic Gianpiero L. Cavalleri*,1, Norman Delanty*,† *Molecular and Cellular Therapeutics, The Royal College of Surgeons in Ireland, Dublin, Ireland † Department of Neurology, Beaumont Hospital, Dublin, Ireland 1 Corresponding author: e-mail address: [email protected]

Contents 1. 2. 3. 4.

Introduction Milestones in the Development of Genetic Mapping Strategies for the Application of Next-Generation Sequencing in Genetic Mapping Consequence of GENE DISCOVERY in Research Programs 4.1 What is required to bring it to clinic? 5. Challenges Around Incorporating Sequencing to the Clinic References

65 66 70 72 72 77 79

Abstract Human genome sequencing technology is developing rapidly. These developments are providing exciting opportunities for genetic mapping of human traits, ranging from accelerated discovery of mutations underlying relatively simple Mendelian disorders to more genetically complex human diseases. This chapter outlines the development of whole-genome sequencing in a historical context of genetic mapping and explores the impact that sequencing is having on gene discovery study design. Using the example of epilepsy, the authors outline the opportunities and barriers for the translation of genetic predictors from discovery to the clinic. Finally, the authors discuss the practical challenges of actual implementation of whole-genome sequencing to the clinic.

1. INTRODUCTION Whole-genome sequencing represents the ultimate step in assaying human genetic variation at the base sequence level. Over the past two decades, epidemiological studies have illustrated a clear genetic component to a wide range of diseases of public health importance. During this period, a variety of linkage and association-based methods have been developed and applied to map disease-associated genes, with notable success. The development of Advances in Protein Chemistry and Structural Biology, Volume 89 ISSN 1876-1623 http://dx.doi.org/10.1016/B978-0-12-394287-6.00003-3

#

2012 Elsevier Inc. All rights reserved.

65

66

Gianpiero L. Cavalleri and Norman Delanty

whole-genome sequencing, combined with the falling cost of the technology holds great promise for accelerating the delivery of genetic predictors of human diseases and related treatment. The expectation is that over the coming decade, major inroads will be made into the understanding of the genetics of common complex diseases, in particular, providing a platform for the translation of such finding to the clinic, with resulting improvements in patient care. This chapter seeks to outline the development of whole-genome sequencing in a historical context of genetic mapping and explore the impact that sequencing is having on gene discovery study design. We then, using the example of epilepsy, outline the opportunities and barriers for the translation of genetic predictors to the clinic. Finally, we discuss the challenges of actual implementation of whole-genome sequencing to the clinic.

2. MILESTONES IN THE DEVELOPMENT OF GENETIC MAPPING The human genome is approximately 3 billion base pairs in length and a pair wise comparison will show that any two ‘unrelated’ individuals differ at around 0.5% of base sequence (or around 3.5 million bases; Pelak et al., 2010). These differences are important—they, in part, determine what makes us all morphologically and behaviorally unique individuals. By allowing the assessment of this variation, in a complete manner, at a relatively low price, the development of whole-genome sequencing opens up various new avenues for the acceleration of gene discovery. To appreciate the significance of whole-genome sequencing, it is important to be familiar with the milestones in methodological development for disease gene mapping in humans. Until now, mapping disease-causing variation was more about the coupling of limited genotyping and sequencing capacity, with a very basic map of the human genome. Indeed, the very first disease genes were identified without any knowledge of a genetic map, simply because no such map existed. In the case of sickle cell anemia and phenylketonuria, the breakthrough was the identification of the diseased protein (b-globin and phenylalanine hydroxylase, respectively), which then allowed the generation of probes which could be hybridized to restriction digests of human chromosomes (Flavell, Kooter, De Boer, Little, & Williamson, 1978). Restriction digests of the human DNA were the first step toward the development of a very crude genetic map of the human genome and a platform for linkage mapping (Botstein, White, Skolnick, & Davis, 1980). The identification of highly variable microsatellites, regularly spaced across the

Opportunities and Challenges for Genome Sequencing in the Clinic

67

genome refined the genetic map further (Donis-Keller et al., 1987). By testing for cosegregation of microsatellites with a disease phenotype in large pedigrees, one can statistically ‘link’ the disease to a specific region of the genome. Sequencing through that region for potentially disease-causing variation was an arduous and expensive task, given that the linked regions were often several megabases in sizes, and Sanger sequencing is limited to reads of around 300 bp/reaction. Nevertheless, linkage analysis has proven a highly effective tool for the identification of Mendelian disease genes, in particular, where clear segregation patterns following theoretical disease models (dominant, recessive, etc.) drive highly significant linkage scores (LOD score), providing the confidence to invest in the difficult follow up fine mapping. Given the success of linkage analysis in mapping disease genes for familial disorders (genes and related disease phenotypes for over 3500 Mendelian disorders are described in the “Online Mendelian Disorders in Man” database), the community turned its attention to more common, sporadic and presumably genetically complex, diseases of public health importance such as cardiovascular disease, diabetes, obesity, etc. The technique of linkage evolved in the sense that it was applied to larger, more complex pedigrees and nonparametric approaches removed the restrictions of Mendelian disease models, which are of course absent from more complex, sporadic conditions. Further, the resolution of the genetic map of the human genome improved steadily (Dib et al., 1996; Kong et al., 2004; Matise et al., 2003). However, despite these developments, linkage has largely proven a disappointment when applied to common diseases. The apparent failure of linkage led to the emergence of ‘association’ as a viable and theoretically more powerful alternative for genetic mapping in common disease. Where linkage examines segregation between genotype and phenotype within familial data, association examines the frequency distribution of a particular variant (or set of variants) across unrelated samples assigned to case and control cohorts. A significant overrepresentation of a variant in the case or control cohort would be indicative of ‘association’ between that variant and the (disease) phenotype in question. The association strategy was motivated further by a highly cited paper by Risch and Merikangis, which argued that association was theoretically more power than linkage for mapping genetic variation for complex traits (Risch & Merikangas, 1996). A fundamental component of early association-based mapping was the selection of particular candidate genes (or regions) based on biological knowledge of the trait, which inherently limited the approach.

68

Gianpiero L. Cavalleri and Norman Delanty

This contrasts with the ‘genomic’ nature of linkage, where knowledge of disease biology was not a prerequisite to success. Looking back and with the benefit of hindsight, it is clear that association studies in genetic mapping quickly led to a flood of false-positive associations being reported in the literature (Campbell & Rudan, 2002). The statistical analysis was not conducted to appropriately rigorous standards. Several factors contributed to the problem including a lack of effective approaches to controlling for cryptic population stratification and a systematic failure to control for multiple testing (at the level of both the number of phenotypes and genetic variants considered in a given experiment). The resulting situation was unsustainable for the community of human geneticists. The experience gained through the development and application of genome-wide association changed the situation in genetic mapping dramatically. Linkage mapping (as the name suggests) takes advantage of the manner in which individuals in a pedigree inherit ‘linked’ genetic alleles along chromosomes and also how the relationship between these alleles is broken down by recombination. GWAS similarly takes advantage of linkage disequilibrium (LD) patterns, but at a population level. Through the early part of the new millennium, various research groups began characterizing LD at a fine-scale level in the human genome. The emerging picture was one of a block-like structure across the human genome, of extended stretches of LD (Reich et al., 2001). The nature and extent of this LD is attractive for association studies as in theory, a subset of carefully selected variants could “tag” or represent other variants across the genome. This concept motivated the HapMap project, a major international effort to characterize linkage disequilibrium across the genome in various different global populations (International HapMap Consortium, 2005). The HapMap project in turn provided the dataset for the selection of genome-wide tagging sets, which, combined with progress in chip-based genotyping technology (from companies such as Illumina and Affymetrix), led the development of so-called GWAS chips. The concept of genomewide association was born. The first published GWAS study illustrated the potential of genome-wide association. When GWAS was applied to 96 cases with age-related macular degeneration (AMD) and 50 controls, a variant in the complement factor H gene was discovered to have an odds ratio of 7.4 for AMD (Klein et al., 2005). The genomic nature of the GWAS approach addressed the major limitation of candidate gene association studies but it also provided datasets to address those statistical issues that plagued the candidate gene approach,

Opportunities and Challenges for Genome Sequencing in the Clinic

69

population stratification and multiple testing. When analyzed using tools such as principal components analysis, the genotypes provided across hundreds of thousands of loci (in multiple individuals) allowed remarkable resolution of a population structure that appeared shaped largely by geographic barriers (Li et al., 2008; Novembre et al., 2008). Methods were quickly developed to integrate this individual-level ancestry to regression-based association analysis, in large part addressing the issue of cryptic population structure (Price et al., 2006) or provide more accurate estimates of stratificationinduced genomic ‘inflation’ (Devlin, Bacanu, & Roeder, 2004). Similarly, the huge datasets forced the community to develop and apply rigorous methods for controlling for multiple testing. A consensus was quickly reached on a threshold for genome-wide significance (the often quoted value of p ¼ 5  10 8). The number of reliable and reproducible genetic predictors of complex traits steadily increased (for a complete catalog, see: http://www.genome.gov/gwastudies/). The development and application of GWAS had a major impact on how genetic mapping studies were designed and conducted. The importance of cohort size on study ‘power’ was well understood before the GWAS era, but regardless studies were, on the whole, relatively small in size (<500 samples). This changed dramatically with GWAS. Over a short timeframe consortia developed, focusing on a variety of heavily studied traits and diseases. These consortia grew to the point that within 5 years of the birth of GWAS, studies involving over 100,000 samples pooled across many research groups were reported in the literature (Lango Allen et al., 2010; Speliotes et al., 2010). Although positive findings were possible from much smaller studies (e.g., the AMD finding discussed above) it quickly became apparent that effect sizes of an odds ratio of >2 were rare. Massive, consortia-led studies became the norm. It is interesting to note the effect the success of GWAS has had on study design. The elegant and incredibly effective “linkage” design, so successfully applied to traits following Mendelian segregation patterns, all but became unfashionable. Most studies ignored pedigree structure—in fact actively avoided it, favoring large collections of unrelated individuals (and indeed filtering out related samples from cohorts). This “sledgehammer” approach generally suited as established epidemiological studies could often provide thousands of (unrelated) DNA samples. Ironically, the effects being detected and reported by these huge studies (typically, in the range of 1.05–1.3) have little or no clinical relevance, although they can sometimes (when the causal variant is identified) inform on disease biology.

70

Gianpiero L. Cavalleri and Norman Delanty

The cost of next-generation sequencing became relatively affordable around the period (2009) that the community of human geneticists realized that ‘standard’ GWAS was (in most, but not all cases) unable to fully characterize the genetic architecture of a trait or disease (Manolio et al., 2009). We stress “relatively affordable” because in 2009, the cost of a full genome sequence was still around EU50,000. Initial gene-mapping efforts incorporating whole-genome sequencing therefore focused on carefully selected pedigrees, typically involving Mendelian disorders that could not be explained by more traditional methods (e.g., linkage or homozygosity mapping). An early success story was the identification of compound heterozygote mutations in SH3TC2 as a causal in a pedigree with Charcot–Marie–Tooth neuropathy (Lupski et al., 2010). By sequencing just one individual (the proband), the authors were able to filter the resulting set of approximately 3.4 million single-nucleotide variants down to a remarkable two variants, both of which, when genotyped in the larger pedigree, segregated with the disease phenotype (Lupski et al., 2010).

3. STRATEGIES FOR THE APPLICATION OF NEXTGENERATION SEQUENCING IN GENETIC MAPPING The arrival of whole-genome sequencing has had a major influence on study design in genetic mapping. Where GWAS drove the assembly of large case:control cohorts, often through multinational consortia (as described above), the arrival of sequencing has placed emphasis back on pedigrees. Broadly speaking, there are four strategies for the application of nextgeneration sequencing in genetic mapping: (i) sequencing one or a number of affected individuals from within a large pedigree, (ii) sequencing trios (unaffected parents and affected child) for the identification of de novo mutations, (iii) sequencing large numbers of unrelated cases, and (iv) sequencing extremes of a trait distribution. The selection of one or several individuals from a disease pedigree for sequencing is dependant in part on the nature of the pedigree in question. If candidate genes for the disease are already well described, sequencing of a single affected individual, with follow up segregation analysis of candidatecausal variation may be sufficient. The example of SH3TC2 and Charcot–Marie–Tooth disease outlined above is a good example of this strategy. If the disease gene is unknown, sequencing distantly related affected individuals from a large pedigree can be very powerful. The extent of genomic sharing (identical by descent, IBD) between any pair of related

Opportunities and Challenges for Genome Sequencing in the Clinic

71

individuals in a pedigree is directly proportional to the genetic distance between those two individuals. Under a model where one segregating locus is causing the disease phenotype in a pedigree, than sequencing and characterizing regions that are IBD between the two most distantly related, affected individuals, is the most cost-effective way of identifying a causal haplotype harboring the disease-causing variation. A further strategy for sequencing within a disease pedigree is to integrate results from linkage analysis with sequence data from the proband. When linkage is applied to small pedigrees, or cases where the penetrance of a mutation is incomplete, study power is compromised. The limitations of Sanger sequencing are exposed in these situations leading to study ‘dead ends.’ Next-generation sequencing reopens the case on pedigrees where LOD scores failed to reach the threshold for genome-wide significance as sequence from an affected individual allows interrogation for candidate mutations within (modest) linkage peaks (Bailey-Wilson & Wilson, 2011). A good example of the combined linkage and sequence approach is the report by Sobreira et al. in which a deletion in PTPN11 was identified as the causal mutation in a family with metachondromatosis (Sobreira et al., 2010). Sequencing parent–child trios is a powerful approach for the identification of de novo disease-causing variation. The rate of mutation across coding regions of the human genome is approximately 1.38  10 8 bp/generation which equates to roughly 40 de novo mutations per generation (Tennessen et al., 2012). Deleterious de novo mutations can of course be disease causing, but up to recently, a genomic assessment at the nucleotide level was not practical. The first high-profile example of the combined trio/exome sequencing approach for the assessment of de novo mutation was reported in 2010 by a Dutch group, who sequenced 10 individuals (and unaffected parents) with unexplained learning disability. Six of the ten individuals carried likely pathogenic de novo mutations (Vissers et al., 2010). A similar approach has been applied on a much larger scale to sporadic cases of autism spectrum disorder, showing an important (but limited) role for de novo variation in this condition (Neale et al., 2012; Sanders et al., 2012). The strategy of sequencing large numbers of unaffected cases is analogous to that typically applied to GWAS—searching for variants that are enriched in case, when compared to control genomes. The approach is expensive but is more applicable to more complex, less familial conditions. Study power is a major issue with association style sequencing approaches; unless effect sizes are large, sizeable cohorts of cases and controls are required to provide sufficient power to detect effects (Nelson et al., 2012; Tennessen et al., 2012).

72

Gianpiero L. Cavalleri and Norman Delanty

A cost-effective “two-stage” variation on this approach is to, in stage 1, sequence a subset of unrelated cases and controls, then, based on p values from univariate case:control association, prioritize for stage 2, a subset of variation for follow up genotyping (using, e.g., a custom array chip) in a larger case and control cohorts (Li, Lewinger, Gauderman, Murcray, & Conti, 2011). This approach has recently been applied to epilepsy, although no clear, reproducible risk variants were identified (Heinzen et al., 2012). The concept of the “extremes of a phenotype” approach to gene mapping has long been appreciated. The first clear example in the context of rare variation was reported in 2004 when a team conducted candidate gene (Sanger) sequencing across three genes known to harbor Mendelian mutations for low plasma levels of high-density lipoprotein cholesterol (HDL-C). Missense sequence variants were significantly enriched in individuals in the low HDL-C extreme, than in those in the high HDL-C extreme (Cohen et al., 2004).

4. CONSEQUENCE OF GENE DISCOVERY IN RESEARCH PROGRAMS 4.1. What is required to bring it to clinic? The development of next-generation sequencing and integration of the technique to genetic mapping studies holds great promise for accelerated discover in disease genomics. Discovery in basic research presents opportunity for translation to the clinic. The following is a discussion, using the example of epilepsy, of the challenges surrounding translation of gene discovery efforts to the clinic. Epilepsy encompasses a diverse group of brain disorders characterized by recurrent unprovoked seizures (Berg & Cross, 2010; Duncan, Sander, Sisodiya, & Walker, 2006). With a lifetime prevalence of between 1% and 2%, epilepsy is the most common serious disorder of the brain. Six million people in Europe have epilepsy and approximately 300,000 individuals are diagnosed each year. One-third of individuals with epilepsy are refractory to treatment and continues to have habitual seizures (Kwan, Schachter, & Brodie, 2011). Refractory epilepsy can have profound negative psychosocial consequences on the patient that include educational, vocational, and psychological domains. Refractory epilepsy is also a strong risk factor for sudden unexplained death in epilepsy (SUDEP), a complication with obvious, devastating impact. There are

Opportunities and Challenges for Genome Sequencing in the Clinic

73

many risk factors for epilepsy in general and these include genetic abnormalities, CNS insults, brain lesions, and precipitating injuries in early life such as complicated febrile convulsions, meningitis, and encephalitis, and also there is important interaction between the above factors (Berkovic, Mulley, Scheffer, & Petrou, 2006). 4.1.1 Epilepsy has a significant genetic component One percent of epilepsies exhibit a clear familial, or Mendelian, pattern of inheritance. In the other 99% of common, sporadic cases, the influence of genes is also considerable. Twin studies indicate a heritability of 70–80% (Berkovic, Howell, Hay, & Hopper, 1998; Kjeldsen, Corey, Christensen, & Friis, 2003) and risks for sporadic epilepsy are increased two- to fourfold in the first-degree relatives of individuals with idiopathic or nonlesional focal epilepsy (Ottman, Lee, Hauser, & Risch, 1998; Winawer, Rabinowitz, Pedley, Hauser, & Ottman, 2003). Heritability studies to date also suggest that both shared and seizure specific genetic risk factors exist (Ottman et al., 1998; Winawer et al., 2003). Since the discovery of CHRNA4 as an epilepsy gene in humans (Steinlein et al., 1995), linkage strategies applied to rare monogenic forms of epilepsy have led to the identification of more than 20 genes with a major effect on susceptibility (see Fig. 3.1; Helbig, Scheffer, Mulley, & Berkovic, 2008; Turnbull et al., 2005; Weber & Lerche, 2008). Almost all of the genes encode ion channels, providing insight to the pathogenic mechanisms that can lead to epilepsy and leading to the conceptualization of epilepsy as a “channelopathy.” However, the gene identifications in Mendelian epilepsy syndromes apply to an extremely small proportion of affected individuals. First epilepsy gene (Steinlein et al., 1995)

First HLA marker for First epilepsy GWAS (Kasperaviciute carbamazepine et al., 2010) hypersensitivity (Chung et al., 2004)

Linkage studies in familial epilepsy

1989 1991 1993 1995 1997 1999 2001 2003 2005 2007 2009 2011 2013 Candidate gene association studies in sporadic epilepsy

Discovery of CNVs in epilepsy (Helbig et al., 2009; Mefford et al., 2010) First exome-sequence epilepsy study (Heinzen et al., 2012)

Figure 3.1 A timeline of genetic mapping in epilepsy.

74

Gianpiero L. Cavalleri and Norman Delanty

Only approximately 10% of people with epilepsy have affected relatives and, of these, only a fraction show Mendelian segregation patterns. As eluded to above, the methodology underpinning linkage analysis falters when applied to many familial and all nonfamilial forms of the condition. The genes that influence risk in the vast majority of epilepsies therefore remain to be identified and are likely to reveal new pathogenic mechanisms that have relevance for treatment or prevention. 4.1.2 Association mapping in sporadic epilepsy Initial efforts at association mapping in epilepsy were restricted to small (underpowered) studies focusing on candidate genes. Reflecting the situation in other disease fields, these studies experienced limited success, and few findings have been confirmed (Cavalleri et al., 2005; Tan, Mulley, & Berkovic, 2004). Several major genome-wide association studies are now underway in epilepsy, but few findings have emerged from these studies so far. In a large GWAS involving over 3000 patients and 4000 controls, we observed only a marginal contribution of common SNPs in conferring risk broadly amongsporadic focal epilepsies (Kasperaviciute et al., 2010), a finding similar to other neuropsychiatric disorders such as schizophrenia, autism, and bipolar disorder (Carroll & Owen, 2009; Purcell et al., 2009). A more recent effort in patients of Chinese descent promoted variation in CAMSAP1L1 as a potential risk factor for epilepsy, although independent replication is required (Guo et al., 2012). On the other hand, clear connections are being established between copy number variants (CNVs) and risk for epilepsy. Recent studies of idiopathic generalized epilepsies (Helbig et al., 2009; Mefford et al., 2010) have found a strong association with a microdeletion at 15q13.3, also reported in schizophrenia (Stefansson et al., 2008), mental retardation (Sharp et al., 2008), and autism (Pagnamenta et al., 2009). A larger study involving nearly 4000 cases and 1500 controls, we observed a clear enrichment of large heterozygous deletions in patients with epilepsy, with deletions exceeding two megabases observed exclusively in epilepsy patients (Heinzen et al., 2010). The identified deletions are scattered across the genome with only a few loci affected multiple times. There is strong evidence that genetic factors also play an important role in SUDEP. Long QT syndrome (LQTS) and Brugada Syndrome are a recognized cause of (sometimes fatal) ventricular arrhythmias and are caused by mutations in several different potassium and sodium channel genes, some of which are also epilepsy genes.

Opportunities and Challenges for Genome Sequencing in the Clinic

75

Indeed, some SUDEP patients have been shown to carry mutations in LQTS genes (Tu, Bagnall, Duflou, & Semsarian, 2011; Tu, Waterhouse, Duflou, Bagnall, & Semsarian, 2011). There have also been cases of SUDEP reported in patients carrying known epilepsy-causing mutations. Examples include two patients (from the same family) with GEFS þ caused by a SCN1a mutation (Hindocha et al., 2008) and in a Dravet syndrome patient, also carrying a pathogenic SCN1a mutation (Le Gal et al., 2010). In terms of clinical utility, under gold-standard care, mutation screening is used in the diagnostic workup of several forms of epilepsy including generalized epilepsies with febrile seizures plus (GEFSþ, screening of SCN1a and SCN1b), severe myoclonic epilepsy (Dravet syndrome, screening of SCN1a, 2a, 3a, 9a, and 1b), infantile spasms (e.g., screening of CDKL5), seizures with cortical malformations (e.g., screening of FLNA), epilepsy with learning disability (screening of various genes and CNVs), and some idiopathic forms of generalized epilepsy (Pal, Pong, & Chung, 2010). However, despite the diversity of seizure-related conditions for which genetic testing is of value, it should be noted that most of these cases are familial and rare (accounting for <1% of cases of epilepsy). A major issue with the integration of genetic testing to more sporadic forms of epilepsy (e.g., CNVs and sporadic epilepsy) is the lack of correlation between a given mutation and specific, clinically useful aspects of the epilepsy phenotype (Mulley & Mefford, 2011; Sisodiya & Mefford, 2011).

4.1.3 Association mapping in epilepsy pharmacogenomics Programs of pharmacogenomics research in epilepsy are less well developed than those focused on predisposition. Despite the availability of genomewide association and more recently, whole-genome sequencing, genetic mapping studies in epilepsy treatment have largely been limited to candidate gene, association studies (Cavalleri, McCormack, Alhusaini, Chaila, & Delanty, 2011). Nevertheless, a number of reliable associations have been identified. Individuals carrying CYP2C9*2 and *3 alleles have a decreased capacity for phenytoin, a commonly prescribed anti-epileptic drug (AED). Variation in metabolism due to CYP2C9 genotype can impact on dose and consequently, incidence of dose-related adverse drug reactions including dizziness, ataxia, and slurred speech (Aynacioglu et al., 1999; Hung, Lin, Chen, Chang, & Liou, 2004). The HLA-B*1502 allele is a strong predictor of Stevens–Johnson syndrome, a rare but devastating adverse

76

Gianpiero L. Cavalleri and Norman Delanty

reaction associated with use of carbamazepine, the most commonly prescribed AED (Chung et al., 2004). But the effect of the *1502 allele is limited to Asian descent populations in which the allele is variable (Alfirevic et al., 2006; Lonjou et al., 2008). However, an apparently more global (in terms of ethnicity) predictor of was recently discovered when we and others identified HLA-A*3101as a predictor of carbamazepinerelated maculopapular exanthemia, hypersensitivity syndrome, and Stevens–Johnson syndrome in European and Asian descent populations (McCormack et al., 2011; Ozeki et al., 2011). Clinical uptake of pharmacogenomic tests has been slow, at least in part because prospective trials are lacking and genetic tests can be slow and expensive. See Cavalleri et al. (2011) for a recent review of epilepsy pharmacogenomics. 4.1.4 Accelerating discovery Within the next few years, the ability of the gene-mapping community to identify pathogenic, epilepsy-related variation is set to accelerate further. Two large genetic mapping efforts in epilepsy are underway and it is hoped that, from 2011 to 2015, these and other established efforts will act as major discovery pipelines for epilepsy. The first is an FP7-funded project (EpiPGX, www.epipgx.eu) focused specifically on delivering pharmacogenetic biomarkers for guiding treatment in epilepsy. The project will involve the application of GWAS and targeted whole-genome sequencing to 8000 epilepsy patients from different centers across Europe. The second is a NINDS-funded project (Epi4k, www.epgp.org/epi4k) focused specifically on epilepsy predisposition and will involve whole-genome sequencing in 4000 epilepsy patients from Europe, Australia, and the United States (Epi4K, 2012). 4.1.5 Translating discovery The translation, over the coming years, of advances in epilepsy genomics (and clinical genomics in general) will mean significant challenges as well as opportunities for health care and society (Auffray et al., 2012; Green & Guyer, 2011; Hudson, 2011). If genetic discovery sits in isolation, then there is a danger that translation to real-life clinical care will lag behind for many years to come. This is particularly relevant to epilepsy, as the heterogeneity of the condition and the many AEDs used to treat it underscore the need for clear stratification of patients according to genetic risk, allowing further study in a clinical environment.

Opportunities and Challenges for Genome Sequencing in the Clinic

77

A major challenge in translating discovery genetics to the clinic is to determine the clinical utility of findings from large-scale gene-mapping research projects. Using the example of epilepsy, the only way we can understand the translational implications of the risk factors identified by large-scale discovery efforts is by stratifying patients, currently under treatment, by those risk factors, and following the course of each patients epilepsy over time, to understand how genetic variants determine treatment response and prognosis. By studying the genetic data in the full richness of the clinic, one can shift the focus of understanding to the point of clinical care. Characterized genetic stratifiers can then provide the platform for structured clinical trials of novel treatments based on biologically defined groups of patients. The challenges regarding the translation of pharmacogenomic discovery to the clinic are slightly different. Focusing on epilepsy, genetic variation across HLA and P450 genes are accepted examples of clinically valuable pharmacogenomic predictors of carbamazepine-induced Stevens–Johnson syndrome and phenytoin metabolism, respectively. Yet uptake of these tests at epilepsy clinics is relatively poor. In part, this is because the clinical community are, generally speaking, much more comfortable integrating novel genetic tests after prospective trials have been conducted and reported. Taking the example of HLA-B*1502 and Stevens–Johnson syndrome, a discovery made in 2004; the following prospective trial was only completed in 2011 (Chen, 2011). Indeed, this remains the only prospective trial for a known epilepsy pharmacogenomic test attempted to date.

5. CHALLENGES AROUND INCORPORATING SEQUENCING TO THE CLINIC The cost of high throughput sequencing is plummeting. Today an exome sequence costs in the region of EU800, comparable to the price of a routine MRI scan in epilepsy. Although access to a patient’s genome sequence is today of limited clinical use (particularly, in the case of epilepsy), value will steadily increase as our knowledge of genetic risk factors continues to improve. It is therefore critical that we begin to develop our ability to act on the resulting information. We need to formally examine how best to incorporate this knowledge in a busy real-life clinical setting. We need to develop systems that allow seamless integration of emerging genomic information. Next-generation sequencing offers a fantastic opportunity to accelerate and optimize genetic screening for the rare but diverse forms of epilepsy

78

Gianpiero L. Cavalleri and Norman Delanty

where mutation detection is of clinical value (see discussion above). Up to now, mutation screening has been conducted through Sanger sequencing— a slow and expensive process. For example, a screen of SCN1a in a single patient costs in the region of EU800. Given that there are >20 “epilepsy” genes (causing Mendelian forms) already known, one can appreciate that a negative screen of a prime candidate such as SCN1a does not rule out an obvious cause at another locus. Exome screening offers an immediate alternative that is both cost effective and efficient, screening all known epilepsy genes in a single process. There are issues surrounding the reliability of exome sequencing in the context of clinical genetics (Desai & Jere, 2012), but they will inevitably be addressed given the clear benefits of the genome-wide screen. Indeed, the diagnostic potential provided by nextgeneration sequencing in epilepsy has recently been illustrated where targeted resequencing of 265 candidate epilepsy genes revealed putative causal mutations in 16 of the 33 patients tested (Lemke et al., 2012). The extent and depth of genetic diversity being revealed by large-scale sequencing studies points to the challenges ahead in terms of assigning causality to a particular variant or set of variants (Nelson et al., 2012; Tennessen et al., 2012). Considering the example of epilepsy and considering the number of genes known to harbor disease seizure inducing variants, identifying causal variants beyond those that are either de novo or segregate clearly in a pedigree is a massive challenge. Central repositories of variation discovered in epilepsy genomes might help address this issue although the population specificity of rare variation will complicate the picture further. As our ability to interpret genetic variation in a disease context improves, so to does the challenge of communicating this information to the patient. There are many issues to be resolved here including, for example, how the information is presented to the clinical team and indeed how to train the clinical teams to handle extensive genetic information (especially, when relevant training in medical schools tends to be heavily focused on Mendelian genetics). How a patient responds to being presented with extensive genetic information is also poorly understood. The case of a single disease-causing mutation is in itself a challenge in that counseling often extends to family members. But the situation changes dramatically when you consider the genome as a whole, and the full spectrum of health-related information that is contained within. For example, in the case of a patient with familial epilepsy, should counseling be limited to a mutation in SCN1a or rather should it extend to other clinically relevant details such as risk factors for celiac disease or

Opportunities and Challenges for Genome Sequencing in the Clinic

79

diabetes? This challenge will become more acute as our understanding of the genetic variation continues to evolve and improve. The practical challenges to be addressed before genomics data can be incorporated to the clinic include data interpretation, storage, effective reporting, patient, and peer education, as well as ethical, social, and economic issues. Furthermore, practitioners’ perception about the clinical relevance of genomic information, citizens’ concerns about rights and privacy in the genomic era, implications of identified genetic risk for the patient and other family members, and perceived cost of the technology are all potential barriers to the translation of genomics into clinical practice. Informed policy, legislation, and guidelines must emerge from national authorities, professional bodies and government to guide the safe and effective application of genomic medicine for all citizens.

REFERENCES Alfirevic, A., Jorgensen, A. L., Williamson, P. R., Chadwick, D. W., Park, B. K., & Pirmohamed, M. (2006). HLA-B locus in Caucasian patients with carbamazepine hypersensitivity. Pharmacogenomics, 7(6), 813–818. Auffray, C., Caulfield, T., Khoury, M. J., Lupski, J. R., Schwab, M., & Veenstra, T. (2012). Looking back at genomic medicine in 2011. Genome Medicine, 4(1), 9. Aynacioglu, A. S., Brockmoller, J., Bauer, S., Sachse, C., Guzelbey, P., Ongen, Z., et al. (1999). Frequency of cytochrome P450 CYP2C9 variants in a Turkish population and functional relevance for phenytoin. British Journal of Clinical Pharmacology, 48(3), 409–415. Bailey-Wilson, J. E., & Wilson, A. F. (2011). Linkage analysis in the next-generation sequencing era. Human Heredity, 72(4), 228–236. Berg, A. T., & Cross, J. H. (2010). Towards a modern classification of the epilepsies? Lancet Neurology, 9(5), 459–461. Berkovic, S. F., Howell, R. A., Hay, D. A., & Hopper, J. L. (1998). Epilepsies in twins: Genetics of the major epilepsy syndromes. Annals of Neurology, 43(4), 435–445. Berkovic, S. F., Mulley, J. C., Scheffer, I. E., & Petrou, S. (2006). Human epilepsies: Interaction of genetic and acquired factors. Trends in Neurosciences, 29(7), 391–397. Botstein, D., White, R. L., Skolnick, M., & Davis, R. W. (1980). Construction of a genetic linkage map in man using restriction fragment length polymorphisms. American Journal of Human Genetics, 32(3), 314–331. Campbell, H., & Rudan, I. (2002). Interpretation of genetic association studies in complex disease. The Pharmacogenomics Journal, 2(6), 349–360. Carroll, L. S., & Owen, M. J. (2009). Genetic overlap between autism, schizophrenia and bipolar disorder. Genome Medicine, 1(10), 102. Cavalleri, G. L., Lynch, J. M., Depondt, C., Burley, M. W., Wood, N. W., Sisodiya, S. M., et al. (2005). Failure to replicate previously reported genetic associations with sporadic temporal lobe epilepsy: Where to from here? Brain, 128(Pt 8), 1832–1840. Cavalleri, G. L., McCormack, M., Alhusaini, S., Chaila, E., & Delanty, N. (2011). Pharmacogenomics and epilepsy: The road ahead. Pharmacogenomics, 12(10), 1429–1447. Chen, P., Lin, J. J., Lu, C. S., Ong, C. T., Hsieh, P. F., Yang, C. C., et al. (2011). Taiwan SJS Consortium. The New England Journal of Medicine, 364(12), 1126–1133.

80

Gianpiero L. Cavalleri and Norman Delanty

Chung, W. H., Hung, S. I., Hong, H. S., Hsih, M. S., Yang, L. C., Ho, H. C., et al. (2004). Medical genetics: A marker for Stevens-Johnson syndrome. Nature, 428(6982), 486. Cohen, J. C., Kiss, R. S., Pertsemlidis, A., Marcel, Y. L., McPherson, R., & Hobbs, H. H. (2004). Multiple rare alleles contribute to low plasma levels of HDL cholesterol. Science, 305(5685), 869–872. Desai, A. N., & Jere, A. (2012). Next-generation sequencing: Ready for the clinics? Clinical Genetics, 81(6), 503–510. Devlin, B., Bacanu, S. A., & Roeder, K. (2004). Genomic control to the extreme. Nature Genetics, 36(11), 1129–1130 author reply 1131. Dib, C., Faure, S., Fizames, C., Samson, D., Drouot, N., Vignal, A., et al. (1996). A comprehensive genetic map of the human genome based on 5,264 microsatellites. Nature, 380(6570), 152–154. Donis-Keller, H., Green, P., Helms, C., Cartinhour, S., Weiffenbach, B., Stephens, K., et al. (1987). A genetic linkage map of the human genome. Cell, 51(2), 319–337. Duncan, J. S., Sander, J. W., Sisodiya, S. M., & Walker, M. C. (2006). Adult epilepsy. The Lancet, 367(9516), 1087–1100. Epi4K: Gene discovery in 4,000 genomes. (2012). Epilepsia, 53(8), 1457–1467. Flavell, R. A., Kooter, J. M., De Boer, E., Little, P. F., & Williamson, R. (1978). Analysis of the beta-delta-globin gene loci in normal and Hb Lepore DNA: Direct determination of gene linkage and intergene distance. Cell, 15(1), 25–41. Green, E. D., & Guyer, M. S. (2011). Charting a course for genomic medicine from base pairs to bedside. Nature, 470(7333), 204–213. Guo, Y., Baum, L. W., Sham, P. C., Wong, V., Ng, P. W., Lui, C. H., et al. (2012). Twostage genome-wide association study identifies variants in CAMSAP1L1 as susceptibility loci for epilepsy in Chinese. Human Molecular Genetics, 21(5), 1184–1189. Heinzen, E. L., Depondt, C., Cavalleri, G. L., Ruzzo, E. K., Walley, N. M., Need, A. C., et al. (2012). Exome sequencing followed by large-scale genotyping fails to identify single rare variants of large effect in idiopathic generalized epilepsy. The American Journal of Human Genetics, 91(2), 293–302. Heinzen, E. L., Radtke, R. A., Urban, T. J., Cavalleri, G. L., Depondt, C., Need, A. C., et al. (2010). Rare deletions at 16p13.11 predispose to a diverse spectrum of sporadic epilepsy syndromes. American Journal of Human Genetics, 86(5), 707–718. Helbig, I., Mefford, H. C., Sharp, A. J., Guipponi, M., Fichera, M., Franke, A., et al. (2009). 15q13.3 microdeletions increase risk of idiopathic generalized epilepsy. Nature Genetics, 41(2), 160–162. Helbig, I., Scheffer, I. E., Mulley, J. C., & Berkovic, S. F. (2008). Navigating the channels and beyond: Unravelling the genetics of the epilepsies. Lancet Neurology, 7(3), 231–245. Hindocha, N., Nashef, L., Elmslie, F., Birch, R., Zuberi, S., Al-Chalabi, A., et al. (2008). Two cases of sudden unexpected death in epilepsy in a GEFS þ family with an SCN1A mutation. Epilepsia, 49(2), 360–365. Hudson, K. L. (2011). Genomics, health care, and society. The New England Journal of Medicine, 365(11), 1033–1041. Hung, C. C., Lin, C. J., Chen, C. C., Chang, C. J., & Liou, H. H. (2004). Dosage recommendation of phenytoin for patients with epilepsy with different CYP2C9/CYP2C19 polymorphisms. Therapeutic Drug Monitoring, 26(5), 534–540. International HapMap Consortium, (2005). A haplotype map of the human genome. Nature, 437(7063), 1299–1320. Kasperaviciute, D., Catarino, C. B., Heinzen, E. L., Depondt, C., Cavalleri, G. L., Caboclo, L. O., et al. (2010). Common genetic variation and susceptibility to partial epilepsies: A genome-wide association study. Brain, 133(Pt 7), 2136–2147. Kjeldsen, M. J., Corey, L. A., Christensen, K., & Friis, M. L. (2003). Epileptic seizures and syndromes in twins: The importance of genetic factors. Epilepsy Research, 55(1–2), 137–146.

Opportunities and Challenges for Genome Sequencing in the Clinic

81

Klein, R. J., Zeiss, C., Chew, E. Y., Tsai, J. Y., Sackler, R. S., Haynes, C., et al. (2005). Complement factor H polymorphism in age-related macular degeneration. Science, 308(5720), 385–389. Kong, X., Murphy, K., Raj, T., He, C., White, P. S., & Matise, T. C. (2004). A combined linkage-physical map of the human genome. American Journal of Human Genetics, 75(6), 1143–1148. Kwan, P., Schachter, S. C., & Brodie, M. J. (2011). Drug-resistant epilepsy. The New England Journal of Medicine, 365(10), 919–926. Lango Allen, H., Estrada, K., Lettre, G., Berndt, S. I., Weedon, M. N., Rivadeneira, F., et al. (2010). Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature, 467(7317), 832–838. Le Gal, F., Korff, C. M., Monso-Hinard, C., Mund, M. T., Morris, M., Malafosse, A., et al. (2010). A case of SUDEP in a patient with Dravet syndrome with SCN1A mutation. Epilepsia, 51(9), 1915–1918. Lemke, J. R., Riesch, E., Scheurenbrand, T., Schubach, M., Wilhelm, C., Steiner, I., et al. (2012). Targeted next generation sequencing as a diagnostic tool in epileptic disorders. Epilepsia, 53(8), 1387–1398. Li, J. Z., Absher, D. M., Tang, H., Southwick, A. M., Casto, A. M., Ramachandran, S., et al. (2008). Worldwide human relationships inferred from genome-wide patterns of variation. Science, 319(5866), 1100–1104. Li, D., Lewinger, J. P., Gauderman, W. J., Murcray, C. E., & Conti, D. (2011). Using extreme phenotype sampling to identify the rare causal variants of quantitative traits in association studies. Genetic Epidemiology, 35(8), 790–799. Lonjou, C., Borot, N., Sekula, P., Ledger, N., Thomas, L., Halevy, S., et al. (2008). A European study of HLA-B in Stevens-Johnson syndrome and toxic epidermal necrolysis related to five high-risk drugs. Pharmacogenetics and Genomics, 18(2), 99–107. Lupski, J. R., Reid, J. G., Gonzaga-Jauregui, C., Rio Deiros, D., Chen, D. C., Nazareth, L., et al. (2010). Whole-genome sequencing in a patient with Charcot-Marie-Tooth neuropathy. The New England Journal of Medicine, 362(13), 1181–1191. Manolio, T. A., Collins, F. S., Cox, N. J., Goldstein, D. B., Hindorff, L. A., Hunter, D. J., et al. (2009). Finding the missing heritability of complex diseases. Nature, 461(7265), 747–753. Matise, T. C., Sachidanandam, R., Clark, A. G., Kruglyak, L., Wijsman, E., Kakol, J., et al. (2003). A 3.9-centimorgan-resolution human single-nucleotide polymorphism linkage map and screening set. American Journal of Human Genetics, 73(2), 271–284. McCormack, M., Alfirevic, A., Bourgeois, S., Farrell, J. J., Kasperaviciute, D., Carrington, M., et al. (2011). HLA-A*3101 and carbamazepine-induced hypersensitivity reactions in Europeans. The New England Journal of Medicine, 364(12), 1134–1143. Mefford, H. C., Muhle, H., Ostertag, P., von Spiczak, S., Buysse, K., Baker, C., et al. (2010). Genome-wide copy number variation in epilepsy: Novel susceptibility loci in idiopathic generalized and focal epilepsies. PLoS Genetics, 6(5), e1000962. Mulley, J. C., & Mefford, H. C. (2011). Epilepsy and the new cytogenetics. Epilepsia, 52(3), 423–432. Neale, B. M., Kou, Y., Liu, L., Ma’ayan, A., Samocha, K. E., Sabo, A., et al. (2012). Patterns and rates of exonic de novo mutations in autism spectrum disorders. Nature, 485(7397), 242–245. Nelson, M. R., Wegmann, D., Ehm, M. G., Kessner, D., St Jean, P., Verzilli, C., et al. (2012). An abundance of rare functional variants in 202 drug target genes sequenced in 14,002 people. Science, 337, 100–104. Novembre, J., Johnson, T., Bryc, K., Kutalik, Z., Boyko, A. R., Auton, A., et al. (2008). Genes mirror geography within Europe. Nature, 456(7218), 98–101. Ottman, R., Lee, J. H., Hauser, W. A., & Risch, N. (1998). Are generalized and localizationrelated epilepsies genetically distinct? Archives of Neurology, 55(3), 339–344.

82

Gianpiero L. Cavalleri and Norman Delanty

Ozeki, T., Mushiroda, T., Yowang, A., Takahashi, A., Kubo, M., Shirakata, Y., et al. (2011). Genome-wide association study identifies HLA-A*3101 allele as a genetic risk factor for carbamazepine-induced cutaneous adverse drug reactions in Japanese population. Human Molecular Genetics, 20(5), 1034–1041. Pagnamenta, A. T., Wing, K., Akha, E. S., Knight, S. J., Bolte, S., Schmotzer, G., et al. (2009). A 15q13.3 microdeletion segregating with autism. European Journal of Human Genetics, 17(5), 687–692. Pal, D. K., Pong, A. W., & Chung, W. K. (2010). Genetic evaluation and counseling for epilepsy. Nature Reviews Neurology, 6(8), 445–453. Pelak, K., Shianna, K. V., Ge, D., Maia, J. M., Zhu, M., Smith, J. P., et al. (2010). The characterization of twenty sequenced human genomes. PLoS Genetics, 6(9). Price, A. L., Patterson, N. J., Plenge, R. M., Weinblatt, M. E., Shadick, N. A., & Reich, D. (2006). Principal components analysis corrects for stratification in genome-wide association studies. Nature Genetics, 38(8), 904–909. Purcell, S. M., Wray, N. R., Stone, J. L., Visscher, P. M., O’Donovan, M. C., Sullivan, P. F., et al. (2009). Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature, 460(7256), 748–752. Reich, D. E., Cargill, M., Bolk, S., Ireland, J., Sabeti, P. C., Richter, D. J., et al. (2001). Linkage disequilibrium in the human genome. Nature, 411(6834), 199–204. Risch, N., & Merikangas, K. (1996). The future of genetic studies of complex human diseases. Science, 273(5281), 1516–1517. Sanders, S. J., Murtha, M. T., Gupta, A. R., Murdoch, J. D., Raubeson, M. J., Willsey, A. J., et al. (2012). De novo mutations revealed by whole-exome sequencing are strongly associated with autism. Nature, 485(7397), 237–241. Sharp, A. J., Mefford, H. C., Li, K., Baker, C., Skinner, C., Stevenson, R. E., et al. (2008). A recurrent 15q13.3 microdeletion syndrome associated with mental retardation and seizures. Nature Genetics, 40(3), 322–328. Sisodiya, S. M., & Mefford, H. C. (2011). Genetic contribution to common epilepsies. Current Opinion in Neurology, 24(2), 140–145. Sobreira, N. L., Cirulli, E. T., Avramopoulos, D., Wohler, E., Oswald, G. L., Stevens, E. L., et al. (2010). Whole-genome sequencing of a single proband together with linkage analysis identifies a Mendelian disease gene. PLoS Genetics, 6(6), e1000991. Speliotes, E. K., Willer, C. J., Berndt, S. I., Monda, K. L., Thorleifsson, G., Jackson, A. U., et al. (2010). Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index. Nature Genetics, 42(11), 937–948. Stefansson, H., Rujescu, D., Cichon, S., Ingason, A., Steinberg, S., Fossdal, R., et al. (2008). Large recurrent microdeletions associated with schizophrenia. Nature, 455(7210), 232–236. Steinlein, O. K., Mulley, J. C., Propping, P., Wallace, R. H., Phillips, H. A., Sutherland, G. R., et al. (1995). A missense mutation in the neuronal nicotinic acetylcholine receptor alpha 4 subunit is associated with autosomal dominant nocturnal frontal lobe epilepsy. Nature Genetics, 11(2), 201–203. Tan, N. C., Mulley, J. C., & Berkovic, S. F. (2004). Genetic association studies in epilepsy: “The truth is out there” Epilepsia, 45(11), 1429–1442. Tennessen, J. A., Bigham, A. W., O’Connor, T. D., Fu, W., Kenny, E. E., Gravel, S., et al. (2012). Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science, 337, 64–69. Tu, E., Bagnall, R. D., Duflou, J., & Semsarian, C. (2011). Post-mortem review and genetic analysis of sudden unexpected death in epilepsy (SUDEP) cases. Brain Pathology, 21(2), 201–208. Tu, E., Waterhouse, L., Duflou, J., Bagnall, R. D., & Semsarian, C. (2011). Genetic analysis of hyperpolarization-activated cyclic nucleotide-gated cation channels in sudden unexpected death in epilepsy cases. Brain Pathology, 21(6), 692–698.

Opportunities and Challenges for Genome Sequencing in the Clinic

83

Turnbull, J., Lohi, H., Kearney, J. A., Rouleau, G. A., Delgado-Escueta, A. V., Meisler, M. H., et al. (2005). Sacred disease secrets revealed: The genetics of human epilepsy. Human Molecular Genetics, 14(17), 2491–2500. Vissers, L. E., de Ligt, J., Gilissen, C., Janssen, I., Steehouwer, M., de Vries, P., et al. (2010). A de novo paradigm for mental retardation. Nature Genetics, 42(12), 1109–1112. Weber, Y. G., & Lerche, H. (2008). Genetic mechanisms in idiopathic epilepsies. Developmental Medicine and Child Neurology, 50(9), 648–654. Winawer, M. R., Rabinowitz, D., Pedley, T. A., Hauser, W. A., & Ottman, R. (2003). Genetic influences on myoclonic and absence seizures. Neurology, 61(11), 1576–1581.