Perspective
Using Human Genetics to Drive Drug Discovery: A Perspective Caroline S. Fox The probability of success of developing medicines to treat human disease can be improved by leveraging human genetics. Different types of genetic data and techniques, including genome-wide association, whole-exome sequencing, and whole-genome sequencing, can be used to gain insight into human disease. Layering different types of genetic evidence from Mendelian disease, coding variants, and common variation can bolster support for a genetic target. Human knockouts offer the potential to perform reverse genetic screens in humans to identify physiologically relevant targets. Other components of a good genetic target include protective loss-of-function mutations, some degree of known biology, tractability, and a clean on-target safety profile. In addition to using human genetics to inspire new drug programs, phenome-wide association studies can be used to identify alternative indications or repurposing opportunities. This information can be combined into a 5-step approach for selecting a genetic target for validation, which is presented in detail in this review. Finally, current challenges in leveraging human genetics are highlighted, including the difficulties translating certain types of genetic data, relatively small number of bona fide disease-associated coding rare variants, and current sample sizes of large well-curated biobanks linked to comprehensive genetic information.
Introduction Developing medicines to treat human disease is the ultimate translational activity of basic science. Issues with research and development productivity in the pharmaceutical industry have been well articulated,1 with an 80-fold decline since the 1950s in the number of new drug approvals relative to investment. Concomitantly, the cost of drug development has increased dramatically and it is now estimated to cost $2.6 billion to bring a single drug to market.2 This cost is high because it includes the cost of failed drugs; <10% of drugs initially tested in phase 1 studies ultimately gain approval. The reasons for the high failure rate have been linked to issues with efficacy, safety, and toxicology.3 The Role of Human Genetics in Driving Drug Development Because of the high cost of failures, insights that increase the probability of success of bringing a drug program to approval are critical to boost the productivity and efficiency of drug development. It has been proposed that using germline human genetics to drive drug discovery can overcome some of these issues.4-6 Drug targets with underlying human genetic support linked to their indication are 2-fold more likely to gain approval in the United States and European Union as compared with drug targets without human genetic support.5 This observation is bolstered by the recent success of drugs targeting PCSK9. Mutations in PCSK9 were initially identified in French families with hyperlipidemia.7 Soon after, investigators identified coding mutations in this gene in association with low low-density lipoprotein (LDL) cholesterol levels.8 It was later discovered that pcsk9, or proprotein AJKD Vol XX | Iss XX | Month 2019
Complete author and article information provided before references. Am J Kidney Dis. XX(XX): 1-9. Published online Month X, XXXX. doi: 10.1053/ j.ajkd.2018.12.045
© 2019 by the National Kidney Foundation, Inc.
convertase subtilisin kexin 9, degrades the LDL receptor and provides synergy in cholesterol lowering above and beyond statin treatment. Clinical trials have now shown that pcsk9 inhibition can lower LDL cholesterol levels by 60% and reduce the risk for coronary disease events by 20%.9 Thus, the interest in using human genetics to drive drug discovery has been spurred on the recent success of PCSK9 inhibitors. There are several features of PCSK9 as a human genetic target that made it particularly compelling. First, the initial mutations were coding mutations identified through exome sequencing,8 which have the advantage over other genomic technologies of identifying the actual genetic target. Second, the mutations initially identified were protective loss-of-function (LoF) mutations, providing firm evidence that inhibition of the target was the therapeutic hypothesis. Third, much of the pathway of cholesterol metabolism was already known at the time of discovery of PCSK9, making the mechanistic elucidation clearer than a situation in which the biology is unknown. Fourth, the target was tractable, and finally, the mechanism was generalizable beyond carriers of the relatively rare mutation.10 Thus, many characteristics exemplified by the success of PCSK9 are relevant to the success of leveraging human genetics in general. This article highlights each of these components to describe how human germline genetics can be used to drive drug discovery. Types of Genetic Data There are several different types of germline genetic data that can be used for drug discovery. Box 1 defines analytic methods and terminology used throughout the text. The leading 3 methodologies currently in use are genomewide association data, whole-exome sequencing, and 1
Perspective Box 1. Analytical Methods and Terminology Discussed in the Text Allelic series: Combination of mutations across the entire spectrum of disease or phenotypic perturbation Common variant: Genetic variant that occurs with a frequency of w>1% Consanguineous population: Populations with a high degree of breeding among members of the same family Druggable target: Protein, peptide, or nucleic acid with known biological activity that can be biologically modulated with a small molecule, biologic, or other pharmacologic entity Founder population: Population characterized by bottle-necking (new population established by a small number of people), which results in limited genetic variation within the population Genome-wide association study (GWAS): Forward genetic screen whereby a single phenotype is linked to a comprehensive set of genotypes to uncover genetic mutations associated with the phenotype of interest Germline genetic mutation: Mutations that are present in the DNA of every cell of the body and can be passed to offspring High-throughput screen (HTS): Scientific method used to identify classes or families of chemical compounds that modulate a biologically meaningful trait from a high-throughput assay Human knockout (KO): “Experiment of nature” defined by the complete absence of a gene that can provide insights into the physiology and genetics of the associated transcribed protein Loss-of-function (LoF) mutation: Coding mutation that results in partial or complete disruption of the transcribed protein Off-target safety issue: Adverse event of safety finding that occurs through modulation of the compound and is not through the mechanism of the target On-target safety issue: Adverse event or safety finding that occurs through the mechanism of action of the target Phenome-wide association study (PHeWAS): Reverse genetic screen that begins with a genetic variant of interest and works backward to identify the associated phenotype(s) Pleiotropy: Situation in which a gene is associated with >1 trait Rare variant: Genetic variant that occurs with a frequency of w<1% Somatic mutation: Genomic mutation that occurs in all cells of the body except reproductive cells and is not passed on to offspring Therapeutic hypothesis: Directionality of modulation of a drug target, either to agonize or inhibit Whole-exome sequencing (WES): Comprehensive sequencing assessment of exomes, the protein-coding portion of the genome (accounts for w1% of the entire genome) Whole-genome sequencing (WGS): Comprehensive sequencing of the entire genome, including introns and exons
whole-genome sequencing. Table 1 shows the benefits and key considerations for each approach, highlighting the importance of a balanced and diverse portfolio when building a toolbox for using human genetics to drive the drug discovery process. In general, genome-wide association has proved successful in the identification of genetic loci for complex disorders. Whole-exome sequencing has demonstrated success in diseases characterized by coding mutations, such as Mendelian disease, and populations enriched for LoF variants, including founder and consanguineous populations. Wholegenome sequencing is still relatively new and its utility is still being evaluated. Genomic data from somatic sources, including tissues and tumors, are also important tools to drive drug discovery. However, given the difference in the way the data are integrated into mechanism elucidation and drug discovery, they are beyond the scope of the present article and are not discussed in detail. In addition, the majority of this review focuses on the challenges and opportunities in developing drugs for common complex disorders. 2
Characteristics of a Good Genetic Target There are several relevant considerations in the determination of what makes a good genetic target from a germline genetics perspective. Different types of underlying complementary genetic evidence can help build support. There are several different types of genetic evidence to consider, and they essentially fall into 3 axes (Table 2): Mendelian disease, rare coding variants in nonsyndromic diseases, and common variation, usually obtained through genome-wide association studies (GWAS). Genes underlying Mendelian disease, as catalogued comprehensively in Online Mendelian Inheritance in Man (OMIM),11 are a great starting point for capturing genes with strong effect sizes because they tend to have extreme phenotypes with high penetrance across a wide range of organ systems. There are nearly 3,000 genes in OMIM that have been comprehensively linked to rare disease syndromes,12 and the majority of these are coding mutations directly associated with protein coding changes and disease. Mendelian disease can additionally point toward a AJKD Vol XX | Iss XX | Month 2019
Perspective Table 1. Types of Genetic Data With Benefits and Considerations Type of Genetic Data Benefits • Comprehensive assessment of Genome-wide common variants association study • Large data sets currently exist (GWAS) with thousands of robust findings • Covers multiple diseases and phenotypes of interest
Whole-exome sequencing (WES)
• • •
Whole genome sequencing (WGS)
•
Considerations • Most variants identified are noncoding • Causal gene unclear without follow-up work • Difficult to translate into causal biology
Examples • UMOD45 • PPARG for diabetes • CASR for serum calcium • EDN1 for vascular disease • APOL1, PLA2R1, CFH, HLA-DQ for IgAN57,58 • HLA-DQB1, MTCO3P1 for childhood SSNS59 • PLA2R1 and HLA-DQA1 for MN60 Can identify loss-of-function (LoF)/ • Utility generally unproved in the • PCSK9 for hyperlipidemia • GREB1L for CAKUT61 general outbred population gain-of-function (GoF) mutations Can identify human knockouts (KOs) • May require extremely large • ABCC8 and pulmonary HTN62 sample sizes (>150,000) with • HSD17B13 and chronic in consanguineous populations substantial no. of cases for Translational pathway clearer than liver disease63 adequate power, which other genomic data types • ANGPTL3 and CVD64 currently do not exist • APOC3 and triglycerides20 • 100,000 Genomes Project Provides comprehensive assessment • Expensive and not presently by Genomics England of coding and noncoding genome scalable for adequate power (www.genomicsengland.co.uk) for meaningful discovery • Value of rare noncoding variants • TOPMED (www.nhlbiwgs.org/) currently unproved
Abbreviations: CAKUT, congenital anomalies of the kidney and urinary tract; CVD, cardiovascular disease; HTN, hypertension; IgAN, immunoglobulin A nephropathy; MN, membranous nephropathy; SSNS, steroid-sensitive nephrotic syndrome.
mechanism of action for a gene or pathway. It has been estimated that 20% to 25% of Mendelian genes are located near loci for common variants,11 highlighting how genes that play a role in Mendelian disease also underlie common disease. There are several examples of therapeutics developed that were inspired by genes underlying Mendelian diseases. For example, rare mutations in SRD5A2 are associated with congenital deficiency of 5α-reductase. This enzyme is responsible for converting dihydrotestosterone to testosterone. Deficiency of this enzyme results in pseudofeminization syndrome and male pseudohermaphroditism.13 Because the clinical phenotype includes a small prostate gland and full scalp hair, it was hypothesized that inhibition of 5α-reductase could be used as a treatment for prostatic hypertrophy and male-pattern baldness. This therapeutic hypothesis lead to the development of finasteride, which has been comprehensively reviewed elsewhere.14 Other prospective examples include ivacaftor to potentiate a mutant form of CFTR in cystic fibrosis,15 orexin antagonists such as suvorexant for sleep based on mutations in genes that cause narcolepsy,16 and compounds in development for pain that leverage the observation that a mutation in SCN9A occurs in individuals with insensitivity to pain. Coding mutations in association with nonsyndromic diseases and phenotypes (which are generally rare, defined as present in <1% of the population) are the second large class of genetic information that can be used to bolster support for a genetic target. These mutations have many similar features to Mendelian mutations, except that they tend to be more common, have much lower penetrance, and have less severe phenotypic consequences. Coding mutations have the advantage of directly implicating a target, AJKD Vol XX | Iss XX | Month 2019
whereas other methods (such as genome-wide association) implicate a region of the genome but not the direct target. In addition, coding mutations often lead to the disruption of the quality or quantity of a protein, lending clarity to the therapeutic hypothesis. The most ideal situation is a protective LoF mutation because this most directly suggests that inhibition of the protein product of the genetic target may result in modification of the disease of interest. Human knockouts (KOs) offer potential for identifying drug targets and on-target safety issues. In model organisms, gene disruption paired with the study of phenotypic consequences provides important insights into gene function.17 Extending this into humans can provide similar insights. Humans carry LoF-causing genetic variants, and it is estimated that there are approximately 100 known LoF variants per individual, with up to 20 genes completely disrupted per individual, although the majority of these are tolerant and in nonessential genes.18 Identifying homozygous LoF, or human KOs, can be enriched in consanguineous populations, characterized by a high degree of parental relatedness.19,20 There are advantages to identifying and studying LoF and human KOs, including direct identification of the target and directionality, insights into gene function and physiology, and identification of the maximal effect of perturbation and associated efficacy and safety consequences. For this to be effective for comprehensive drug discovery, large databases of human KOs need to be linked to a wide range of clinical data and an ability to recall participants based on genotype for targeted deep phenotyping experiments. The notion of creating databases of human KOs has been dubbed the human KO project21 and is getting traction, although scalable databases do not presently exist. 3
Perspective Table 2. Criteria for Selecting a Good Genetic Target Present in OMIM (Online Mendelian Inheritance in Man)
Common variation
Rare coding variation
Benefit • Causal mutation for rare disease • Large effect size with high penetrance • Coding mutations with associated protein coding changes • Often point to mechanism of action for genetic target or pathway • Well-powered, thousands of findings exist across a range of phenotypes • Common in the general population • Assumed generalizability to sporadic cases • • • •
Some degree of known biology • Tractability
•
Clean on-target safety profile
•
Considerations • Uncertain generalizability to non–mutation carriers • Nearly half of all genes in OMIM not conclusively linked to disease
• Majority of findings derived from GWAS (see Table 1 for additional limitations) • Small effect sizes, which can be difficult to model in vitro, particularly when using genetic perturbation models Along with common variation, can be used to • Generally derived from WES data (see Table 1 for additional limitations) create an allelic series Often disrupt the quality or quantity of a • Due to evolutionary pressures, often present in low frequencies in outbred populations protein • Questionable generalizability to non–mutation Directly implicate the target carriers, particularly if extremely rare Tends to have larger effect sizes than common variants Working out mechanism of action can be • Limits target consideration to those with clearer known biology Targetable by available pharmaceutical mo- • Majority of genetic targets not targetable by small molecules dalities; typically refers to small molecules but • Many new modalities being developed; early has been expanded to more contemporary applications mostly aimed at rare diseases approaches including biologics Human genetics can be used to predict on- • Methods not currently validated; positive and target safety issues negative predictive values remain uncertain
Abbreviations: GWAS, genome-wide association study; WES, whole-exome sequencing.
One limitation of relying on genetic targets from Mendelian disease or rare coding variants is the question of whether the underlying pathway will be relevant outside of mutation carriers, who tend to be the sole carriers of these rare alleles. One way to bolster confidence in the generalizability of a pathway beyond mutation carriers is to overlay data from common variants, generally derived from GWAS, on top of genes from Mendelian disease and rare variant studies. The combination of mutations across the entire spectrum of disease or phenotypic perturbation, known as an allelic series, can increase confidence that perturbation of the mechanism implicated by the set of mutations in a given target is relevant across a population of and not simply limited to those with a known mutation and associated syndrome. A good example of a target that represents several different types of genetic support is IFIH1. IFIH1 has been associated with systemic lupus erythematosus in GWAS. In addition, very rare mutations in this gene are linked to Aicardi-Goutieres syndrome 7 (OMIM #606951), characterized by induction of interferon signaling in response to double-stranded RNA stimuli and an overall heightened inflammatory state consistent with lupus.22 Taken together, data supporting a genetic target from disparate and robust sources can build confidence around the association of a target with a disease indication. In addition to the 3 key axes of underlying human genetic support, there are several other attributes of a good genetic target (Table 2). Having some degree of known biology in the biological pathway implicated by the target can increase interest in a target because working out a de novo mechanism of action can be time consuming. In 4
addition to mechanism, tractability is key; the target should be targetable by available pharmaceutical modalities. This generally means that a target should be accessible through small molecules or by an antibody, although there are many promising tools on the horizon, including RNA interference knockdown, antisense oligonucleotide knockdown, genome editing, and viral and nonviral gene therapy. Finally, the target should have a clean on-target safety profile. The section below on phenome-wide association studies (PheWAS) discusses how human genetics can be used to predict on-target safety issues. Finally, it is important to note that the characteristics of a good genetic target are mostly hypothetical because the field is too nascent to use modeling to predict key attributes in a quantitative fashion. GWAS: Opportunities and Challenges GWAS have emerged as one of the most powerful forms of locus identification, with thousands of traits linking a single trait or disease to a region of the genome. A GWAS is essentially a forward genetic screen, in which the phenotype is known and the question is to uncover the genetic mutations underlying the phenotype. Genome-wide association is predicated on 3 concepts. The first is the notion of linkage disequilibrium, which highlights that the genome is built of blocks of related mutations. This correlation structure allows the majority of the common variation in the genome to be assayed with a parsimonious set of single-nucleotide polymorphisms (SNPs). The second key feature to enable genome-wide AJKD Vol XX | Iss XX | Month 2019
Perspective association is the advent of high-throughput assay chips that allow for the simultaneous measurement of approximately 1 million SNPs that cover the majority of the genome. The third key feature is scalability; genome-wide association generally requires large sample sizes to be successful because of the multiple testing penalty paid from testing so many SNPs at once. The success of GWAS has been in the ability to successfully leverage these key concepts. The first study of more than 1 million participants was recently published,23 highlighting the ability to scale GWAS to the level that is necessary to identify hundreds of genetic loci that often underlie a single complex trait. The success of genomewide association was recently reviewed at its 10-year anniversary.24 Many of the results from GWAS are stored in the GWAS catalogue, co-sponsored by the European Molecular Biology Laboratory European Bioinformatics Institute and National Human Genome Research Institute.25 The benefits and considerations of genome-wide association are highlighted in Table 1. Specifically, genomewide association is a robust approach that has yielded thousands of well-replicated findings across many different clinically relevant diseases and phenotypes. However, there are several challenges; the majority of which have to do with our limited ability to translate the associations from GWAS into bona fide mechanistic insights. First, genome-wide association links a phenotype to a region of the genome, not a direct gene target or mutation. There may be dozens of genes in any given region or the region may be a gene desert. This requires follow-up methods including fine-mapping, co-localization, or other functional genomics techniques to hone in on the causal variant. Techniques including computational methods such as co-localization26,27 and laboratory-based techniques including massively parallel reporter assays28,29 have demonstrated limited success, with the ability to confidently identify the most likely causal variant or gene in a region at <20%. In particular, the tissue-specific relevance of GWAS findings renders this process particularly complex. Second, the majority (>90%) of GWAS findings are in noncoding regions,30 making the genetic mechanism of action more difficult to elucidate with standards tools. Third, effect sizes tend to be small, making modeling in vitro more challenging.31 There are several retrospective examples that highlight the power of human genetics to co-localize with genomics to known relevant biology and drug targets as a positive control for the method. Several of these examples include PPARG for diabetes32 and CASR for serum calcium33 and phosphorus.34 However, these are retrospective examples, in which the GWAS finding was able to identify known drug targets. Given the mentioned issues with translating GWAS, it is important to focus on prospective examples (ie, when the human genetic finding came before the biological and mechanistic elucidation). One such example is the EDN1 gene in association with 5 different vascular diseases,35 which nicely highlights the complexity of mechanism elucidation from a noncoding AJKD Vol XX | Iss XX | Month 2019
GWAS finding. Thus, the major limitation in using GWAS to drive drug discovery is our current ability to translate the findings to mechanism of action. This is exemplified by the paucity of published reports that have delineated mechanisms from GWAS and the lack of established drug programs in the public domain derived solely from GWAS. Using PheWAS to Predict Safety Issues and Identify Repurposing Opportunities PheWAS are reverse genetic screens, in which one starts with a genetic marker of interest and looks to find associated phenotypes. The concept of PheWAS was originally described in 201336 with the analysis of 3,144 SNPs previously implicated by GWAS and 1,358 traits identified using electronic medical records in approximately 13,000 individuals. This concept has been scaled up to include full coverage of the common variant genome, larger panels of phenotypes, and larger sample sizes.37 Because PheWAS starts with a genetic marker of interest, the technique is particularly useful to identify pleiotropy, defined as a situation in which a gene is associated with more than 1 trait. Depending on the directionality of the association, PheWAS findings can either be leveraged for insight into on-target adverse events to provide insight into potential safety issues or for repurposing opportunities to identify new opportunities to leverage an existing compound or medicine. For example, a PheWAS was performed of 521,000 individuals from 23andMe examining phenotypic associations for mutations in TRAF3IP2.38 Associations with TRAF3IP2 were identified for Crohn’s disease, acne, and nose bleeds. These findings suggest that manipulation of the target therapeutically for treatment of Crohn’s disease could potentially result in increased risk for either acne or nose bleeds. Another example is the association of a SNP in CETP with both high-density lipoprotein (HDL) levels and age-related macular degeneration. SNPs in CETP are associated with HDL levels10; these same SNPs are associated in a direction-consistent manner with age-related macular degeneration,39 raising concern that pharmacologic inhibition of the CETP enzyme to increase HDL levels might inadvertently increase the risk for age-related macular degeneration. Fortuitously, the REVEAL trial has been completed and there was no reported imbalance of agerelated macular degeneration events in the treatment versus control arm.40 PheWAS is positioned as an exciting application of reverse genetic screens in human genetics, but the technique is not currently validated in terms of its ability to predict positive and negative results. In addition, it is important to keep in mind that this technique is only useful for on-target adverse events; off-target adverse events can only be predicted by looking at actual drug compounds in the setting of safety and toxicology studies. In addition to leveraging genetic information for ontarget safety signal identification, PheWAS can be used for repurposing of drug compounds. A recent example 5
Perspective Mine human genetics for promising findings: APOL1 has strong and compelling human genetics support
Confirm unmet medical need and commercial viability: APOL1 demonstrates substantial unmet medical need and commercial viability
Determine the therapeutic hypothesis: inhibition of ApoL1 may reduce CKD risk
Use human knock-outs for safety modeling: Human APOL1 KO is viable
Elucidate mechanism of action: APOL1 demonstrates complex biology
HTS
Figure 1. Steps to selecting a genetic target for validation for small-molecule screening; APOL1 is provided as an example. Abbreviations: CKD, chronic kidney disease; HTS, high-throughput screen; KO, knockout.
highlights the power of this approach. Pulley et al created the Accelerating Drug Development and Repurposing Incubator to leverage electronic health records linked to DNA samples from BioVu.41 Among 35,000 genotyped participants, the authors highlight several repurposing opportunities, including TNF for cervical cancer, PLA2G7 for glomerulonephritis, and CACNB2 for hypoadrenalism. As larger biobanks with more comprehensive phenotyping become linked to genotyping information, this technique will be increasingly useful to identify repurposing opportunities. Specific Examples of Leveraging Genetics The power and limitations of GWAS and PheWAS are aptly highlighted by results from the CKDGen consortium, a global consortium dedicated to uncovering genetic loci for kidney function and related traits. This consortium has uncovered more than 50 loci associated with estimated glomerular filtration rate and chronic kidney disease (CKD).42-44 These findings highlight genes in the area of kidney development, transmembrane transport activity, kidney structure, and regulation of glucose metabolism. Similar to GWAS findings of other traits, the majority of these findings are in noncoding regions and are linked to regulatory regions in renal but not extrarenal tissues. One exception from the CKDGen consortium is rs4293393, and SNPs found in the promoter region of the UMOD gene, which encodes for Tamm-Horsfall protein.45 6
This SNP, one of the leading GWAS findings for CKD, has been found to be associated with both CKD and uromodulin in a direction-consistent manner. Additionally, higher urinary uromodulin levels are associated with the future development of CKD. Together, these findings suggest that variation in the UMOD gene may influence the risk for developing CKD through urinary uromodulin levels. However, the same allele associated with protection for CKD is also associated with increased risk for kidney stones,46 raising the concern that manipulation of this axis for the prevention of CKD may inadvertently increase the risk for kidney stones. These examples highlight how GWAS and PheWAS can be leveraged to build or refute support for a given target using human genetics. Approach to Selecting a Genetic Target and an Example: APOL1 There are 5 steps to selecting a genetic target for validation (Fig 1). The example given next provides a stepwise approach to moving toward a small-molecule highthroughput screen. The first step is to mine human genetics for promising findings following the principles outlined above and presented in Table 2. The second step is to confirm unmet medical need and commercial viability. Step 3 is to determine the therapeutic hypothesis. Step 4 is to use human KOs for safety modeling. Step 5 is the most challenging, which is to elucidate the mechanism of action sufficiently such that an assay can be created for high-throughput chemical screening. AJKD Vol XX | Iss XX | Month 2019
Perspective These steps can be applied to APOL1, one of the best known targets for end-stage kidney disease. The chromosomal region of interest, 22q12, was originally discovered in association with nondiabetic kidney disease and focal segmental glomerulosclerosis in 2 separate publications.47,48 The target was originally misattributed to the gene MYH9 (which is in the same genomic region as APOL1) due to its prior association with a Mendelian form of pediatric kidney disease. In 2010, investigators conducted conditional analyses of the chromosomal region of interest (22q12) and uncovered that the primary causal gene in the region was APOL1, which had increased in frequency in the West African population due to its protective nature against trypanosomal disease (African sleeping sickness).49 Mutations in APOL1 are both relatively common in the African population and confer substantial risk for disease, leading it to account for w70% of the black-white difference in endstage kidney disease.48 Thus, mining of human genetics for end-stage kidney disease uncovered APOL1. After human genetics has been mined, unmet medical need and commercial viability must be assessed (step 2). Approximately 12% of blacks in the United States harbor 2 copies of the mutant APOL1 risk alleles, translating to nearly 5 million individuals and almost 1 million with active CKD. This number is sufficient commercially for the majority of pharmaceutical companies outside of the orphan disease indication in terms of the size of the market. The unmet medical need in APOL1 mutation carriers is underscored by a post hoc analysis of AASK (African American Study of Kidney Disease and Hypertension). This analysis demonstrated that patients with 2 copies of the risk alleles for APOL1 have a progressive decline in kidney function despite receiving standard of care,50 suggesting substantial unmet medical need in this patient population. The third step is to determine the therapeutic hypothesis, or directionality of the pharmacologic intervention (ie, is the goal to raise the level of the target or inhibit the level of the target). For genetic variation, this can be more complex than simply looking at the association of a circulating protein level and an outcome because genetic associations are relative to an arbitrarily designated referent allele. For the case of APOL1, overexpression of ApoL1 is toxic in podocytes, HEK 293, and DLD-1 cells, and overexpression in human podocytes leads to lysosomal damage.51,52 Taken together, these findings suggest that increased ApoL1 levels are damaging; thus, the therapeutic hypothesis is to decrease ApoL1 levels.51,52 The fourth step is to use human KOs, when available, for safety modeling. As discussed, human KOs are experiments of nature that can provide insights into the physiology and safety of a gene by virtue of its complete deficiency.53 In addition, human KOs can provide insight into the potential safety consequences of lowering an endogenous protein to very low levels. In the case of APOL1, a farmer in India presented with recurrent trypanosomal disease; molecular evaluation demonstrated that he was missing both copies of AJKD Vol XX | Iss XX | Month 2019
the APOL1 gene due to frameshift mutations in both alleles.54 The patient was otherwise healthy and importantly, had a normal creatinine level. This information suggests that lowering APOL1 levels will not be incompatible with life. It suggests that a potential on-target safety issue may be trypanosomal disease if pharmacologic inhibition of the target is administered in endemic regions. The fifth step is the biological elucidation of the target, and this is the most difficult step, particularly in human genetics, because the tools to elucidate function are still in their infancy. Transgenic expression of human APOL1 mutations in mice has been shown to disrupt endosomal trafficking and autophagy, leading to inflammation of the podocytes with ultimate podocyte death and glomerular scarring.55 Others have shown that in human embryonic kidney cells, the APOL1 risk variants are associated with loss of intracellular potassium and induction of stress-activated protein kinase pathways.56 There are several factors that make the biologic elucidation of this target difficult. First, the target is expressed in mammals (humans, gorillas, and baboons) but not in many other standard animal models. Second, human podocytes are notoriously difficult to culture in vitro, requiring the use of less relevant cell types. Finally, the target itself is intracellular, within the podocyte, rendering it classically “undruggable” from the small-molecule or biologics perspective. These issues highlight the need to combine high-quality cell biological and genomic data to advance mechanistic insight from genetic association data. It is conceivable that novel modalities such as RNA interference or antisense oligonucleotide techniques will be necessary. Challenges and Opportunities There is enormous potential in leveraging human genetics for drug discovery, as illustrated in this article. However, some challenges exist. First, the majority of genotypephenotype association data are GWAS. Although there have been numerous GWAS findings,25 a major challenge is translating GWAS into meaningful biological insights, mechanism of action, and pharmacologic targetability. There have been few success stories of target elucidation, and the translational path using GWAS alone is complex because the majority of GWAS findings are in noncoding regions with small effect sizes, requiring systematic functional elucidation to identify causal variants. While the effect size of GWAS does not correlate to the effect that can be achieved from pharmacologic manipulation when dissected post hoc,5 it can make modeling a finding in vitro challenging.31 In addition, the limited number of success stories to date dampen enthusiasm. However, developing tools to translate GWAS is a rapidly evolving field, and confidence exists that the necessary tools will be developed soon. Other limitations of using human genetics to drive drug discovery include the relatively small number of bona fide rare coding variant findings. As the numbers of individuals 7
Perspective with exome sequencing grows, particularly in unique populations with the ability to meaningful detect rare mutations with large phenotypic consequences, this limitation will be overcome. PheWAS is a promising technique for elucidating on-target safety profiles but remains unvalidated in terms of the positive and negative predictive value. Genotype phenotype associations are not causal, and determining causality, mechanism of action, and targetability requires extensive laboratory work. Finally, and most importantly, having underlying human genetic support for a target and its indication merely increases the probability of success, but does not ensure success. Conclusions Using human genetics to drive drug discovery offers many opportunities. First, it provides insight into biology and disease associations in the most relevant model system: the human. Second, genetics can help pinpoint a pathway of interest that was not previously suspected in association with a disease. Third, genetics can help identify potential on-target safety issues, alternative indications, and repurposing opportunities. Layering genetic support onto relevant biology to identify the most promising targets is the key challenge and opportunity. Article Information Author’s Affiliation: Merck & Co, Inc, Boston, MA. Address for Correspondence: Caroline S. Fox, MD, MPH, Merck & Co, Inc, 33 Ave Louis Pasteur, Boston, MA 02115. E-mail: caroline.
[email protected] Support: None. Financial Disclosure: Dr Fox is a full-time employee of Merck Sharp & Dohme Corp, a subsidiary of Merck & Co, Inc, Kenilworth, NJ. As such, she receives compensation and owns stock in the company. Peer Review: Received August 11, 2018, in response to an invitation from the journal. Reviewed by 3 external peer reviewers, with direct editorial input from an Associate Editor and a Deputy Editor. Accepted in revised form December 24, 2018.
References 1. Scannell JW, Blanckley A, Boldon H, Warrington B. Diagnosing the decline in pharmaceutical R&D efficiency. Nat Rev Drug Discov. 2012;11(3):191-200. 2. DiMasi JA, Grabowski HG, Hansen RW. Innovation in the pharmaceutical industry: new estimates of R&D costs. J Health Econ. 2016;47:20-33. 3. Waring MJ, Arrowsmith J, Leach AR, et al. An analysis of the attrition of drug candidates from four major pharmaceutical companies. Nat Rev Drug Discov. 2015;14(7):475-486. 4. Fox CS, Hall JL, Arnett DK, et al. Future translational applications from the contemporary genomics era: a scientific statement from the American Heart Association. Circulation. 2015;131(19):1715-1736. 5. Nelson MR, Tipney H, Painter JL, et al. The support of human genetic evidence for approved drug indications. Nat Genet. 2015;47(8):856-860.
8
6. Plenge RM, Scolnick EM, Altshuler D. Validating therapeutic targets through human genetics. Nat Rev Drug Discov. 2013;12(8):581-594. 7. Abifadel M, Varret M, Rabes JP, et al. Mutations in PCSK9 cause autosomal dominant hypercholesterolemia. Nat Genet. 2003;34(2):154-156. 8. Cohen J, Pertsemlidis A, Kotowski IK, Graham R, Garcia CK, Hobbs HH. Low LDL cholesterol in individuals of African descent resulting from frequent nonsense mutations in PCSK9. Nat Genet. 2005;37(2):161-165. 9. Sabatine MS, Giugliano RP, Keech AC, et al. Evolocumab and clinical outcomes in patients with cardiovascular disease. N Engl J Med. 2017;376(18):1713-1722. 10. Willer CJ, Schmidt EM, Sengupta S, et al. Discovery and refinement of loci associated with lipid levels. Nat Genet. 2013;45(11):1274-1283. 11. Chong JX, Buckingham KJ, Jhangiani SN, et al. The genetic basis of Mendelian phenotypes: discoveries, challenges, and opportunities. Am J Hum Genet. 2015;97(2):199-215. 12. Amberger JS, Bocchini CA, Schiettecatte F, Scott AF, Hamosh A. OMIM.org: Online Mendelian Inheritance in Man (OMIM(R)), an online catalog of human genes and genetic disorders. Nucleic Acids Res. 2015;43(database issue):D789-D798. 13. Imperato-McGinley J, Miller M, Wilson JD, Peterson RE, Shackleton C, Gajdusek DC. A cluster of male pseudohermaphrodites with 5 alpha-reductase deficiency in Papua New Guinea. Clin Endocrinol (Oxf). 1991;34(4):293-298. 14. Rittmaster RS. Finasteride. N Engl J Med. 1994;330(2):120-125. 15. Davis PB, Yasothan U, Kirkpatrick P, Ivacaftor. Nat Rev Drug Discov. 2012;11(5):349-350. 16. Dubey AK, Handu SS, Mediratta PK. Suvorexant: the first orexin receptor antagonist to treat insomnia. J Pharmacol Pharmacother. 2015;6(2):118-121. 17. Brown SD, Moore MW. Towards an encyclopaedia of mammalian gene function: the International Mouse Phenotyping Consortium. Dis Model Mech. 2012;5(3):289-292. 18. MacArthur DG, Balasubramanian S, Frankish A, et al. A systematic survey of loss-of-function variants in human protein-coding genes. Science. 2012;335(6070):823-828. 19. Narasimhan VM, Hunt KA, Mason D, et al. Health and population effects of rare gene knockouts in adult humans with related parents. Science. 2016;352(6284):474-477. 20. Saleheen D, Natarajan P, Armean IM, et al. Human knockouts and phenotypic analysis in a cohort with a high rate of consanguinity. Nature. 2017;544(7649):235-239. 21. Mullard A. Calls grow to tap the gold mine of human genetic knockouts. Nat Rev Drug Discov. 2017;16(8):515-518. 22. Rice GI, Del Toro Duany Y, Jenkinson EM, et al. Gain-of-function mutations in IFIH1 cause a spectrum of human disease phenotypes associated with upregulated type I interferon signaling. Nat Genet. 2014;46(5):503-509. 23. Lee JJ, Wedow R, Okbay A, et al. Gene discovery and polygenic prediction from a genome-wide association study of educational attainment in 1.1 million individuals. Nat Genet. 2018;50(8):1112-1121. 24. Visscher PM, Wray NR, Zhang Q, et al. 10 years of GWAS discovery: biology, function, and translation. Am J Hum Genet. 2017;101(1):5-22. 25. MacArthur J, Bowler E, Cerezo M, et al. The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res. 2017;45(D1):D896-D901. 26. Schaid DJ, Chen W, Larson NB. From genome-wide associations to candidate causal variants by statistical fine-mapping. Nat Rev Genet. 2018;19(8):491-504.
AJKD Vol XX | Iss XX | Month 2019
Perspective 27. Hormozdiari F, van de Bunt M, Segre AV, et al. Colocalization of GWAS and eQTL signals detects target genes. Am J Hum Genet. 2016;99(6):1245-1260. 28. Ernst J, Melnikov A, Zhang X, et al. Genome-scale highresolution mapping of activating and repressive nucleotides in regulatory regions. Nat Biotechnol. 2016;34(11):1180-1190. 29. Ulirsch JC, Nandakumar SK, Wang L, et al. Systematic functional dissection of common genetic variation affecting red blood cell traits. Cell. 2016;165(6):1530-1545. 30. Maurano MT, Humbert R, Rynes E, et al. Systematic localization of common disease-associated variation in regulatory DNA. Science. 2012;337(6099):1190-1195. 31. MacRae CA, Pollak MR. Effect size does matter: the long road to mechanistic insight from genome-wide association. Circulation. 2015;132(21):1943-1945. 32. Mahajan A, Wessel J, Willems SM, et al. Refining the accuracy of validated target identification through coding variant finemapping in type 2 diabetes. Nat Genet. 2018;50(4):559-571. 33. O’Seaghdha CM, Yang Q, Glazer NL, et al. Common variants in the calcium-sensing receptor gene are associated with total serum calcium levels. Hum Mol Genet. 2010;19(21):4296-4303. 34. Kestenbaum B, Glazer NL, Kottgen A, et al. Common genetic variants associate with serum phosphorus concentration. J Am Soc Nephrol. 2010;21(7):1223-1232. 35. Gupta RM, Hadaya J,Trehan A, et al. A genetic variant associated with five vascular diseases is a distal regulator of endothelin-1 gene expression. Cell. 2017;170(3):522-533 e515. 36. Denny JC, Bastarache L, Ritchie MD, et al. Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data. Nat Biotechnol. 2013;31(12):1102-1110. 37. Verma A, Lucas A, Verma SS, et al. PheWAS and beyond: the landscape of associations with medical diagnoses and clinical measures across 38,662 individuals from Geisinger. Am J Hum Genet. 2018;102(4):592-608. 38. Ehm MG, Aponte JL, Chiano MN, et al. Phenome-wide association study using research participants’ self-reported data provides insight into the Th17 and IL-17 pathway. PLoS One. 2017;12(11):e0186405. 39. Fritsche LG, Igl W, Bailey JN, et al. A large genome-wide association study of age-related macular degeneration highlights contributions of rare and common variants. Nat Genet. 2016;48(2):134-143. 40. HPS3/TIMI55–REVEAL Collaborative Group; Bowman L, Hopewell JC, Chen F, et al. Effects of anacetrapib in patients with atherosclerotic vascular disease. N Engl J Med. 2017;377(13):1217-1227. 41. Pulley JM, Shiry-Rice JK, Lavieri RR, et al. Accelerating precision drug development and drug repurposing by leveraging human genetics. Assay Drug Dev Technol. 2017;15(3):113-119. 42. Pattaro C, Teumer A, Gorski M, et al. Genetic associations at 53 loci highlight cell types and biological pathways relevant for kidney function. Nat Commun. 2016;7:10023. 43. Kottgen A, Pattaro C, Boger CA, et al. New loci associated with kidney function and chronic kidney disease. Nat Genet. 2010;42(5):376-384. 44. Kottgen A, Glazer NL, Dehghan A, et al. Multiple loci associated with indices of renal function and chronic kidney disease. Nat Genet. 2009;41(6):712-717. 45. Kottgen A, Hwang SJ, Larson MG, et al. Uromodulin levels associate with a common UMOD variant and risk for incident CKD. J Am Soc Nephrol. 2010;21(2):337-344.
AJKD Vol XX | Iss XX | Month 2019
46. Gudbjartsson DF, Holm H, Indridason OS, et al. Association of variants at UMOD with chronic kidney disease and kidney stones-role of age and comorbid diseases. PLoS Genet. 2010;6(7):e1001039. 47. Kopp JB, Smith MW, Nelson GW, et al. MYH9 is a major-effect risk gene for focal segmental glomerulosclerosis. Nat Genet. 2008;40(10):1175-1184. 48. Kao WH, Klag MJ, Meoni LA, et al. MYH9 is associated with nondiabetic end-stage renal disease in African Americans. Nat Genet. 2008;40(10):1185-1192. 49. Genovese G, Friedman DJ, Ross MD, et al. Association of trypanolytic ApoL1 variants with kidney disease in African Americans. Science. 2010;329(5993):841-845. 50. Parsa A, Kao WH, Xie D, et al. APOL1 risk variants, race, and progression of chronic kidney disease. N Engl J Med. 2013;369(23):2183-2196. 51. Nichols B, Jog P, Lee JH, et al. Innate immunity pathways regulate the nephropathy gene apolipoprotein L1. Kidney Int. 2015;87(2):332-342. 52. Lan X, Jhaveri A, Cheng K, et al. APOL1 risk variants enhance podocyte necrosis through compromising lysosomal membrane permeability. Am J Physiol Renal Physiol. 2014;307(3): F326-F336. 53. Perdigoto C. Mutations: dawn of the human knockout project. Nat Rev Genet. 2017;18(6):328-329. 54. Vanhollebeke B, Truc P, Poelvoorde P, et al. Human Trypanosoma evansi infection linked to a lack of apolipoprotein L-I. N Engl J Med. 2006;355(26):2752-2756. 55. Beckerman P, Bi-Karchin J, Park AS, et al. Transgenic expression of human APOL1 risk variants in podocytes induces kidney disease in mice. Nat Med. 2017;23(4):429-438. 56. Olabisi OA, Zhang JY, VerPlank L, et al. APOL1 kidney disease risk variants cause cytotoxicity by depleting cellular potassium and inducing stress-activated protein kinases. Proc Natl Acad Sci U S A. 2016;113(4):830-837. 57. Gharavi AG, Kiryluk K, Choi M, et al. Genome-wide association study identifies susceptibility loci for IgA nephropathy. Nat Genet. 2011;43(4):321-327. 58. Kiryluk K, Li Y, Scolari F, et al. Discovery of new risk loci for IgA nephropathy implicates genes involved in immunity against intestinal pathogens. Nat Genet. 2014;46(11):1187-1196. 59. Jia X, Horinouchi T, Hitomi Y, et al. Strong association of the HLA-DR/DQ locus with childhood steroid-sensitive nephrotic syndrome in the Japanese population. J Am Soc Nephrol. 2018;29(8):2189-2199. 60. Sekula P, Li Y, Stanescu HC, et al. Genetic risk variants for membranous nephropathy: extension of and association with other chronic kidney disease aetiologies. Nephrol Dial Transplant. 2017;32(2):325-332. 61. Sanna-Cherchi S, Khan K, Westland R, et al. Exome-wide association study identifies GREB1L mutations in congenital kidney malformations. Am J Hum Genet. 2017;101(5): 789-802. 62. Bohnen MS, Ma L, Zhu N, et al. Loss-of-function ABCC8 mutations in pulmonary arterial hypertension. Circ Genom Precis Med. 2018;11(10):e002087. 63. Abul-Husn NS, Cheng X, Li AH, et al. A protein-truncating HSD17B13 variant and protection from chronic liver disease. N Engl J Med. 2018;378(12):1096-1106. 64. Dewey FE, Gusarova V, Dunbar RL, et al. Genetic and pharmacologic inactivation of ANGPTL3 and cardiovascular disease. N Engl J Med. 2017;377(3):211-221.
9