Segregation, linkage, GWAS, and sequencing

Segregation, linkage, GWAS, and sequencing

C H A P T E R 2 Segregation, linkage, GWAS, and sequencing Andrea R. Waksmunskia,b,c, Leighanne R. Maina,b,c, Jonathan L. Hainesa,b,c a Department o...

197KB Sizes 0 Downloads 33 Views

C H A P T E R

2 Segregation, linkage, GWAS, and sequencing Andrea R. Waksmunskia,b,c, Leighanne R. Maina,b,c, Jonathan L. Hainesa,b,c a

Department of Genetics and Genome Sciences, Case Western Reserve University, Cleveland, OH, United States bCleveland Institute for Computational Biology, Case Western Reserve University, Cleveland, OH, United States cDepartment of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH, United States

Segregation analysis Segregation: How is a trait inherited? The principles for modern genetics were first characterized by an Austrian monk, Gregor Mendel, who suggested that there are units of heredity that are passed down from parents to offspring [1]. By examining the frequency of traits in the offspring, he was able to describe the Law of Segregation [1, 2], which ultimately defined multiple inheritance patterns including additive, dominant, recessive, autosomal, and sex linked. Before the era of recombinant DNA, the statistical tool of segregation analysis was often used to determine the mode of inheritance of a trait based on these segregation ratios [3]. This was done to determine if a trait has strong enough genetic underpinnings to warrant additional genetic studies, which were generally very laborious and expensive. To perform a segregation analysis, investigators ascertain information regarding individuals’ phenotypes from multiple families. For these analyses, the collection of biospecimens such as DNA is not necessary [4].

Simple vs. complex segregation analysis Segregation analyses are categorized as either simple or complex. Simple (or classical) segregation analyses statistically determine if the ratio of offspring with the trait of interest

Genetics and Genomics of Eye Disease https://doi.org/10.1016/B978-0-12-816222-4.00002-2

7

Copyright # 2020 Elsevier Inc. All rights reserved.

8

2. Segregation, linkage, GWAS, and sequencing

in a nuclear family deviate from proportions expected for Mendelian inheritance [5]. Complex segregation analyses elucidate the transmission of a trait in families by testing various models of inheritance [5]. Complex segregation analyses estimate population-level parameters such as allele frequencies, penetrance parameters, transmission probabilities (probability of the child’s trait given the parents’ traits), and familial correlations including the correlation between parents, between siblings, and between parent and offspring for qualitative traits [6]. For quantitative traits, similar parameters are also estimated with the following exceptions: penetrance parameters include genotype means and environmental variance [7].

Advantages of segregation analysis Historically, when familial aggregation of a trait was observed, segregation analysis was used to determine its mode of inheritance in families and guide further studies to identify the genetic variation of the trait. Complex segregation analysis allows larger pedigrees to be analyzed and can be used to study either quantitative or qualitative traits [5, 7, 8]. This analysis can model multiple genetic risk factors for a trait as well as environmental factors and phenocopies [5, 9]. Phenocopies resemble the result of genetic variation but are actually the product of different etiology including environmental factors and random chance [9]. The knowledge gained from a complex segregation analysis can help inform model-based linkage analyses to locate the genetic determinants of the trait [4, 5]. This information subsequently increases the power of those analyses because a model misclassification is less likely [4]. With the advancement of genome-wide DNA assays and analyses, it has become a more efficient and cost effective to interrogate the genetic variation of a trait directly allowing both the determination of the level of heritability and the underlying variation simultaneously.

Disadvantages of segregation analysis While segregation analyses have been instrumental in identifying the modes of inheritance for numerous traits and diseases, there are limitations. Unfortunately, the study designs of segregation analyses are restricted to determining trait transmission in families, not the general population. The complexity of segregation analysis is also limited due to the proportional relationship between the amount of data necessary to assess segregation and the number of parameters estimated for the model [5]. These analyses are also most informative for studying Mendelian conditions rather than complex traits, which do not usually follow monogenic patterns of inheritance [10]. Although this analysis provides evidence for the segregation of a phenotype, it cannot directly prove such a phenomenon is true [5]. Additional statistical analyses and functional studies would be required to demonstrate the phenomenon is occurring biologically. Moreover, current genomic approaches do not require data from segregation analyses for input, and the results from these genomewide studies provide more detailed information about inheritance than segregation analyses would. Therefore, in modern genetic epidemiology, segregation analyses are rarely performed in research studies.

I. Introduction to gene mapping

Linkage analysis

9

Segregation analyses for ocular traits and diseases Because there are numerous eye conditions with a strong genetic component, multiple studies utilized segregation analyses to explain inheritance patterns. In 1914, a study aimed to determine if the transmission of myopia was consistent with Mendelian modes of inheritance [11]. Segregation analyses were also used to detect the modes of inheritance for retinitis pigmentosa, which is one of the most common types of inherited retinal degeneration. Studies demonstrated that this disease has multiple patterns of inheritance, including autosomal dominant, autosomal recessive, X-linked, and mitochondrial modes of inheritance [12–14]. A segregation analysis was performed for age-related maculopathy in families from the Beaver Dam Eye Study [15]. Their findings suggested that Mendelian transmission of a major gene could not be rejected and were consistent with a major effect describing about 60% of the observed age-related maculopathy scores in each eye [15]. More recent publications have described segregation analyses for inherited retinal degeneration [16–18], juvenile onset primary open-angle glaucoma (POAG) [19], and familial keratoconus [20]. In these studies, the segregation analysis was often coupled with the next-generation sequencing or other statistical analyses, such as genome-wide association analyses.

Linkage analysis What is genetic linkage? Mendel’s Law of Independent Assortment suggests that different genetic loci are inherited independently of one another [1]. This is true if loci are on different chromosomes (as were Mendel’s original pea traits), or far apart on the same chromosome. However, it was soon observed that some loci tend to be inherited together (essentially being linked to each other), in violation of this law. How frequently two loci are inherited together can be measured statistically and transformed in a relative distance [2, 21, 22]. The key to using a genetic linkage is having at least two measured traits.

Linkage: What genetic variation contributes to a trait? Linkage analysis is a statistical method for discovering the locations of loci underlying a trait of unknown position by testing for co-segregation with genetic polymorphisms of known position in the genome [22]. In contrast to segregation analyses that determine the mode of inheritance for a trait and do not require biospecimens, methods to detect genetic linkage require DNA from families to be genotyped for polymorphisms with known positions [4]. The physical map positions of the genotyped polymorphisms are evaluated to estimate the recombination fraction between them and the hypothesized locus of the trait of interest. Conceptually, the recombination fraction (θ) between two loci is estimated based on the number of recombinants and nonrecombinants that are counted in a pedigree [23]. This can be counted directly in rare cases but is typically statistically estimated [23]. Linkage analyses were initially used to study the genetic etiology of Mendelian conditions but have also been performed for complex traits [24, 25].

I. Introduction to gene mapping

10

2. Segregation, linkage, GWAS, and sequencing

Types of linkage analysis Multiple types of linkage analyses have been developed to interrogate the genetic basis of qualitative and quantitative traits. These include two-point and multipoint linkage analyses as well as model-free and model-based linkage analyses. Two-point linkage analyses (also called single-point analyses) individually examine the likelihood of linkage between the trait locus and a single genetic polymorphism (SNPs) [26]. Multipoint analyses examine the likelihood of linkage occurring among multiple polymorphisms and the trait locus in a region or on an entire chromosome [26]. Either of these approaches can be utilized in model-free or model-based linkage analyses. By definition, model-free linkage does not require the investigator to define a model (which includes parameters such as mode of inheritance) prior to performing the analysis. This particular type of analysis is more robust when the mode of inheritance is unknown for a trait [4]. It considers if allele sharing among siblings in a family is the result of being identical by descent (IBD), which occurs when the shared alleles were inherited by each child from a common ancestor [4]. For this analysis, the ratios of sib-pairs with 0, 1, and 2 alleles shared IBD at a single locus are estimated and compared to those estimated under the assumption of no genetic linkage [27]. The logarithm(log10) of this likelihood ratio corresponds to the maximum logarithm of the odds (LOD) score. There is significant evidence of linkage if the maximum LOD score (MLS) is >3.0 [22, 27]. For model-based linkage analyses, the researcher defines model parameters, such as trait allele frequency, mode of inheritance with penetrance values, and marker allele frequencies, to be used in the statistical analysis [23]. In linkage analyses, the recombination fraction between the trait locus and the marker of known position is estimated and compared to 0.5 (the expected recombination fraction for unlinked loci) in a likelihood ratio [22]. The log10 of this ratio is considered the “log-odds” or LOD score [22]. In smaller scans of the genome, a LOD score > 3.0, which is equivalent to a pointwise p-value of about 104, is considered significant evidence for genetic linkage [28]. For genome-wide significance, the LOD score should be >3.3, which corresponds to a p-value of 4.9  105 [28].

Advantages of linkage analysis Although linkage analyses were most prevalent in classical genetic epidemiological studies, their utility has reemerged in the genomics era as a result of their advantageous qualities. Linkage analyses are optimal for identifying rare variants that are co-segregating with a trait with high penetrance within families [29]. These rare variants, which could contribute to a significant portion of the trait’s heritability, are often difficult to detect with other analyses because of the reliance on nonfamily, case-control designs for genome-wide association studies (GWAS) [30]. In linkage analyses, the number of polymorphisms required to detect linkage is relatively low compared to methods for elucidating genetic association. Older genome-wide linkage analyses utilize 300–600 microsatellite polymorphisms, which cover map position intervals of 10 centimorgans and 5 centimorgans, respectively [31]. More recent genome-wide linkage studies use 4000–6000 biallelic markers SNPs. Because of the relatively low number of polymorphisms, correction for multiple comparisons is not onerous. The power for these analyses to identify linkage is enhanced when the study cohort is enriched

I. Introduction to gene mapping

Linkage analysis

11

for individuals with the disease allele. This can be accomplished by ascertaining individuals from population isolates, like the Amish, or based on trait characteristics such as illness severity, clinical subtypes, or age of onset [31]. Linkage analyses are also unhindered by allelic heterogeneity because genetic linkage considers the physical proximity between genomic loci rather than alleles [29].

Disadvantages of linkage analysis While linkage analyses have successfully elucidated thousands of susceptibility loci for human traits, they are primarily used for studying families rather than populations. Additionally, linkage analyses are not well suited for detecting genetic variations of small effects when the study’s sample size is small [32]. This becomes especially problematic when studying complex traits for which the genetic variance is typically accounted for by multiple loci of low to modest effect. Consequently, these analyses are most effective for studying traits and disorders that exhibit Mendelian modes of inheritance, which are rare in the general population [31]. Model-based linkage analyses are sensitive to inaccuracies in the defined mode of inheritance, such as overestimating the disease allele frequency [5]. LOD scores in linkage analyses are deleteriously affected by errors in genotyping and/or phenotyping [33]. Locus and clinical heterogeneity can also negatively affect the detection of genetic linkage [33]. The chromosomal region identified by linkage analysis is also relatively large (often 20–50 million base pairs) and requires additional fine-mapping approaches such as linkage disequilibrium mapping or association testing to precisely pinpoint the genetic determinant of the trait [34].

Linkage analyses for ocular traits and diseases Linkage analyses have been vital for mapping the genes responsible for both Mendelian and complex eye traits. The genetic etiology of age-related macular degeneration (AMD) has been extensively studied with linkage studies. A 21-member family with a high incidence of AMD was ascertained for two-point linkage analysis, which identified a 9 centimorgan region on the q arm of chromosome 1 [35]. Investigators mapped nominally significant genomic loci for AMD on chromosomes 5, 9, 12, 15, 16, 18, and 20 using a genome-wide model-free linkage analysis of 34 large families from the Beaver Dam Eye Study [36]. Pooled model-based and model-free linkage analyses for AMD demonstrated the disease relevance of regions on chromosomes 1, 2, 10, and 17 [37]. In 2005, a genome-scan meta-analysis (GSMA) was performed using data from six AMD linkage screens [36–41] to increase the power to detect loci for AMD [42]. This study found significant evidence for linkage to AMD on chromosome 10q26 and nominally significant linkage regions on chromosomes 1q, 2p, 3p, 4q, 12q, and 16q [42]. More recently, a chromosome-specific multipoint linkage analysis was performed in Amish families with AMD [43]. This study included liability classes for carriers of particular genetic variants (Y402H and P503A) in the complement factor H (CFH) gene in the statistical models and determined that these variants were modestly responsible for the significant linkage signal obtained in a separate genome-wide analysis [43]. Additionally, linkage mapping for juvenile open-angle glaucoma, which is a rare Mendelian form of POAG, facilitated the discovery of a locus for the common and complex form of the disease [44]. Evidence for

I. Introduction to gene mapping

12

2. Segregation, linkage, GWAS, and sequencing

genetic linkage for refractive errors was found on chromosomes 3, 7, and 22 using model-free linkage analysis on 834 sib-pairs in 486 families within the Beaver Dam Eye Study [45]. A genomic locus for congenital cavitary optic disc anomalies on chromosome 14 was also identified by a statistical approach that coupled genome-wide linkage analysis with fine mapping methods [46].

Genome-wide association studies What is a GWAS? As genetic linkage and candidate gene studies could not explain all the heritability of complex diseases, GWAS became feasible with the growing number of individuals genotyped [32]. GWAS generally compare variations in the genome between those affected by the disease (cases) and those who are not (controls) [47]. These studies can also be applied to quantitative traits and endophenotypes for a disease [48]. This methodology is based on the “common disease-common variant” hypothesis, which suggests that common diseases have common underlying influential variants across the population [49]. GWAS have been successfully used to identify associated variants for diseases for over a decade. The first GWAS in 2002 [50] was followed by the first GWAS of common genomic variants in 2005 [51] and the first large-scale, high-coverage GWAS for complex traits in 2007 [52]. Usually, DNA microarrays such as SNP arrays are used to find a significant variation in allele frequencies in cases versus controls. Initial studies focused on only approximately 1000 SNPs, but as a result of rapid technological improvements, larger SNP arrays of 600,000–5,000,000 SNPs are now used and can be customizable [53]. Although examining SNPs is most common, data for GWAS can also be generated from whole genome sequencing (WGS) or whole exome sequencing (WES). These approaches are not currently favorable due to their high price and the limited increase in associated variants found compared to SNP arrays [48].

Advantages and disadvantages of GWAS Although GWAS have uncovered thousands of moderate to low-effect SNPs for hundreds of human traits, the complete genetic architecture of complex traits has remained elusive to solve with GWAS. Most of the associated variants found in GWAS have been common (minor allele frequency > 1%) in the population and have very small overall effects on the penetrance of the trait [48, 54, 55]. Therefore, a large portion of the heritability of many complex traits remains unexplained by known variants. GWAS also mainly focus on the statistical likelihood of trait-associated variation and are unable to implicate the direct biological consequences of the results. Supporting evidence from functional studies must corroborate GWAS results for them to be considered causal for human diseases. Since large sample sizes are also needed to properly perform a GWAS, many consortia have been established to help ascertain and aggregate thousands of cases and controls for a trait of interest [56]. For instance, investigators have collaborated to form the following consortia focused on eye diseases: the International Age-related Macular Degeneration Genomics Consortium (IAMDGC), the International Glaucoma Genetics Consortium, the National Eye

I. Introduction to gene mapping

DNA sequencing

13

Institute Glaucoma Human genetics collaBORation Heritable Overall Operational Database (NEIGHBORHOOD) Consortium, and the Consortium on Refractive Error and Myopia (CREAM) [57–63]. Despite these international efforts to generate large cohorts of cases and controls, primarily populations of European decent have been extensively researched with GWAS. This can lead to issues with population stratification and admixture [47]. As additional diverse populations are studied, the number of variants associated with disease is also predicted to grow. Another limitation for GWAS is the correction for multiple testing in the statistical analysis. The Bonferroni adjustment is the most commonly used method and is based on the following equation: αGWAS ¼ αn, α is the family-wide significance level and n represents the number of tests (or SNPs) being used. Point-wise significance in GWAS is considered p-value <5  108 for α ¼ 0.05 [32, 64, 65]. If 0.05 is used for declaring GWAS significance instead of 5  108, for one million independent SNPs, 50,000 could be false positives. Bonferroni adjustments tend to be overly conservative because they assume that all tests are independent despite the fact that many SNPs are inherited together and, therefore, are not independent [66].

Examples of GWAS for eye diseases GWAS have been extremely successful in identifying loci associated with both AMD and glaucoma. The first GWAS of common SNPs was performed on AMD in 2005 [51] and identified a specific complement factor H (CFH) variant that increased an individual’s risk of developing AMD. Simultaneously, three additional papers using three different study designs identified the same variant in CFH [67–69]. This seminal event started the explosion of GWAS for human traits. Since then, 52 independent genomic variants across 34 loci have been found for AMD and explain up to 65% of the heritability of the disease [59, 60, 70–73]. However, about 35% of AMD heritability is still unexplained by known genetic variants. This missing heritability might be explained by rare variants, which was demonstrated in a 2011 study that found a highly penetrant allele contributing to a CFH variant found in a minute part of the population [74]. Additional rare variants in other AMD susceptibility loci have since been identified in the general population [60, 75]. GWAS have also been performed to interrogate the genetic architecture of glaucoma. A total of 71 loci have been found for POAG, and the role of gene-gene interactions in POAG has been investigated [58, 61, 76]. These studies helped find novel POAG pathways and led to a better understanding of underlying mechanisms and heritability of the disease [77]. As technology advances and more samples become available with help from various consortia, GWAS continue to find new susceptibility loci for eye diseases and inspire possible treatments for these conditions [78].

DNA sequencing Overview of sequencing technologies: Past to present Knowing the sequence of the human genome has been critical in understanding the etiology of many diseases. The first method of DNA sequencing (chain-termination sequencing) was

I. Introduction to gene mapping

14

2. Segregation, linkage, GWAS, and sequencing

developed in 1977 by Frederick Sanger and colleagues [79] (Fig. 1). Using known concentrations of nucleotides and dideoxynucleotides, DNA synthesis would occur and terminate once a dideoxynucleotide was added to the growing strand due to the lack of an oxygen to properly bind to the next nucleotide. These concentrations were then run on gels to visualize where termination occurred and determine the overall sequence of the fragment [79]. Soon after, a faster and more cost-effective sequencing method (shotgun sequencing) was used to help decode the human genome [80–83].

1977

1981

Technology

Developer

Sanger sequencing

Sanger and colleagues

Shotgun sequencing

Sanger and colleagues

Methodology

References [79]

Chain termination sequencing [80]

2001

First human genome was published. 2004

2005

Complete human genome was published for academic research. National Human Genome Research Institute (NGHRI) started an initiative to reduce the cost of sequencing. 454 sequencing

Roche

Genome Analyzer

Illumina (formerly Solexa)

SOLiD

Applied Biosciences

Ion Torrent

Life Technologies

HiSeq 2000

Illumina

MiSeq

Illumina

2006

2010

[81–83]

[83a] Sequencing-by-synthesis [83b,83c,84]

Sequencing-by-ligation

[84a,85] [85a,85b]

Sequencing-by-synthesis

[85c,85d,85e]

2011

2012

2014

SMRT sequencing Nanopore Technologies

Pacific Biosciences (PacBio) Oxford Nanopore Technologies

[85f,85g] Single-molecule, realtime sequencing

[86–88]

Nanopore sequencing

[89–91]

FIG. 1 Timeline of sequencing technologies. First-generation technologies are highlighted in yellow. Secondgeneration (also called next-generation) technologies are in blue. Third-generation technologies are in green. The developer and methods are noted for each technology.

I. Introduction to gene mapping

DNA sequencing

15

In 2004, the complete human genome was published for academic research, and the National Human Genome Research Institute (NGHRI) started an initiative to reduce the cost of sequencing. This led to the development of high-throughput sequencing technologies, sometimes referred to as the next-generation (“next-gen”) and third-generation technologies [81–83]. The most prominent types of sequencing include sequencing by synthesis, sequencing by ligation, single-molecule, real-time sequencing, and nanopore sequencing (Fig. 1). Sequencing by synthesis is by far the most widely used methodology and discriminates between nucleotides via fluorescent labeling, differences in ion emittance, or changes in voltage due to a pH change. In sequencing by ligation, each nucleotide is uniquely labeled with a fluorescent probe that is detected when DNA ligase adds the nucleotide to the growing complementary strand, which is then sequenced. However, issues with palindromic regions were found with this method; therefore, it is no longer commonly used [85]. Single-molecule, real-time (SMRT) sequencing avoids clonal amplification and allows for the direct sequencing of modified DNA, which includes epigenetic markers [86]. In addition to determining base pair composition, SMRT sequencing generates kinetic profiles of DNA polymerase and the extending strand. This provides the possibility of looking at various types of DNA methylation states [87, 88]. Nanopore sequencing measures current changes in the surrounding fluid as DNA moves nucleotide by nucleotide through a motorized pore [89–91]. Both SMRT and nanopore sequencing are promising technologies but currently have higher error rates than other high-throughput methods [92].

Types of sequencing Sequencing can examine different components of the genome. For example, WGS determines the nucleotide composition of most of the genome. Difficulties still exist with regions of high guanine-cytosine (GC) content and repetitive sequences. WES, on the other hand, examines only the protein-coding regions of the genome [93]. For WES, DNA first must be filtered so that only the protein-coding regions are present. This can be done with an array-based capture or an in-solution capture. In array-based capture methods, DNA microarrays with oligonucleotides that correspond to the human genome are used. For in-solution capture, custom probes are used to target the DNA of interest [94]. This method is commonly used to make clinical diagnoses [95, 96]. WGS and WES have aided in the discovery of genetic variants for numerous inherited ocular diseases. For instance, researchers performed WES to identify new variants for retinitis pigmentosa [97]. WGS has also helped identify new variants for inherited retinal diseases and contributed to a better understanding of the alterations in the genetic landscape of uveal melanoma [98, 99]. In addition to examining variation in single nucleotides, genome-wide sequencing efforts enable examination of structural and epigenetic variation in the human genome, especially in the context of human disease. Methylation profiling has shown differences in the retinal cells of individuals with AMD compared to controls [100, 101]. Chromatin immunoprecipitation sequencing (ChIP-seq) is used to identify DNA-protein interactions and histone modifications [102]. Chromatin structure and organization are also important for disease diagnostics, which can be examined with both DNase I hypersensitive sites sequencing (DNase-seq) and the more recent assay for transposase-accessible chromatin using sequencing (ATAC-seq)

I. Introduction to gene mapping

16

2. Segregation, linkage, GWAS, and sequencing

[103–105]. Analysis of ATAC-seq data in AMD identified a decrease in chromatin accessibility in those affected by AMD. They also suggested that smoking, which is a known risk factor for AMD, is linked to chromatin accessibility and proposed a possible mechanism for how smoking affects AMD progression at the molecular level [106]. The three-dimensional shape of DNA has also been explored using chromosome conformation capture techniques such as 3C, 4C, 5C, and Hi-C [107–110]. RNA sequencing (RNA-seq) is also commonly used to look at expression profiles of different cells and tissue types [111]. These methods have also been applied to single cells, which will be helpful for studying the rare populations of cells and determining the development of complex organs such as the eye. Recently, new subtypes of retinal ganglion cells (RCGs) have been found using single-cell RNA-seq [112]. Various other cell types and subtypes are expected to emerge as single-cell sequencing grows in popularity and decreases in cost.

Advantages and disadvantages of sequencing Despite the immense progress made in recent years, sequencing still has many disadvantages. In addition to the cost and time required for sequencing, short read lengths obtained from these methods lead to inaccuracy in the sequence results [84, 113]. Repetitive sequences have also remained an issue for these technologies and have made it impossible to sequence 4%–9% of the human genome [114, 115]. Sequencing technology continues to improve with time, and more accurate results should be expected in the future. In contrast to sequencing approaches, DNA microarrays and targeted genotyping offer cheaper, less data intensive methods for discovering underlying the genetic components of diseases. Both methods use the specific sequences of DNA to examine the DNA fragments of interest. However, both methods likely miss data that may be pertinent to understanding the mechanisms of a disease or trait of interest. For instance, noncoding regions that have an impact on the gene of interest are often excluded with these methods [116]. Some DNA microarray probes also are nonspecific and may inadvertently bind to similar sequences, such as those that occur within gene families or splice variants of genes. Finally, microarrays are not optimal for a novel variant discovery since their probes are designed based on known genetic variants; therefore, areas that have not yet been annotated in the genome are omitted from the array [117]. DNA microarrays are still beneficial for generating data on large sample sizes GWAS, for which it is still not economically or technologically feasible for many labs to do WGS [48].

References € ber Pflanzen-Hybriden. [Experiments on Plant Hybrids], 4, Verhandlungen des [1] G. Mendel, Versuche u naturforschenden Vereines in Brunn, der naturfoschung Vereins, 1866, pp. 3–47. [2] T.H. Morgan, The Physical Basis of Heredity, J.B. Lippincott, 1919. [3] R.C. Elston, Segregation analysis, Adv. Hum. Genet. 11 (63–120) (1981) 372–373. Retrieved from, https://www. ncbi.nlm.nih.gov/pubmed/7023205. [4] A. Schnell, J. Witte, Family-Based Study Designs, 2008, pp. 19–28, https://doi.org/10.3109/9781420052923-3. [5] G.P. Jarvik, Complex segregation analyses: uses and limitations, Am. J. Hum. Genet. 63 (4) (1998) 942–946. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/9758633https://doi.org/10.1086/302075.

I. Introduction to gene mapping

References

17

[6] N.E. Morton, S. Yee, R. Lew, Complex segregation analysis, Am. J. Hum. Genet. 23 (6) (1971) 602–611. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/5132068. [7] N.E. Morton, C.J. MacLean, Analysis of family resemblance. 3. Complex segregation of quantitative traits, Am. J. Hum. Genet. 26 (4) (1974) 489–503. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/4842773. [8] J.M. Lalouel, D.C. Rao, N.E. Morton, R.C. Elston, A unified model for complex segregation analysis, Am. J. Hum. Genet. 35 (5) (1983) 816–826. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/6614001. [9] R.C. Elston, K.C. Yelverton, General models for segregation analysis, Am. J. Hum. Genet. 27 (1) (1975) 31–45. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/1171617. [10] G.D. Smith, S. Ebrahim, ‘Mendelian randomization’: can genetic epidemiology contribute to understanding environmental determinants of disease? Int. J. Epidemiol. 32 (1) (2003) 1–22. [11] J.A. Wilson, The factor of hereditary in myopia, Br. Med. J. 2 (2800) (1914) 393–395. Retrieved from, https:// www.jstor.org/stable/25311034. [12] A.C. Bird, X-linked retinitis pigmentosa, Br. J. Ophthalmol. 59 (4) (1975) 177–199. Retrieved from, https:// www.ncbi.nlm.nih.gov/pubmed/1138842. [13] J.A. Boughman, P.M. Conneally, W.E. Nance, Population genetic studies of retinitis pigmentosa, Am. J. Hum. Genet. 32 (2) (1980) 223–235. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/7386458. [14] M. Jay, On the heredity of retinitis pigmentosa, Br. J. Ophthalmol. 66 (7) (1982) 405–416. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/7093178. [15] I.M. Heiba, R.C. Elston, B.E. Klein, R. Klein, Sibling correlations and segregation analysis of age-related maculopathy: the Beaver Dam Eye Study, Genet. Epidemiol. 11 (1) (1994) 51–67. Retrieved from, https:// www.ncbi.nlm.nih.gov/pubmed/8013888. https://doi.org/10.1002/gepi.1370110106. [16] P. Biswas, J.L. Duncan, B. Maranhao, I. Kozak, K. Branham, L. Gabriel, … R. Ayyagari, Genetic analysis of 10 pedigrees with inherited retinal degeneration by exome sequencing and phenotype-genotype association, Physiol. Genomics 49 (4) (2017) 216–229. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/ 28130426. https://doi.org/10.1152/physiolgenomics.00096.2016. [17] P. Biswas, M.A. Naeem, M.H. Ali, M.Z. Assir, S.N. Khan, S. Riazuddin, … R. Ayyagari, Whole-exome sequencing identifies novel variants that co-segregates with autosomal recessive retinal degeneration in a Pakistani pedigree, Adv. Exp. Med. Biol. 1074 (2018) 219–228. Retrieved from, https://www.ncbi.nlm.nih.gov/ pubmed/29721947. https://doi.org/10.1007/978-3-319-75402-4_27. [18] L. Bryant, O. Lozynska, A.M. Maguire, T.S. Aleman, J. Bennett, Prescreening whole exome sequencing results from patients with retinal degeneration for variants in genes associated with retinal degeneration, Clin. Ophthalmol. 12 (12) (2018) 49–63. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/29343940. https://doi.org/10.2147/OPTH.S147684. [19] V. Gupta, B.I. Somarajan, S. Gupta, A.K. Chaurasia, S. Kumar, P. Dutta, … K. Nischal, The inheritance of juvenile onset primary open angle glaucoma, Clin. Genet. 92 (2) (2017) 134–142. Retrieved from, https:// www.ncbi.nlm.nih.gov/pubmed/27779752. https://doi.org/10.1111/cge.12906. [20] A. Davidson, E. Borasio, V. Cipriani, P. Liskova, V. Plagnol, S.J. Tuft, A.J. Hardcastle, Identification and segregation analysis of rare ZNF469 coding variants in familial Keratoconus, Invest. Ophthalmol. Vis. Sci. 55 (13) (2014). Retrieved from://WOS:000433205506032, . [21] R.C. Elston, Linkage and association, Genet. Epidemiol. 15 (6) (1998) 565–576. Retrieved from, https://www. ncbi.nlm.nih.gov/pubmed/9811419. 10.1002/(SICI)1098-2272(1998)15:6<565::AID-GEPI2> 3.0.CO;2-J, . [22] N.E. Morton, Sequential tests for the detection of linkage, Am. J. Hum. Genet. 7 (3) (1955) 277–318. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/13258560. [23] R.C. Elston, Methods of linkage analysis—and the assumptions underlying them, Am. J. Hum. Genet. 63 (4) (1998) 931–934. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/9758631. [24] J.L. Haines, M.A. Pericak-Vance, Genetic Analysis of Complex Diseases, second ed., John Wiley & Sons, 2005. [25] E.S. Lander, N.J. Schork, Genetic dissection of complex traits, Science 265 (5181) (1994) 2037–2048. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/8091226. [26] N. Risch, Linkage strategies for genetically complex traits. II. The power of affected relative pairs, Am. J. Hum. Genet. 46 (2) (1990) 229–241. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/2301393. [27] N. Risch, Linkage strategies for genetically complex traits. III. The effect of marker polymorphism on analysis of affected relative pairs, Am. J. Hum. Genet. 46 (2) (1990) 242–253. Retrieved from, https://www.ncbi.nlm.nih. gov/pubmed/2301394.

I. Introduction to gene mapping

18

2. Segregation, linkage, GWAS, and sequencing

[28] E. Lander, L. Kruglyak, Genetic dissection of complex traits: guidelines for interpreting and reporting linkage results, Nat. Genet. 11 (3) (1995) 241–247. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/7581446, https://doi.org/10.1038/ng1195-241. [29] J. Ott, J. Wang, S.M. Leal, Genetic linkage analysis in the age of whole-genome sequencing, Nat. Rev. Genet. 16 (5) (2015) 275–284. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/25824869, https://doi.org/ 10.1038/nrg3908. [30] J. McClellan, M.C. King, Genetic heterogeneity in human disease, Cell 141 (2) (2010) 210–217. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/20403315. https://doi.org/10.1016/j.cell.2010.03.032. [31] M. Baron, The search for complex disease genes: fault by linkage or fault by association?, Mol. Psychiatry 6 (2) (2001) 143–149. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/11317215. https://doi.org/10. 1038/sj.mp.4000845. [32] N. Risch, K. Merikangas, The future of genetic studies of complex human diseases. Science 273 (5281) (1996) 1516–1517, https://doi.org/10.1126/science.273.5281.1516. [33] R. Mayeux, Mapping the new frontier: complex genetic disorders, J. Clin. Invest. 115 (6) (2005) 1404–1407. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/15931374. https://doi.org/10.1172/JCI25421. [34] M. Boehnke, Limits of resolution of genetic linkage studies: implications for the positional cloning of human disease genes, Am. J. Hum. Genet. 55 (2) (1994) 379–390. Retrieved from, https://www.ncbi.nlm.nih.gov/ pubmed/8037215. [35] M.L. Klein, D.W. Schultz, A. Edwards, T.C. Matise, K. Rust, C.B. Berselli, … T.S. Acott, Age-related macular degeneration. Clinical features in a large family and linkage to chromosome 1q, Arch. Ophthalmol. 116 (8) (1998) 1082–1088. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/9715689. [36] S.K. Iyengar, D. Song, B.E. Klein, R. Klein, J.H. Schick, J. Humphrey, … R.C. Elston, Dissection of genomewidescan data in extended families reveals a major locus and oligogenic susceptibility for age-related macular degeneration, Am. J. Hum. Genet. 74 (1) (2004) 20–39. Retrieved from, https://www.ncbi.nlm.nih.gov/ pubmed/14691731. https://doi.org/10.1086/380912. [37] D.E. Weeks, Y.P. Conley, H.J. Tsai, T.S. Mah, S. Schmidt, E.A. Postel, … M.B. Gorin, Age-related maculopathy: a genomewide scan with continued evidence of susceptibility loci within the 1q31, 10q26, and 17q25 regions, Am. J. Hum. Genet. 75 (2) (2004) 174–189. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/15168325. https://doi.org/10.1086/422476. [38] G.R. Abecasis, B.M. Yashar, Y. Zhao, N.M. Ghiasvand, S. Zareparsi, K.E. Branham, … A. Swaroop, Age-related macular degeneration: a high-resolution genome scan for susceptibility loci in a population enriched for latestage disease, Am. J. Hum. Genet. 74 (3) (2004) 482–494. Retrieved from, https://www.ncbi.nlm.nih.gov/ pubmed/14968411. https://doi.org/10.1086/382786. [39] J. Majewski, D.W. Schultz, R.G. Weleber, M.B. Schain, A.O. Edwards, T.C. Matise, … M.L. Klein, Age-related macular degeneration–a genome scan in extended families, Am. J. Hum. Genet. 73 (3) (2003) 540–550. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/12900797. https://doi.org/10.1086/377701. [40] J.H. Schick, S.K. Iyengar, B.E. Klein, R. Klein, K. Reading, R. Liptak, … R.C. Elston, A whole-genome screen of a quantitative trait of age-related maculopathy in sibships from the Beaver Dam Eye Study, Am. J. Hum. Genet. 72 (6) (2003) 1412–1424. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/12717633. https://doi.org/ 10.1086/375500. [41] J.M. Seddon, S.L. Santangelo, K. Book, S. Chong, J. Cote, A genomewide scan for age-related macular degeneration provides evidence for linkage to several chromosomal regions, Am. J. Hum. Genet. 73 (4) (2003) 780–790. Retrieved from ://WOS:000185676100006, https://doi.org/10.1086/378505. [42] S.A. Fisher, G.R. Abecasis, B.M. Yashar, S. Zareparsi, A. Swaroop, S.K. Iyengar, … B.H. Weber, Meta-analysis of genome scans of age-related macular degeneration, Hum. Mol. Genet. 14 (15) (2005) 2257–2264. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/15987700. https://doi.org/10.1093/hmg/ddi230. [43] J.D. Hoffman, J.N. Cooke Bailey, L. D’Aoust, W. Cade, J. Ayala-Haedo, D. Fuzzell, … J.L. Haines, Rare complement factor H variant associated with age-related macular degeneration in the Amish, Invest. Ophthalmol. Vis. Sci. 55 (7) (2014) 4455–4460. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/24906858. https:// doi.org/10.1167/iovs.13-13684. [44] E.M. Stone, J.H. Fingert, W.L.M. Alward, T.D. Nguyen, J.R. Polansky, S.L.F. Sunden, … V.C. Sheffield, Identification of a gene that causes primary open angle glaucoma, Science 275 (5300) (1997) 668–670. Retrieved from ://WOS:A1997WF07700040, https://doi.org/10.1126/science.275.5300.668. [45] A.P. Klein, P. Duggal, K.E. Lee, C.Y. Cheng, R. Klein, J.E. Bailey-Wilson, B.E. Klein, Linkage analysis of quantitative refraction and refractive errors in the Beaver Dam Eye Study, Invest. Ophthalmol. Vis. Sci. 52 (8) (2011)

I. Introduction to gene mapping

References

[46]

[47]

[48]

[49]

[50]

[51]

[52]

[53]

[54]

[55] [56]

[57]

[58]

[59]

[60]

[61]

[62]

19

5220–5225. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/21571680https://doi.org/10.1167/ iovs.10-7096. D.C. Wang, X.Y. Pan, J.D. Ji, S. Gu, X.T. Sun, C. Jiang, … C. Zhao, A large family with inherited optic disc anomalies: a correlation between a new genetic locus and complex ocular phenotypes, Sci. Rep. 7 (2017), https://doi. org/10.1038/s41598-017-07730-7 Retrieved from ://WOS:000407400500031, . G.M. Clarke, C.A. Anderson, F.H. Pettersson, L.R. Cardon, A.P. Morris, K.T. Zondervan, Basic statistical analysis in genetic case-control studies, Nat. Protoc. 6 (2) (2011) 121–133. Retrieved from, https://www.ncbi.nlm. nih.gov/pubmed/21293453. https://doi.org/10.1038/nprot.2010.182. P.M. Visscher, N.R. Wray, Q. Zhang, P. Sklar, M.I. McCarthy, M.A. Brown, J. Yang, 10 Years of GWAS discovery: biology, function, and translation, Am. J. Hum. Genet. 101 (1) (2017) 5–22. Retrieved from, https://www. ncbi.nlm.nih.gov/pubmed/28686856. https://doi.org/10.1016/j.ajhg.2017.06.005. N.J. Schork, S.S. Murray, K.A. Frazer, E.J. Topol, Common vs. rare allele hypotheses for complex diseases, Curr. Opin. Genet. Dev. 19 (3) (2009) 212–219. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/ 19481926https://doi.org/10.1016/j.gde.2009.04.010. K. Ozaki, Y. Ohnishi, A. Iida, A. Sekine, R. Yamada, T. Tsunoda, … T. Tanaka, Functional SNPs in the lymphotoxin-alpha gene that are associated with susceptibility to myocardial infarction, Nat. Genet. 32 (4) (2002) 650–654. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/12426569. https://doi.org/10.1038/ng1047. R.J. Klein, C. Zeiss, E.Y. Chew, J.Y. Tsai, R.S. Sackler, C. Haynes, … J. Hoh, Complement factor H polymorphism in age-related macular degeneration, Science 308 (5720) (2005) 385–389. Retrieved from, https:// www.ncbi.nlm.nih.gov/pubmed/15761122. https://doi.org/10.1126/science.1109557. The Wellcome Trust Case Control Consortium, Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls, Nature 447 (7145) (2007) 661–678. Retrieved from, https://www.ncbi. nlm.nih.gov/pubmed/17554300. https://doi.org/10.1038/nature05911. T. LaFramboise, Single nucleotide polymorphism arrays: a decade of biological, computational and technological advances, Nucleic Acids Res. 37 (13) (2009) 4181–4193. Retrieved from, https://www.ncbi.nlm.nih.gov/ pubmed/19570852. https://doi.org/10.1093/nar/gkp552. J.R. Black, S.J. Clark, Age-related macular degeneration: genome-wide association studies to translation, Genet. Med. 18 (4) (2016) 283–289. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/26020418. https://doi. org/10.1038/gim.2015.70. D.B. Goldstein, Common genetic variation and human traits, N. Engl. J. Med. 360 (17) (2009) 1696–1698. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/19369660. https://doi.org/10.1056/NEJMp0806284. T.A. Manolio, F.S. Collins, N.J. Cox, D.B. Goldstein, L.A. Hindorff, D.J. Hunter, … P.M. Visscher, Finding the missing heritability of complex diseases, Nature 461 (7265) (2009) 747–753. Retrieved from, https://www.ncbi. nlm.nih.gov/pubmed/19812666. https://doi.org/10.1038/nature08494. H. Aschard, J.H. Kang, A.I. Iglesias, P. Hysi, J.N. Cooke Bailey, A.P. Khawaja, … L.R. Pasquale, Genetic correlations between intraocular pressure, blood pressure and primary open-angle glaucoma: a multi-cohort analysis, Eur. J. Hum. Genet. 25 (11) (2017) 1261–1267. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/ 28853718. https://doi.org/10.1038/ejhg.2017.136. J.N. Cooke Bailey, S.J. Loomis, J.H. Kang, R.R. Allingham, P. Gharahkhani, C.C. Khor, … Consortium, A, Genome-wide association analysis identifies TXNRD2, ATXN2 and FOXC1 as susceptibility loci for primary open-angle glaucoma, Nat. Genet. 48 (2) (2016) 189–194. Retrieved from :// WOS:000369043900017, https://doi.org/10.1038/ng.3482. L.G. Fritsche, W. Chen, M. Schu, B.L. Yaspan, Y. Yu, G. Thorleifsson, … Consortium, A. M. D. G, Seven new loci associated with age-related macular degeneration, Nat. Genet. 45 (4) (2013) 433–439. 439e431–432. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/23455636. https://doi.org/10.1038/ng.2578. L.G. Fritsche, W. Igl, J.N. Cooke Bailey, F. Grassmann, S. Sengupta, J.L. Bragg-Gresham, … I.M. Heid, A large genome-wide association study of age-related macular degeneration highlights contributions of rare and common variants, Nat. Genet. 48 (2) (2016) 134–143. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/ 26691988. https://doi.org/10.1038/ng.3448. A.P. Khawaja, J.N.C. Bailey, N.J. Wareham, R.A. Scott, M. Simcoe, R.P. Igo, … Consortium, N, Genome-wide analyses identify 68 new loci associated with intraocular pressure and improve risk prediction for primary open-angle glaucoma, Nat. Genet. 50 (6) (2018) 778–782. Retrieved from: WOS:000433621000004, https:// doi.org/10.1038/s41588-018-0126-8. Y. Lu, V. Vitart, K.P. Burdon, C.C. Khor, Y. Bykhovskaya, A. Mirshahi, … T.Y. Wong, Genome-wide association analyses identify multiple loci associated with central corneal thickness and keratoconus, Nat. Genet. 45 (2)

I. Introduction to gene mapping

20

[63]

[64]

[65]

[66]

[67]

[68]

[69]

[70]

[71]

[72]

[73]

[74]

[75]

[76]

[77]

2. Segregation, linkage, GWAS, and sequencing

(2013) 155–163. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/23291589. https://doi.org/10. 1038/ng.2506. V.J. Verhoeven, P.G. Hysi, R. Wojciechowski, Q. Fan, J.A. Guggenheim, R. Hohn, … C.J. Hammond, Genomewide meta-analyses of multiancestry cohorts identify multiple new susceptibility loci for refractive error and myopia, Nat. Genet. 45 (3) (2013) 314–318. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/ 23396134. https://doi.org/10.1038/ng.2554. C.J. Hoggart, T.G. Clark, M. De Iorio, J.C. Whittaker, D.J. Balding, Genome-wide significance for dense SNP and resequencing data, Genet. Epidemiol. 32 (2) (2008) 179–185. Retrieved from, https://www.ncbi.nlm.nih.gov/ pubmed/18200594. https://doi.org/10.1002/gepi.20292. M.I. McCarthy, G.R. Abecasis, L.R. Cardon, D.B. Goldstein, J. Little, J.P. Ioannidis, J.N. Hirschhorn, Genome-wide association studies for complex traits: consensus, uncertainty and challenges, Nat. Rev. Genet. 9 (5) (2008) 356–369. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/18398418. https://doi.org/10. 1038/nrg2344. The International HapMap Consortium, A haplotype map of the human genome, Nature 437 (7063) (2005) 1299–1320. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/16255080. https://doi.org/10.1038/ nature04226. A.O. Edwards, R. Ritter 3rd, K.J. Abel, A. Manning, C. Panhuysen, L.A. Farrer, Complement factor H polymorphism and age-related macular degeneration, Science 308 (5720) (2005) 421–424. Retrieved from, https:// www.ncbi.nlm.nih.gov/pubmed/15761121. https://doi.org/10.1126/science.1110189. G.S. Hageman, D.H. Anderson, L.V. Johnson, L.S. Hancox, A.J. Taiber, L.I. Hardisty, … R. Allikmets, A common haplotype in the complement regulatory gene factor H (HF1/CFH) predisposes individuals to age-related macular degeneration, Proc. Natl. Acad. Sci. U. S. A. 102 (20) (2005) 7227–7232. Retrieved from, https://www. ncbi.nlm.nih.gov/pubmed/15870199. https://doi.org/10.1073/pnas.0501536102. J.L. Haines, M.A. Hauser, S. Schmidt, W.K. Scott, L.M. Olson, P. Gallins, … M.A. Pericak-Vance, Complement factor H variant increases the risk of age-related macular degeneration, Science 308 (5720) (2005) 419–421. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/15761120. https://doi.org/10.1126/science.1110359. S. Arakawa, A. Takahashi, K. Ashikawa, N. Hosono, T. Aoi, M. Yasuda, … M. Kubo, Genome-wide association study identifies two susceptibility loci for exudative age-related macular degeneration in the Japanese population, Nat. Genet. 43 (10) (2011) 1001–1004. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/ 21909106. https://doi.org/10.1038/ng.938. W. Chen, D. Stambolian, A.O. Edwards, K.E. Branham, M. Othman, J. Jakobsdottir, … A. Swaroop, Genetic variants near TIMP3 and high-density lipoprotein-associated loci influence susceptibility to age-related macular degeneration, Proc. Natl. Acad. Sci. U. S. A. 107 (16) (2010) 7401–7406. Retrieved from, https://www.ncbi. nlm.nih.gov/pubmed/20385819. https://doi.org/10.1073/pnas.0912702107. J.A. Fagerness, J.B. Maller, B.M. Neale, R.C. Reynolds, M.J. Daly, J.M. Seddon, Variation near complement factor I is associated with risk of advanced AMD, Eur. J. Hum. Genet. 17 (1) (2009) 100–104. Retrieved from, https:// www.ncbi.nlm.nih.gov/pubmed/18685559. https://doi.org/10.1038/ejhg.2008.140. Y. Yu, T.R. Bhangale, J. Fagerness, S. Ripke, G. Thorleifsson, P.L. Tan, … J.M. Seddon, Common variants near FRK/COL10A1 and VEGFA are associated with advanced age-related macular degeneration, Hum. Mol. Genet. 20 (18) (2011) 3699–3709. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/21665990. https://doi.org/10.1093/hmg/ddr270. S. Raychaudhuri, O. Iartchouk, K. Chin, P.L. Tan, A.K. Tai, S. Ripke, … J.M. Seddon, A rare penetrant mutation in CFH confers high risk of age-related macular degeneration, Nat. Genet. 43 (12) (2011) 1232–1236. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/22019782. https://doi.org/10.1038/ng.976. J.M. Seddon, Y. Yu, E.C. Miller, R. Reynolds, P.L. Tan, S. Gowrisankar, … S. Raychaudhuri, Rare variants in CFI, C3 and C9 are associated with high risk of advanced age-related macular degeneration, Nat. Genet. 45 (11) (2013) 1366–1370. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/24036952. https://doi. org/10.1038/ng.2741. S.S. Verma, J.N. Cooke Bailey, A. Lucas, Y. Bradford, J.G. Linneman, M.A. Hauser, … Consortium, N, Epistatic gene-based interaction analyses for glaucoma in eMERGE and NEIGHBOR consortium, PLoS Genet. 12 (9) (2016)e1006186 Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/27623284. https://doi.org/10. 1371/journal.pgen.1006186. K. Abu-Amero, A.A. Kondkar, K.V. Chalam, An updated review on the genetics of primary open angle glaucoma, Int. J. Mol. Sci. 16 (12) (2015) 28886–28911. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/ 26690118. https://doi.org/10.3390/ijms161226135.

I. Introduction to gene mapping

References

21

[78] R. Sofat, J.P. Casas, A.R. Webster, A.C. Bird, S.S. Mann, J.R. Yates, … A.D. Hingorani, Complement factor H genetic variant and age-related macular degeneration: effect size, modifiers and relationship to disease subtype, Int. J. Epidemiol. 41 (1) (2012) 250–262. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/ 22253316. https://doi.org/10.1093/ije/dyr204. [79] F. Sanger, S. Nicklen, A.R. Coulson, DNA sequencing with chain-terminating inhibitors, Proc. Natl. Acad. Sci. U. S. A. 74 (12) (1977) 5463–5467. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/271968. [80] S. Anderson, Shotgun DNA sequencing using cloned DNase I-generated fragments, Nucleic Acids Res. 9 (13) (1981) 3015–3027. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/6269069. [81] E.S. Lander, L.M. Linton, B. Birren, C. Nusbaum, M.C. Zody, J. Baldwin, … International human genome sequencing consortium, Initial sequencing and analysis of the human genome, Nature 409 (6822) (2001) 860–921. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/11237011. https://doi.org/10.1038/35057062. [82] J.A. Schloss, How to get genomes at one ten-thousandth the cost, Nat. Biotechnol. 26 (10) (2008) 1113–1115. Retrieved from ://WOS:000259926000025, https://doi.org/10.1038/nbt1008-1113. [83] J.C. Venter, M.D. Adams, E.W. Myers, P.W. Li, R.J. Mural, G.G. Sutton, … X. Zhu, The sequence of the human genome, Science 291 (5507) (2001) 1304–1351. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/ 11181995. https://doi.org/10.1126/science.1058040. [83a] M. Margulies, M. Egholm, W.E. Altman, S. Attiya, J.S. Bader, L.A. Bemben, … J.M. Rothberg, Genome sequencing in microfabricated high-density picolitre reactors, Nature 437 (7057) (2005) 376–380, https://doi. org/10.1038/nature03959. [83b] J.C. Dohm, C. Lottaz, T. Borodina, H. Himmelbauer, Substantial biases in ultra-short read data sets from highthroughput DNA sequencing, Nucleic Acids Res. 36 (16) (2008) e105, https://doi.org/10.1093/nar/gkn425. [83c] J. Guo, N. Xu, Z. Li, S. Zhang, J. Wu, D.H. Kim, … J. Ju, Four-color DNA sequencing with 3’-O-modified nucleotide reversible terminators and chemically cleavable fluorescent dideoxynucleotides, Proc. Natl. Acad. Sci. U. S. A. 105 (27) (2008) 9145–9150, https://doi.org/10.1073/pnas.0804023105. [84] L. Liu, Y.H. Li, S.L. Li, N. Hu, Y.M. He, R. Pong, … M. Law, Comparison of next-generation sequencing systems. J. Biomed. Biotechnol. (2012). https://doi.org/10.1155/2012/251364 Retrieved from: WOS:000307669100001. [84a] A. Valouev, J. Ichikawa, T. Tonthat, J. Stuart, S. Ranade, H. Peckham, … S.M. Johnson, A high-resolution, nucleosome position map of C. elegans reveals a lack of universal sequence-dictated positioning, Genome Res. 18 (7) (2008) 1051–1063, https://doi.org/10.1101/gr.076463.108. [85] Y.F. Huang, S.C. Chen, Y.S. Chiang, T.H. Chen, K.P. Chiu, Palindromic sequence impedes sequencing-byligation mechanism, BMC Syst. Biol. 6 (Suppl 2) (2012) S10. Retrieved from, https://www.ncbi.nlm.nih. gov/pubmed/23281822. https://doi.org/10.1186/1752-0509-6-S2-S10. [85a] A. Mellmann, D. Harmsen, C.A. Cummings, E.B. Zentz, S.R. Leopold, A. Rico, … H. Karch, Prospective genomic characterization of the German enterohemorrhagic Escherichia coli O104:H4 outbreak by rapid next generation sequencing technology, PLoS One 6 (7) (2011) e22751, https://doi.org/10.1371/journal.pone.0022751. [85b] J.M. Rothberg, W. Hinz, T.M. Rearick, J. Schultz, W. Mileski, M. Davey, … J. Bustillo, An integrated semiconductor device enabling non-optical genome sequencing, Nature 475 (7356) (2011) 348–352, https://doi.org/ 10.1038/nature10242. [85c] S.S. Ajay, S.C. Parker, H.O. Abaan, K.V. Fajardo, E.H. Margulies, Accurate and comprehensive sequencing of personal genomes, Genome Res. 21 (9) (2011) 1498–1505, https://doi.org/10.1101/gr.123638.111. [85d] E. Borgstrom, S. Lundin, J. Lundeberg, Large scale library generation for high throughput sequencing. PLoS One 6 (4) (2011) e19119, https://doi.org/10.1371/journal.pone.0019119. [85e] L. Liu, N. Hu, B. Wang, M. Chen, J. Wang, Z. Tian, … D. Lin, A brief utilization report on the Illumina HiSeq 2000 sequencer, Mycology 2 (3) (2011) 169–191, https://doi.org/10.1080/21501203.2011.615871. [85f] O. Harismendy, R.B. Schwab, L. Bao, J. Olson, S. Rozenzhak, S.K. Kotsopoulos, … K.A. Frazer, Detection of low prevalence somatic mutations in solid tumors with ultra-deep targeted sequencing, Genome Biol. 12 (12) (2011) R124, https://doi.org/10.1186/gb-2011-12-12-r124. [85g] N.J. Loman, R.V. Misra, T.J. Dallman, C. Constantinidou, S.E. Gharbia, J. Wain, M.J. Pallen, Performance comparison of benchtop high-throughput sequencing platforms, Nat. Biotechnol. 30 (5) (2012) 434–439, https://doi. org/10.1038/nbt.2198. [86] M.O. Carneiro, C. Russ, M.G. Ross, S.B. Gabriel, C. Nusbaum, M.A. DePristo, Pacific biosciences sequencing technology for genotyping and variation discovery in human data, BMC Genomics 13 (2012) 375. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/22863213. https://doi.org/10.1186/1471-2164-13-375.

I. Introduction to gene mapping

22

2. Segregation, linkage, GWAS, and sequencing

[87] Z. Feng, G. Fang, J. Korlach, T. Clark, K. Luong, X. Zhang, … E. Schadt, Detecting DNA modifications from SMRT sequencing data by modeling sequence context dependence of polymerase kinetic, PLoS Comput. Biol. 9 (3) (2013) e1002935. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/23516341, https://doi.org/ 10.1371/journal.pcbi.1002935. [88] B.A. Flusberg, D.R. Webster, J.H. Lee, K.J. Travers, E.C. Olivares, T.A. Clark, S.W. Turner, Direct detection of DNA methylation during single-molecule, real-time sequencing, Nat. Methods 7 (6) (2010) 461–465. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/20453866, https://doi.org/10.1038/nmeth.1459. [89] P.M. Ashton, S. Nair, T. Dallman, S. Rubino, W. Rabsch, S. Mwaigwisya, … J. O’Grady, MinION nanopore sequencing identifies the position and structure of a bacterial antibiotic resistance island, Nat. Biotechnol. 33 (3) (2015) 296–300. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/25485618. https://doi.org/10.1038/ nbt.3103. [90] M. Jain, I.T. Fiddes, K.H. Miga, H.E. Olsen, B. Paten, M. Akeson, Improved data analysis for the MinION nanopore sequencer, Nat. Methods 12 (4) (2015) 351–356. Retrieved from, https://www.ncbi.nlm.nih.gov/ pubmed/25686389. https://doi.org/10.1038/nmeth.3290. [91] Y. Wang, Q. Yang, Z. Wang, The evolution of nanopore sequencing, Front. Genet. 5 (2014) 449. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/25610451. https://doi.org/10.3389/fgene.2014.00449. [92] Zhang, W., Huang, N., Zheng, J., Liao, X., Wang, J., & Li, H. D. (2019). A sequence-based novel approach for quality evaluation of third-generation sequencing reads. Genes (Basel), 10, 1. Retrieved from https://www. ncbi.nlm.nih.gov/pubmed/30646604. https://doi.org/10.3390/genes10010044 [93] J.K. Teer, J.C. Mullikin, Exome sequencing: the sweet spot before whole genomes, Hum. Mol. Genet. 19 (R2) (2010) R145–R151. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/20705737, https://doi.org/10. 1093/hmg/ddq333. [94] S.B. Ng, E.H. Turner, P.D. Robertson, S.D. Flygare, A.W. Bigham, C. Lee, … J. Shendure, Targeted capture and massively parallel sequencing of 12 human exomes, Nature 461 (7261) (2009) 272–276. Retrieved from, https:// www.ncbi.nlm.nih.gov/pubmed/19684571. https://doi.org/10.1038/nature08250. [95] M. Choi, U.I. Scholl, W. Ji, T. Liu, I.R. Tikhonova, P. Zumbo, … R.P. Lifton, Genetic diagnosis by whole exome capture and massively parallel DNA sequencing, Proc. Natl. Acad. Sci. U. S. A. 106 (45) (2009) 19096–19101. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/19861545. https://doi.org/10.1073/ pnas.0910672106. [96] S.B. Ng, K.J. Buckingham, C. Lee, A.W. Bigham, H.K. Tabor, K.M. Dent, … M.J. Bamshad, Exome sequencing identifies the cause of a mendelian disorder, Nat. Genet. 42 (1) (2010) 30–35. Retrieved from, https://www.ncbi. nlm.nih.gov/pubmed/19915526. https://doi.org/10.1038/ng.499. [97] S. Zuchner, J. Dallman, R. Wen, G. Beecham, A. Naj, A. Farooq, … M.A. Pericak-Vance, Whole-exome sequencing links a variant in DHDDS to retinitis pigmentosa, Am. J. Hum. Genet. 88 (2) (2011) 201–206. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/21295283. https://doi.org/10.1016/j.ajhg.2011.01.001. [98] J.M. Ellingford, S. Barton, S. Bhaskar, S.G. Williams, P.I. Sergouniotis, J. O’Sullivan, … G.C. Black, Whole genome sequencing increases molecular diagnostic yield compared with current diagnostic testing for inherited retinal disease, Ophthalmology 123 (5) (2016) 1143–1150. Retrieved from, https://www.ncbi.nlm.nih.gov/ pubmed/26872967. https://doi.org/10.1016/j.ophtha.2016.01.009. [99] B. Royer-Bertrand, M. Torsello, D. Rimoldi, I. El Zaoui, K. Cisarova, R. Pescini-Gobert, … C. Rivolta, Comprehensive genetic landscape of uveal melanoma by whole-genome sequencing, Am. J. Hum. Genet. 99 (5) (2016) 1190–1198. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/27745836. https://doi.org/10.1016/j. ajhg.2016.09.008. [100] A. Hunter, P.A. Spechler, A. Cwanger, Y. Song, Z. Zhang, G.S. Ying, … J.L. Dunaief, DNA methylation is associated with altered gene expression in AMD, Invest. Ophthalmol. Vis. Sci. 53 (4) (2012) 2089–2105. Retrieved from: WOS:000303669400046, https://doi.org/10.1167/iovs.11-8449. [101] V.F. Oliver, A.E. Jaffe, J. Song, G. Wang, P. Zhang, K.E. Branham, … S.L. Merbs, Differential DNA methylation identified in the blood and retina of AMD patients, Epigenetics 10 (8) (2015) 698–707. Retrieved from, https:// www.ncbi.nlm.nih.gov/pubmed/26067391. https://doi.org/10.1080/15592294.2015.1060388. [102] D.S. Johnson, A. Mortazavi, R.M. Myers, B. Wold, Genome-wide mapping of in vivo protein-DNA interactions, Science 316 (5830) (2007) 1497–1502. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/17540862. https://doi.org/10.1126/science.1141319.

I. Introduction to gene mapping

References

23

[103] A.P. Boyle, S. Davis, H.P. Shulha, P. Meltzer, E.H. Margulies, Z. Weng, … G.E. Crawford, High-resolution mapping and characterization of open chromatin across the genome, Cell 132 (2) (2008) 311–322. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/18243105, https://doi.org/10.1016/j.cell.2007.12.014. [104] J.D. Buenrostro, P.G. Giresi, L.C. Zaba, H.Y. Chang, W.J. Greenleaf, Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position, Nat. Methods 10 (12) (2013) 1213–1218. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/24097267. https://doi.org/10.1038/nmeth.2688. [105] M.R. Corces, A.E. Trevino, E.G. Hamilton, P.G. Greenside, N.A. Sinnott-Armstrong, S. Vesuna, … H.Y. Chang, An improved ATAC-seq protocol reduces background and enables interrogation of frozen tissues, Nat. Methods 14 (10) (2017) 959–962. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/28846090. https://doi.org/10.1038/nmeth.4396. [106] J. Wang, C. Zibetti, P. Shang, S.R. Sripathi, P. Zhang, M. Cano, … J. Qian, ATAC-Seq analysis reveals a widespread decrease of chromatin accessibility in age-related macular degeneration, Nat. Commun. 9 (1) (2018) 1364. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/29636475. https://doi.org/10.1038/s41467018-03856-y. [107] J. Dekker, K. Rippe, M. Dekker, N. Kleckner, Capturing chromosome conformation, Science 295 (5558) (2002) 1306–1311. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/11847345. https://doi.org/10.1126/ science.1067799. [108] J. Dostie, T.A. Richmond, R.A. Arnaout, R.R. Selzer, W.L. Lee, T.A. Honan, … J. Dekker, Chromosome conformation capture carbon copy (5C): a massively parallel solution for mapping interactions between genomic elements, Genome Res. 16 (10) (2006) 1299–1309. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/ 16954542. https://doi.org/10.1101/gr.5571506. [109] E. Lieberman-Aiden, N.L. van Berkum, L. Williams, M. Imakaev, T. Ragoczy, A. Telling, … J. Dekker, Comprehensive mapping of long-range interactions reveals folding principles of the human genome, Science 326 (5950) (2009) 289–293. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/19815776. https://doi.org/10. 1126/science.1181369. [110] M. Simonis, P. Klous, E. Splinter, Y. Moshkin, R. Willemsen, E. de Wit, … W. de Laat, Nuclear organization of active and inactive chromatin domains uncovered by chromosome conformation capture-on-chip (4C), Nat. Genet. 38 (11) (2006) 1348–1354. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/17033623. https://doi.org/10.1038/ng1896. [111] Y. Chu, D.R. Corey, RNA sequencing: platform selection, experimental design, and data interpretation, Nucleic Acid Ther 22 (4) (2012) 271–274. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/22830413. https:// doi.org/10.1089/nat.2012.0367. [112] B.A. Rheaume, A. Jereen, M. Bolisetty, M.S. Sajid, Y. Yang, K. Renna, … E.F. Trakhtenberg, Single cell transcriptome profiling of retinal ganglion cells identifies cellular subtypes. Nat. Commun. 9 (2018). https:// doi.org/10.1038/s41467-018-05134-3 Retrieved from ://WOS:000438856500001, . [113] J.A. Reuter, D.V. Spacek, M.P. Snyder, High-throughput sequencing technologies, Mol. Cell 58 (4) (2015) 586–597. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/26000844. https://doi.org/10.1016/j. molcel.2015.05.004. [114] J. Huddleston, M.J.P. Chaisson, K.M. Steinberg, W. Warren, K. Hoekzema, D. Gordon, … E.E. Eichler, Discovery and genotyping of structural variation from long-read haploid genome sequence data, Genome Res. 27 (5) (2017) 677–685. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/27895111. https://doi.org/10. 1101/gr.214007.116. [115] International Human Genome Sequencing Consortium, Finishing the euchromatic sequence of the human genome, Nature 431 (7011) (2004) 931–945. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/15496913. https://doi.org/10.1038/nature03001. [116] C.A. Scacheri, P.C. Scacheri, Mutations in the noncoding genome, Curr. Opin. Pediatr. 27 (6) (2015) 659–664. Retrieved from, https://www.ncbi.nlm.nih.gov/pubmed/26382709. https://doi.org/10.1097/MOP. 0000000000000283. [117] Bumgarner, R. (2013). Overview of DNA microarrays: types, applications, and their future. Curr. Protoc. Mol. Biol., Chapter 22, Unit 22 21. Retrieved from: https://www.ncbi.nlm.nih.gov/pubmed/23288464. https://doi. org/10.1002/0471142727.mb2201s101

I. Introduction to gene mapping