Origin of the PSEN1 E280A mutation causing early-onset Alzheimer's disease

Origin of the PSEN1 E280A mutation causing early-onset Alzheimer's disease

Alzheimer’s & Dementia - (2013) 1–7 Research Article Origin of the PSEN1 E280A mutation causing early-onset Alzheimer’s disease Matthew A. Lallia, H...

2MB Sizes 55 Downloads 52 Views

Alzheimer’s & Dementia - (2013) 1–7

Research Article

Origin of the PSEN1 E280A mutation causing early-onset Alzheimer’s disease Matthew A. Lallia, Hannah C. Coxb, Mary L. Arcilaa, Liliana Cadavidc, Sonia Morenoc, Gloria Garciac, Lucia Madrigalc, Eric M. Reimand, Mauricio Arcos-Burgosc,e, Gabriel Bedoyaf, Mary E. Brunkowb, Gustavo Glusmanb, Jared C. Roachb, Leroy Hoodb, Kenneth S. Kosika,*, Francisco Loperac a

Neuroscience Research Institute, Department of Molecular Cellular Developmental Biology, University of California, Santa Barbara, CA, USA b Institute for Systems Biology, Seattle, WA, USA c Grupo de Neurociencias de Antioquia, Universidad de Antioquia, Medellın, Colombia d Banner Alzheimer’s Institute, Phoenix, AZ, USA e Translational Genomics Group, Genome Biology Department, John Curtin School of Medical Research, ANU College of Medicine, Biology and Environment, The Australian National University, Canberra, ACT, Australia f Grupo de Genetica Molecular (GENMOL), Universidad de Antioquia, Medellın, Colombia

Abstract

Background: A mutation in presenilin 1 (E280A) causes early-onset Alzheimer’s disease. Understanding the origin of this mutation will inform medical genetics. Methods: We sequenced the genomes of 102 individuals from Antioquia, Colombia. We applied identity-by-descent analysis to identify regions of common ancestry. We estimated the age of the E280A mutation and the local ancestry of the haplotype harboring this mutation. Results: All affected individuals share a minimal haplotype of 1.8 Mb containing E280A. We estimate a time to most recent common ancestor of E280A of 10 (95% credible interval, 7.2–12.6) generations. We date the de novo mutation event to 15 (95% credible interval, 11–25) generations ago. We infer a western European geographic origin of the shared haplotype. Conclusions: The age and geographic origin of E280A are consistent with a single founder dating from the time of the Spanish Conquistadors who began colonizing Colombia during the early 16th century. Ó 2013 The Alzheimer’s Association. All rights reserved.

Keywords:

Alzheimer’s disease; PSEN1; population genetics; whole-genome sequencing

1. Introduction A glutamic acid-to-alanine mutation at codon 280 (E280A) in the presenilin 1 gene (PSEN1) causing earlyonset familial Alzheimer’s disease (EOFAD) at a mean age of 49 years [1] and displaying a fully penetrant autosomal dominant transmission was first discovered in a kindred in the state of Antioquia, Colombia [2,3]. Mutations in PSEN1 are the most common cause of autosomal dominant

Conflicts of Interest: The authors have none to declare. *Corresponding author. Tel: 11-805-893-5222; Fax: 11-805-893-2017. E-mail address: [email protected]

Alzheimer’s disease, with w185 mutations affecting more than 400 families [4]. Almost 70% of pathological mutations in PSEN1 occur in exons 5 to 8 [4] and act through altering g-secretase activity and the production of amyloid-b peptide Ab42 [5]. The Antioquian kindred, with more than 5000 members, is the largest in the world, and all affected individuals have the same mutation, present the same phenotype, come from the same geographical region, and share common last names [6]. Recent whole-genome sequencing of six affected individuals from this kindred revealed an extensive stretch of identity-by-descent (IBD) containing the causal E280A mutation, suggesting the introduction of this allele into the

1552-5260/$ - see front matter Ó 2013 The Alzheimer’s Association. All rights reserved. http://dx.doi.org/10.1016/j.jalz.2013.09.005

M.A. Lalli et al. / Alzheimer’s & Dementia - (2013) 1–7

2

population through a recent common founder event [7]. In this study, we investigate a large sample of subjects from this extended kindred through whole-genome sequencing. Our goals were (1) to determine the extent of IBD surrounding the E280A, (2) to estimate the age of the E280A mutation, and (3) to determine on which genetic background the mutation arose. 2. Methods 2.1. Participants Nearly 1000 individuals in the extended kindred have been identified and grouped into 24 subpedigrees (Fig. 1A). Each pedigree was assigned an identification code consisting of the letter C (referring to Colombia), followed by a number indicating the order in which they were found. To assemble these 24 pedigrees and find the possible first founder of E280A, we used several strategies: interviews with elder healthy individuals of each affected family, interviews with genealogists and historians of the region, study of local historical and genealogical books, and examination of last wills and ecclesiastical records (baptismal, confirmations, marriage, death, census, and so forth) dating back as far as 1540. At the end of this research, historical records suggested that the affected population has a common ancestor from Spain [6], and we have assembled a large pedigree showing common ancestry for 13 families (Fig. 1B). Individuals in the extended kindred are enrolled in the Colombian Alzheimer’s Prevention Initiative registry. This registry comprises more than 2700 living members from

C6

C4

the Antioquian EOFAD kindred. Approximately 30% of these members carry the PSEN1 E280A mutation. In total, this kindred is estimated to have w5000 living relatives, including w1500 mutation carriers. Enrollment in this study was pursuant to approval by the institutional review board at the Universidad de Antioquia and the Western Institutional Review Board, Institute for Systems Biology. Additional descriptions of the PSEN1 E280A Antioquian population, study participants, and relevant procedures have been provided previously and extensively [1]. 2.2. Whole-genome sequencing Including six genomes from our previous study, DNA was collected for 102 samples in the Antioquian EOFAD cohort, and sequencing was performed by Complete Genomics Inc. (CGI, Mountain View, CA) using unchained combinatorial probe anchor chemistry [8]. Of the 102 samples, 74 individuals harbored the E280A mutation. Our study analyzes chromosome 14 (n 5 204) and includes 75 chromosomes bearing the E280A mutation (one subject was homozygous for the mutation). Reads were aligned to the reference genome (National Center for Biotechnology Information Build 37, Genome Reference Consortium Human Build 37) and variants were called by CGI. The average depth of sequencing coverage across chromosome 14 was more than 50, allowing for sensitive and accurate single nucleotide polymorphism detection. Previous studies estimate the error rate of CGI sequencing methods to be w1 ! 1025 [9]. Only variants with high confidence scores assigned by CGI were included in our analysis.

1745

Yarumal C3, C12 C14, C15

1770 C7, C17

1790

C11, C18, C22 C13

1812

C1 C19 C9, C16 Medellín C23

1850 C2, C5, C10, C21, C25

1890

C8, C24

1910 1930

Fig. 1. Geographic map of E280A carriers and a pedigree reconstructed from historical and genealogical accounts. (A) Twenty-four families affected with early-onset familial Alzheimer’s disease caused by E280A are highlighted on the map of the Department of Antioquia, Colombia. (B) For 13 of the recruited families with E280A, genealogical records demonstrate a common pair of founders in 1745. The living members of this family are distributed throughout most of the regions indicated in view A, consistent with the hypothesis that all affected individuals in Antioquia share an even more distant common founder with a similar dispersal pattern of descendants.

M.A. Lalli et al. / Alzheimer’s & Dementia - (2013) 1–7

2.3. Haplotype reconstruction The presence of an extensive region of IBD around PSEN1 common in all carriers of E280A implies that the E280A allele resides on this haplotype. This allows us to reconstruct an extended haplotype using systematic long-range phasing (SLRP v1.0) [10], which is ideal for use in a recent founder population such as the Antioquian EOFAD cohort. 2.4. Estimating the age of E280A After reconstructing haplotype phase using systematic long-range phasing, we selected informative markers in a w30-Mb region centered on E280A to estimate the age of E280A. We used two independent methods to determine the age of the E280A allele: a single-marker method [11,12] and the coalescent modeling of intra-allelic variability by DMLE12.2 software (DMLE) [13]. The singlemarker method is based on the expected exponential decay of a given haplotype over generation time and it estimates the time since the most recent common ancestor (TMRCA) for a given mutation. In contrast, DMLE estimates the actual age of the de novo mutation event, thus we expect an older estimated age from DMLE. The single-marker method, unlike DMLE, fails to account for population growth rate, which leads to a further underestimation of the mutation age. In addition, the superexponential growth of the study population will lead to a further underestimation of the mutation age. We estimated the mutation age in generations (g) with a single-marker method using the following formula [11]:   d 2pn ln p12p n g5

lnð12qÞ

3

Here, pd is the frequency of a given marker on the haplotype bearing the E280A allele, and pn is the overall frequency of a given marker in the sample population (n 5 204 chromosomes). q is the recombination rate between a given marker and E280A. For this equation to work, pd must be greater than pn. Physical positions (base pair) were converted to genetic distances (centi-Morgans) using the genetic map obtained from the HapMap2 database lifted over to GRCh37 [14]. Physical positions absent from this map were interpolated. Because our age estimates are based on the breakdown of linkage disequilibrium resulting from recombination, we identified recombination breakpoints around the mutation and sorted these by position (Fig. E1). Constraining our marker selection to IBD segments, we ensure that pd will be greater than pn. Because many haplotypes in our study share the same end points, we chose markers that distinguish haplotypes of different lengths to capture the extent of haplotype diversity in our data set. These markers are listed in Table 1. We then used a second method, modeling the intra-allelic coalescent using DMLE [13], to estimate the age of the mutation. Haplotypes for disease and nondisease chromosomes were constructed with the markers used in the previous section. Growth rate per generation was estimated by   ln PPot Growth Rategen 5 ; g where, Pt is the current population size, Po is the initial population size, and g is the number of generations between the current population size and the population size at the moment of mutation origin (assuming 25 years

:

Table 1 Single marker estimates of time to most recent common ancestor of E280A Marker

pd*

pny

qz

Generations

Agex

rs1504606 rs11850859 rs61992164 rs191808201 rs1106494 rs55921409 rs2332457 rs8020803 E280A rs205813 rs117346379 rs1652593 rs61985647 rs61988101 rs36038796

0.68 0.73 0.39 0.35 0.81 0.89 0.87 0.89 — 0.79 0.72 0.67 0.57 0.53 0.47

0.41 0.57 0.36 0.17 0.59 0.61 0.62 0.5 — 0.5 0.31 0.57 0.28 0.34 0.36

0.15 0.13 0.12 0.12 0.10 0.07 0.03 0.03 — 0.05 0.07 0.10 0.12 0.14 0.17

4.8 7.2 24.1 12.0 5.9 4.4 15.3 8.6 — 10.6 7.5 14.2 7.0 8.2 9.5

120 180 600 300 150 110 380 210 — 270 190 350 180 210 240

*Frequency of a marker on a haplotype bearing E280A. y Overall frequency of a marker in the sample population. z Recombination rate between a given marker and E280A. x Conversion from generation to age uses a rate of 25 years per generation. Age was rounded to the nearest decade.

4

M.A. Lalli et al. / Alzheimer’s & Dementia - (2013) 1–7

Fig. 2. Minimal haplotype containing E280A. All carriers of E280A share a minimal common haplotype spanning 1.8 Mb surrounding presenilin 1 (PSEN1) E280A from rs2158987 to rs10135303 (chromosome 14: 72696625–74484310). Positions are given in GRCh37 coordinates. For a complete list of the single nucleotide polymorphisms on this haplotype, see Table E1.

3.1. Haplotype reconstruction

2000

Reconstructing variant phase reveals a common extended haplotype among all carriers of the E280A mutation comprising 580 single nucleotide polymorphisms and spanning a minimum interval of 1.8 Mb from rs2158987 to rs10135303, and extending through approximately 20 genes (Fig. 2, Table E1). None of the 129 chromosomes lacking E280A share this distinct segment of IBD. Some individuals share a common extended

1000

1500

95% CI

No. of simulations

We constructed reference panels from HapMap [14], the 1000 Genomes Project [15], and the Human Genome Diversity Panel (HGDP) [16,17]. To infer the locus-specific ancestry along the disease-containing chromosomes in our data set, we used two state-of-the-art algorithms: LAMPLD (v1.0) [18] and MULTIMIX (v1.1.0) [19]. Both programs are well suited to detect local ancestry from three reference populations and are validated on Latino populations. We used SHAPEIT2 (v2.r644) [20] to phase the HGDP data set to create reference ancestry panels as input for these algorithms. To resolve the continental origin of E280A, we constructed a reference panel for each of the three continents of potential origin. The Native American reference panel comprises the five Amerindian populations of the HGDP. The African panel contains three West African populations, including the HGDP Yoruban and Mandenka, and the Yoruba in Ibadan, Nigeria from the 1000 Genomes Project. We selected the Utah residents with northern and western European ancestry to represent European ancestry. We estimated the continental background of each haplotype spanning E280A in our data set, including those containing the mutation (E280A1) and those without the mutation (E280A2). To resolve on which specific European background E280A arose, we constructed additional reference ancestry

3. Results

500

2.5. Local ancestry estimation

panels from the HGDP and the 1000 Genomes Project to represent northern, western, and southern Europe.

0

per generation). As an admixture of Native American, European, and African populations, the Antioquian E280A kindred has a historical maximum g of w20 since the advent of Europeans to the New World. We estimate an initial population between 2 to 100 founders, and a present-day size of around 5000. This yields a superexponential generational population growth rate of w0.2 to 0.4. We include 75 of 1500 disease-bearing chromosomes or w5% of all carriers. We ran DMLE12.2 with 100,000 burn-in iterations followed by 100,000 simulations of the mutation age.

10

15

20

25

30

35

Mutation Age (g)

Fig. 3. Estimation of mutation age by DMLE12.2 software. After 100,000 burn-in runs, the age of the mutation was simulated 100,000 times by DMLE12.2 software. Mutation age is estimated in generation time (g). A 95% credible interval (CI) is shaded between 11 and 24 generations. The red line indicates the maximum likelihood estimated age of the mutation at 15.3 generations.

M.A. Lalli et al. / Alzheimer’s & Dementia - (2013) 1–7

A

Native American

5

B

Native American

European

European

African

African

CEU

1

Proportion estimated ancestry

0.9

HGDP Pima HGDP Maya

HGDP Mandenka HGDP Yoruba YRI

HGDP Colombia HGDP Karitiana HGDP Surui

0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

LAMP-LD

C

North European

FIN

West European

MULTIMIX

D

North European West European

HGDP Orcadian

South European

South European

1

GBR

HGDP French HGDP Bergamo

TSI HGDP Tuscan IBS

Proportion estimated ancestry

0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 HGDP Sardinian

C.

0 LAMP-LD

MULTIMIX

Fig. 4. (A–D) Geographic origin of the haplotype spanning E280A. Using the ancestry reference panel comprising the populations depicted in (A), LAMP-LD and MULTIMIX estimate unanimously a European ancestry of E280A in (B) for the haplotypes containing E280A (E280A1). Continental ancestry estimation is also shown for haplotypes spanning the region that do not contain the mutation (E280A2). Similarly, using the reference ancestry panels depicted (C), both algorithms strongly support a western European origin for haplotypes bearing E280A (D). CEU, Utah residents (CEPH) with northern and western European ancestry; FIN, Finnish in Finland; GBR, British in England and Scotland; HGDP, Human Genome Diversity Project; IBS, Iberian population in Spain; TSI, Toscans in Italy; YRI, Yoruba in Ibadan, Nigeria.

mutation-bearing haplotype spanning over 60 Mb. The presence of this conserved haplotype indicates a single founder event for the E280A allele in Antioquia, and the length of the haplotype suggests recent origin. A recombination w1 Mb telomeric to the mutation marks the beginning of the minimal common haplotype. This recombination likely happened very early in the founding of the population because the mutation-bearing haplotypes are almost equally split between two extended haplotypes upstream of this recombination (Fig. E1). This confounds the determination of the haplotype of the original founder haplotype telomeric to E280A. Centromeric to E280A, the original founder haplotype can be deter-

mined by consensus. Carriers of E280A share, on average, 11.7 6 6 Mb of the ancestral haplotype telomeric to E280A. 3.2. Estimated age of E280A Having reconstructed the phase of the haplotype containing E280A, we next sought to estimate the age of this mutation. Estimating the age of E280A provides insights into history and demography of the Antioquian founder population. The single-marker method produced an average estimated TMRCA of 9.9 generations (95% credible interval [CI], 7.2–12.6; Table 1). This corresponds to

6

M.A. Lalli et al. / Alzheimer’s & Dementia - (2013) 1–7

approximately 250 years [95% CI, 179–316], using an average rate of 25 years per generation. This estimate is consistent with genealogical records of this kindred that trace back to ca. 1780 (Fig. 1B), coincident with the founding of Yarumal in 1787, a town where many E280A carriers reside. Modeling the intra-allelic coalescent with DMLE yielded an estimated age of 15.3 generations (95% CI, 11–25; Fig. 3), which places the estimated time of origin of the mutation during the early 17th century, a time when waves of Spanish colonizers admixed with the indigenous population of Antioquia. Taken together, these estimates reflect a recent origin of the E280A in Antioquia that is consistent with genealogy and historical events in Antioquia. 3.3. Geographic origin of E280A Latin American populations are recent admixtures of Native American, European, and African ancestry. The Antioquian population is characterized by the admixture of European males with Native American females, with a small proportion of African ancestry introduced through the Atlantic slave trade [21]. After colonization of the region, Antioquia remained fairly isolated [22]. An interesting question is whether the E280A arose on a European, Native American, or African genetic background. The goal of local ancestry estimation is to determine the ancestral origin of each segment along a diploid chromosome. As a first approach to determine the ancestry of the E280A allele, we constructed a phased reference panel and simply checked whether any individuals in this panel shared the common extended haplotype identified in our study. Remarkably, only one individual in the entire reference panel comprising more than 2000 genomes shared the common haplotype. Sequenced in the 1000 Genomes Project, this individual was selected from Colombia in Medellin, the capital of Antioquia. Because this individual is a member of a sequencing trio, deterministic phasing guarantees a high accuracy of haplotype reconstruction. Thus, although not providing information on the geographic origin of the E280A allele, this finding reflects the accurate haplotype reconstruction in our study. We constructed a reference panel for each of the three continents of potential origin of E280A (Fig. 4A) and estimated the continental background of each haplotype spanning E280A in our data set, including those containing the mutation (E280A1) and those without the mutation (E280A2). Both algorithms of local ancestry estimation estimate unambiguously a European ancestry of E280A (Fig. 4B), allowing us to exclude the possibility of an African or Native American origin of this allele. In contrast, the local ancestry estimation of the haplotypes not containing E280A had a mixture of continental genetic back-

grounds, in similar proportions to overall admixture estimates of other samples from Antioquian populations [23,24]. To resolve further the specific background on which E280A arose, we constructed reference panels representing North, West, and South Europe (Fig. 4C). Both LAMP-LD and MULTIMIX estimated that E280A arose on a western European genetic background (Fig. 4D). Because of the low numbers of haplotypes sampled in our reference populations, and the lack of an exact haplotype match in any of these panels, it is not possible at this time to resolve further the geographic origin of E280A within Europe using these reference panels. Last, we asked whether E280A arose on a background from a genetically isolated population such as Basque, Bedouin, Adygei, and Sardinian. Although previous studies have suggested possible Basque or Sephardic contributions to the Antioquian population [21], our local ancestry analysis did not support these as likely genetic backgrounds on which the E280A mutation arose.

4. Discussion We estimated the age and geographic origin of the autosomal dominant point mutation E280A in PSEN1 that causes EOFAD. An extended and conserved haplotype containing E280A in all carriers of this mutation indicates a recent single founder event that defines an extended kindred. With an estimated TMRCA of E280A of 10 generations ago and a predicted de novo mutation event around 15 generations ago, this mutation likely arose in the post-Columbian era, early during the 17th century during the Spanish colonization of Colombia. The predicted western European ancestry of the haplotype on which E280A resides is consistent with a Spanish origin of the initial mutation carrier and founder of the E280A extended kindred. Given the recent origin and founder effect in the EOFAD cohort, genomewide local ancestry estimation and subsequent admixture mapping may reveal additional variants that influence the disease in this population [25].

Acknowledgments Support for this work came from a National Institutes of Health Fogarty grant (R01 AG029802; K.S.K.), the Errett Fisher Foundation (K.S.K.), and the University of Luxembourg–Institute for Systems Biology Program, Colciencias (grants 111540820543 and 111540820512; F.L.). The funding sources had no role in the data collection, data analysis, data interpretation, or writing of the report. K.S.K. had access to all the data in the study and had final responsibility for the decision to submit for publication. We extend our gratitude to our research subjects and other

M.A. Lalli et al. / Alzheimer’s & Dementia - (2013) 1–7

members of the PSEN1 E280A mutation kindred for their participation.

RESEARCH IN CONTEXT

1. Systematic review: We performed a systematic review by searching PubMed for the term E280A. We included all studies concerning the Antioquian kindred affected by early-onset familial Alzheimer’s disease. Although a Spanish origin was suggested for the E280A mutation, no genetic study to date demonstrates this definitively. 2. Interpretation: Our study estimates an age and geographic origin of E280A consistent with a single founder dating from the time of the Spanish Conquistadors who began colonizing Colombia during the early 16th century. 3. Future directions: Using the same methods of local ancestry estimation demonstrated in this work, we will expand our analysis genomewide in the same individuals. Genomewide local ancestry estimation and subsequent admixture mapping may reveal additional variants that influence the disease in this population.

References [1] Acosta-Baena N, Sepulveda-Falla D, Lopera-Gomez CM, JaramilloElorza MC, Moreno S, Aguirre-Acevedo D, et al. Pre-dementia clinical stages in presenilin 1 E280A familial early-onset Alzheimer’s disease: a retrospective cohort study. Lancet Neurol 2011;10:213–20. [2] Cornejo W, Lopera F, Uribe CS, Salinas M. Descripcion de una familia con demencia presenil tipo Alzheimer. Acta Med Colomb 1987; 12:55–61. [3] Lopera F, Arcos M, Madrigal L, Kosik KS, Cornejo W, Ossa J. Demencia tipo Alzheimer con agregacion familiar en Antioquia, Colombia. Acta Neurol Colomb 1994;10:173–87. [4] Cruts M, Theuns J, Van Broeckhoven C. Locus-specific mutation databases for neurodegenerative brain diseases. Hum Mutat 2012; 33:1340–4. [5] Scheuner D, Eckman C, Jensen M, Song X, Citron M, Suzuki N, et al. Secreted amyloid beta-protein similar to that in the senile plaques of Alzheimer’s disease is increased in vivo by the presenilin 1 and 2 and APP mutations linked to familial Alzheimer’s disease. Nat Med 1996;2:864–70. [6] Lopera F, Ardilla A, Martinez A, Madrigal L, Arango-Viana JC, Lemere CA, et al. Clinical features of early-onset Alzheimer disease in a large kindred with an E280A presenilin-1 mutation. JAMA 1997;277:793–9.

7

[7] Lalli MA, Garcia G, Madrigal L, Arcos-Burgos M, Arcila ML, Kosik KS, et al. Exploratory data from complete genomes of familial Alzheimer disease age-at-onset outliers. Hum Mutat 2012;33:1630–4. [8] Drmanac R, Sparks AB, Callow MJ, Halpern AL, Burns NL, Kermani BG, et al. Human genome sequencing using unchained base reads on self-assembling DNA nanoarrays. Science 2010; 327:78–81. [9] Roach JC, Glusman G, Smit AF, Huff CD, Hubley R, Shannon P, et al. Analysis of genetic inheritance in a family quartet by whole-genome sequencing. Science 2010;328:636–9. [10] Palin K, Campbell H, Wright AF, Wilson JF, Durbin R. Identity-bydescent-based phasing and imputation in founder populations using graphical models. Genet Epidemiol 2011;35:853–60. [11] Risch N, Deleon D, Ozelius L, Kramer P, Almasy L, Singer B, et al. Genetic analysis of idiopathic torsion dystonia in Ashkenazi Jews and their recent descent from a small founder population. Nat Genet 1995;9:152–9. [12] Ichida K, Hosoyamada M, Kamatani N, Kamitsuji S, Hisatome I, Shibasaki T, et al. Age and origin of the G774A mutation in SLC22A12 causing renal hypouricemia in Japanese. Clin Genet 2008;74:243–51. [13] Reeve JP, Rannala B. DMLE1: Bayesian linkage disequilibrium gene mapping. Bioinformatics 2002;18:894–5. [14] International HapMap Consortium. A second generation human haplotype map of over 3.1 million SNPs. Nature 2007;449:851–61. [15] 1000 Genomes Project Consortium. An integrated map of genetic variation from 1,092 human genomes. Nature 2012;491:56–65. [16] Rosenberg NA, Pritchard JK, Weber JL, Cann HM, Kidd KK, Zhivotovsky LA, et al. Genetic structure of human populations. Science 2002;298:2381–5. [17] Patterson N, Moorjani P, Luo Y, Mallick S, Rohland N, Zhan Y, et al. Ancient admixture in human history. Genetics 2012;192:1065–93. [18] Baran Y, Pasaniuc B, Sankararaman S, Torgerson DG, Gignoux C, Eng C, et al. Fast and accurate inference of local ancestry in Latino populations. Bioinformatics 2012;28:1359–67. [19] Churchhouse C, Marchini J. Multiway admixture deconvolution using phased or unphased ancestral panels. Genet Epidemiol 2013; 37:1–12. [20] Delaneau O, Zagury JF, Marchini J. Improved whole-chromosome phasing for disease and population genetic studies. Nat Methods 2013;10:5–6. [21] Carvajal-Carmona LG, Soto ID, Pineda N, Ortiz-Barrientos D, Duque C, Ospina-Duque J, et al. Strong Amerind/white sex bias and a possible Sephardic contribution among the founders of a population in northwest Colombia. Am J Hum Genet 2000; 67:1287–95. [22] Bedoya G, Montoya P, Garcia J, Soto I, Bourgeois S, Carvajal L, et al. Admixture dynamics in Hispanics: a shift in the nuclear genetic ancestry of a South American population isolate. Proc Natl Acad Sci U S A 2006;103:7234–9. [23] Galanter JM, Fernandez-Lopez JC, Gignoux CR, Barnholtz-Sloan J, Fernandez-Rozadilla C, Via M, et al. Development of a panel of genome-wide ancestry informative markers to study admixture throughout the Americas. PLoS Genet 2012;8:e1002554. [24] Bryc K, Velez C, Karafet T, Moreno-Estrada A, Reynolds A, Auton A, et al. Colloquium paper: genome-wide patterns of population structure and admixture among Hispanic/Latino populations. Proc Natl Acad Sci U S A 2010;2:8954–61. [25] Chakraborty R, Weiss KM. Admixture as a tool for finding linked genes and detecting that difference from allelic association between loci. Proc Natl Acad Sci U S A 1988;85:9119–23.

M.A. Lalli et al. / Alzheimer’s & Dementia - (2013) 1–7

7.e1

Fig. E1. (A, B) Identity-by-descent (IBD) in haplotypes containing E280A. Haplotypes containing E280A were reconstructed and colored by IBD. Because the recombination events up- and downstream of E280A are independent, we separated the haplotypes into their centromeric (A) and telomeric (B) segments, and sorted them by length. Two major distinct haplotypes ( blue and orange) comprise the centromeric segments. By consensus, we cannot determine which (if either) of these haplotypes belonged to the first founder of this kindred. Telomeric to E280A, an extended haplotype (beige) is shared among all E280A carriers, implying that this is the original ancestral haplotype. The extensive length of the IBD surrounding E280A confirms a single common founder for this mutation.

Table E1 Five hundred eighty SNPs define the minimal haplotype shared among all carriers of E280A No.

dbSNP ID

MAF

Chr 14 position

Nucleotide change

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28

rs2158987 rs847309 rs2529647 rs2529646 rs847308 rs847307 rs847306 rs847305 rs847304 rs847303 rs847302 rs847301 rs847300 rs6574054 rs847279 rs847278 rs699362 rs699361 rs699360 rs847277 rs847276 rs847275 rs8022968 rs8022525 rs847274 rs847269 rs847268 rs847267

0.022 0.01 0.309 0.309 0.309 0.316 0.003 0.317 0.434 0.119 0.319 0.121 0.01 0.318 0.313 0.314 0.316 0.317 0.315 0.316 0.315 0.314 0.315 0.315 0.315 0.322 0.298 0.297

72696625 72696877 72697908 72697911 72698590 72699077 72699451 72699790 72699799 72700062 72701140 72701681 72701713 72702425 72703275 72703361 72703933 72703945 72703955 72704494 72704505 72704544 72704580 72704584 72704596 72705328 72705544 72705568

C.G G.A G.A T.C C.T C.T T.C A.G A.G G.A T.C C.A A.G T.C G.A C.T G.A A .T A.C A.G C.T G.A G.A A .T A.G A.G T.C A.G (Continued )

M.A. Lalli et al. / Alzheimer’s & Dementia - (2013) 1–7

7.e2

Table E1 (Continued ) No.

dbSNP ID

MAF

Chr 14 position

Nucleotide change

29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90

rs847266 rs847265 rs847264 rs847263 rs847262 rs847261 rs847259 rs847256 rs847255 rs847253 rs847252 rs699359 rs847250 rs847249 rs847247 rs847246 rs2215014 rs847245 rs847241 rs847240 rs847239 rs847238 rs847237 rs847235 rs847234 rs847345 rs77331431 rs699369 rs699368 rs699367 rs6574057 rs2239258 rs2239257 rs11849202 rs1029616 rs1568404 rs917284 rs1915131 rs3819550 rs2238211 rs7144038 rs7144185 rs7149034 rs10483843 rs6574060 rs6574061 rs847341 rs847340 rs4578585 rs1879289 rs1879291 rs1879292 rs7152907 rs12184980 rs2079500 rs737499 rs737498 rs873671 rs7144302 rs7144049 rs4903007 rs11158953

0.297 0.293 0.297 0.296 0.284 0.139 0.285 0.136 0.009 0.009 0.181 0.145 0.18 0.15 0.14 0.141 0.029 0.083 0.402 0.269 0.269 0.014 0.285 0.302 0.01 0.428 0.053 0.437 0.304 0.009 0.452 0.459 0.451 0.389 0.399 0.394 0.411 0.444 0.446 0.382 0.381 0.459 0.395 0.441 0.378 0.269 0.141 0.009 0.379 0.423 0.389 0.38 0.379 0.379 0.423 0.379 0.423 0.421 0.374 0.416 0.457 0.396

72705657 72705855 72705963 72706015 72706835 72707163 72709083 72709924 72710245 72710624 72710800 72711069 72713310 72713488 72714411 72714918 72715544 72716686 72719407 72720337 72720464 72720785 72722205 72723421 72723508 72729450 72730015 72731204 72732516 72733043 72734186 72734947 72735003 72735625 72736899 72736969 72737111 72738437 72738673 72739796 72740856 72740895 72741070 72741686 72742583 72742656 72743337 72743475 72743533 72744439 72744591 72744774 72745114 72745259 72745815 72746118 72746493 72746778 72747523 72747907 72748337 72748924

T.G C.T G.C G.A C.A T.C T.G T.G G.T A.G A .T C.G A.G T.C T.C G.A C.T G.A C.T C.T A .T G.A A.G A.G A.G G.A T.C G.A A.G T .A C.A C.G T.C G.C G.A C.T G.T A.C T .A T.C C.G C.T G.A A.G T.C C.G T.C C.A A.G C.T A.G A.G T.C G.A A.C T.C A.G T.C T.C G.A T.C T.G (Continued )

M.A. Lalli et al. / Alzheimer’s & Dementia - (2013) 1–7

7.e3 Table E1 (Continued ) No.

dbSNP ID

MAF

Chr 14 position

Nucleotide change

91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152

rs847339 rs6574065 rs35584903 rs1099969 rs34870583 rs847338 rs3886822 rs2239255 rs2238207 rs2238206 rs7151952 rs2238205 rs2238204 rs4899430 rs2681766 rs10873250 rs11628263 rs8003581 rs147010083 rs7160019 rs12885387 rs12147474 rs2286069 rs4903030 rs2189810 rs34473506 rs1157500 rs2283375 rs144673437 rs757653 rs757652 rs740079 rs2239222 rs2283374 rs2283373 rs2283372 rs2283371 rs189345065 rs4903031 rs12434785 rs763732 rs2238169 rs112255338 rs2239228 rs2074648 rs2074647 rs17780615 rs8020134 rs7148786 rs7150443 rs11847034 rs11851319 rs56754001 rs72726187 rs58435870 rs10146857 rs10130313 rs11158961 rs56011838 rs55824212 rs10145697 rs57977724

0.025 0.269 0.458 0.025 0.267 0.009 0.353 0.399 0.257 0.256 0.259 0.373 0.271 0.267 0.005 0.001 0.447 0.393 0.007 0.293 0.449 0.122 0.3 0.364 0.478 0.124 0.176 0.128 0.002 0.398 0.048 0.017 0.406 0.475 0.146 0.183 0.393 0.006 0.164 0.329 0.237 0.467 0.011 0.282 0.122 0.098 0.113 0.309 0.447 0.011 0.373 0.334 0.105 0.101 0.103 0.364 0.268 0.092 0.285 0.279 0.362 0.103

72749082 72751893 72752976 72753003 72753018 72754111 72754262 72755506 72757384 72757572 72757762 72758614 72758720 72759512 72825269 72843762 72886274 72887527 72966480 72970037 72975499 72977199 72977524 72984261 72984879 72988567 72995814 72996672 72997324 73006498 73006579 73010297 73011885 73015223 73015288 73015497 73015668 73017920 73019236 73023875 73024252 73026117 73026532 73028325 73028706 73029178 73029609 73029707 73033653 73033751 73034801 73034834 73036256 73037045 73037289 73037708 73039648 73039895 73042070 73042083 73042463 73042598

G.T G.A C.A G.C G.A T.C T .A T.C A.G C.G T.C A.C C.T G.A A.G G.T T.C T .A C.T A.C A.C G.A T.C A .T A.G C.T A.C T.C G.A G.C A.G A.G A.G G.A A .T G.A A.G G.T A.G C.A C.T C.G C.T A.G A.G G.A C.T T.C A.G T.C G.A T.C C.G G.A G.A A.G A .T G.T T.C A.G A.G G.A (Continued )

M.A. Lalli et al. / Alzheimer’s & Dementia - (2013) 1–7

7.e4

Table E1 (Continued ) No.

dbSNP ID

MAF

Chr 14 position

Nucleotide change

153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214

rs59559187 rs56236280 rs2526908 rs12588279 rs2526906 rs2332886 rs12893389 rs2107683 rs35592119 rs2526901 rs2526900 rs2253590 rs2526897 rs2526950 rs7159355 rs7492545 rs17118221 rs2526940 rs12885159 rs6574086 rs6574087 rs2332887 rs35140796 rs12101179 rs17118280 rs55772147 rs72728529 rs2107689 rs2107688 rs717517 rs7160623 rs2803952 rs55737436 rs2803955 rs2526935 rs929328 rs2803956 rs45470093 rs2526933 rs79513088 rs2803959 rs67504864 rs2526932 rs2803961 rs2803962 rs12590634 rs2803963 rs55886104 rs994759 rs2526925 rs17180845 rs2803966 rs2526922 rs2526920 rs757655 rs2247658 rs113716952 rs2526919 rs1557926 rs12434611 rs58322420 rs28620548

0.107 0.098 0.308 0.402 0.152 0.416 0.416 0.042 0.417 0.148 0.151 0.14 0.145 0.181 0.493 0.492 0.053 0.116 0.457 0.492 0.492 0.491 0.487 0.163 0.052 0.396 0.009 0.113 0.004 0.03 0.481 0.112 0.405 0.004 0.436 0.279 0.353 0.366 0.266 0.069 0.266 0.34 0.267 0.267 0.276 0.37 0.4 0.368 0.447 0 0.439 0.422 0.419 0.358 0.273 0.446 0.412 0.421 0.313 0.331 0.415 0.452

73042679 73043720 73045809 73047858 73049405 73052019 73052082 73052121 73052247 73052703 73053031 73054742 73054951 73056740 73061541 73062428 73062542 73062700 73063150 73063630 73063661 73063910 73064376 73064446 73065478 73067165 73068923 73069175 73069317 73070475 73070655 73070834 73071079 73074425 73076595 73077854 73079464 73079976 73080440 73080603 73080618 73080832 73081068 73082826 73083240 73083982 73084761 73091106 73096614 73097058 73100649 73100786 73101118 73101727 73104995 73105681 73105741 73107771 73110653 73111602 73120249 73121288

C.T G.A A .T A .T G.A C.A G.C C.G A.G A.G G.A A.G A.C A .T T.C C.T C.A T.C C.T C.T C.T G.A A .T T.C G.A C.T G.A G.A T.G A.G T.C A.G G.A T .A A.C A.G A.G G.A C.G G.A T.C T.G A.G T.C C.T C.T G.A G.C A.C A.G T.C C.T A.G A.G C.T T.C G.A A.G C.G G.A G.C C.T (Continued )