Forensic Science International: Genetics 21 (2016) 1–4
Contents lists available at ScienceDirect
Forensic Science International: Genetics journal homepage: www.elsevier.com/locate/fsig
Short communication
Genetic diversity of 38 insertion–deletion polymorphisms in Jewish populations J.F. Ferraguta , R. Pereirab,c, J.A. Castroa , C. Ramona , I. Nogueirob,c, A. Amorimb,c,d, A. Picornella,* a Institut Universitari d’Investigació en Ciències de la Salut (IUNICS) i Laboratori de Genètica, Departament de Biologia, Universitat de les Illes Balears, Carretera de Valldemossa, km 7.5, 07122 Palma de Mallorca, Illes Balears, Spain b i3S—Instituto de Investigação e Inovação em Saúde, Universidade do Porto, Porto, Portugal c IPATIMUP—Instituto de Patologia e Imunologia Molecular da Universidade do Porto, Porto, Portugal d Faculdade de Ciências da Universidade do Porto, Porto, Portugal
A R T I C L E I N F O
A B S T R A C T
Article history: Received 19 August 2015 Received in revised form 22 October 2015 Accepted 4 November 2015
Population genetic data of 38 non-coding biallelic autosomal indels are reported for 466 individuals, representing six populations with Jewish ancestry (Ashkenazim, Mizrahim, Sephardim, North African, Chuetas and Bragança crypto-Jews). Intra-population diversity and forensic parameters values showed that this set of indels was highly informative for forensic applications in the Jewish populations studied. Genetic distance analysis demonstrated that this set of markers efficiently separates populations from different continents, but does not seem effective for molecular anthropology studies in Mediterranean region. Finally, it is important to highlight that although the genetic distances between Jewish populations were small, significant differences were observed for Chuetas and Bragança Jews, and therefore, specific databases must be used for these populations. ã 2015 Elsevier Ireland Ltd. All rights reserved.
Keywords: Indels Jews Ashkenazim Sephardim Mizrahim Chuetas Bragança
1. Introduction Indels are length polymorphisms created by small insertions or deletions of one or more nucleotides in the genome. These markers have a series of special characteristics that justify its increasing interest as a tool for a wide range of purposes, including population genetic studies, ancestry affiliation, and forensics. Indels combine many of the desirable characteristics of both SNPs and STRs, such as a widely spread distribution throughout the genome [1–3]; origin from a single rare mutation event and so unlikely to present recurrent mutations [4]; amenable to PCR design of short amplicons, improving the chances of amplification success of degraded samples [5,6]; significant differences in allele frequencies between populations, providing good results in population differentiation studies, and the ease of analysis by multiplex PCR and capillary electrophoresis [1,7,5]. In 2009, Pereira et al. [5] described a new multiplex for human identification combining 38 small non-coding biallelic autosomal indels into a single multiplex, with proven success for genotyping of degraded samples.
* Corresponding author. Fax: +34 971 173184. E-mail address:
[email protected] (A. Picornell). http://dx.doi.org/10.1016/j.fsigen.2015.11.003 1872-4973/ ã 2015 Elsevier Ireland Ltd. All rights reserved.
Jews can be traced back to populations occupying a small geographic area, in the Middle East, several thousand years ago and have maintained continuous cultural and religious traditions despite a series of Diasporas. Contemporary Jews comprise several communities that can be classified according to the location where each community developed. Among others, these include Middle Eastern Jews (“Mizrahim”) (Iran and Iraq), who have always resided in the Near East; the Askenazim who lived in communities of central and eastern Europe; the Sephardim (from Sepharad, Hebrew word for Hispania) who, after their expulsion from the Iberian Peninsula in the late 15th century, lived in other Mediterranean countries (especially Bulgaria and Turkey); and the North African Jews, comprising both Sephardim and Mizrahim [8–10]. Chuetas (Majorca, Spain) and the crypto-Jewish communities in Portugal (especially in Belmonte and Bragança district) are the only current Iberian populations whose ancestors can be traced to the original Sephardic Jewish populations, because of their peculiar history that kept the memory of their Jewish origin through centuries, and their inbreeding, which has prevented their gradual assimilation into the general population, as it happened with most of converted Iberian Jews [11,12]. Geneticists have studied Jewish populations since the turn of the 20th century in an attempt to unravel what must be a
2
J.F. Ferragut et al. / Forensic Science International: Genetics 21 (2016) 1–4
complex system of interrelationships among Jewish communities and their non-Jewish neighbours. These studies have provided evidences for shared Middle Eastern ancestry among major Jewish Diaspora groups and variable degrees of admixture with local populations [13–19]. Regarding the Jewish descendants in Iberia, genetic studies have shown that both Chuetas and Bragança Jews present a significant persistence of a Jewish heritage as well as signs of introgression from their host non-Jewish populations [20–22]. In the present study we aimed to characterize the diversity of the 38 indel markers [5] in populations with Jewish ancestry, and compare their distribution in Jewish and non-Jewish populations. Finally we evaluated the usefulness of the indel set both in forensic casework and population genetics, especially in Chuetas and in Bragança crypto-Jewish populations, due to the genetic heritage of their singular history.
interest: MP (matching probability), PD (power of discrimination), PIC (polymorphisms information content), PE (power of exclusion) and TPI (typical paternity index) were computed using Powerstats v1.2 spreadsheet [24]. To examine the relationship of the populations under study and with other published population data [5,25,26], FST genetic distances were represented in a multidimensional scaling (MDS) plot using SPSS v.15.0 (SPSS, Inc., Chicago, IL, USA). Population structure was further assessed with STRUCTURE 2.3.4 software using default settings (admixture model; correlated allele frequencies) [27,28]. All runs included a burn-in period of 50,000 iterations followed by 100,000 MCMC repetitions (K = 1 9), and were repeated ten times each in order to test the consistency of the results. 3. Results and discussion 3.1. Intra-population variability
2. Materials and methods 2.1. DNA samples DNA samples from 466 unrelated individuals with known Jewish ancestry were obtained after informed consent: 136 Chueta individuals (Majorca, Spain) belonged to the collection of the Genetics Laboratory, University of the Balearic Islands; 55 Jews from Bragança (NE Portugal) collected by the Institute of Pathology and Molecular Immunology, University of Porto; and 275 individuals of the National Laboratory for the genetics of Israeli populations at Tel-Aviv University. Following classical criteria, these samples were categorized into four groups: Ashkenazi (55), Middle Eastern (28 Iranian and 26 Iraqi), North African (34 Moroccan, 13 Libyan and 13 Tunisian), and Sephardic (62 Turkish and 44 Bulgarian). 2.2. Indel genotyping The 38 indels were genotyped using the PCR multiplex protocol originally described in Pereira et al. [5]. Amplification products were separated by capillary electrophoresis in a 3130 Genetic Analyzer (Applied Biosystems) and fragment sizes and allele calls were determined automatically using GeneMapper v3.2 ID software (Applied Biosystems). Typing quality and allele designation were warranted by the analysis of control samples of known genotype (GHEP-ISFG collaborative exercise; details available at www.gep-isfg.org/en/working-commissions/collaborative-exercise-indels-2012.html). 2.3. Data analysis Allele frequencies, Hardy–Weinberg equilibrium analysis, AMOVA and populations pairwise genetic distances (FST) were calculated using the Arlequin v.3.5 software [23]. Gene Diversity m
was calculated as 1 Si ðpi Þ2 . Statistical parameters of forensic
A total of 932 chromosomes were analyzed in this study. Genotypic data and allele frequencies for each indel marker and population (Sephardic, North African, Middle Eastern and Ashkenazi Jews, Chuetas and Jews from Bragança) are shown in Supplementary Tables S1 and S2. No deviations from Hardy–Weinberg equilibrium were observed in the populations studied after Bonferroni's correction for multiple tests (p > 0.00132), showing that no significant levels of substructure were found. Supplementary material related to this article found, in the online version, at http://dx.doi.org/10.1016/j.fsigen.2015.11.003. High genetic diversities (GD) were observed in all populations analyzed (Supplementary Table S3), with mean values ranging between 0.4148–0.4336 (Table 1). The highest values were found for rs2307579 (Y05) and rs34511541 (G08) and expected heterozygosities ranged between 0.4861–0.4989 and between 0.4650–0.5000, respectively, while the lowest heterozygosities were observed for rs2307839 (Y09) in Ashkenazi (0.1503) and rs2308171 (R09) in North African Jews (0.1528). These GD values were in the range found in other studies [5,25,26], although the markers showing the highest and the lowest variability differed between populations. Supplementary material related to this article found, in the online version, at http://dx.doi.org/10.1016/j.fsigen.2015.11.003. Linkage disequilibrium (LD) test showed no significant gametic association between all 38 markers after Bonferroni correction, with the exception of the pair rs16624–rs2067238 in Ashkenazi population (p 105), in accordance with the reported founder effect and genetic drift of this population, indicating an early bottleneck, probably corresponding to initial migrations of ancestral Ashkenazim to Europe [29]. Still, it must be said that many generations have elapsed after the founder bottleneck and therefore this LD value may be a spurious result, and since the two indels are located at different chromosomes, this association is rapidly vanished, provided random mating is established.
Table 1 Diversity and Forensic Parameters for the set of 38 indels studied in the 6 studied populations with Jewish ancestry. Obs Het Sephardic Jews North African Jews Middle Eastern Jews Ashkenazi Jews Chuetas Bragança Jews
0.4322 0.4204 0.4096 0.4412 0.4338 0.4387
Exp Het 0.0768 0.0995 0.0871 0.1050 0.0641 0.0844
0.4244 0.4190 0.4148 0.4187 0.4336 0.4335
0.0738 0.0831 0.0743 0.0799 0.0555 0.0574
Combined MP
Combined PD
Combined PE
1.9581E + 14 1.5530E + 14 1.2510E + 14 6.3431E + 13 3.3360E + 14 2.0471E + 14
0.9999999999999950 0.9999999999999940 0.9999999999999920 0.9999999999999840 0.9999999999999970 0.9999999999999950
99.72% 99.68% 99.50% 99.85% 99.70% 99.79%
Obs Het: observed heterozygosity; Exp Het: expected heterozygosity; MP: matching probability; PD: power of discrimination; PE: power of exclusion.
J.F. Ferragut et al. / Forensic Science International: Genetics 21 (2016) 1–4
3
Table 2 Paternity indices (PI) and probabilities (W) based on the results obtained with STRs, indels, and with both type of markers in a paternity case with two paternal mutations.
PI W
STRs AmpFlSTR Identifiler (Applied Biosystems) Powerplex1 System (Promega)
INDELS (38 Indel-plex)
STRs + INDELs
0.827 45.259%
12530.759 99.992%
10360.431 99.990%
happened when subsequently typed for the set of 38 autosomal indels (Supplementary Table S5). Joining the information from STRs and indels resulted in a combined PI value of 10,360 substantially contributing to solve the case. Supplementary material related to this article found, in the online version, at http://dx.doi.org/10.1016/j.fsigen.2015.11.003.
3.2. Forensic efficiency Forensic parameters of interest were calculated for each indel marker and population (Supplementary Table S4). As expected from the levels of diversity, rs34511541 (G08) showed the highest values of power of discrimination (PD) in different populations, while the lowest values in PD and power of exclusion in trios (PE) were observed in rs2308171 (R09), except in Ashkenazi and Bragança Jews. Supplementary material related to this article found, in the online version, at http://dx.doi.org/10.1016/j.fsigen.2015.11.003. The combined matching probability (MP) (Table 1) ranged from a value of 1 in 6.34 10+13 (in Ashkenazim) to 1 in 3.34 10+14 (in Chuetas). Considering the essentially biallelic nature of small indels, a high combined PE was also obtained in all the populations (>0.995 in all cases). Although values differed slightly between populations, the set of loci was highly informative in all the studied populations with Jewish ancestry, providing a suitable short amplicon marker set to complement STRs in forensic casework.
3.4. Inter-population variability AMOVA analysis of the six Jewish populations studied showed a low but statistically significant value (FST = 0.00871; p < 105). Interestingly, pairwise comparisons (Supplementary Table S6) evidenced the existence of a sub-set composed of the Jewish populations (Sephardic, North Africa, Mizrahim and Ashkenazi) with non-significant genetic distances, whereas the converted Jewish populations (Bragança Jews and Chuetas) presented significant differences with all other populations tested in this study. However, hierarchical AMOVA showed lack of homogeneity within these sub-sets (FSC = 0.00250; p = 0.01075). Supplementary material related to this article found, in the online version, at http://dx.doi.org/10.1016/j.fsigen.2015.11.003. A comparison was also performed between the populations in this work and other populations previously studied for the same set of indel markers (Portugal, Angola, Mozambique, Macau and Taiwan [5], South East Spain [26], Brazil (Rio de Janeiro and Terenas) [25], and Majorca (Chueta's host population) (GHEP-ISFG collaborative exercise www.gep-isfg.org/en/working-commissions/collaborative-exercise-indels-2012.html). MDS results (stress value < 0.199) showed greater structure than would be obtained from a random dataset [31]. The MDS plot (Fig. 1)
3.3. Application of this set of indels to a challenging paternity investigation A paternity case involving mother, an alleged father and one girl was investigated. In the initial conventional STR analysis, mismatches were observed between the alleged father and the child in two STRS (D2S1338 and D5S818) resulting in a paternity index (PI) of 0.827 (Table 2), if the possibility of mutation was considered. The alleged father matched perfectly when 12 additional X-STRs were typed, in order to investigate the possibility that he was not the father but a close relative [30], and the same
-1
Terenas (Brazil)
0 Asia Rio de Janeiro (Brazil)
1
North African Jews
Middle Eastern Jews
Portugal
2 Majorca
Ashkenazi Jews Sephardic Jews
South East Spain Chuetas 3
Bragança Jews
Africa
Stress: 0.17861 RSQ: 0.93959
4
-1
0
1
2
3
Fig. 1. Multidimensional scaling (MDS) plot of the pairwise genetic distances between the 6 populations with Jewish ancestry (showed as circles) in this study and South European populations (Portugal, South East Spain and Majorca), Brazilian populations (Rio Janeiro and Terenas), Asian and African populations. All non-Jewish populations are shown as black squares.
4
J.F. Ferragut et al. / Forensic Science International: Genetics 21 (2016) 1–4
indicated that European, African and Asian populations are distantly positioned. Terenas (with major Amerindian ancestry living in Mato Grosso do Sul, Center-West Brazil) was closer to the Asians, while Rio de Janeiro was placed between Europe and Africa, as expected due to their origin. Jewish and South European populations (Portugal, Majorca and South East Spain) clustered together, indicating that this set of markers can separate efficiently between populations from different continents, but does not appear to be a suitable tool for studying the genetic sub-structure in Mediterranean context. Within the Mediterranean region, taking into account the genetic distances matrix (Supplementary Table S6), we can observe two groups without significant differences within them; one is the aforementioned Jewish group, and the other includes the South European non-Jewish populations (Portugal, Majorca and South East Spain). North African Jews showed no significant distances with any population of any group. Regarding the converted Jewish populations, Chuetas presented significant differences with all other populations, including both the Jews and their host population, Majorca; but Braganza Jews were not significantly different to Portugal, their host population. A Bayesian clustering analysis was performed with the same samples in STRUCTURE software in order to deeper scrutinize the substructure of the Mediterranean region for this set of 38 neutral, slow evolving, autosomal markers. The results consistently revealed worse ln P(D) values for K > 1 (data not shown), failing to detect distinct clusters in the data set and supporting that any possible genetic structure in the populations here analyzed could not be discerned with the level of resolution afforded by these 38 loci. 4. Conclusions This study provides a useful database of the 38 indel multiplex [5] for populations with Jewish ancestry including two Sephardic isolates, the Chuetas and the Crypto-Jews from Bragança. The results obtained demonstrated also their usefulness for general forensic purposes, especially in situations where conventional STRs may result in low paternity indices. However, this set of markers does not seem adequate for molecular anthropology studies in subcontinental level, namely in the Mediterranean region. It is important to highlight that although the genetic distances between Jewish populations were low, significant differences were observed for both Chuetas and Bragança Jews with the others. Therefore, specific databases for these Jewish populations must be used in the forensic field to correctly weigh the value of the evidence based on these indel markers. This study follows the guidelines for publication of population data proposed by the journal [32]. Acknowledgments This work was partially supported by grant AAEE24/2014 from the Direcció General de R + D + I (Comunitat Autònoma de les Illes Balears) and European Regional Development Fund (ERDF). RP (SFRH/BPD/81986/2011) and IN (SFRH/BD/73336/2010) are recipients of grants awarded by the Portuguese Foundation for Science and Technology (FCT) and co-financed by the European Social Fund (Human Potential Thematic Operational Programme— POPH). References [1] J.L. Weber, D. David, J. Heil, Y. Fan, C. Zhao, et al., Human diallelic insertion/ deletion polymorphisms, Am. J. Hum. Genet. 71 (2002) 854–862. [2] The 1000 Genomes Project Consortium, A map of human genome variation from population-scale sequencing, Nature 467 (2010) 1061–1073.
[3] R.E. Mills, W.S. Pittard, J.M. Mullaney, U. Farooq, T.H. Creasy, et al., Natural genetic variation caused by small insertions and deletions in the human genome, Genome Res. 21 (2011) 830–839. [4] M.W. Nachman, S.L. Crowell, Estimate of the mutation rate per nucleotide in humans, Genetics 156 (2000) 297–304. [5] R. Pereira, C. Phillips, C. Alves, A. Amorim, A. Carracedo, et al., A new multiplex for human identification using insertion/deletion polymorphisms, Electrophoresis 30 (2009) 3682–3690. [6] C. Romanini, M.L. Catelli, A. Borosky, R. Pereira, M. Romero, et al., Typing short amplicon binary polymorphisms: supplementary SNP and Indel genetic information in the analysis of highly degraded skeletal remains, Forensic Sci. Int. Genet. 6 (2012) 469–476. [7] N. Yang, H. Li, L.A. Criswell, P.K. Gregersen, M.E. Alarcon-Riquelme, et al., Examination of ancestry and ethnic affiliation using highly informative diallelic DNA markers: application to diverse and admixed populations and implications for clinical epidemiology and forensic medicine, Hum. Genet. 118 (2005) 382–392. [8] S.W. Baron, Social Religious History of the Jews, Columbia University Press, New York, 1937. [9] H.H. Ben-Sasson, A History of the Jewish People, Harvard University Press, 1976. [10] R.M. Goodman, Genetic Disorders Among the Jewish People, The Johns Hopkins University Press, 1979. [11] E. Laub, J.F. Laub, El Mito Triunfante: Estudio Antropológico-social De Los Chuetas Mallorquines, M. Font, Palma de Mallorca, 1987. [12] J.C. Martins, Portugal E Os Judeus: Judaísmo E Anti-semitismo No Século XX, Vega, Lisbon, 2006. [13] R. Patai, J. Patai, The Myth of the Jewish Race, C. Scribner’s Sons, New York, 1975. [14] B. Bonné-Tamir, A. Adam, Genetic Diversity Among Jews: Diseases and Markers at the DNA Level, Oxford University Press, 1992. [15] M.F. Hammer, A.J. Redd, E.T. Wood, M.R. Bonner, H. Jarjanazi, et al., Jewish and Middle Eastern non-Jewish populations share a common pool of Ychromosome biallelic haplotypes, Proc. Natl. Acad. Sci. U. S. A. 97 (2000) 6769–6774. [16] M.G. Thomas, M.E. Weale, A.L. Jones, M. Richards, A. Smith, et al., Founding mothers of Jewish communities: geographically separated Jewish groups were independently founded by very few female ancestors, Am. J. Hum. Genet. 70 (2002) 1411–1420. [17] D.M. Behar, B. Yunusbayev, M. Metspalu, E. Metspalu, S. Rosset, et al., The genome-wide structure of the Jewish people, Nature 466 (2010) 238–242. [18] G. Atzmon, L. Hao, I. Pe’er, C. Velez, A. Pearlman, et al., Abraham’s children in the genome era: major Jewish diaspora populations comprise distinct genetic clusters with shared Middle Eastern ancestry, Am. J. Hum. Genet. 86 (2010) 850–859. [19] H. Ostrer, K. Skorecki, The population genetics of the Jewish people, Hum. Genet. 132 (2013) 119–127. [20] A. Picornell, J.A. Castro, M. Misericordia Ramon, Genetics of the Chuetas (Majorcan Jews): a comparative study, Hum. Biol. 69 (1997) 313–328. [21] C. Crespí, J. Mila, N. Martínez-Pomar, A. Etxagibel, I. Muñoz-Saa, et al., HLA polymorphism in a Majorcan population of Jewish descent: comparison with Majorca, Minorca, Ibiza (Balearic Islands) and other Jewish communities, Tissue Antigens 60 (2002) 282–291. [22] I. Nogueiro, J.C. Teixeira, A. Amorim, L. Gusmao, L. Alvarez, Portuguese crypto-Jews: the genetic heritage of a complex history, Front. Genet. 6 (12) (2015) . [23] L. Excoffier, H.E.L. Lischer, Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows, Mol. Ecol. Resour. 10 (2010) 564–567. [24] A. Tereba, Tools for analysis of population statistics, Profiles DNA 2 (1999) 14–16. [25] F. Manta, A. Caiafa, R. Pereira, D. Silva, A. Amorim, et al., Indel markers: genetic diversity of 38 polymorphisms in Brazilian populations and application in a paternity investigation with post mortem material, Forensic Sci. Int. Genet. 6 (2012) 658–661. [26] M. Saiz, M.J. Alvarez-Cubero, L.J. Martinez-Gonzalez, J.C. Alvarez, J.A. Lorente, Population genetic data of 38 insertion–deletion markers in South East Spanish population, Forensic Sci. Int. Genet. 13 (2014) 236–238. [27] J.K. Pritchard, M. Stephens, P. Donnelly, Inference of population structure using multilocus genotype data, Genetics 155 (2000) 945–959. [28] D. Falush, M. Stephens, J.K. Pritchard, Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies, Genetics 164 (2003) 1567–1587. [29] D.M. Behar, M.F. Hammer, D. Garrigan, R. Villems, B. Bonne-Tamir, et al., MtDNA evidence for a genetic bottleneck in the early history of the Ashkenazi Jewish population, Eur. J. Hum. Genet. 12 (2004) 355–364. [30] N. Pinto, M. Magalhaes, E. Conde-Sousa, C. Gomes, R. Pereira, et al., Assessing paternities with inconclusive STR results: the suitability of bi-allelic markers, Forensic Sci. Int. Genet. 7 (2013) 16–21. [31] K. Sturrock, J. Rocha, A multidimensional scaling stress evaluation table, Field Methods 12 (2000) 49–60. [32] A. Carracedo, J.M. Butler, L. Gusmao, A. Linacre, W. Parson, et al., New guidelines for the publication of genetic population data, Forensic Sci. Int. Genet. 7 (2013) 217–220.