Accepted Manuscript Title: Analysis of 24 Y chromosomal STR haplotypes in a Chinese Han population sample from Henan Province, Central China Author: Meisen Shi Yaju Liu Juntao Zhang Rufeng Bai Xiaojiao Lv Shuhua Ma PII: DOI: Reference:
S1872-4973(15)00071-X http://dx.doi.org/doi:10.1016/j.fsigen.2015.04.001 FSIGEN 1336
To appear in:
Forensic Science International: Genetics
Received date: Revised date: Accepted date:
31-12-2014 29-3-2015 3-4-2015
Please cite this article as: Meisen Shi, Yaju Liu, Juntao Zhang, Rufeng Bai, Xiaojiao Lv, Shuhua Ma, Analysis of 24 Y chromosomal STR haplotypes in a Chinese Han population sample from Henan Province, Central China, Forensic Science International: Genetics http://dx.doi.org/10.1016/j.fsigen.2015.04.001 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Analysis of 24 Y chromosomal STR haplotypes in a Chinese Han population sample from Henan Province, Central China Meisen Shia,b,c, Yaju Liud, Juntao Zhangd, Rufeng Baib,c, Xiaojiao Lvc, Shuhua Maa a
Department of Radiology, First Affiliated Hospital, Medical College of Shantou University, Shantou 515041, P.R.China b Collaborative Innovation Center of Judicial Civilization, P.R.China c Key Laboratory of Evidence Science (China University of Political Science and Law), Ministry of Education, Beijing 100088, P.R.China d Xuchang Institute of Forensic Sciences, Public Security Bureau of Henan Province, Xuchang 461000, P.R.China
Correspondence and phone calls about the paper should be directed to Shuhua Ma at the following address,phone and fax number,and e-mail address: Shuhua Ma Ph.D. Department of Radiology, First Affiliated Hospital, Medical College of Shantou University, Shantou 515041, Guangdong, PR China Fax: +86-10-68621175 E-mail:
[email protected]
Highlights
► In the first paragraph of the "Results" the statement that “no locus duplication was observed” was revised as “without observation of duplications in one-allele loci”. ► The non-unique haplotypes distribution of 24 Y-STR loci was included in Table S3. ► The change of haplotype distribution after the addition of additonal Y-STRs to the Yfiler loci was shown in Table S4b.
► A reduced median-joining network for the non-unique haplotypes data was constructed using the Network software v.4.610 in Fig S2. ► The last two references are updated.
Abstract We analyzed haplotypes for 24 Y chromosomal STRs (Y-STRs), including 17 Yfiler loci (DYS19, DYS385a/ b, DYS389I/II, DYS390, DYS391, DYS392, DYS393, DYS437, DY438, DYS439, DYS448, DYS456, DYS458, DYS635 and Y-GATA-H4) and 7 additional STRs (DYS388, DYS444, DYS447, DYS449, DYS522 and DYS527a/b) in 1100 unrelated Chinese Han individuals from Henan Province using AGCU Y24 STR kit systems. The calculated average gene diversity (GD) values ranged from 0.4105 to 0.9647 for the DYS388 and DYS385a/b loci, respectively. The discriminatory capacity (DC) was 72.91% with 802 observed haplotypes using 17 Yfiler loci, by the addition of 7 Y-STRs to the Yfiler system, the DC was increased to 79.09% while showing 870 observed haplotypes. Among the additional 7 Y-STRs, DYS449, DYS527a/b, DYS444 and DYS522 were major contributors to enhancing discrimination. In the analysis of molecular variance, the Henan Han population clustered with Han origin populations and showed significant differences from other Non-Han populations. In the present study, we report 24 Y-STR population data in Henan Han population, and we emphasize the need for adding additional markers to the commonly used 17 Yfiler loci to achieve more improved discriminatory capacity in a population with low genetic diversity. Keywords: Y-STR; Yfiler; Haplotypes; Multidimensional scaling; Henan Han; Central China 1. Population Henan is the birthplace of Chinese civilization with over 5,000 years of history, and remained China's cultural, economical, and political center until approximately 1,000 years ago. Henan is China's third most populous province with a population of over 94 million (year of 2010), including its minority ethnic population, and the Han ethnicity makes up almost the entire population with 98.8%. Blood samples were collected from 1100 unrelated healthy male individuals of Chinese Han population living in Henan Province, Central China. All participants signed the informed consent and provided the information about birthplace, parents and grandparents at the same time. Their ancestors had lived in the region for at least three generations. 2. DNA extraction DNA was extracted using a QIAamp DNA mini kit (Qiagen, Hilden, Germany) according to the manufacturer’s instructions. 3. Amplification and genotyping
17 Yfiler loci (DYS19, DYS385a/b, DYS389I/II, DYS390, DYS391, DYS392, DYS393, DYS437, DY438, DYS439, DYS448, DYS456, DYS458, DYS635 and Y-GATA-H4) and 7 additional Y-STRs (DYS388, DYS444, DYS447, DYS449, DYS522 and DYS527a/b) were amplified using AGCU Y24 STR kit(AGCU ScienTech, Wuxi, China) (Fig.S1). The PCR products were separated by capillary electrophoresis using an ABI PRISM 3500XL genetic Analyzer (Life Technologies, USA). Fragment sizes were assigned using GeneMapper ID-X Software Version 3.2 software (Life Technologies). Allele designations were determined by comparison of the sample fragments with those of allelic ladders provided with the kit. To ensure correct allele calls, Y-STR typing results for more than 120 Henan Han samples and the 9948 male control DNA (Promega Corporations, Madison, MI, USA) were confirmed to be comparable with the AmpFlSTR1 YfilerTM kit (Life Technologies, USA).The updated recommendations of the DNA Commission of the International Society of Forensic Genetics for analysis of Y-STR systems were followed [1]. 4. Analysis of Data Allelic frequencies were estimated by direct gene-counting. Gene and haplotype diversities were calculated according to the Nei’s formula [2].The discrimination capacity was calculated as the proportion of different haplotypes in the sample. Pairwise values of Rst and associated probability values (p-values, 10,000 permutations)were calculated to measure the genetic distance corresponding to Yfiler haplotypes (DYS19, DYS389I, DYS389II, DYS390, DYS391, DYS392, DYS393, DYS385a/b, DYS437, DYS438, DYS439, DYS448, DYS456, DYS458, DYS635, Y-GATA-H4) of our population and compared with 17 other published data or data from neighbouring populations submitted to Y-STR haplotype database (YHRD), using Arlequin software Version 3.5 (http://cmpg.unibe.ch/software /arlequin3.5). To illustrate the relationship between populations based on pairwise Rst, a multidimensional scaling (MDS) plot was created by using SPSS 15. A reduced median-joining network for the non-unique haplotypes data was constructed using the Network software v.4.610 (fluxus-engineering.com) [3]. Due to the complex repeat structure, DYS385 a/b and DYS527 a/b were excluded from the analysis. DYS389 is a special marker that has two different values at the same location. The two values are designated "DYS389I" and "DYS389B" (= DYS389II minus DYS389I) for the analysis. 5. Quality control Experiments were performed in the Key Laboratory of Evidence Science (China University of Political Science and Law), which is accredited according to the ISO 17025 standard. We had participated in the Y-STR haplotype reference database (YHRD) quality assurance exercise in 2009 typing the YHRD core loci as well as additional loci DYS437, DYS448, DYS456, DYS458, DYS635 and Y-GATA-H4. The haplotypes used in this study have been submitted to Y chromosome STR haplotype reference database (http://www.yhrd.org), with the accession number YA003924. 6. Results We could successfully obtained haploypes of total 24 Y-STRs from 1100 male individuals in a
Henan Han population without observation of duplications in one-allele loci or null alleles based on the observed fragment sizes. The locus or haplotype diversity and the number of observed alleles or haplotypes for 20 single copy STR loci and 2 multi-copy STRs are summarized in Supplementary Material Tables S1 and Table S2. The haplotypes distribution of 24 Y-STR loci are shown in Table S3. The number of haplotypes and haplotype diversities after the addition of additonal Y-STRs to the Yfiler loci are shown in Table S4. The Rst values calculated to measure genetic distances between 17 Yfiler haplotypes of 18 populations (n=9,009) with the statistical significance are shown in Table S5. Other remarks In this study, among 20 single copy loci, the DYS449 was the most informative loci with gene diversity of 0.8692, while DYS388, which has high and spiked distribution for a single allele frequency above 0.7, was the least informative loci with gene diversity of 0.4104. The 2 multi-copy Y-STRs, DYS385 a/b and DYS527 a/b, were more diverse than most of the single copy Y-STRs with diversity values of 0.9647 and 0.9304, respectively. A total of 870 different haplotypes of the 24 Y-STRs was identified from the 1100 studied male individuals, of which 754 were individual specific (68.55%).The most common haplotype # 796 observed in 22 males has a frequency of 2.00% (DYS19: 16, DYS389I: 13, DYS389II: 32, DYS390: 25, DYS391: 10, DYS392: 11, DYS393: 13, DYS437: 14, DYS438: 11, DYS439: 11, DYS448: 19, DYS456: 16, DYS458: 17, DYS635: 23, Y-GATA H4: 13, DYS385a/b: 11–15, DYS388:12, DYS444:14, DYS447:24, DYS449:32; DYS522:10; DYS527:21-23) (Table S3). Their unrelatedness was confirmed by the analysis of autosomal STRs using the AmpFlSTR® Identifiler® kit (Life Technologies, USA). The occurrence of the remaining haplotypes ranged from 1 to 12 times. Our data also demonstrated that 116 haplotypes were repeatedly found among individuals in the population, indicating that there are potentially a few common ancestors. In the reduced median-joining network relating 116 non-unique haplotypes representing 346 Y chromosomes (Fig.S2), most of haplotypes were dispersed sporadically without core haplotype. Therefore, further analysis of Y-SNPs with more samples and their distribution of Y chromosomal haplogroups would be helpful to elucidate relations between those non-unique haplotypes and the population history of the Henans. We compared the haplotype resolution of combined Y-STRs. 662 haplotypes (60.18%) were observed once and the discriminatory capacity of the 17 Yfiler loci was 72.90% with 802 different haplotypes. By the addition of 7 Y-STRs (DYS388, DYS444, DYS447, DYS449, DYS522 and DYS527a/b) to the 17 Yfiler loci, an improved discrimination capacity was obtained as 79.09% from 870 observed haplotypes in 1100 Henan Han samples. The shared haplotypes by Yfiler loci were separated to more distinct haplotypes by key markers, as indicated in Table S4b. It was clear that the additional Y-STRs allowed for higher haplotype resolution. The improvement achieved by DYS449 was followed by improvements with DYS527a/b, DYS444, DYS522 and DYS447, but did not change by the addition of DYS388. The low diversity of DYS388 in other populations have also been reported in previous reports [4, 5]. The combination of these 7 Y-STRs with Yfiler loci in the Henan Han samples which were not greatly improved in terms of discriminatory capacity(DC) and haplotype diversity(HD), indicating that the discrimination power of 24 Y-STR haplotypes in Henan Han is low for forensic and kinship casework. Therefore, further
investigation into other markers is needed to achieve more discrimination in the Henan population. Our haplotype data were also compared against the data available at the Y Chromosome Haplotype Reference Database (YHRD, Release 48), currently including 84,256 haplotypes for Yfiler haplotype data set. Five Hundred and eighty-two (72.57%) haplotypes detected in the Henan Han population found zero matches in YHRD when Yfiler haplotype were compared. The haplotypes which found matches in YHRD matched only to those detected in Asian populations. The most common haplotype matches to 36 East Asian samples, and 2 Eurasian Altaic sample found in the Yfiler haplotype set. For having extensive illustration of the genetic relation, the studied data were compared via AMOVA on the Yfiler loci with data from 17 reference populations (published and referenced in the YHRD). Namely Zhejiang Han (n=4451)[6], Beijing Han (n=207) (YHRD, Accession # YA003470), Shanxi Han (n=222)[7], Mudanjiang Han(n=859)[8], Luzhou Han (n=424)[9], South Han (n=119)[10], Taiwan Han (n=200)[11], Liaoning Manchu(n=231)[12], Lhasa Tibetan (n=167)[13], Ningxia Hui(n=143)[14], Qinghai Salar(n=133)[15], Xinjiang Kazakh (n=121) [16], Xinjiang Uyghur (n=217) [16], Guangxi Yao(n=100) [17], Guangxi Yi(n=105) [17], Guangxi Jing(n=103)[17], and Guangxi Zhuang(n=107)[17] with the statistical significance determined by a permutation test (10,000 replicates, Table S5). AMOVA analysis showed that 93.93% of the variation was found within populations, whereas 6.07% was among populations (fixation index FST = 0.06066). Pairwise analysis showed no significant differences (P > 0.05) in the comparison of Henan Han and Beijing Han (Rst=0.0035). With other Chinese Han origin samples from Zhejiang, Shanxi, Mudanjiang, Luzhou, South China, and Taiwan, although significant, low Rst values were obtained(0.0028~0.0418). In the comparison with the remaining minority populations, highly significant distances were observed (P=0.00000), with the corresponding Rst value ranging from 0.0028 to 0.2492. The Manchu ethnic population has the smallest genetic distance with Henan Han population compared with other Chinese minority ethnic groups. This is probably their ancestors mixed more with Han Chinese in their early settlement. They governed and greatly influenced China history for more than 300 years during Qing Dynasty. The MDS plot (Fig. 1) structured from Rst distance matrix shows that Henan population clusters with Han origin populations, confirming their historical ancestry, and stands far apart from other 9 Chinese minority ethnic groups excepting Chinese Manchu ethnic. From the MDS plot, we know that the distribution pattern was in good agreement with the geographic locations or the ethno-origins of these populations. The work presented here is in compliance with the update of the guidelines and recommendations on forensic analysis using Y-chromosome STRs [18]. This paper follows the guidelines for publication of population data proposed by the journal [19].
CONFLICT OF INTEREST The authors state that they have no conflicts of interest.
Acknowledgments We thank all sample donors for their contributions to this work and all those who helped with sample collection. This study was supported by The National Natural Science Foundation of China (NSFC. No. 81172902 and No.81373745), the China Postdoctoral Science Foundation funded project (No.2013M530371), and the Collaborative Innovation Center of Judicial Civilization, China.
References [1] P.S. Walsh, D.A. Metzger, R. Higuchi, Chelex 100 as a medium for simple extraction of DNA for PCR-based from forensic material, Biotechniques 10 (1991) 506-513. [2] M. Nei, Molecular Evolutionary Genetics, Columbia University Press, New York, 1987, 176-179. [3] H.J. Bandelt, P. Forster, A. Röhl, Median-joining networks for inferring intraspecific phylogenies, Mol Biol Evol 16(1999) 37–48. [4] A. Mohyuddin, Q. Ayub, R. Qamar, T. Zerjal, A. Helgason, S.Q. Mehdi, C. Tyler- Smith, Y-chromosomal STR haplotypes in Pakistani populations, Forensic Sci. Int. 118 (2001) 141–146. [5] E.Y. Lee, K.J.Shin, A.Rakha, J.E.Sim, M.J.Park, N.Y.Kim, W.I.Yang, H.Y.Lee, Analysis of 22 Y chromosomal STR haplotypes and Y haplogroup distribution in Pathans of Pakistan, Forensic Sci.Int.Genet, 11 (2014) 111–116. [6] W.W.Wu, L.P. Pan, H.L.Hao, X.T.Zheng, J.F.Lin, D.J.Lu, Population genetics of 17 Y-STR loci in a large Chinese Han population from Zhejiang Province, Eastern China, Forensic Sci.Int: Genet. 5 (2011) e11-e13. [7] R.F.Bai, Z.Zhang, Q.Z. Liang, D.Lu, L.Yuan, X.Yang, M.S.Shi, Haplotype diversity of 17 Y-STR loci in a Chinese Han population sample from Shanxi Province, Northern China, Forensic Sci.Int. Genet. 7 (2013) 214-216. [8] Y.Liu, L.Liao, M.Gu, Y.Ye, Population genetics for 17 Y-STR loci in a Chinese Han population sample from Mudanjiang city, Northeast China, Forensic Sci. Int.Genet.13(2014) e16–e17. [9]. L.Bing, W.B. Liang, J.H.Pi, D.M. Zhang, D.Yong, H.B.Luo, L.S.Zhang, L.Zhang, Population genetics for 17 Y-STR loci(AmpFISTR1Y-filerTM) in Luzhou Han ethnic group,Forensic Sci Int. Genetics. 7 (2013) e23-e26. [10] Y.K.Chen, Q.Li, D.C.Li, Z.H. Deng, Study on the genentic polymorphism of 17 Ychromosome specific STR loci of non-related male individuals in southern Chinese Han population, Exp. Lab Med. 26 (2008) 351-354, 386. [11] T.Huang, Y.Hsu, J.Li, J.Chung, C. Shun, Polymorphism of 17 Y-STR loci in Taiwan population, Forensic Sci. Int. 174 (2007) 249-254. [12] J.He, F,Guo, Population genetics of 17 Y-STR loci in Chinese Manchu population from Liaoning Province, Northeast China Forensic Sci.Int: Genet.7 (2013) e84-e85. [13] B.F. Zhu, Y.M. Wu, C.M. Shen, T.H.Yang, Y.J.Deng, X.Xun, Y.F. Tian, J.C.Yan, T.Li, Genetic analysis of 17 Y-chromosomal STRs haplotypes of Chinese Tibetan ethnic group residing in Qinghai province of China, Forensic Sci. Int, 175 (2–3) (2008):238-243.
[14] H.Guo, J.W.Yan, Z.P. Jiao, H.Tang, Q.X.Zhang, L.Zhao, N. Hua, H.F.Li, Y.C.Liu. Genetic polymorphisms for 17 Y-chromosomal STRs haplotypes in Chinese Hui population, Legal Med(Tokyo).10 (2008): 163-169. [15]B.F.Zhu, C.M.Shen, X.Xun, J.W.Yan, Y.J.Den,J.Zhu,Y.Liu, Population genetic polymorphisms for 17 Y-chromosomal STRs haplotypes of Chinese Salar ethnic minority group, Legal Med(Tokyo).9 (2007) 203-209. [16] W.J.Shan, Abdurahman Ablimit, W.J.Zhou, F.C.Zhang. Z.H.Ma, X.F.Zheng, Genetic polymorphism of 17 Y chromosomal STRs in Kazakh and Uighur populations from Xinjiang, China, Int J Legal Med.128(2014):743-744. [17] D.L.Feng, C.H.Liu, Z.R. Liang, C.Liu, Genetic polymorphism of 17 Y-STR loci in four minority populations in Guangxi of China, HEREDITAS (Beijing). 31(2009): 921-935.
[18] A.Carracedo, J.M.Butler, L.Gusmão, A.Linacre, W.Parson, L. Roewer, P.M.Schneider, Update of the guidelines for the publication of genetic population data, Forensic Sci Int Genet.10(2014):A1-2. [19] A. Carracedo, J.M. Butler, L. Gusmão, A.Linacre, W. Parson, L. Roewer, P.M. Schneider, New guidelines for the publication of genetic population data, Forensic Sci Int Genet. 7(2013) 217-220.
Fig. 1 Multi-dimensional scaling (MDS) plot of the Henan Han population and 17 reference populations, from pairwise Rst values. Acronyms are as follows: HN-H, Han Chinese population from Henan; SX-H, Han Chinese population from Shanxi; BJ-H, Han Chinese population from Beijing; ZJ-H, Han Chinese population from Zhejiang; SOUTH-H, Han Chinese population from South China; TAIWAN, Han Chinese population from Taiwan; MDJ-H, Han Chinese population from Mudanjiang; LZ-H, Han Chinese population from Luzhou; Yao Chinese Yao ethnic; Yi, Chinese Yi ethnic; Jing, Chinese Jing ethnic; Zhuang, Chinese Zhuang ethnic; Manchu, Chinese Manchu ethnic; Hui Chinese Hui ethnic; Uyghur, Chinese Uyghur ethnic; Kazakh, Chinese Kazakh ethnic; Yao Chinese Yao ethnic.