Journal of Forensic and Legal Medicine 44 (2016) 10e13
Contents lists available at ScienceDirect
Journal of Forensic and Legal Medicine j o u r n a l h o m e p a g e : w w w . e l s e v i e r . c o m / l o c a t e / j fl m
Research Paper
Population study and mutation analysis for 28 short tandem repeat loci in southwest Chinese Han population Qin Su a, 1, Bo Jin a, b, 1, Haibo Luo a, Yingbi Li a, Jin Wu a, Jing Yan a, Yiping Hou a, Weibo Liang a, *, Lin Zhang a, ** a
Department of Forensic Genetics, West China School of Basic Science and Forensic Medicine, Sichuan University (West China University of Medical Sciences), Chengdu 610041, PR China Department of Forensic Medicine, North Sichuan Medical College, Nanchong 637000, PR China
b
a r t i c l e i n f o
a b s t r a c t
Article history: Received 12 December 2015 Received in revised form 8 August 2016 Accepted 23 August 2016 Available online 24 August 2016
Short tandem repeat (STR) system is the most widely used genetic markers in modem forensic practice. Because of the relatively unstable molecular structure, STRs show a high mutation rate. In the current study, we report 169 mutation events of 13 CODIS and 15 non-CODIS STR loci that were found in 5569 cases of trios and duos paternity test. Our result indicated that locus-specific mutation rate varied among different populations, geometric means of the longest run of perfect repeats (LRPR) and heterozygosity. Along with previous published data, a forensic dataset for allele frequencies and locus-specific mutation rates of 13 CODIS and 15 non-CODIS STR loci from southwest Chinese Han population has been established. The mutation rate data have important implications in interpreting forensic individual identification and paternity testing. © 2016 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.
Keywords: Short tandem repeats Southwestern China Population genetics Mutations Forensic practice
1. Introduction Short tandem repeat (STR) loci, also known as microsatellites, are abundant in human genome and widely used as first-line personal genetic markers in forensic practice. Based on the comparative theory, the identification of person via STR typing have been considered as a golden standard in forensic fields.1,2 Despite of the great value of STR loci for forensic application, most STR loci showed a relatively higher mutation rates than coding genes3 and lead to incorrect conclusion in forensic analysis. Thus, mutation events were suggested to be taken into the consideration during the calculation of cumulative paternity index (CPI), or additional STR loci should be analyzed for more comparable DNA intelligence to reach an undoubted identifications especially in complex and/or disputed cases.4,5 The Combined DNA Index System (CODIS), including 13 STRs,
* Corresponding author. ** Corresponding author. E-mail addresses:
[email protected] (W. Liang),
[email protected] (L. Zhang). 1 Qin Su and Bo Jin contributed equally to this work, and were considered as cofirst author.
has been world widely used since it had been established by FBI in 1997.6 STR loci including but not limited to CODIS have been used in China for over two decades. It is necessary to establish a data bank for mutation rates of CODIS and non-CODIS loci, which may have important implication in the interpretation of forensic cases. Mutation analyses of STRs of some Chinese populations have been made in recent years.7 Many new STR loci with excellent distinguishing ability in Chinese population have been investigated and frequently used in our practice. However, their mutation rates have not been well studied. Zhu et al. have reported mutation rates and 95% CI of 28 STR loci, mutation steps and gender origin.8 However, the factors that may contribute to the mutation have not be studied. In the current study, we analyzed 28 STR loci, including 13 CODIS STRs (D3S1358, vWA, FGA, D8S1179, D21S11, D18S51, D5S818, D13S317, D16S539, TH01, TPOX, CSF1PO and D7S820) and 15 additional loci (Penta E, D2S441, D2S1338, Penta D, D10S1248, D19S433, D6S1043, D12S391, D11S2368, D13S325, D18S1364, D2S1772, D7S3048, D8S1132 and D22-GATA198B05) in Han population living in Southwestern China and intend to analyze the factors that may impact the mutation rates. It was found that the locus-specific STR mutation rate is associated with different populations, geometric means of the longest run of perfect repeats (LRPR) and heterozygosity.
http://dx.doi.org/10.1016/j.jflm.2016.08.008 1752-928X/© 2016 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.
Q. Su et al. / Journal of Forensic and Legal Medicine 44 (2016) 10e13
11
described in the article, when p value < 0.05 a significant relationship between two variables was confirmed.
2. Materials and methods AS described in,8 DNA samples from 565 parents/child-trio, 3558 father/child-duo and 1446 mother/child-duo paternity cases were collected and received STR genotyping. The Chelex-100® protocol was used to extract genomic DNA from peripheral blood or buccal cotton swab samples. Amplification of 28 STR loci (D3S1358, vWA, FGA, D8S1179, D21S11, D18S51, D5S818, D13S317, D16S539, TH01, TPOX, CSF1PO, D7S820, Penta E, D2S441, D2S1338, Penta D, D10S1248, D19S433, D6S1043, D12S391, D11S2368, D13S325, D18S1364, D2S1772, D7S3048, D8S1132 and D22-GATA198B05) were applied using multiplex PCR system Goldeneye 20A kit (Peoplespot Incorporation, Beijing, China), AGCU Expressmarker 22 kit (AGCU ScienTech Incorporation, Wuxi, Jiangsu, China) according to manufacturer's instructions. PCR products were separated on an ABI PRISM 310/3130 Genetic Analyzers (Applied Biosystems, Foster City, CA, USA). Data were analyzed with GeneMapper ID v3.2 software (Applied Biosystems). A mutation was first identified for, then the parental origin and amount of steps were determined according to the definition introduced by Brinkmann et al.3 and Weber.9 Null/silent alleles have been excluded from this study.10 The Fisher's exact test of HardyeWeinberg's Equilibrium (HWE) of each locus and observed heterozygosity (Ho), expected heterozygosity (He) were estimated with the Arlequin Ver 3.5.1.3 (http:// cmpg.unibe.ch/software/arlequin3/). Polymorphism information content (PIC), power of discrimination (PD) and probability of paternity exclusion (PE) of loci were calculated using the PowerStats Ver 1.2 (http://www.promega.com/geneticidtools/). The mutation data of this study were compared with other reports using SPSS 22.0 (SPSS Incorporation, Chicago, Illinois, USA). The relevancy between each locus mutation rate and the longest run of perfect repeats (LRPR), expected heterozygosity were investigated through SPSS 22.0 too. Other statistical tests were
3. Results and discussion 3.1. General data of population genetics study Based on the genotype of 28 STR loci in 6134 samples of unrelated individuals, deviations of HardyeWeinberg's Equilibrium (HWE) were estimated at D12S391 (p ¼ 0.0000), D18S1364 (p ¼ 0.0386), D18S51 (p ¼ 0.0109), Penta D (p ¼ 0.0444) and TPOX (p ¼ 0.0466). After Bonferroni correction (i.e., 0.05/28 ¼ 0.00178), only D12S391 was significant. Seventeen of 28 loci had the observed heterozygosity (Ho) higher than 0.8. The value of Ho ranged from 0.615 (TPOX) to 0.921 (Penta E). Penta E presented the most informative locus with PD ¼ 0.987, while TPOX (PD ¼ 0.792) was the least informative one. The combined PD of all 28 loci was reached at approximately 1. The probability of paternity exclusion (PE) varied between 0.306 at TPOX and 0.833 at Penta E. The typical paternity index (TPI, the harmonic mean of the paternity index) varied from 1.291 at TPOX to 6.131 at Penta E. Combined PE and TPI was 0.999999999999 and 1.26 1012, respectively. According to the information showed in Table S1, most of these loci indicated useful potential for forensic application. 3.2. The mutation rates in familial trios versus duos According to Zhu's,8 169 mutation events were observed from 565 parents/child-trio and 5004 parent/child-duo cases. Among them, there were 25 mutation events in trios and 144 mutation events in duos, and there were 6 cases where two STR exclusions were found. The implication of a combination of duos and trios might result in an underestimation of mutation rate when the child was heterozygous at a particular locus and the available parent had
Table 1 Comparison of mutation rates and 95% confidence interval (CI) with other datasets. Locus
CSF1PO D10S1248 D11S2368 D12S391 D13S317 D13S325 D16S539 D18S1364 D18S51 D19S433 D21S11 D2S1338 D2S1772 D2S441 D3S1358 D5S818 D6S1043 D7S3048 D7S820 D8S1132 D8S1179 FGA GATA198B05 Penta D Penta E TH01 TPOX vWA
This study
Qian et al.
Hohoff et al.
Lotte Henke et al.
Ana Carolina Mardini et al.
Mutation Rate 103
95% CI 103
Mutation Rate 103
95% CI 103
Mutation Rate 103
95% CI 103
Mutation Rate 103
95% CI 103
Mutation Rate 103
95% CI 103
1.1 0.4 3.1 3.9 1.5 1.6 0.9 2.3 2.3 0.9 1.7 2.8 0.8 0.4 1.3 0.9 2.8 0.8 0.7 2.3 0.9 5.9 2.3 0.5 1.7 0.7 0.5 2.6
0.4e2.5 0.0e2.4 0.8e8.0 2.3e6.1 0.6e3.0 0.2e5.6 0.3e2.2 0.5e6.7 1.2e4.0 0.2e2.5 0.8e3.2 1.4e5.2 0.0e4.3 0.0e2.5 0.5e2.7 0.3e2.2 1.3e5.1 0.0e4.3 0.1e2.1 0.5e6.8 0.3e2.4 3.8e8.7 0.5e6.8 0.1e1.8 0.7e3.6 0.2e2.2 0.1e1.7 1.3e4.6
1.2
0.7e1.8
0.0
0.0e7.6
1.9
1.0e3.3
1.5
1.0e2.1
2.1 0.5
1.5e2.8 0.2e0.9
0.0
0.0e7.6
4.5 0.9
0.1e25 0.3e1.9
1.0
0.6e1.5
0.5
0.2e0.9
4.3
0.1e23.7
1.2
0.7e1.9
1.1
0.7e1.6
1.8 0.7 0.8 1.4
1.2e2.5 0.3e1.4 0.5e1.4 0.8e2.3
0.0 0.0 5.9 4.3
0.0e11.0 0.0e15.7 0.7e21.1 0.1e23.7
1.4 0.8 1.5 0.7
0.9e2.2 0.4e1.4 0.9e2.3 0.3e1.3
1.7
1.2e2.3
1.6
1.1e2.2
1.1 1.1 0.7
0.6e1.6 0.6e1.6 0.3e1.2
2.1 0.0
0.1e11.5 0.0e7.6
1.2 1.1
0.7e1.8 0.5e2.2
0.6 1.5
0.3e1.0 1.0e2.1
1.1
0.6e1.6
2.1
0.1e11.6
0.6e1.5
0.8e1.9 1.9e3.5
2.9 2.1
0.1e16.3 0.1e11.5
0.6e2.5 0.1e24.8 0.5e1.5 1.7e3.3
1.0
1.3 2.6
1.3 4.5 0.9 2.4
1.5 2.3
1.0e2.1 1.7e3.0
0.4 4.0 0.0 0.1 1.7
0.2e0.8 2.9e5.3 0.0e0.2 0.0e0.4 1.2e2.4
0.0e7.6 0.0e7.6 0.5e14.9
0.9 0.9 0.2 0.1 1.6
0.4e1.6 0.4e1.6 0.0e0.7 0.0e0.4 1.0e2.4
0.0 0.1 2.2
0.0e0.3 0.0e0.4 1.6e2.9
0.0 0.0 4.1
12
Q. Su et al. / Journal of Forensic and Legal Medicine 44 (2016) 10e13
Fig. 1. Allele frequencies and mutation events at the FGA locus.
Fig. 2. Mutation rates and geometric mean of LRPR in 22 STR loci.
Fig. 3. Mutation rates and heterozygosity in 28 STR loci.
one allele which was shared, and the other allele differing by one repeat. For this, we performed a chi square test on the average mutation rate of duos and trios and there was no significant difference (P ¼ 0.217). Therefore, we didn't carry out further discussions on the possible underestimation of the mutation rate.
3.3. The mutation in different populations In comparisons of the results obtained from the current study to the previous studies,7,11e13 locus-specific mutation rates and 95% CI in different population are summarized in Table 1. These datasets showed that while the same locus show a varied mutation rate
Q. Su et al. / Journal of Forensic and Legal Medicine 44 (2016) 10e13
among the different population, the pattern of mutation rates for all of these studied loci are different. For example, TPOX and Th01 showed a showed a consistent relative low mutation rate in most datasets. In contrast, D21S11 and FGA showed opposite tendency in different populations: D21S11 had a highest mutation rates in the study conducted by Hohoff et al. but had a moderate to low rates in other studies. In addition, most of loci showed a higher mutation rates in the current study than in the study by Qian et al. notably loci D13S317, and D6S1043 (p < 0.05), suggesting mutation rates could vary among Chinese population in different area. In the study conducted by Hohoff, no mutation was found among the half of the studied loci, and D16S539 and D21S11 showed a higher mutation rates than all others. The significant difference of mutation rate in the two Caucasian datasets (German Caucasians11 and Brazilian Caucasians13) from different regions were only detected in loci D3S1358 and D8S1179. Interestingly, FGA locus exhibited a high mutation rate in this study. As shown in Fig. 1, the distribution of mutation was relatively even among the alleles, suggesting that the mutation rate at FGA locus is unrelated to the length of FGA allele. Thus far, it is unclear why the mutation rate of FGA was much higher than in other studies. The possible reasons include the variation of detection systems among laboratories as well as the geographic source of cases, the type of cases and subjects. 3.4. Linear regression between mutation rate and geometric means of the longest run of perfect repeats(LRPR) Linear regression between the logarithm of locus-specific mutation rate and geometric means of the longest run of perfect repeats (LRPR) was evaluated as previously reported.3 Because D7S3048, D2S1772, D13S325, D8S1132, GATA198B05 and D10S1248 loci had relatively insufficient sample data, they were not included in this analysis. The analyzed results for the remaining 22 loci are shown Table S2 and Fig. 2, the coefficient of determination R2 (D21S11 was not included) between them was 0.7004. To some extent, this implied mutation rate increased with the increase of geometric means of LRPR, whereas the opposite trend was seen in the D21S11 locus. As DNA slippage replication appeared to be more frequently observed in loci with large number of repeat units,3 we speculated that the deviation seen in D21S11 was due to different population sample sets. 3.5. Correlation between the logarithm of mutation rate and heterozygosity Relevance between the logarithm of locus-specific mutation rate and expected heterozygosity was analyzed using Spearman's test. As showed in Table S3 and Fig. 3, a positive linear correlation was found between He and log-mutation rate, which is similar to the one from the study by Lu et al.14 for 24 loci. These results suggested that the logarithm of locus-specific mutation rate increased with expected heterozygosity. 4. Conclusion The current study revealed that locus-specific mutation rate varied among different populations, geometric mean of LRPR and heterozygosity. Along with previous published data,8 a forensic
13
dataset for allele frequencies and locus-specific mutation rates of 13 CODIS and 15 non-CODIS STR loci from southwest Chinese Han population has been established. The mutation rate data have important implications in interpreting forensic individual identification and paternity testing. Conflict of Interest Statement We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted. Acknowledgments This study was supported by grants from the National Natural Science Foundation of China (Nos. 81471827, 81202387), Applied Basic Research Programs of Science and Technology Commission Foundation of Sichuan Province (No. 2013JY0013), Outstanding Youth Fund of Sichuan University (No. 2014SCU04A14) and General Research Project of Education Department Foundation of Sichuan Province (No. 12SB224). We thank Dr. Junping Xin (Department of Pathology, University of Chicago) for assistance in editing and polishing manuscript. Appendix A. Supplementary data Supplementary data related to this article can be found at http:// dx.doi.org/10.1016/j.jflm.2016.08.008. References 1. Bar W, et al. DNA recommendations. Further report of the DNA Commission of the ISFH regarding the use of short tandem repeat systems. International Society for Forensic Haemogenetics. Int J Leg Med. 1997;110(suppl 4):175e176. 2. Butler JM. Short tandem repeat typing technologies used in human identity testing. Biotechniques. 2007;(suppl 4):43. ii-v. 3. Brinkmann B, et al. Mutation rate in human microsatellites: influence of the structure and length of the tandem repeat. Am J Hum Genet. 1998;62(suppl 6): 1408e1415. 4. Henke J, Henke L. Which short tandem repeat polymorphisms are required for identification? Lessons from complicated kinship cases. Croat Med J. 2005;46(suppl 4):593e597. 5. Liu YX, et al. Multistep microsatellite mutation in a case of non-exclusion parentage. Forensic Sci Int Genet. 2015;16:205e207. 6. Butler JM. Genetics and genomics of core short tandem repeat loci used in human identity testing. J Forensic Sci. 2006;51(suppl 2):253e265. 7. Qian XQ, et al. Mutation rate analysis at 19 autosomal microsatellites. Electrophoresis. 2015;36(suppl 14):1633e1639. 8. Zhu W, et al. Mutation study of 28 autosomal STR loci in Southwest Chinese Han population. Forensic Sci Int Genet Suppl Ser. 2015:5. e298-e299. 9. Weber JL, Wong C. Mutation of human short tandem repeats. Hum Mol Genet. 1993;2(suppl 8):1123e1128. 10. Clayton TM, et al. Primer binding site mutations affecting the typing of STR loci contained within the AMPFlSTR SGM Plus kit. Forensic Sci Int. 2004;139(suppl 2):255e259. 11. Henke L, Henke J. Supplemented data on mutation rates in 33 autosomal short tandem repeat polymorphisms. J Forensic Sci. 2006;51(suppl 2):446e447. 12. Hohoff C, Schurenkamp M, Brinkmann B. Meiosis study in a population sample from Nigeria: allele frequencies and mutation rates of 16 STR loci. Int J Leg Med. 2009;123(suppl 3):259e261. 13. Mardini AC, et al. Mutation rate estimates for 13 STR loci in a large population from Rio Grande do Sul, Southern Brazil. Int J Leg Med. 2013;127(suppl 1): 45e47. 14. Lu D, et al. Mutation analysis of 24 short tandem repeats in Chinese Han population. Int J Leg Med. 2012;126(suppl 2):331e335.