Legal Medicine 9 (2007) 33–37 www.elsevier.com/locate/legalmed
Announcement of population data
Sequence polymorphism of the mitochondrial DNA hypervariable regions I and II in 205 Singapore Malays Hang Yee Wong a,*, June S.W. Tang a, Bruce Budowle b, Marc W. Allard c, Christopher K.C. Syn a, Wai Fun Tan-Siew a, Shui Tse Chow a a
DNA Profiling Laboratory, Centre for Forensic Science, Health Sciences Authority, 11 Outram Road, Singapore 169078, Singapore b Federal Bureau of Investigation, Laboratory Division,Quantico, VA 22135, USA c Department of Biological Sciences, George Washington University, Washington, DC 20052, USA Received 2 February 2006; received in revised form 30 June 2006; accepted 18 August 2006
Abstract Mitochondrial DNA sequences of the hypervariable regions HV1 and HV2 were analyzed in 205 unrelated ethnic Malays residing in Singapore as an initial effort to generate a database for forensic identification purposes. Sequence polymorphism was detected using PCR and direct sequencing analysis. A total of 152 haplotypes was found containing 152 polymorphisms. Out of the 152 haplotypes, 115 were observed only once and 37 types were seen in multiple individuals. The most common haplotype (16223T, 16295T, 16362C, 73G, 146C, 199C, 263G, and 315.1C) was shared by 7 (3.41%) individuals, two haplotypes were shared by 4 individuals, seven haplotypes were shared by 3 individuals, and 27 haplotypes by 2 individuals. Haplotype diversity and random match probability were estimated to be 0.9961% and 0.87%, respectively. Crown Copyright 2006 Published by Elsevier Ireland Ltd. All rights reserved. Keywords: mtDNA; Haplotypes; Haplogroups; Heteroplasmy; Sequence polymorphism; Singapore Malay population
Population: Blood samples from 205 unrelated ethnic Malays residing in Singapore were collected on FTA paper (Whatman, Middlesex, UK) [1]. Purification: Purification of the FTA punches was performed in accordance with the manufacturer’s protocol. PCR amplification: V1 (position 16024 –16365) and HV2 (position 73–340) regions were amplified in a single reaction using primers F15971 (TTA ACT CCA CCA TTA GCA CC) and R484 (TGA GAT TAG TAG TAT GGG AG). The PCR was performed in a total volume of 50 ll consisting of 1· PCR buffer (containing 1.5 mM MgCl2, 10 mM Tris–HCl and 50 mM KCl), 200 lM of each dNTP, 400 nM of each primer and 8 lg of BSA with 5 U AmpliTaq Gold (Applied Biosystems Inc., Foster City, USA). Amplification was performed on PTC-200 DNA Engine Peltier Thermal Cycler (MJ Research, Inc., Massachusetts, *
Corresponding author. Tel.: +65 62130779; fax: +65 62130855. E-mail address:
[email protected] (H.Y. Wong).
USA) using the following conditions: 96 C for 10 min, followed by 32 cycles of 94 C for 20 s, 56 C for 10 s, 72 C for 30 s and hold at 4 C. Postamplification products were purified using ExoSAP-IT (USB Corp., OH, USA) in a 5:1 ratio [2]. Samples were randomly chosen from each batch of PCR amplification and subjected to electrophoresis on a 2% agarose gel to obtain an estimated yield of the PCR products. Sequencing: Sequencing reactions were performed in a total reaction volume of 20 ll consisting of 20 ng of template amplicon, 10 pmol of primers, ABI BigDye Terminator Cycle Sequencing kit on a PTC-200 under the following conditions: 96 C for 1 min followed by 25 cycles of 96 C for 10 s, 50 C for 5 s, 60 C for 4 min and hold at 4 C. The primers used for HV1 were F15971 (TTA ACT CCA CCA TTA GCA CC) and R16410 (GAG GAT GGT GGT CAA GGG AC), while primers used for HV2 were F15 (CAC CCT ATT AAC CAC TCA CG), R408 (CTG TTA AAA GTG CAT ACC GCC) and
1344-6223/$ - see front matter Crown Copyright 2006 Published by Elsevier Ireland Ltd. All rights reserved. doi:10.1016/j.legalmed.2006.08.007
34
H.Y. Wong et al. / Legal Medicine 9 (2007) 33–37
Table 1 Mitochondrial DNA HV1 and HV2 control region sequence polymorphism in 205 unrelated Malays residing in Singapore Haplotype
Variation in HV1a
Variation in HV2b
Frequency
HA1 HA2 HA3 HA4 HA5 HA6 HA7 HA8 HA9 HA10 HA11 HA12 HA13 HA14 HA15 HA16 HA17 HA18 HA19 HA20 HA21 HA22 HA23 HA24 HA25 HA26 HA27 HA28 HA29 HA30 HA31 HA32 HA33 HA34 HA35 HA36 HA37 HA38 HA39 HA40 HA41 HA42 HA43 HA44 HA45 HA46 HA47 HA48 HA49 HA50 HA51 HA52 HA53 HA54 HA55 HA56 HA57 HA58 HA59 HA60 HA61 HA62 HA63
37G, 183C, 189C, 223T, 261T, 362C 37G, 223T, 278T, 311C, 320T 42A, 70G, 183C, 189C, 209C, 223T, 233G 47A, 51G, 168T, 184A, 189C, 201T, 304C 47A, 93C, 129A, 223T, 261T, 356C 51G, 93C, 182C, 183C, 189C, 218T, 292T, 362C 51G, 168T, 311C 51G, 179T, 234T 51G, 182C, 183C, 189C, 362C 51G, 183C, 189C, 194C, 195C, 266T 51G, 184T, 223T, 325C, 362C 51G, 185T, 223T, 362C 70G, 93C, 140C, 182C, 183C, 189C, 266A 86C, 129A, 183C, 189C, 223T, 297C 86C, 129A, 192T, 223T, 297C 86C, 129A, 209C, 223T, 272G 86C, 129A, 209C, 223T, 272G 86C, 147T, 183C, 184A, 189C, 217C, 235G 86C, 172C, 189C, 223T, 234T, 290T 92C, 140C, 172C, 189C, 223T, 278T 92C, 140C, 182C, 183C, 189C, 261T, 266A 92C, 148T, 182C, 183C, 189C, 223T, 362C 92C, 164G, 182C, 183C, 189C, 223T, 266T, 362C 93C, 126C, 207G, 292T, 309G, 318T 93C, 129A, 140C, 182C, 183C, 189C, 261T, 266A 93C, 129A, 223T, 256T, 271C 93C, 129A, 223T, 256T, 271C 93C, 140C, 183C, 189C, 266A 93C, 172C, 223T, 298C, 327T 93C, 183D, 186T, 189C, 223T, 271C, 311C 93C, 184A, 223T, 278T 93C, 185T, 223T, 260T, 294T, 298C 93C, 192T, 223T, 266T, 271C, 316G, 362C 93C, 209C, 223T, 224C, 263C, 278T, 319A 93C, 209C, 223T, 224C, 263C, 278T, 319A 93C, 223T, 295T, 362C 93C, 260T, 298C, 311C, 355T, 362C 93C, 260T, 298C, 355T, 362C 93C, 295T, 362C 108T, 129A, 162G, 172C, 189C, 304C 108T, 129A, 162G, 172C, 189C, 304C 108T, 129A, 162G, 172C, 288C, 304C 108T, 129A, 162G, 172C, 304C 108T, 129A, 162G, 172C, 304C 108T, 129A, 172C, 223T, 234T, 290T, 311C 108T, 129A, 172C, 223T, 234T, 290T 111T, 140C, 182C, 183C, 189C, 234T, 243C, 291T 111T, 168T, 172C,183C, 189C, 223T, 362C 124C, 179A, 223T, 261T, 262T 124C, 223T, 248T, 362C 126C, 140C, 182C, 183C, 189C, 261T, 266A 126C, 214A, 223T, 271C, 278T, 298C 126C, 223T, 290T 126C, 231C, 311C 126C, 231C, 311C 126C, 292T, 294T, 296T 129A, 140C, 182C, 183C, 189C, 261T, 266A 129A, 140C, 182C, 183C, 189C, 266A 129A, 140C, 223T, 271C 129A, 140C, 271C 129A, 145A, 249C, 288C, 301T, 304C, 311C 129A, 148T, 172C, 223T, 256T, 305G, 309G 129A, 172C, 293G, 304C, 311C
152C, 315.1C 146C, 150T, 152C, 269T, 315.1C 143A, 152C, 315.1C 93G, 146C, 315.1C 150T, 152C, 315.1C 207A, 228A, 234G, 249D, 315.1C 146C, 315.1C 146C, 152C, 315.1C, 334C 315.1C 315.1C 195, 315.1C 195C, 315.1C 210G, 315.1C 150T, 152C, 199C, 315.1C 150T, 199C, 315.1C 151T, 152C, 225A, 249D, 315.1C, 316A 152C, 225A, 249D, 291T, 315.1C, 316A 146C, 315.1C 125C, 127C, 128T, 146C, 195C, 315.1C 249G, 315.1C, 319C 152C, 210G, 315.1C 150T, 152C, 185A, 315.1C 150T, 315.1C 151T, 152C, 315.1C 152C, 210G, 315.1C 204C, 315.1C 315.1C 146C, 210G, 315.1C 249D, 315.1C 151T, 315.1C 151T, 315.1C 152C, 189G, 207A, 249D, 315.1C 184A, 249D, 315.1C 146C, 150T, 151T, 152C, 315.1C 146C, 150T, 151T, 234G, 315.1C 146C, 199C, 315.1C 249D, 315.1C 204C, 207A, 249D, 315.1C 146C, 199C, 315.1C 249D, 293C, 315.1C 249D, 315.1C 150T, 249D, 315.1C 150T, 249D, 315.1C 249D, 315.1C 153G, 185A, 189G, 315.1C 153G, 185A, 189G, 315.1C 131C, 204C, 315.1C 152C, 315.1C 315.1C 315.1C 152C, 210G, 315.1C 152C, 195C, 204C, 315.1C 315.1C 143A, 228A, 315.1C 315.1C 152C, 315.1C 131C, 152C, 210G, 308D, 309D, 315.1C 152C, 210G, 315.1C 146C, 151T, 315.1C 143A, 146C, 151T, 315.1C 152C, 315.1C, 329A 152C, 315.1C 249D, 315.1C
2 1 1 1 1 1 1 1 1 2 1 1 1 1 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 1 1 2 3 1 1 1 1 4 1 2 2 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 2 2 1 1
H.Y. Wong et al. / Legal Medicine 9 (2007) 33–37
35
Table 1 (continued) Haplotype
Variation in HV1a
Variation in HV2b
HA64 HA65 HA66 HA67 HA68 HA69 HA70 HA71 HA72 HA73 HA74 HA75 HA76 HA77 HA78 HA79 HA80 HA81 HA82 HA83 HA84 HA85 HA86 HA87 HA88 HA89 HA90 HA91 HA92 HA93 HA94 HA95 HA96 HA97 HA98 HA99 HA100 HA101 HA102 HA103 HA104 HA105 HA106 HA107 HA108 HA109 HA110 HA111 HA112 HA113 HA114 HA115 HA116 HA117 HA118 HA119 HA120 HA121 HA122 HA123 HA124 HA125 HA126 HA127 HA128
129A, 172C, 294T, 304C, 362C 129A, 172C, 301T, 304C 129A, 172C, 304C, 311C 129A, 172C, 304C 129A, 172C, 304C 129A, 182C, 183C, 189C, 223T, 265G, 297C 129A, 182C, 183C, 189C, 223T, 297C 129A, 183C, 189C, 192T, 223T, 297C, 362C 129A, 183C, 189C, 223T, 295T, 362C 129A, 189C, 192T, 223T, 297C 129A, 209C, 223T, 272G 129A, 223T, 256T, 271C, 362C 129A, 223T, 257A, 261T 129A, 223T, 263C, 362C 129A, 223T, 311C 134T, 223T, 362C 136C, 153A, 223T, 274A, 311C 136C, 182C, 183C, 189C, 217C 136C, 183C, 189C, 217C, 319A 136C, 223T, 257A, 261T, 292T, 294T 136C, 223T, 257A, 261T, 292T, 294T 136C, 223T, 257A, 261T, 292T, 294T 140C, 182C, 183C, 189C, 217C, 274A, 335G 140C, 182C, 183C, 189C, 220C, 261T, 266A 140C, 182C, 183C, 189C, 261T, 266A 140C, 183C, 189C, 234T, 266A 140C, 183C, 189C, 243C 140C, 183C, 189C, 243C 140C, 183C, 189C, 243C 140C, 183C, 189C, 264T, 266A 140C, 183C, 189C, 266A, 356C 140C, 183C, 189C, 266A 140C, 183C, 189C, 266A 140C, 183C, 189C, 266A 140C, 223T, 261T, 362C 145A, 223T, 311C 147T, 183C, 184A, 189C, 217C, 235G, 356C 147T, 183C, 184A, 189C, 217C, 235G 147T, 183C, 184A, 189C, 217C, 235G 147T, 193T, 223T, 291T 148T, 182C, 183C, 189C, 223T, 362C 166G, 192T, 266T, 304C, 311C 167T, 223T, 246T, 311C, 362C 168T, 295T, 296T, 304C 172C, 304C, 311C 176C, 220C, 241C, 254G, 265G, 298C, 362C 176T, 223T, 278T, 354T 182C, 183C, 189C, 217C, 223T, 261T 182C, 183C, 189C, 217C, 261T, 362C 182C, 183C, 189C, 223T, 271C, 311C 183C, 189C, 223T, 304C 184T, 223T, 298C, 319A 185T, 223T, 260T, 298C 188T, 223T, 274A, 278T, 311C 189C, 223T, 229C, 294T, 311C, 362C 189C, 223T, 234T, 278T, 311C 189C, 223T, 278T 189C, 223T, 297C 189C, 223T, 311C 192T, 223T, 274A, 362C 192T, 288C, 304C, 309G 209C, 223T, 233G, 291T, 304C, 311C 214A, 223T, 230R, 278T 214T, 234T, 238C 223T, 239T, 263C, 325C
152C, 249D, 315.1C 249D, 315.1C 249D, 315.1C 207A, 249D, 315.1C 249D, 315.1C 150T, 199C, 315.1C, 332T 150T, 199C, 315.1C, 332T 150T, 199C, 315.1C, 332T 146C, 199C, 315.1C 150T, 199C, 315.1C, 332T 152C, 225A, 249D, 315.1C, 316A 315.1C 150T, 315.1C 152C, 315.1C 199C, 204C, 250C, 310C 315.1C 315.1C 207A, 315.1C 207A, 315.1C 146C, 150T, 152C, 315.1C 150T, 207A, 315.1C 150T, 315.1C 146C, 150T, 195C, 315.1C 152C, 210G, 315.1C 152C, 210G, 315.1C 152C, 210G, 315.1C 103A, 152C, 189G, 315.1C 103A, 152C, 204C, 315.1C 103A, 152C, 315.1C 210G, 315.1C 210G, 228A, 234G, 315.1C 146C, 210G, 315.1C 152C, 210G, 315.1C 210G, 315.1C 152C, 195C, 315.1C 146C, 315.1C 204C, 315.1C 146C, 315.1C 315.1C 150T, 195C, 199C, 204C, 315.1C, 337D 150T, 152C, 315.1C 93G, 315.1C 195C, 315.1C 146C, 152C, 199C, 249G, 315.1C 152C, 249D, 315.1C 150T, 152C, 249D, 315.1C 199C, 315.1C 146C, 152C, 315.1C 146C, 315.1C 151T, 185A, 315.1C 199C, 315.1C 315.1C 152C, 214G, 249D, 315.1C 146C, 152C, 310C 151T, 152C, 315.1C 150T, 152C, 315.1C 150T, 152C, 228A, 315.1C 150T, 199C, 204C, 315.1C 146C, 199C, 315.1C 153G, 315.1C 143A, 183G, 315.1C 143A, 189G, 195C, 315.1C 152C, 195C, 204C, 207R, 315.1C 152C, 315.1C, 339G 315.1C
Frequency 2 2 2 2 2 1 1 1 1 1 1 1 2 1 1 2 1 1 1 1 1 1 1 1 3 1 1 1 3 1 1 3 1 3 1 1 1 1 2 1 1 1 1 2 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 2 1 1 1 (continued on next page)
36
H.Y. Wong et al. / Legal Medicine 9 (2007) 33–37
Table 1 (continued) Haplotype
Variation in HV1a
Variation in HV2b
Frequency
HA129 HA130 HA131 HA132 HA133 HA134 HA135 HA136 HA137 HA138 HA139 HA140 HA141 HA142 HA143 HA144 HA145 HA146 HA147 HA148 HA149 HA150 HA151 HA152
223T, 249C, 295T, 362C 223T, 257A, 261T, 292T, 294T 223T, 257A, 261T, 292T, 294T 223T, 261T, 302G, 362C 223T, 261T, 362C 223T, 263C, 274A, 311C, 343G, 357C 223T, 271C, 291T, 362C 223T, 286T, 362C 223T, 290T, 304C 223T, 291T, 362C 223T, 291T, 362C 223T, 295T, 311C, 362C 223T, 295T, 362C 223T, 295T, 362C 223T, 295T, 362C 223T, 298C, 327T 223T, 311C, 362C 223T, 311C, 362C 223T, 324C 223T, 362C 223T, 362C 260T, 298C, 355T, 362C 289G 304C, 309G
146C, 199C, 315.1C 150T, 195C, 315.1C 150T, 315.1C 152C, 315.1C 152C, 315.1C 315.1C 315.1C 315.1C 159C, 315.1C 152C, 193G, 315.1C 315.1C 146C, 199C, 315.1C 146C, 199C, 204C, 315.1C 146C, 199C, 315.1C 146C, 315.1C 249D, 315.1C 146C, 199, 315.1C 315.1C 143A, 234G, 315.1C 195C, 315.1C 315.1C 207A, 249D, 315.1C 195C, 315.1C 315.1C
3 2 2 2 2 1 1 1 1 1 2 1 1 7 1 1 1 2 1 1 1 1 4 1
Each sequence was compared with the rCRS, and substituted bases were recorded following the International Union of Pure and Applied Chemistry (IUPAC) designations. a For easy tabulation, the position of polymorphism is denoted by removing 16,000 bases. For example, the first base change for HA1 shown here as 37G, refers to 16037G. b As all the haplotypes contain polymorphisms 73G and 263G in the HV2 region, these sites are not specified in the table. Length variation in the 303–309 C-stretch region is ignored.
R484 (TGA GAT TAG TAG TAT GGG AG). After cycle sequencing, reactions were purified using AutoSeq96 Plates (Amersham Biosciences, Uppsala, Sweden) which contain DNA grade Sephadex G-50. Sequencing reactions were run on an ABI Prism 3100 Genetic Analyzer. Results: See Table 1. Analysis of data: Data were analysed by the ABI Prism Sequencing Analysis Software Version 3.7. Samples were sequenced in both 5 0 and 3 0 directions and additional sequencing was performed for samples with length heteroplasmy in the C-stretch regions. In order to minimize transcriptional errors, analysis was carried out both manually and electronically. For manual analysis, electropherograms were visually checked, sequence data were exported into a Word document and then aligned with the revised Cambridge Reference Sequence (rCRS) [3] using Vector NTI Advance 9.0 (InforMax, Maryland, USA) to identify sequence polymorphisms. For electronic analysis, the entire data set was directly transferred to ABI SeqScape Software Version 2.1.1 to generate a list of sequence polymorphisms. The two sets of data were compared and any discrepancies resolved. Haplotype P diversity was calculated using the equation h ¼ nð1 X 2i Þ=ðn 1Þ, where n is the sample size and Xi is the frequency of i-th haplotype [4]. P The random match probability was taken to be P ¼ X 2i [5] and the mean pairwise difference was calculated according to Nei [6].
Quality control: The Laboratory is accredited by ASCLD. External proficiency testing is obtained from ASCLD/DAB approved test providers. Other remarks: In our present study, single nucleotide polymorphisms (SNPs) 309.1 and 309.2 were not considered as these sites are not routinely used when comparing unknown forensic and reference samples. Based on the compiled data, a total of 132 and 75 haplotypes were observed in HV1 and HV2, respectively, in which 38 haplotypes of HV1 and 31 haplotypes of HV2 were shared by more than 1 individual. The most common haplotypes of HV1 and HV2 were found in 9 (4.39%) and 26 (12.68%) individuals, respectively. When the HV1 and HV2 regions were combined, 152 different mtDNA haplotypes were observed, of which 115 were observed only once, and 37 were seen in multiple individuals. Of the 37 haplotypes, 27 were shared by 2 individuals, 7 were shared by 3 individuals, 2 were shared by 4 individuals while the most common haplotype (16223T, 16295T, 16362C, 73G, 146C, 199C, 263G, 315.1C) was shared by 7 (3.40%) individuals. In this database, haplotype diversity and random match probability for the combined HV1 and HV2 regions were estimated as 0.9961% and 0.87%, respectively. When compared with the rCRS, a total of 152 polymorphic sites (107 from HV1, 45 from HV2) were observed. The mean pairwise difference for HV1 and HV2 combined haplotypes is 10.4 ± 4.8. Comparison of all possible sequence pairs gave
H.Y. Wong et al. / Legal Medicine 9 (2007) 33–37
37
Acknowledgements 16230
207
This research was supported by a research vote from Health Sciences Authority, Singapore. H.Y.W. and J.S.W.T. would like to extend their appreciation to Jodi Irwin, Armed Forces DNA Identification Laboratory, for her invaluable advice. References Fig. 1. The electropherogram of the mtDNA HV1 and HV2 sequence of M54 (Haplotype HA126) shows clearly defined peaks G and A at position 16230, and G and A at position 207, respectively. Both nucleotide signals are well above the background noise and reproducible through independent sequencing reactions, which demonstrate a two-point heteroplasmy situation.
81 matches out of 20,910 comparisons, with an empirical random match frequency of one in 258. In our population study, only one sample, M54 (Haplotype HA126) was observed to contain two point heteroplasmies. Both were purine transitions, 16230R and 207R, with height proportions of >30% (Fig. 1). Recommendations for the differentiation of artefacts from true point heteroplasmy include repeated sequencing [7], sequence analysis of the opposite strand [8] and proportions of the two peaks [9]. We adopted all three suggestions and concluded that sample M54 did exhibit two point heteroplasmies. In this study, we have established a Singapore Malay mtDNA sequence database of HV1 and HV2 regions for 205 individuals. These data will facilitate assessment of the significance of matches in mtDNA sequence forensic casework in Singapore, as well as for population and evolutionary analyses.
[1] Fujita Y, Kubo S. Application of FTA technology to extraction of sperm DNA from mixed body fluids containing semen. Legal Med. 2006;8:43–7. [2] Dugan KA, Lawrence HS, Hares DR, Fisher CL, Budowle B. An improved method for post-PCR purification for mtDNA sequence analysis. J. Forensic Sci. 2002;47:811–8. [3] Andrew RM, Kubacka I, Chinnery PF, Lightowlers RN, Turnbull DM, Howell N. Reanalysis and revision of the Cambridge reference sequence for human mitochondrial DNA. Nat. Genet. 1999;23:147. [4] Tajima F. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 1989;123:585–95. [5] Stonkeing M, Hedgecock D, Higuchi RG, Vigilant L, Erlich HA. Population variation of human mtDNA control region sequences detected by enzymatic amplification and sequence-specific oliognucleotide probes. Am. J. Hum. Genet. 1991;48:370–82. [6] Nei M. Molecular evolutionary genetics. New York: Columbia University Press; 1987. [7] Parsons TJ, Muniec DS, Sullivan K, Woodyatt N, Alliston-Greiner R, Wilson MR, et al. A high observed substitution rate in human mitochondrial DNA control region. Nat. Genet. 1997;15:363–8. [8] Parson W, Parsons TJ, Scheithauer R, Holland MM. Population data for 101 Austrian Caucasian mitochondrial DNA d-loop sequences: application of mtDNA sequence analysis to a forensic case. Int. J. Legal Med. 1998;111:124–32. [9] Carracedo A, Ba¨r W, Lincoln P, Mayr W, Morling N, Olaisen B, et al. DNA commission of the international society for forensic genetics: guidelines for mitochondrial DNA typing. Forensic Sci. Int. 2000;110:79–85.