Polymorphic sites in human mitochondrial DNA control region sequences: population data and maternal inheritance

Polymorphic sites in human mitochondrial DNA control region sequences: population data and maternal inheritance

Forensic Science International 98 (1998) 169–178 Polymorphic sites in human mitochondrial DNA control region sequences: population data and maternal ...

889KB Sizes 4 Downloads 96 Views

Forensic Science International 98 (1998) 169–178

Polymorphic sites in human mitochondrial DNA control region sequences: population data and maternal inheritance ¨ Anne Baasner*, Claudia Schafer, Anke Junge, Burkhard Madea Institute of Legal Medicine, University Bonn, Stiftsplatz 12, 53111 Bonn, Germany Received 23 February 1998; received in revised form 24 September 1998; accepted 28 September 1998

Abstract Sequence analysis of the mitochondrial DNA (mtDNA) control region is of central importance for forensic identity testing as well as for studies of human evolution. Here we report the sequencing data of the hypervariable regions I and II from 50 unrelated individuals from a western German population (Rhine area). In regions I and II, 52 and 26 sites of sequence polymorphism, respectively, were noted. Nucleotide substitution rather than insertion / deletion was the majority of variation. The distribution showed a large bias towards transitional changes than transversional changes. Furthermore we investigated uniparental inheritance in seven CEPH families each family with 7–9 maternal descendants. Most maternal relatives shared identical mtDNA sequences. Additionally sequences were compared for father:child pairs and as expected no evidence for paternal transmission of mtDNA was observed. The high variability of mtDNA control region sequences permits utility in forensic identity investigations. The data also indicate that the neomutation rate seems to be very low from one generation to the other.  1998 Elsevier Science Ireland Ltd. All rights reserved. Keywords: mtDNA; Hypervariable region I and II; Control region; Identity testing

1. Introduction PCR technology to amplify VNTR loci and short tandem repeats (STR) allows consistent typing of as little as 1 ng of genomic DNA. Even greater sensitivity could be *Corresponding author. Tel.: 149-228-738356; fax: 149-228-738339. 0379-0738 / 98 / $ – see front matter  1998 Elsevier Science Ireland Ltd. All rights reserved. PII: S0379-0738( 98 )00163-7

170

A. Baasner et al. / Forensic Science International 98 (1998) 169 – 178

obtained by the analysis of mitochondrial DNA which is present in up to 10 000 copies per cell. Due to the lower efficiency of DNA repair as well as to a higher frequency of DNA replication errors [1], mitochondrial DNA and in particular the non-coding region is highly polymorphic. The generation of mtDNA variability can only occur through new mutations. In vertebrates the mutation rate of mitochondrial genes is |10-fold higher compared to nuclear genes [2]. Because mtDNA is maternally inherited comparisons can be made between even distant maternal relatives [3]. Consequently – unless mutations have occurred – the mtDNA sequences of mother and child and among siblings must be identical. All these characteristics make the mtDNA sequencing of the D-loop region suitable for forensic identity testing [1]. In order to exploit the potential informativeness of mtDNA an adequate population study must be performed to be used as reference. Knowledge on the frequencies with which certain mitochondrial DNA sequences occur in a given population will facilitate interpretation of results from mtDNA sequence analysis in forensic casework. Here we present our sequencing data of 50 unrelated individuals in a western German population (Rhine area). The results are compared with other Caucasian populations and discussed for their limitations for individualisation. Furthermore, we compared sequences of the two hypervariable regions (HVR I and HVR II) from 7 CEPH families (74 generational events; mother:child, grandmother:grandchild, or sibling pairs) to empirically determine the frequency with which maternal relatives differ in mtDNA sequence.

2. Materials and methods

2.1. DNA extraction from whole blood DNA from 50 unrelated individuals (Rhine area) was prepared from whole blood using the ‘salting out’ procedure according to Miller et al. [4]. The amount of DNA was quantitated photometrically. DNA from the CEPH (Centre d’Etude Polymorphisme Human) [5] families were kindly provided by the Institute of Human Genetics, Bonn. Seven CEPH families were evaluated; Utah Pedigrees 102, 884, 1333, 1340, 1341, 1345 and 13 294.

2.2. Amplification protocol PCR was conducted in two stages. First stage: 10 ng DNA, 1 mM of each primer (L15926 and H00580), 200 mM of each dNTP, 2.5 ml 103PCR buffer, 1.5 mM MgCl 2 , 1.25 U Taq polymerase (Goldstar, Eurogentec), 25 ml total volume. Second stage: 1 ml of the PCR product from the first stage, 0.5 mM primers (biotinylated L15997 with M13(-24)-H16401 for L-strand of HVR I, M13(-24)-L15997 with biotinylated H16401 for H-strand of HVR I; biotinylated L29 with M13(-24)-H408 for L-strand of HVR II, M13(-24)-L29 with biotinylated H408 for H-strand of HVR II),

Table 1 Sequence polymorphism in the hypervariable region I of the human mitochondrial DNA A. Baasner et al. / Forensic Science International 98 (1998) 169 – 178 171

172

Table 2 Sequence polymorphism in the hypervariable region I of the human mitochondrial DNA A. Baasner et al. / Forensic Science International 98 (1998) 169 – 178

A. Baasner et al. / Forensic Science International 98 (1998) 169 – 178

173

200 mM of each dNTP, 5 ml 103PCR buffer, 1.5 mM MgCl 2 , 2.5 U Taq polymerase (Goldstar, Eurogentec), 50 ml total volume. Primer sequences [6]L15926 59-TCAAAGCTTACACCAGTCTTGTCTTGTAAACC39H00580 59-TTGAGGAGGTAAGCTACATA-39L15997 59-CACCATTAGCACCCAAAGCT-39H16401 59-TGATTTCACGGAGGATGGTG-39L29 59-GGTCTATCACCCTATTAACCAC-39H408 59-CTGTTAAAAGTGCATACCGCCA-39M13(-24) 59-CGACGTTGTAAAACGACGQCCAGT-39 Temperature cycle (First stage): 948C – 45 s, 508C – 1 min, 728C – 5.5 min, 25 cycles (Biometra, Triothermoblock). Temperature cycle (Second stage): 948C – 45 s, 508C – 1 min, 728C – 3 min, 25 cycles (Biometra, Triothermoblock). PCR product was purified and desalted using the Qiagen purification kit (Qiagen, Hilden, Germany). DNA was completely used for sequencing reaction.

2.3. Sequencing reaction PCR products were sequenced by solid phase sequencing procedure according to the supplier’s protocol (Amersham Pharmacia Biotech, Freiburg, Germany) using M13(-24) fluorescent primer. Electrophoresis was carried out on an automated A.L.F express DNA sequencer (Amersham Pharmacia Biotech) for 600 min at 1500 V, 34 W, 3 W laser and 508C. Sequence analysis was performed using the HLA SBTyper software version 1.00 (Amersham / Pharmacia Biotech).

3. Results Alignments were made with the original Anderson sequence and all differences were noted (Tables 1–3) [7]. Anderson sequence was atypical at two positions: at position 263 (HVR II) guanine was normally presented instead of adenine, at position 310-314 (HVR II) six cytosine nucleotides not five were usually presented. From the 50 investigated individuals 35 different genotypes for HVR I and 28 different genotypes for HVR II were observed. Six individuals shared the same genotype for HVR I as well as for HVR II. HVR I showed 52 sites of sequence polymorphism whereas HVR II showed 29 sites of sequence polymorphism (Table 4). The distribution showed a large bias towards transitional changes with a transition:transversion ratio of 100:5 in HVR I. No transversion could be observed in HVR II. The majority of observed transitions in the two hypervariable regions were pyrimidine. The 309.1 insertion is a common variant in the HVR II poly C-stretch region shared by 24 individuals. Two individuals (sample 13 and 49) contained nine cytosine nucleotides at the same position instead of the seven reported by Anderson et al. [7]. At position 44 within region II one individual (sample 43) showed a 1 bp cytosine insertion. In the hypervariable region I insertions could be observed within the poly

174

A. Baasner et al. / Forensic Science International 98 (1998) 169 – 178 Table 3 Sequence polymorphism in the hypervariable region II of the human mitochondrial DNA

C-stretch region at position 16 184-16 193. Sample 49 contained six cytosine nucleotides instead of four in the reference sequence at position 16 193. Sample 44 shows heteroplasmy at position 16 183 with adenine predominating cytosine. No deletion could be observed so far.

A. Baasner et al. / Forensic Science International 98 (1998) 169 – 178

175

Table 4 Sequence polymorphism within the variable regions of mtDNA of 50 unrelated individuals

Total number of sequence polymorphism Transition: transversion ratio Pyrimidine transitions Purine transitions Transversions

Insertions

T→C C→T A→G G→A C→A A→C G→C A→T C

HVR I

HVR II

108

212

100:5

135:0

39 47 11 3 2 1 1 1 Position 16193 (1–2 bp)

34 10 81 10 – – – – Position 44 (1 bp) Position 309 (1–2 bp) Position 315 (1 bp)

Sample 31 is the only individual which shows adenine at position 263 (HVR II) instead of guanine. Nucleotide change at position 73 from adenine to guanine could be observed for 27 samples. Although the majority of sequences were only represented once in the database one common genotype in the hypervariable region I was observed for 22% (samples 1, 7, 10, 11, 12, 16, 19, 24, 39, 43, 44) which was identical to the Anderson sequence. The genetic diversity (equivalent to heterozygosity) in each population on the basis of mtDNA types is h5(12Sx 2 )n /(n21) (where n is the sample size and where x is the frequency of each mtDNA type) [8]. The probability of two randomly selected individuals from a population having identical mtDNA types is P5Sx 2 . In our study the genetic diversity was 0.9306 for HVR I and 0.9739 for HVR II. The probability of two random individuals having the same genotype was 0.044 for HVR I and 0.023 for HVR II. The combined h value for HVR I and HVR II was 0.9943, the combined P value was 0.013. Investigation of the seven CEPH families revealed that most maternal relatives shared identical mtDNA sequences. Altogether 74 generational events were investigated and within these, two instances of substitution were detected (one child of family 884 and one child of family 1345). Both substitutions were a cytosine insertion, occurring in the HVR II poly C-stretch region (309.1). Our data indicate a substitution rate of one in 37 generations (Table 5). One son of family 1340 shows heteroplasmy (with a 70:30 heteroplasmic mixture) in HVR II at the position 227, with guanine predominating adenine. In some other CEPH samples the sequence chromatographs were suggestive that heteroplasmy might be present at low levels but firm conclusions could not be drawn out from the direct

176

A. Baasner et al. / Forensic Science International 98 (1998) 169 – 178

Table 5 Sequence comparisons and observed mutations Sample source

mtDNA lineages

Generations

Mutations

Rate per generation

CEPH families

13

74

2

0.027 (1 / 37)

sequencing results. Additionally sequences were compared for father:child pairs and as expected no evidence for paternal transmission of mtDNA was observed.

4. Discussion In all cases the direct sequencing of the PCR products gave satisfactory results. The sequencing primers should be used in both-forward and reverse-directions. We have confirmed the findings of others that information from both strands greatly enhances the ability to unambiguously sequence a PCR-derived template [9]. The distribution of sequence polymorphism showed that the most changes are transitions. Within the hypervariable region II, exclusively transitions could be observed. The two length mutations in the second region each involve the gain of a cytosine on either side of the thymine (base 309 of reference sequence). This thymine residue therefore interrupts a region which contains from 12 to 14 consecutive cytosines suggesting that a possible mechanism for generating length mutations in this region may be due to slipped mispairing involving this long series of cytosines [2]. The same mechanism could be responsible for the two cytosine nucleotides insertions (sample 49) in the HVR I poly C-stretch region at position 16 184-16 193. In conclusion, our sequencing results are comparable to other German and Caucasian population data [10,11]. The extent of variation in each population is given by the diversity value h. In our study the h value was 0.9306 for HVR I and 0.9739 for HVR II. The combined h value was 0.9943. The probability of identity of 0.044 for HVR I and 0.023 for HVR II and 0.013 for the two regions taken together are similar to those described by other authors [12]. Our results show that both of the variable regions present within the non-coding region have to be sequenced to obtain maximum information. For further investigations an extensive population study has to be performed for more detailed statistical evaluation. Most of the investigated CEPH families shared identical mtDNA sequences. The two substitutions were observed in the HVR II poly C-stretch region (309.1). The 309.1 C insertion is a common variant, shared by |50% of individuals in our database. Most individuals are heteroplasmic at low levels for length variants in this HVR II C-stretch region. Because our CEPH study evolves a very restricted period of evolutionary time, it is reasonable that we would observe substitutions at sites where they occur most rapidly [13]. Another explanation is that one might expect that the cell lines would display an artificially higher rate of substitution due to the extended replication in culture. A previous study could not find any evidence for this explanation [13]. However, if this

A. Baasner et al. / Forensic Science International 98 (1998) 169 – 178

177

were the case, the data from the CEPH families would overestimate the actual substitution rate. In our study heteroplasmy was clearly detected in one individual from the CEPH lineages and in one individual of the population study. The occurrence of such low levels of heteroplasmy suggests either that heteroplasmy is a very rare event or that it is not easily detectable by direct sequencing. However, until now a suitable estimate of the frequency of heteroplasmy is still lacking. A previous study suggested that heteroplasmic variants in the control region are more widespread as reported here [14]. The lack of heteroplasmy observed may have resulted from the difficulties of detection by direct sequencing. For the mtDNA identical genotypes will exist, unless mutations have occurred, if the suspect has maternal relatives. Therefore sequencing of the D-loop region of the mtDNA is of limited application in the identification of criminals [15] and most likely to be useful in identifying the victims of crimes and accidents [3,16,17]. In general, mitochondrial DNA profiling cannot be used to definitively identify an individual, but it has considerable utility in providing corroborative evidence in criminal cases.

Acknowledgements This work was supported by the Medical Institution of the University of Bonn (BONFOR, 148.01)

References [1] M.R. Wilson, M. Stoneking, M.M. Holland, J.A. DiZinno, B. Budowle, Guideliness for the use of mitochondrial DNA sequencing in forensic science, Crime Laboratory Digest 20 (1993) 68–77. [2] R.L. Cann, A.C. Wilson, Length mutation in human mitochondrial DNA, Genetics 104 (1983) 699–711. [3] P.L. Ivanov, M.J. Wadhams, R.K. Roby, M.M. Holland, V.W. Weednand, T.J. Parsons, Mitochondrial DNA sequence heteroplasmy in the Grand Duke of Russia Georgij Romanov establishes the authenticity of the remains of Tsar Nicholas II, Nat Genet 12 (1996) 417–420. [4] S.A. Miller, D.D. Dyke, H.F. Poleskey, A simple salting out procedure for extracting DNA from human nucleated cells, Nucleic Acids Res 16 (1988) 1215. ´ ´ [5] Dausset, Le centre d’etude du polymorphisme humain, La Presse Medicale 15 (1986) 1801–1802. [6] K.M. Sullivan, R. Hopgood, P. Gill, Identification of human remains by amplification and automated sequencing of mitochondrial DNA, Int J Leg Med 106 (1992) 85–90. [7] S. Anderson, A.T. Bankier, B.G. Barrell, M.H.L. deBrujin, A.R. Coulson, J. Drouin, I.C. Eperon, D.P. Nierlich, B.A. Roe, F. Sanger, P.H. Schreier, A.J.H. Smith, R. Staden, I.G. Young, Sequence and organization of the human mitochondrial genome, Nature 290 (1981) 457–465. [8] F. Tajima, Statistical method for testing the neutral mutation hypothesis by DNA polymorphism, Genetics 123 (1989) 585–595. [9] S.M. Hyder, C. Hu, Y. Needleman, X.Y. Sonoda, V.V. Baker, Improved accuracy in direct automated DNA sequencing of small PCR products by optimizing the template concentration, BioTechniques 17 (1994) 478–482. [10] S. Lutz, H.J. Weisser, J. Heizmann, S. Pollak, Location and frequency of polymorphic positions in the mtDNA control region of individuals from Germany, Int J Leg Med 111 (1998) 67–77.

178

A. Baasner et al. / Forensic Science International 98 (1998) 169 – 178

[11] W. Parson, T.J. Parsons, R. Scheithauer, M.M. Holland, Population data for 101 Austrian Caucasians mitochondrial DNA d-loop sequences: Application of mtDNA sequence analysis to a forensic case, Int J Leg Med 111 (1998) 124–132. [12] R. Piercy, K.M. Sullivan, N. Benson, P. Gill, The application of mitochondrial DNA typing to the study of white Caucasian genetic identification, Int J Leg Med 106 (1993) 85–90. [13] T.J. Parsons, D.S. Muniec, K. Sullivan, N. Woodyatt, R. Alliston-Greiner, M.R. Wilson, D.L. Berry, K.A. Holland, V.W. Weedn, P. Gill, M.M. Holland, A high observed substitution rate in the human mitochondrial DNA control region, Nat Genet 15 (1997) 363–368. [14] K.E. Bendall, V.A. Macaulay, J.R. Baker, B.C. Sykes, Heteroplasmic point mutations in the human mtDNA control region, Am J Hum Genet 59 (1996) 1276–1287. [15] G. Barreto, A.R. Vago, C. Ginther, A.J.G. Simpson, S.D.J. Pena, Mitochondrial D-loop signatures produced by low-stringency single specific primer PCR constitute a simple comparative human identity test, Am J Hum Genet 58 (1996) 609–616. [16] P. Gill, P.L. Ivanov, C. Kimpton, R. Piercy, N. Benson, G. Tully, I. Evett, Identification of the remains of the Romanov family by DNA analysis, Nat Genet 6 (1994) 130–135. [17] M.M. Holland, D.B. Fisher, B.G. Mitchell, W.C. Rodriguez, J.J. Canik, C.R. Merril, V.W. Weedn, Mitochondrial DNA sequences analyses of human skeletal remains: identification of remains from the Vietnam war, J Forensic Sci 38 (1993) 542–553.